@anonymous-piwik-user opened this issue on June 18th 2012

I have a problem on one of my sites that is using ISO-8859-1 encoded sites. When tracking firefox (MAC FF 12.0) the url show up as: "f%F6rvaltning" no problem here, character "" is url encoded with ISO encoding (UTF-8 encoding of same character should be %C3%B6 instead of %F6)

but when it is tracking a safari browser (MAC SF 5.1) it shows up as: "frvaltning"

This leads to duplicate entries in log_action and does not look so good in the reports.

I have not seen this problem in the sites using UTF-8 encoding so I believe it is limited to ISO-8859-1 but I might be wrong.

@anonymous-piwik-user commented on June 20th 2012

@anonymous-piwik-user commented on June 20th 2012

@anonymous-piwik-user commented on June 20th 2012

@mattab commented on June 19th 2012

Thanks for the report. Are you able, which would be VERY useful, to reproduce the issue with a very simple HTML page, and a piwik JS code inside? if so can you pelase attach here the page that you use to reproduce the issue?


@anonymous-piwik-user commented on June 20th 2012

I have tried the following browsers:

WI7 FF  13.0
WI7 SF  5.1
WI7 CH  19.0

When looking at log_visit and live visitors for the test page (see urlencode.zip) everything looks to be in order. On the Visitors tab I see invalid characters with the latin1 page (see screenshot.31.png).

I have not been able to recreate the original issue yet.

The original issue is that invalid characters are saved to log_visit table. I still see invalid characters in live visitors and being saved to log_visit with Safari 5.1 on Mac, iPhone and iPad so I will try to get hold of one of these and attempt to recreate the issue again.

@anonymous-piwik-user commented on June 20th 2012

Ok found it, the url with encoded characters need to be intepreted by the browser as an folder or file and not parameters. (Ok to use rewrite rules in the webserver)

I am attaching file that should replicate the error using Safari 5.1 on win 7 on IIS when clicking the Latin1 link (Replicating the error on linux might require rewrite rules, let me know if I should post an apache config that replicates this)

@mattab commented on July 19th 2012

Thanks, We'll try to take a look soon before next release, unless you submit a patch first ;)

@robocoder commented on August 1st 2012

We can't determine the page's encoding using cross-browser javascript.

If you have latin1 characters in filenames, you'll have to utf8_encode it, e.g.,

This issue was closed on August 1st 2012
