@cauboy opened this Issue on October 13th 2014

Problem:

The conversion rates in the Goal Plugin are floats but always integer valued (1,0%, 2,0%, …) on a server located in Germany.

Reason:

In PHP the automatic float-to-string conversion is based on the value of setlocale(LC_NUMERIC). In English speaking countries a dot is used as the decimal separator while in some countries, e.g. Germany, a comma is used.
Example, where this issue is critical:

setlocale(LC_NUMERIC, "en_US");
$query = "INSERT INTO `values` SET `value` = $PI, `key` = 'PI'";
echo $query;
// output: INSERT INTO `values` SET `value` = 3.1415926535898, `key` = 'PI'
setlocale(LC_NUMERIC, "de_DE");
$query = "INSERT INTO `values` SET `value` = $PI, `key` = 'PI'";
// output: INSERT INTO `values` SET `value` = 3,1415926535898, `key` = 'PI'

This is harmful as the SQL query becomes invalid in the latter example.

Fix

Probably not a good solution, but for me the following change in piwik/core/Db.php solved the problem:

    public static function query($sql, $parameters = array())
    {
        setlocale(LC_NUMERIC, 'en_US'); // this line is a quick fix for this issue
        try {
            self::logSql(__FUNCTION__, $sql, $parameters);

            return self::get()->query($sql, $parameters);
        } catch (Exception $ex) {
            self::logExtraInfoIfDeadlock($ex);
            throw $ex;
        }
    }

Here are two posts about the setlocale issue:
http://mark-story.com/posts/view/php-floats-localization-and-landmines
http://blog.azure.km.ua/2012/12/php-typecasting-float-to-string-trap.html

@mattab commented on November 3rd 2014 Owner

Thanks for the bug report!

until we properly support locale number representations in #4114 we should probably add setlocale(LC_NUMERIC, 'en_US'); to the core/frontcontroller.php init() function

could you confirm it works for you to add the code in https://github.com/piwik/piwik/blob/master/core/FrontController.php#L311-310 ?

@cauboy commented on November 3rd 2014

Doesn't seem to work there. I updated PIWIK to v2.8.3 and cleared the numeric values via DROP TABLE archive_numeric_2014_11. Then I put setlocale(LC_NUMERIC, 'en_US'); just after the try{ in line 311. I am a bit wondering why my Piwik installation does have the try part and the code behind your link doesn't… Anyway, shouldn't make a difference. Here's the code fragment:

 public function init()
    {
        static $initialized = false;
        if ($initialized) {
            return;
        }
        $initialized = true;

        try {

            // make sure decimal points are used for string representation
            setlocale(LC_NUMERIC, 'en_US');

            Registry::set('timer', new Timer);

            $directoriesToCheck = array(
                '/tmp/',
                '/tmp/assets/',
                '/tmp/cache/',
                '/tmp/logs/',
                '/tmp/tcpdf/',
                '/tmp/templates_c/',
            );
            …

Not sure why that didn't work our. So I tested to put setlocale(LC_NUMERIC, 'en_US'); in core/Db.php and everything works fine again…

@tsteur commented on November 4th 2014 Owner

How did you start the archiving? Probably it was overwritten by https://github.com/piwik/piwik/blob/master/core/Translate.php#L222-L223 . Can you try to add it to FrontController and comment out the two lines in Translate.php temporarily?

@cauboy commented on November 4th 2014

For archiving I just opened …/piwik/index.php?module=CoreHome&action=index&idSite=1&period=day&date=today#/module=Goals&action=ecommerceReport&idSite=1&period=day&date=today&idGoal=ecommerceOrder

So It was indeed overwritten. Here's what my working code is looking like:

core/Translate.php (line 220 - 226)

 private static function setLocale()
    {
        $locale = $GLOBALS['Piwik_translations']['General']['Locale'];
        // $locale_variant = str_replace('UTF-8', 'UTF8', $locale);
        // setlocale(LC_ALL, $locale, $locale_variant);
        setlocale(LC_CTYPE, '');
    }

and

core/FrontController.php (Line 303 - 316)

    public function init()
    {
        static $initialized = false;
        if ($initialized) {
            return;
        }
        $initialized = true;

        try {

            // make sure decimal points are used in string representation
            setlocale(LC_NUMERIC, 'en_US');

            Registry::set('timer', new Timer);
@tsteur commented on November 5th 2014 Owner

Maybe we should use something like numer_format() http://us2.php.net/manual/en/function.number-format.php before writing the value to the database? Don't think there are so many float columns so might be doable?

@mattab commented on November 5th 2014 Owner

:+1: for number_format the value before writing in archive_numeric_*

@tsteur commented on November 5th 2014 Owner

What about goals / conversion table etc?

@mattab commented on November 5th 2014 Owner

+1 forgot that are other FLOAT columns that this should apply also besides this DOUBLE

@tsteur commented on November 5th 2014 Owner

Alright. Figured out we cannot use number_format as we cannot tell it to keep the same precision. We would either round or add more precision. So looking for something else.

@tsteur commented on November 5th 2014 Owner

From what I have seen so far the best solution seems to be a str_replace(',', '.', $float). The code for this will be in one method so we will be able to change it.

@tsteur commented on November 5th 2014 Owner

The tests seem to still pass but the more I think about it the more it is a hack and it will most likely fail again at some point. Also it is very confusing when to use/convert it and when not. I am even sure I haven't covered all places and there could be also errors in some calculations.

It also affects API requests. Requesting something in JSON + language DE will return a float as an actual float as supported by JSON using a "dot" see http://demo.piwik.org/index.php?module=API&method=API.getProcessedReport&idSite=7&period=day&date=2014-11-04&apiModule=UserCountry&apiAction=getCountry&showTimer=1&format=JSON&token_auth=anonymous&language=de whereas when requesting XML one will get a float with a comma as it was converted to a string in PHP before rendering it http://demo.piwik.org/index.php?module=API&method=API.getProcessedReport&idSite=7&period=day&date=2014-11-04&apiModule=UserCountry&apiAction=getCountry&showTimer=1&format=XML&token_auth=anonymous&language=de

To actually fix this issue we should set the locale to en_* in the beginning and also make sure this locale is installed. If not, show an error in SystemCheck etc. Before rendering something we should set the locale to the correct value. Eg if someone has configured German language it should use de_DE. Problem is we render everywhere, even in files like Session.php before throwing an exception. Or sometimes we might render something, then query or calculate data and render something again. So there is no clear point I think when we can set the de_DE locale.

Alternatively we could maybe always set en_* as locale and then in the UI whenever we output any numbers call filters that make sure to output the numbers correctly formatted with a comma or dot etc. I think Twig supports this already and as we want to get rid of Twig long term there are similar libs for JS / AngularJS. This means we'd always have to take care of calling a filter when outputting a number / float but I think this is much better as the current implemented fix see https://github.com/piwik/piwik/commit/38047056e83ccb00ef606fbdb921f6ddf6ef1f68 . As described in the first paragraph this is rather a hack and very hard to understand when to use this method and when not.

@diosmosis @mnapoli @mattab thoughts?

@tsteur commented on November 6th 2014 Owner

Here is a nice example of what I meant: When using German locale (or any other locale using comma instead of dot) it results in 'UTC-5,5' instead of 'UTC-5.5' which probably nobody would expect... resulting in a bug that it is not possible to add a site under circumstances as it says "Invalid Timezone" Such bugs are almost impossible to find: https://github.com/piwik/piwik/blob/19a2adbe7d8d7c9033cb99b4e57d2c5db515e836/plugins/SitesManager/API.php#L1247

Even safe_serialize did serialize wrong: https://github.com/piwik/piwik/commit/6e01277e60cd14d3bc6b38ac37cc46015d310011 and getRequestVar was buggy https://github.com/piwik/piwik/commit/f824e9ad93a4588c682da259bae27ea42be22367

Not sure how many more such issues there are. They are hard to identify and it is almost impossible to prevent more of those bugs unless we always work with en_* locale I think

@tsteur commented on November 6th 2014 Owner

If we do not always use a German locale I propose to run all tests using German locale in the future as most likely most of our users are using a locale that use a comma instead of a dot. We wouldn't notice such bugs otherwise

@mattab commented on November 6th 2014 Owner

It seems to me that we should try to set the locale to en_US or so - wouldnt' that help solve many problems and we wouldnt have to care as much about wrong locales?

Though I know it's hard to set the setlocale since I see in codebase some other setlocale calls: https://github.com/piwik/piwik/search?utf8=%E2%9C%93&q=setlocale

@tsteur commented on November 6th 2014 Owner

Yes, as mentioned. Problem is we then have to always format numbers manually when rendering anything to make sure we display a number using a comma instead of a dot in floats when a German or similar locale is used. This can be loads of work as well.

Ideally Piwik wouldn't render everywhere in the code and then it would be easier to handle... so we probably have to weigh up the pros and cons

@tsteur commented on November 6th 2014 Owner

There are code parts where it is hard to tell whether it causes bugs or not see https://github.com/piwik/piwik/blob/master/plugins/Actions/Reports/Base.php#L74

@tsteur commented on November 6th 2014 Owner

I just checked the Twig templates if they output any floats for usage in JavaScript but haven't found any yet. Also clicked through the UI with a German locale and couldn't find any JS error. If someone outputs something like var x = {{ 5.55 }} in a Twig template it would lead in a JS error because var x = 5,5

@tsteur commented on November 7th 2014 Owner

I think most of the issues related to German locale (or other locale) are fixed but it is hard to tell as most parts of the code are not really covered by tests. Especially not by integration / unit tests which would be very helpful here.

I think we can merge the current changes so it'll work at least for now (mostly). Then I think there can be only one conclusion the more I think about it: We always have to set an en_* locale and format numbers in Twig templates and API (if we want so) etc using Twig filter or using a library or so.

It is easy to explain: While we can live with the fact or bug when a number is not formatted correct in the UI (eg 5.5 instead of 5,5 => easy to fix) we cannot let the platform track or calculate wrong values. It is so easy to cause a bug / wrong values if not a en_* locale is used and those bugs are in places where nobody would expect them see the changes and my comments. Even the PiwikTracker was sending wrong values with non en_* locale. Another thing is that we cannot make sure all the libraries work correctly. For instance the pdf library uses floats quite a lot but also Zend etc. Last but not least we cannot expect plugin developers to be aware of this problem. The worst part: As we do not use a German locale in our tests we would not even notice once we track or display wrong metrics.

Be aware that our code sets the German locale as soon as someone uses German language. Probably most of our users use a language where a comma is used instead of a dot. As soon as this locale is installed on the server we track wrong values.

I am not sure how much work it is and whether we will be able to format the numbers everywhere but it is worth starting it even if it'll take a while. I think it is also worth it to run the UI tests with a locale/language that uses a comma instead of a dot in floats. Shall we handle all this in a separate issue?

@mattab commented on November 7th 2014 Owner

@cauboy can you please try with 2.9.0-b7 just released? http://piwik.org/faq/how-to-update/faq_159/

This should be fixed, we'll close the issue once you confirm it's ok!

@mattab commented on November 10th 2014 Owner

ping @cauboy

@cauboy commented on November 10th 2014

@mattab: Sorry for the late response… was a busy weekend.

Approved: I updated to 2.9.0-b8 and it works fine for me.

@mattab commented on November 11th 2014 Owner

@cauboy that's what we like to hear! kuddos @tsteur

This Issue was closed on November 11th 2014
Powered by GitHub Issue Mirror