@mattab opened this issue on March 9th 2009

Baidu is the biggest search engine in China and currently Piwik fails detecting keywords from baidu.

Example queries:



Resolving this issue involves writing unit test to cover these bits of code. Also we should check whether the code path around line 715 in core/Tracker/Visit.php is useful, if not fix it or delete it.

@robocoder commented on March 10th 2009

The problems with baidu might be more complex than at first glance: - the second url uses the variable name "word" instead of "wd" - gb2312 is an encoding; are the keywords not utf-8?

@mattab commented on March 20th 2009

also see #435 which is very related

@mattab commented on March 24th 2009

(In [1014]) - cleaning up the search engine parsing code, adding tests, recording UTF8 keywords in the DB rather than encoded (as tables are now utf8, refs #5730) - adding tests in url.test.php and fixed double encoding in some edge cases - fixed #589 Piwik fails to properly decode and store some chinese keywords (eg. from baidu.com) - fixed #435 Exotic encoded keywords should be stored as utf-8 in the DB - refs #575 hopefully fixed, will give it a few days of tests on piwik.org

This issue was closed on March 24th 2009
