@mattab opened this Issue on June 20th 2017 Owner

Reproduce

  • Open outlinks report on demo
  • Find an outlink with an & in the URL such as http://demo.piwik.org/index.php?module=SitesManager&action=downloadPiwikTracker&idSite={$IDSITE}&piwikUrl=http://piwik.example.org/
  • Click on the segmented visitor log icon

Got

No data:

segmented visitor log no data

URL requested was

https://demo.piwik.org/index.php?segment=outlinkUrl==http%3A%2F%2Fdemo.piwik.org%2Findex.php%3Fmodule%3DSitesManager%26amp%3Baction%3DdownloadPiwikTracker%26amp%3BidSite%3D%7B%24IDSITE%7D%26amp%3BpiwikUrl%3Dhttp%3A%2F%2Fpiwik.example.org%2F&date=today&module=Live&action=indexVisitorLog&disableLink=1&small=1&hideProfileLink=1&period=week&idSite=1

Expected instead

The visitor log should show all users who clicked on that outlinks.

There seems to be a problem with the encoding, the & is encoded too many times and end up as amp etc.

Similar to https://github.com/piwik/piwik/issues/10126 https://github.com/piwik/piwik/issues/6287 https://github.com/piwik/piwik/issues/8395

@sgiehl commented on July 17th 2017 Member

Tried to fix that for a while now, but it seems to be a bit more tricky then I thought.
I don't think the problem is double encoding. The Outlink might be stored with & in the piwik database, so it would be correct to include that in the segment. Currently not sure, why it doesn't work. Always decoding the segment value didn't help.

@mneudert would you maybe be keen to have a look here? :slightly_smiling_face:

@mneudert commented on July 18th 2017 Member

I have taken a good look at the behavior and perhaps can explain what is going on (ignoring the potential encoding error in the popup title).

To clarify my findings I whipped together some hackish fix. One hackish enough to break a lot of other things completely unrelated...

Let's begin the journey with the popup URL getting opened. The URL is both url- and html-encoded (%26amp%3B) in the segment GET-Parameter. This way it gets transported to the constructor of \Piwik\Segment and the funny things begin. The first round of initializeSegment exits with an Exception because the semicolon of & gets treated as an AND-Operator and that leads to a broken expression tree. The second check is not much more helpful resulting in an SQL-Statement containing ( 1 = 0 ) (I think the SQL_WHERE_DO_NOT_MATCH_ANY_ROW from \Piwik\Segment\SegmentExpression is the one kicking in here) and therefore no results.

My first fix was to tell the Expression parser what an & is and that it is a good thing. Resulting in a properly parsed expression tree containing [ 'outlinkUrl', '==', 'url?with=some&ampersand=stuff' ]. But during the following cleaning process the TableLogAction::getIdActionFromSegment now messes up the encoding and returns it completely decoded. Without the & in place. And now we are back to no results because the database has the url encoded. Removing the sanitization for outlinks helped to fix that one but that seems really off to me and might break more than worth it.

A different way might be to double-encode the segmentation parameter. But that seems equally hackish... perhaps Travis can shed some light on what gets broken...

@sgiehl commented on July 19th 2017 Member

@mneudert Well, no tests have been failing due to your changes. But guess we should start by adding a couple of new tests for segments (see #11384), so we can later prove changes required to fix this here, don't break anything else.

Note: Moving out of 3.0.5 as we aim to release this soon

Powered by GitHub Issue Mirror