There should be a way to switch from POST bulk requests to GET.
Example scenario Users want to migrate Piwik to new infrastructure without losing data.
Outcome There won't be possibility to replay the gap from logs because access log will contain POST requests without needed parameters.
This issue also causes some partial data loss anytime the log replay
--replay-logs will be used (learn more in newly added FAQ about log replay
maybe web servers can be configured to log the
POST data as well, if not, maybe the only alternative would be to let users eg. disable the POSTing of Content Tracking data and have them sent as GET requests. This would have performance implications for end user on the website. cc @tsteur maybe you have some idea about this?
Not really. We can use bulk tracking with GET. I tried it with content tracking initially but ran into issues with max URL length which is configured differently everywhere. We'd have to assume a length that is most likely save everywhere. Depending on the content tracking data we have to maybe send a request per content impression, sometimes we can maybe group two or three content impressions together. Problem is when having to do multiple requests we will run into
0 visits bug so we'd have to delay each request by about 800ms meaning some banners might not be tracked if user is not long enough on the site etc. See #6415 . This bug can be fixed with a queue see #6075 but needs to be implemented and special software eg Redis. Personally I'd like to work on #6075 soonish anyway but it won't fix it for all users.
@quba you have some more information - honestly I'm not sure what is the best next step here, besides accepting some data loss until we figure out proper solution to this challenge
@mattab, @tsteur: maybe we should increase priority of building a queue solution so there won't be need to replay from logs (instead we could replay from a queue that accepts both types of requests).
I am working on the queue already
The solution for this issue will be to use the new QueuedTracking plugin. This plugin will store the requests in Redis database including the
POST values. This will let us replay all the requests the same way they were initially sent via
piwik.js. Unfortunately, we cannot make
Content Tracking Log Replay work with Mysql only as we do not store POST values and the web server log files don't log POST values.
See the doc at: https://github.com/piwik/plugin-QueuedTracking#readme