@tsteur opened this Pull Request on December 18th 2014 Owner

refs #6850

Workflow is basically as follows:

  • Tracker will "remember" an archived report to invalidate if date is not today (in websites timezone)
  • Before archiving we will actually invalidate all "remembered" reports
  • Once a report was invalidated we will "forget" the previously remembered report to not do it all the time.

For each archived report to remember we store an entry in the option table. We make sure to create only one entry per idSite and date.

I removed the invalidate_report call from log importer for now. In case something fails we should fix the actual bug in the core (Tracker or Archiver). In case we notice problems before the 2.10 release we can add the call temporarily again in log importer but only if there is actually an issue and only in case we cannot fix the issue quickly.

@mattab commented on December 19th 2014 Owner

Feedback

  • if a user does not use core:archive and simply archives via browser, then the function invalidateArchivedReportsForSitesThatNeedToBeArchivedAgain will not be executed. this creates a different code path between core:archive and browser triggered archiving.

Besides this, looks nice!!

@tsteur commented on December 19th 2014 Owner

Different code paths is no good. So we'd have to invalidate the reports there https://github.com/piwik/piwik/blob/2.10.0-b9/core/Archive.php#L490 ? or somewhere else?

@mattab commented on December 19th 2014 Owner

@tsteur I believe this is the best place to put it, yes! but it needs to be super fast as this function is called actually thousands and thousands of times when archiving. (called once for each blob and each set of numeric metrics!)

@tsteur commented on December 22nd 2014 Owner

It is now done in Archive::get(). It will actually invalidate max once per request. I added a test to make sure it is only done once and to make sure it actually works.

It will currently invalidate all "remembered invalid archives" in Archive::get() which can take a while I reckon in case there are many many remembered invalid archives for many different sites. We could invalidate only for the sites that are actually requested but would make code more complex... I presume it is needed maybe?

Edit: I will push code soon that only invalidates archived reports for requested site ids for better performance

@mattab commented on December 28th 2014 Owner

Well done @tsteur

This Pull Request was closed on December 22nd 2014
Powered by GitHub Issue Mirror