@quba opened this Issue on February 11th 2016 Contributor

Background:

  • a Piwik instance with 47k websites
  • MySQL master-master replication
  • one archiving thread

Issue:
The size of binlog files used to replicate data is so big, that the slave DB is not able to process them. Currently, after 2 minutes we have 400 huge queries to piwik_option to update this value. Situation will be even worse when running a few archiving threads which is a must have when you have thousands of sites.

Idea:
Maybe we could use Redis to store this option and maybe even all Piwik settings?

@kaz231 commented on February 11th 2016

It's a matter of \Piwik\CronArchive\SharedSiteIds::getNextSiteId().

@tsteur commented on February 11th 2016 Owner

If it's caused by \Piwik\CronArchive\SharedSiteIds::getNextSiteId() then we probably have to use Redis for it as we kinda have to always perform this update.

If there are like 10.000 sites, shouldn't there be like max 10.000 queries to update the option table entry? To put this in relation we might execute much more inserts for archive record entries (eg like 100 per site which would be 1mio). Is it actually inserting or updating? Probably doesn't make a difference :)

@quba commented on February 11th 2016 Contributor

Standard insert or:
INSERT INTOpiwik_option(option_name option_value autoload) VALUES ('SharedSiteIdsToArchive' '' '') ON DUPLICATE KEY UPDATE option_value = '' ...

@quba commented on February 11th 2016 Contributor

The workaround could be to force-idsites one by one, but this makes it really complicated if you want to start a few threads..

@tsteur commented on February 13th 2016 Owner

Let's have a chat next week about it? I can schedule a meeting

cc @kaz231 @quba @mattab

@kaz231 commented on February 13th 2016

:+1:

@tsteur commented on February 16th 2016 Owner

Summary of what we discussed:

We already have various caches like Piwik\Cache\Eager, Piwik\Cache\Lazy and Piwik\Cache\Transient see https://github.com/piwik/piwik/blob/2.16.1-b1/core/Cache.php#L17-L57 that can be configured to use different kind of backends in global.ini.php see https://github.com/piwik/piwik/blob/2.16.1-b1/config/global.ini.php#L74-L97.

We are aware that several servers could request the same idSite at the same time because we do not have a proper lock in place but this is ok for now. We need to ideally fix the problem that multiple archives at the same time for same idSite could be created and slow down each other.

@mattab commented on March 15th 2016 Owner

Hi @kaz231 Feel free to assign yourself when you start working on this one :+1:

Powered by GitHub Issue Mirror