@quba opened this Issue on June 19th 2015 Contributor

Hi,

would it be possible to change this query:
https://github.com/piwik/piwik/blob/master/core/Sequence.php#L91-L105

to set exact number instead of incrementing the sequence? This would make Piwik more resistant in case there are DB replication issues. We should avoid situations when sequence has lower value on one of the nodes.

@mgazdzik commented on June 19th 2015 Contributor

:+1: for strengthening this

@mattab commented on June 22nd 2015 Owner

Context from mysql docs:

  • LAST_INSERT_ID() returns a BIGINT (64-bit) value representing the first automatically generated value that was set for an AUTO_INCREMENT column by the most recently executed INSERT statement to affect such a column.
  • If expr is given as an argument to LAST_INSERT_ID(), the value of the argument is returned by the function and is remembered as the next value to be returned by LAST_INSERT_ID(). This can be used to simulate sequences [...].
  • The ID that was generated is maintained in the server on a per-connection basis. This means that the value returned by the function to a given client is the first AUTO_INCREMENT value generated for most recent statement affecting an AUTO_INCREMENT column by that client.

Problem description:

  • Piwik runs on replicated DB master-slave
  • @quba what is the problematic use case / when does it occur?
    • is it related to #7554 or do we already have sequence ID problem before #7554 is implemented?
@quba commented on June 22nd 2015 Contributor

When there are any issues with DB replication, this table won't sync automatically because this query is only incrementing the value. It's not related to #7554.

@tsteur commented on June 30th 2015 Owner

master-slave shouldn't cause any issues but I presume master-master does here? Can you provide us the solution which queries to rewrite? We had a quick look but we're not sure. Eg here's the used code: https://github.com/piwik/piwik/blob/2.14.0-rc1/core/Sequence.php#L93

I'm not sure what you mean with exact value, it has to be atomic so we cannot read and then set the value.

@mattab commented on July 10th 2015 Owner

I don't really understand full problem, moving out of milestone until we have more information cc @quba

@mattab commented on July 15th 2015 Owner

@quba is the problem for master-master replication only?

@quba commented on July 15th 2015 Contributor

piwik_sequence is a crucial table so we need to make sure that values are synchronized across all databases.

There are two ways to go:

  • administrators can develop checks to monitor if these tables are in sync,
  • we can build Piwik to be resistant if there are replication issues (e.g. one of the queries timed out).

The thing is, that if these tables are desynchronized, Piwik produces really random results for reports (e.g. mixed site IDs). It's also really hard to debug. Therefore I wanted to let you know that maybe we could improve something here.

@tsteur commented on July 15th 2015 Owner

Can maybe an administrator or whoever has more experience with master-master replication have a look at the queries and tell us how to improve? Otherwise we won't be able to do much.

@quba commented on July 15th 2015 Contributor

@tsteur the thing is that it can't be implemented like set value = value + 1. In previous comment you mentioned that it makes no sense to select the value before executing the update query so I think that there's no real solution.

As I said, currently it's a minor issue for us (we implemented additional monitoring) but I wanted to discuss this further and ask for your feedback.

@tsteur commented on July 16th 2015 Owner

In previous comment you mentioned that it makes no sense to select the value before executing the update query so I think that there's no real solution.

Yes it has to be atomic.

Maybe a solution would be at some point to generate random values including [0-9a-f]? The DB index would get a bit bigger etc but something like this might work and should be backwards compatible but haven't thought too much about it. We'd just need to try to avoid collisions (which should be doable)

@mattab commented on July 16th 2015 Owner

@quba if there is any risk that this bug hits us in production, please comment here, as i would move it back to Short term or even schedule it for a next version

Powered by GitHub Issue Mirror