@iMarkus opened this Issue on October 13th 2017 Contributor

fix-duplicate-log-actions found about 21.000 entries out of 50.000.000 to fix. Each fix takes about 1 hour in a 100GB database. This means it would take about 2.5 years to fix all duplicate entries.

Is there any way to make things faster?

@mattab commented on October 13th 2017 Owner

Hi @iMarkus could you please send the command output with the times so we can see how slow it is?

@iMarkus commented on October 13th 2017 Contributor

Hi @mattab - here is the command output. I just let it run for 90 entries so far to see if it may speed up, but it takes always ~1hour to finish.

Found 21137 actions with duplicates.

DEBUG [2017-10-13 02:23:15]   idaction = 8465009, duplicate idactions = [array]
    Fixed duplicates in piwik_log_link_visit_action. Time elapsed: 3564.746s.
    Fixed duplicates in piwik_log_conversion. Time elapsed: 1.103s.
    Fixed duplicates in piwik_log_conversion_item. Time elapsed: 0.041s.

DEBUG [2017-10-13 03:22:41]   idaction = 8335013, duplicate idactions = [array]
    Fixed duplicates in piwik_log_link_visit_action. Time elapsed: 3618.117s.
    Fixed duplicates in piwik_log_conversion. Time elapsed: 1.204s.
    Fixed duplicates in piwik_log_conversion_item. Time elapsed: 0.030s.

DEBUG [2017-10-13 04:23:00]   idaction = 8363875, duplicate idactions = [array]
    Fixed duplicates in piwik_log_link_visit_action. Time elapsed: 3583.670s.
    Fixed duplicates in piwik_log_conversion. Time elapsed: 1.268s.
    Fixed duplicates in piwik_log_conversion_item. Time elapsed: 0.025s.

DEBUG [2017-10-13 05:22:45]   idaction = 8271979, duplicate idactions = [array]
    Fixed duplicates in piwik_log_link_visit_action. Time elapsed: 3633.765s.
    Fixed duplicates in piwik_log_conversion. Time elapsed: 1.266s.
    Fixed duplicates in piwik_log_conversion_item. Time elapsed: 0.044s.
@mattab commented on October 15th 2017 Owner

@iMarkus would you please enable MySQL Slow query log, and then run the command again, and post here the Slow queries from the slow query logs?

@iMarkus commented on October 16th 2017 Contributor

@mattab enclosed the update query:

UPDATE piwik_log_link_visit_action SET
idaction_url_ref = IF((idaction_url_ref IN (9787130)), 9787126, idaction_url_ref),
idaction_name_ref = IF((idaction_name_ref IN (9787130)), 9787126, idaction_name_ref),
idaction_name = IF((idaction_name IN (9787130)), 9787126, idaction_name),
idaction_url = IF((idaction_url IN (9787130)), 9787126, idaction_url),
idaction_event_action = IF((idaction_event_action IN (9787130)), 9787126, idaction_event_action),
idaction_event_category = IF((idaction_event_category IN (9787130)), 9787126, idaction_event_category),
idaction_content_interaction = IF((idaction_content_interaction IN (9787130)), 9787126, idaction_content_interaction),
idaction_content_name = IF((idaction_content_name IN (9787130)), 9787126, idaction_content_name),
idaction_content_piece = IF((idaction_content_piece IN (9787130)), 9787126, idaction_content_piece),
idaction_content_target = IF((idaction_content_target IN (9787130)), 9787126, idaction_content_target)WHERE idaction_url_ref IN (9787130) OR idaction_name_ref IN (9787130) OR idaction_name IN (9787130) OR idaction_url IN (9787130) OR idaction_event_action IN (9787130) OR idaction_event_category IN (9787130) OR idaction_content_interaction IN (9787130) OR idaction_content_name IN (9787130) OR idaction_content_piece IN (9787130) OR idaction_content_target IN (9787130);
@mattab commented on October 16th 2017 Owner

Thanks for the query @iMarkus :+1:
this SQL query looks nasty! Hopefully we can find a better way. unfortunately we likely won't be able to work on this for a little while as we don't have much time available, but we'll get to it.

Powered by GitHub Issue Mirror