@diosmosis opened this Pull Request on August 22nd 2015 Member

I created a new command that will analyze a single archive tables to find the following information:

  • number of archives present
  • number of invalidated archives
  • number of temporary archives
  • number of error archives
  • number of segment archives
  • number of numeric rows
  • number of blob rows

This information can be used by users or by people maintaining other installs to see why some archive tables are bigger than others. Even if the table isn't bloated, sysadmins may not be aware of how Piwik is being used, and this command can give insights regarding, for example, if many segments are being used on certain months or if many archives result in an error, etc.

Sample output:

Statistics for the archive_numeric_2015_01 and archive_blob_2015_01 tables:

+-------------------------------------------+------------+---------------+-------------+---------+-----------+----------------+-------------+
| Group                                     | # Archives | # Invalidated | # Temporary | # Error | # Segment | # Numeric Rows | # Blob Rows |
+-------------------------------------------+------------+---------------+-------------+---------+-----------+----------------+-------------+
| day[2015-01-01 - 2015-01-01] idSite = 1   | 5          | 0             | 0           | 0       | 4         | 0              | 0           |
| month[2015-01-01 - 2015-01-31] idSite = 1 | 5          | 0             | 0           | 0       | 4         | 45             | 95          |
| day[2015-01-02 - 2015-01-02] idSite = 1   | 5          | 0             | 0           | 0       | 4         | 27             | 63          |
| day[2015-01-03 - 2015-01-03] idSite = 1   | 5          | 0             | 0           | 0       | 4         | 27             | 63          |
| day[2015-01-04 - 2015-01-04] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 43             | 79          |
| day[2015-01-05 - 2015-01-05] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 31             | 67          |
| week[2015-01-05 - 2015-01-11] idSite = 1  | 5          | 0             | 0           | 0       | 4         | 45             | 95          |
| day[2015-01-06 - 2015-01-06] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 35             | 71          |
| day[2015-01-07 - 2015-01-07] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 31             | 67          |
| day[2015-01-08 - 2015-01-08] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 29             | 63          |
| day[2015-01-09 - 2015-01-09] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| day[2015-01-10 - 2015-01-10] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| day[2015-01-11 - 2015-01-11] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| day[2015-01-12 - 2015-01-12] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| week[2015-01-12 - 2015-01-18] idSite = 1  | 5          | 0             | 0           | 0       | 4         | 0              | 0           |
| day[2015-01-13 - 2015-01-13] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-14 - 2015-01-14] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-15 - 2015-01-15] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-16 - 2015-01-16] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-17 - 2015-01-17] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-18 - 2015-01-18] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-19 - 2015-01-19] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| week[2015-01-19 - 2015-01-25] idSite = 1  | 5          | 0             | 0           | 0       | 4         | 0              | 0           |
| day[2015-01-20 - 2015-01-20] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-21 - 2015-01-21] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-22 - 2015-01-22] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-23 - 2015-01-23] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-24 - 2015-01-24] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-25 - 2015-01-25] idSite = 1   | 3          | 0             | 0           | 0       | 2         | 0              | 0           |
| day[2015-01-26 - 2015-01-26] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| week[2015-01-26 - 2015-02-01] idSite = 1  | 5          | 0             | 0           | 0       | 4         | 0              | 0           |
| day[2015-01-27 - 2015-01-27] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| day[2015-01-28 - 2015-01-28] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| day[2015-01-29 - 2015-01-29] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| day[2015-01-30 - 2015-01-30] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
| day[2015-01-31 - 2015-01-31] idSite = 1   | 4          | 0             | 0           | 0       | 3         | 0              | 0           |
+-------------------------------------------+------------+---------------+-------------+---------+-----------+----------------+-------------+

Total # Archives: 138
Total # Invalidated Archives: 0
Total # Temporary Archives: 0
Total # Error Archives: 0
Total # Segment Archives: 102

Possible improvements

  • Analyze only specific archives, eg, invalidated / temporary / segment archives.
  • Display number of rows per individual archive (would display one archive per row, should combine w/ above improvement).

    TODO

  • [x] Tests (1 integration for ArchiveTableDao + 1 system for command)
  • [x] Switch from using tables_prefix to Db\Settings::getTablePrefix().
@diosmosis commented on August 24th 2015 Member

Ready for review or merge.

This Pull Request was closed on August 25th 2015
Powered by GitHub Issue Mirror