@mattab opened this Issue on October 10th 2017 Owner

Currently when we export a flattened report the CSV output (and others) looks like this:

Label | Visits | Total events | Events with a value | Total value | Minimum value | Maximum value | Unique visitors (daily sum) | The average of all values for this event
-- | -- | -- | -- | -- | -- | -- | -- | --
PluginTabs - Preview | 78 | 97 | 0 | 0 | 0 | 0 | 76 | 0
PluginTabs - Description | 35 | 41 | 0 | 0 | 0 | 0 | 34 | 0
PluginTabs - Faq | 19 | 20 | 0 | 0 | 0 | 0 | 19 | 0
PluginTabs - Documentation | 18 | 24 | 0 | 0 | 0 | 0 | 18 | 0

Instead we would like each individual columns listed like below, with their correct name and value, for all reports combining several levels and exported as flattened with flat=1

Event name | Event category | Label | Visits | Total events | Events with a value | Total value | Minimum value | Maximum value | Unique visitors (daily sum) | The average of all values for this event
-- | -- | -- | -- | -- | -- | -- | -- | --
PluginTabs | Preview | PluginTabs - Preview | 78 | 97 | 0 | 0 | 0 | 0 | 76 | 0
PluginTabs | Description | PluginTabs - Description | 35 | 41 | 0 | 0 | 0 | 0 | 34 | 0
PluginTabs | Faq | PluginTabs - Faq | 19 | 20 | 0 | 0 | 0 | 0 | 19 | 0
PluginTabs | Documentation | PluginTabs - Documentation | 18 | 24 | 0 | 0 | 0 | 0 | 18 | 0

Notes

  • it's important we leave the Label column for backward compatibility.
  • the new columns should be in both the reporting API and the processed report/metadata API

Why is this important? having the components of the Label column clearly separated (un-flattened) allows easy data analysis in the spreadsheet or in other data analysis systems. this will also be very helpful for Custom Reports export where the exported report should have all dimensions clearly listed out.

@tsteur commented on October 10th 2017 Owner

BTW: If you do it like this and this is not controlled by a new API parameter, you will be likely breaking API and needs to be mentioned in advance. Luckily mobile app is not using flatten, otherwise this could break it.

@tsteur commented on October 10th 2017 Owner

And for processed reports you might also need to add new elements to mention which columns are the label columns, and in which order. Similar to < metrics> and <processedMetrics> Otherwise hard to process for a computer this API output.

@sgiehl commented on October 12th 2017 Member

While trying to implement this I encountered one problem, which I'm not sure how to solve.

How should those "metrics" be named when the metrics are not returned translated?
It is possible to get the translated name of the label column using the report classes and their associated dimensions, but I don't see any useful name to use when it should be untranslated.
Using label1, label2,... instead does not make much sense, as no one could see what it stands for.

My current implementation always uses the translated name, but that might be hard to process automatically, as the name might change when a translation changes.

@mattab @tsteur Any opinion how we should solve this?

@tsteur commented on October 12th 2017 Owner

Not quite sure what you mean when metrics are not returned translated? From this I am reading there is a URL param to do that? I presume when metrics are not translated, we return the metric name?

For dimensions we would then return the dimensionId? Also could use label1, label2 but not sure which is best. I reckon CSV export etc is mostly for human processing and if a computer wants to process it, they almost have to use API.getProcessedReport where all the metadata is known. That's why I mentioned we need to put the label columns into the metadata output of a processed report.

Can you send a link to an example when metrics are returned untranslated?

@sgiehl commented on October 12th 2017 Member

Translated column names would be https://demo.piwik.org/index.php?module=API&method=Events.getName&idSite=3&period=year&date=yesterday&format=HTML&translateColumnNames=1

or untranslated: https://demo.piwik.org/index.php?module=API&method=Events.getName&idSite=3&period=year&date=yesterday&format=HTML

The main problem is, that in the datatables the is only a label column. This label column doesn't represent any metric in most cases. So there is only a translated name for this label column. But using the dimension id might be a good solution

@tsteur commented on October 12th 2017 Owner

I reckon dimension id would be consistent, and if not available for some reason, fall back to label1 and label2 (we could still show eg dimesionid_foo,label2 when we have eg the dimension id for one dimension? I reckon...)

Apart from this the output of those reports is not very good for computer processing anyway and cannot really be used, even when requesting JSON format etc. If someone wants to process data automatically, API.getProcessedReport is the way to go. So it should be totally fine to use dimension id

@mattab commented on October 12th 2017 Owner

good point, It would be also needed to have the extra columns in the Processed report API output as well. (edited issue)

Powered by GitHub Issue Mirror