@iamwebrocker opened this Issue on October 5th 2017

Hi,

I noticed a (to me) strange behavior when exporting CSV data:
Every month I export the metrics from, say, "Users" overview. I keep the range for the export as "3 months", so I get the three recent months (but I played around with larger ranges as well, it doesn't change the behaviour). I collect these CSVs in a spreadsheet and have some kind of dashboard table where I collect/display the data.
I noticed that though the amount of columns don't change, and always the same "view" is used, the order of the columns vary across the months. One month the "unique visitors" are in column D, another month they are in another column.
The critical thing in my workflow is currently that I need to manually check if the data is indeed in the column I referenced from my "dashboard" table. It is easy to spot when a time-based metric like visitor time swaps places with a simple count, but it is very hard to spot if "Download" swaps places with "Pageviews".

I think I could tweak my workflow to re-order the columns alphabetically or something like this, but what I'd like to know, why do the columns swap places, if similar views are exported from month to month?

cheers, and thanks for all the work done here,
Tom

@mattab commented on October 6th 2017 Owner

Hi @iamwebrocker
Thanks for the report. There is an issue indeed where the order of columns may change. there is another probably duplicate issue about TSV exports and the Live API: https://github.com/piwik/piwik/issues/10186

In your case could you list some example of CSV data exports causing random column order? wondering if all APIs are affected or just a few.

Thanks,

@iamwebrocker commented on October 6th 2017

Hi @mattab,
thanks for looking into this. :-)
The random column order has happened across the PIWIK updates during this year, so it is not dependent on a specific version. It may be a hickup in the specific install, though (it's on a client's production website that is one or two minor versions behind the latest one). From what I recall it happens on CSV exported from 'Segment'ed views. (We're looking at different "chapters" on this site and comparing them).

Select the segment from dashboard:
01-select-segment

Change to vistors/overview:
02-change-visitors-overview

Export csv, limit to three months:
03-export-csv-limit-3m

The resulting csv from August and September show different column orders (see ):
```Datum,Eindeutige Besucher,Besuche,Benutzer,Aktionen,Maximale Aktionen pro Besuch,Absprünge,Von Besuchern verbrachte Gesamtzeit (in Sekunden),
Wiederkehrende Besuche,Aktionen bei wiederkehrenden Besuchen,Eindeutige wiederkehrende Besucher,Wiederkehrende Benutzer,Maximale Aktionen eines wiederkehrenden Besuchs,Absprungsrate bei wiederkehrenden Besuchen,Aktionen pro wiederkehrendem Besuch,Durchschnittszeit von wiederkehrenden Besuchen,Konversionen***,Besuche mit Konversionen,Umsatz,Konversionsrate,nb_conversions_new_visit,nb_visits_converted_new_visit,revenue_new_visit,conversion_rate_new_visit,nb_conversions_returning_visit,nb_visits_converted_returning_visit,revenue_returning_visit,conversion_rate_returning_visit,Seitenansichten,Eindeutige Seitenansichten,Downloads,Eindeutige Downloads,Ausgehende Verweise,Eindeutige ausgehende Verweise,Suchanfragen,Eindeutige Suchbegriffe,nb_hits_with_time_generation,Durchschnittliche Generierungszeit,Absprungsrate,Aktionen pro Besuch,Durchschnittliche Besuchszeit (in Sekunden)```

Datum,Eindeutige Besucher,Besuche,Benutzer,Aktionen,Maximale Aktionen pro Besuch,Absprünge,Von Besuchern verbrachte Gesamtzeit (in Sekunden),***Konversionen***,Besuche mit Konversionen,Umsatz,Konversionsrate,nb_conversions_new_visit,nb_visits_converted_new_visit,revenue_new_visit,conversion_rate_new_visit,Seitenansichten,Eindeutige Seitenansichten,Downloads,Eindeutige Downloads,Ausgehende Verweise,Eindeutige ausgehende Verweise,Suchanfragen,Eindeutige Suchbegriffe,nb_hits_with_time_generation,Durchschnittliche Generierungszeit,Absprungsrate,Aktionen pro Besuch,Durchschnittliche Besuchszeit (in Sekunden),Durchschnittszeit von wiederkehrenden Besuchen,Aktionen pro wiederkehrendem Besuch,Absprungsrate bei wiederkehrenden Besuchen,***Wiederkehrende Besuche***,Aktionen bei wiederkehrenden Besuchen,Eindeutige wiederkehrende Besucher,Wiederkehrende Benutzer,Maximale Aktionen eines wiederkehrenden Besuchs,nb_conversions_returning_visit,nb_visits_converted_returning_visit,revenue_returning_visit,conversion_rate_returning_visit

@iamwebrocker commented on October 6th 2017

oops, wrong button, sorry :-)

@iamwebrocker commented on October 6th 2017

Here's a better picture to show the differences
04-csv-diff

@sgiehl commented on October 9th 2017 Member

@iamwebrocker Did you generate those two exports with the same Piwik version or did the order change after doing an update of Piwik?

@iamwebrocker commented on October 9th 2017

hi @sgiehl, the exports were made with the same piwik version/install (3.1.0). The "August" export happened mid-september, the "September" last week.
A simliar behaviour has happened earlier with earlier versions as well (although I didn't check for a causality with updates then). When I first observed this, I thought it may be caused by using different range settings (24 months, 12months etc), but with those two I am very sure that they both had the "3 month" range set before exporting the csv

@sgiehl commented on October 9th 2017 Member

Thanks for clarification. Will try to reproduce that on my local instance and have a look if I'm able to fix it

@iamwebrocker commented on October 9th 2017

@sgiehl thanks for taking care of this. If it helps, I would give you temporarily access to the live instance where this happen(s|ed). What I didn't check: Does the order change when I repeatetly export from the same range/view. I could do/check this in the meantime, if it helps.

@sgiehl commented on October 9th 2017 Member

Yes, it would help if you would check that. I'll get back to you if I need some access...

@iamwebrocker commented on October 9th 2017

Okay, I tried 3 times, with a log out between the second and the third. It seems that the order-change is consistent and the same as in the picture above from three days ago.

test-bu-diff-3
test-bu-diff-2
test-bu-diff-1

Next test is using another segment and see if the effect can be reproduced there.

@iamwebrocker commented on October 9th 2017

Here's the order of another segment which doesn't show differences with the same settings as above in the same range:

Datum
Eindeutige Besucher
Besuche
Benutzer
Aktionen
Maximale Aktionen pro Besuch
Absprünge
Von Besuchern verbrachte Gesamtzeit (in Sekunden)
Wiederkehrende Besuche
Aktionen bei wiederkehrenden Besuchen
Eindeutige wiederkehrende Besucher
Wiederkehrende Benutzer
Maximale Aktionen eines wiederkehrenden Besuchs
Absprungsrate bei wiederkehrenden Besuchen
Aktionen pro wiederkehrendem Besuch
Durchschnittszeit von wiederkehrenden Besuchen
Konversionen
Besuche mit Konversionen
Umsatz
Konversionsrate
nb_conversions_new_visit
nb_visits_converted_new_visit
revenue_new_visit
conversion_rate_new_visit
nb_conversions_returning_visit
nb_visits_converted_returning_visit
revenue_returning_visit
conversion_rate_returning_visit
Seitenansichten
Eindeutige Seitenansichten
Downloads
Eindeutige Downloads
Ausgehende Verweise
Eindeutige ausgehende Verweise
Suchanfragen
Eindeutige Suchbegriffe
nb_hits_with_time_generation
Durchschnittliche Generierungszeit
Absprungsrate
Aktionen pro Besuch
Durchschnittliche Besuchszeit (in Sekunden)
@iamwebrocker commented on October 9th 2017

I checked two more segment, no differences.
Here's another one, that shows differences, tho:
test-fsp-diff-1
It seems in this case it is the other way round, but the "blocks" that swapped position are the same as in the tests above.

@iamwebrocker commented on October 9th 2017

I compared the resulting CSV upon noticing that this seems only to happen with segements that have fairly low counts. I notice that the "wrong" ones (assuming that the order with "Durschnittliche Besuchszeit (in Sekunden)" at the last position is the correct one) have empty values in the the CSV, e.g. ",,," instead of "3,0.4,100," etc.
For example, here is the complete content of the CSV that is shown directly above this comment on the left side of the picture:

Datum,Eindeutige Besucher,Besuche,Benutzer,Aktionen,Maximale Aktionen pro Besuch,Absprünge,Von Besuchern verbrachte Gesamtzeit (in Sekunden),Konversionen,Besuche mit Konversionen,Umsatz,Konversionsrate,nb_conversions_new_visit,nb_visits_converted_new_visit,revenue_new_visit,conversion_rate_new_visit,Seitenansichten,Eindeutige Seitenansichten,Downloads,Eindeutige Downloads,Ausgehende Verweise,Eindeutige ausgehende Verweise,Suchanfragen,Eindeutige Suchbegriffe,nb_hits_with_time_generation,Durchschnittliche Generierungszeit,Absprungsrate,Aktionen pro Besuch,Durchschnittliche Besuchszeit (in Sekunden),Durchschnittszeit von wiederkehrenden Besuchen,Aktionen pro wiederkehrendem Besuch,Absprungsrate bei wiederkehrenden Besuchen,Wiederkehrende Besuche,Aktionen bei wiederkehrenden Besuchen,Eindeutige wiederkehrende Besucher,Wiederkehrende Benutzer,Maximale Aktionen eines wiederkehrenden Besuchs,nb_conversions_returning_visit,nb_visits_converted_returning_visit,revenue_returning_visit,conversion_rate_returning_visit
2017-06,14,14,0,98,19,3,2453,0,0,0,0%,0,0,0,0%,31,14,0,0,0,0,0,0,29,"0,45 s",21%,7,2 Minuten 55s,0 s,0,0%
2017-07,16,17,0,77,35,10,3802,0,0,0,0%,0,0,0,0%,28,17,0,0,0,0,0,0,25,"0,79 s",59%,4.5,3 Minuten 44s,0 s,1,100%,1,1,1,0,1,0,0,0,0%
2017-08,5,5,0,11,5,3,185,0,0,0,0%,0,0,0,0%,6,5,0,0,0,0,0,0,6,"0,68 s",60%,2.2,37 s,0 s,0,0%,,,,,,,,,

Notice the succession of empty values at the end of the last row.

And here is another one, this time the "August" CSV of the first example:

Datum,Eindeutige Besucher,Besuche,Benutzer,Aktionen,Maximale Aktionen pro Besuch,Absprünge,Von Besuchern verbrachte Gesamtzeit (in Sekunden),Wiederkehrende Besuche,Aktionen bei wiederkehrenden Besuchen,Eindeutige wiederkehrende Besucher,Wiederkehrende Benutzer,Maximale Aktionen eines wiederkehrenden Besuchs,Absprungsrate bei wiederkehrenden Besuchen,Aktionen pro wiederkehrendem Besuch,Durchschnittszeit von wiederkehrenden Besuchen,Konversionen,Besuche mit Konversionen,Umsatz,Konversionsrate,nb_conversions_new_visit,nb_visits_converted_new_visit,revenue_new_visit,conversion_rate_new_visit,nb_conversions_returning_visit,nb_visits_converted_returning_visit,revenue_returning_visit,conversion_rate_returning_visit,Seitenansichten,Eindeutige Seitenansichten,Downloads,Eindeutige Downloads,Ausgehende Verweise,Eindeutige ausgehende Verweise,Suchanfragen,Eindeutige Suchbegriffe,nb_hits_with_time_generation,Durchschnittliche Generierungszeit,Absprungsrate,Aktionen pro Besuch,Durchschnittliche Besuchszeit (in Sekunden)
2017-06,9,9,0,116,35,3,6874,2,30,2,0,25,0%,15,8 Minuten 36s,0,0,0,0%,0,0,0,0%,0,0,0,0%,9,9,0,0,0,0,0,0,9,"0,51 s",33%,12.9,12 Minuten 44s
2017-07,3,3,0,55,35,0,3621,,,,,,0%,0,0 s,0,0,0,0%,0,0,0,0%,,,,,7,3,0,0,0,0,0,0,7,"0,41 s",0%,18.3,20 Minuten 7s
2017-08,14,14,0,200,63,4,9914,2,26,2,0,13,0%,13,27 Minuten 34s,0,0,0,0%,0,0,0,0%,0,0,0,0%,21,14,0,0,0,0,0,0,19,"0,81 s",29%,14.3,11 Minuten 48s

Again, empty values, this time in the second row.

Maybe this is a pointer to what's going on there?
cheers,
Tom

Powered by GitHub Issue Mirror