@wikiloops opened this Issue on November 5th 2014

This issue is also related to #6576 - following @tsteur hint I have tested the insights functionality and would like to contribute some things about "insights" that should get a revision IMO.

There is two things which I see as problematic:
First, there really needs to be a default exclusion of "low populated"-data.
Example: When comparing browser usage with insights, the least used browsers will be displayed as those with the most dramatic gains and losses, which is misleading or at least hiding the relevant information between lots of irrelevant data. I dont really care about a 200% drop in use of iceweasel browser :)

Second: The reports accessible via the "bulb" icon in widgets have default evaluation time spans which will display misleading data if the global time range of PIWIK is set to a date range including today.
It all comes down to: You can not compare the data of an unfinished day/week/month with a complete timespan from the past - otherwise you will see horrifying drops in evolution, depending on how much time is left untill the chosen timespan has complete data.
I know this is not a bug, but a basic fact about data analisys.

For UX-reasons, I think it would be good to prevent this by choosing other defaults if the global time range includes today, p.e.

  • day-timeframe: compare yesterday with the previous day by default, offer yesterday vs. same day last week-comparison
  • week/month timeframe: compare previous week/month with the one before previous week/month by default
@tsteur commented on November 5th 2014 Owner

Re 1) I think for Insights there is already a minimum impact of 1% or 2% to filter out low population. Not sure about movers and shakers. If a site has not many visitors (eg < 100) basically all are included. We could in the future slightly adjust the minimum impact automatically depending on the number of visits or depending on other metric. It shouldn't be too hard to implement it.

Re 2) Not sure what the best way would be to solve this issue

@wikiloops commented on November 5th 2014

re 1) filtering by "impact in %"-only seems not ideal, because one visit of a rare or new kind (browser/search engine, whatever) will have an impact of +100% and wont be filtered this way, while being of little relevance. Good point to be carefull with auto-exclusions on little-traffic-sites, tho!

re 2) Is changing the default comparison timeframes within insights overly difficult?
Thats what I tried to point at, thinking it would not take too much work to check if the global PIWIK date-range includes today - if it does, use the changed insights defaults as outlined in my first post, if it doesn't the current comparisons are fine. Hope this clarifies things a bit...

@tsteur commented on November 5th 2014 Owner

Re 1) what I mean with 2% is that when your site had 1000 visits on a given day, it would exclude all rows / actions / entries having less than 20 visits. This is not about growth or at least shouldn't be about growth. It used to be possible to adjust this percentage rate in the report with the bulb but we removed it as there were already quite a few setting screws.

@tsteur commented on November 5th 2014 Owner

re 2) It should be doable but it also has to be visualized to the user which dates are compared etc. which might be the difficult part to make it look nice / not disruptive.

@wikiloops commented on November 6th 2014

re 1) I'm affraid the filtering is not working the way you describe it (which would be absolutely fine) - I am seeing a lot of data on browsers used by 1 - 3 visitors, while tracking over 1400 a day.
+1 for not offering too many confusing setting screws

re 2) agreed, this will be a little hard to word or visualize without adding a calendar widget, which is not an option really.
Maybe one should split out the "fine tuning settings" from the insights interface all the way and just offer one standard evaluation, which correlates to the global chosen date range.
I know removing customization options feels a little wrong, but I feel customization should never lead to confusing/useless data if a user is not spotting the need to set customized settings.
The defaults need to be well chosen, otherwise the whole widget creates a bad impression at first sight.

Thinking out of the box - one could add an extra notice to the global date-range-picker.
If a user chooses a date range including today, prompt a message saying something along the lines of:

  • "Attention: You have chosen to evaluate the traffic of a date range that has not ended.
    Since there is no use in comparing the not-ended period with an ended one, the "insights" (add other widgets here) will compare the data of the last two ended periods."*

Given, that message needs a better wording, but it might cater the unexperienced user to understand the effects right away when playing with date ranges for the first time.
It gives a nice hint why all your graphs will drop on the current time frame when choosing a not-ended one, which I felt to be a bit confusing (depressing!) UX.

with such a message given, one would not need to display too much info within the insigts widget, one single dropdown offering a selection between
"short term" (yesterday vs. last weeks "yesterday" / last ended week to previous week etc)
"mid term" (yesterday vs. same day 4 weeks ago / last ended week vs 5 weeks ago)
"long term" (yesterday vs. same date last year / last week vs. same week last year)
might be all thats needed.
Of course, the advanced analyst should be able to find out about the parameters somewhere... a simple space-saving option might be to display the evaluated timespans on mouse-over of the selector.

@tsteur commented on November 6th 2014 Owner

Re 1) Then it needs to be added. Maybe it was there in an earlier version and removed at some point.

Re 2) I can imagine the message can be also annoying for some users but I reckon we would display it only once or twice anyway as some users might pick it quite often. Would be nice to collect 3 or 4 more ideas. I just do not really have much time to think about this right now. I might add some more ideas somewhen later or maybe we'll think about it before working on it. I couldn't find the issue but there should be one covering something similar already (it is about the many features we have which many people don't know etc and improving UX etc).

@hpvd commented on July 12th 2016

this should be added here (was marked as dublicate)
"comparison of data of "unfinished periods" with "complete periods" does not make big sense e.g. in widget" #10282

When comparing data e.g. number of visits, entry pages etc
with data of day before there is a problem:
when checking earlier than 23:59
there is always a difference because
the period taken into account is shorter
than the complete day before
and with this not all "things" of this day has already happened.

An example:
Looking at data at 9:00 in the morning:
actual data is taken from 0:00 to 9:00 = 9h
the data for comparison form 0:00 to 0:00 = 24h

the result looks like this:


=> there are two ways to solve this:

  • compare data always the last 24h 9:00-9:00 to the 24h before 9:00-9:00
  • compare last full day 0:00-0:00 to the last full day before 0:00-0:00

of course this is not only relevant for comparing the actual day with the day before but to all comparisons of data of unfinished periods with complete periods before

same problem for graphs is reported make a difference in graphs for data of "unfinished periods" and "complete periods" #10291

Powered by GitHub Issue Mirror