Listing the referrer websites can be significantly improved by normalizing the domain names. Currently subdomains such as "www7" are treated as separate website. Here's an example of such a referrer list, in which you see that lemonde.fr is listed several times:
Of course this is not trivial, as some sub-domains are pointing to separate websites while others are only mirrors or mobile variants of the same site.
To solve this issue, Mozilla maintains a list of "effective" tld names. This list includes domains such as bl0gsp0t.com and dyndns.org, because X.dyndns.org should be treated as a separate websites.
Using this list it is easy to normalize the domains, or in other words, to extract the "effective" websites. The list is not perfect (for instance tumbr.com is missing) but it should solve 95% of the problem.
Good idea to use a list to improve the referrer website. For lemonde example though, I feel like having all the subdomains brings value as it helps seeing which sub-sites bring more traffic. lemonde is not in the list so it makes sense.
We could also implement this as a plugin in the upcoming marketplace at: http://plugins.piwik.org/
Another very smart solution would be to do just group the visits by domain and subdomain. This seems to be easier as we don't need to maintain the effective tld list at all. The result could look like this:
||= Website =||= Visits =|| || guardian.co.uk || 503108|| || lemonde.fr || 303471|| || - www.lemonde.fr || 177113|| || - decodeurs.blog.lemonde.fr || 83375|| || - emploi.blog.lemonde.fr || 30323|| || - abonnes.lemonde.fr || 7412|| || - mobile.lemonde.fr || 2652|| || - alicedsl.lemonde.fr || 2596|| || derstandard.at || 58850||
Ok, we might still need to maintain a shorter list of effective TLDs where we put some country-specific TLDs in, such as co.uk, but we don't need to cover company specific TLDs such as blogsp0t.com, as users can easily unfold the domain to see what blogs are linking most.
(btw I hate this comment system which always blacklists my comments just because I include blogsp0t.com. silly!)
Great idea to add a new "view" of the report with subtables showing subdomains.
Maybe we show such new report as a new footer link Related Report "Websites by Domain" under "Websites" report - Maybe we could "save" as preference on click, as part of #1915 - or maybe in general we could make "Related Report" link more visible (see for example under Page Titles report)
Or maybe as a "COG" dropdown option.
I would prefer making the hierarchical view the new default and then let the user "make it flat" as we are doing with the Pages report.
Anyone thinking that the flat view is better than grouping by domain?
Nice idea for a plugin which could filter out the Referrers dataTable to make the grouping as explained here!
As a first step toward this I worked on a PHP implementation for extracting the "effective" domain name of an hostname.
Usage is very simple:
> include('EffectiveDomainName.php'); > print EffectiveDomainName::get('mobile.nytimes.com') . "\n"; nytimes.com > print EffectiveDomainName::get('flightjs.github.io') . "\n"; flightjs.github.io > print EffectiveDomainName::get('www.google.com.br') . "\n"; google.com.br
@gka Thanks for the tip.
Weird that this issue got closed, I don't think I closed it unless it was by mistake...
It would be relatively easy to create a plugin that will either modify existing
getWebsites or add new related report report where we will call a filter
GroupBy that will group rows by "effective domain".
Would you also group
and maybe group
Since facebook.com is not listed as effective TLD (aka "public suffix"), any subdomain *.facebook.com will indeed be "normalized" to facebook.com. However, t.co is not being "grouped" with twitter.com, as both are entirely different domains.
Hi @gka alright
maybe we could use your list and then customise it with all known social networks domains for example.
I'm setting to
Short term as it's quite easy to build this at least in a plugin on the Marketplace
we'd simply apply the normalisation function in a custom filter, that would
GroupBy the labels by the normalisation function. it would ideally be possible to disable it in the Cog icon menu.