Good afternoon everyone,
I had difficulty with importing logfiles from GoDaddy successfully into Piwik and URLs replaced for privacy using
piwik/htdocs/misc/log-analytics/import_logs.py --idsite=1 --url=piwikurl --enable-http-errors --enable-http-redirects --enable-static -d /home/bitnami/logfile.log
If the GET field looks like "GET www.site.org/index.htm" the import fails and produces 'Page URL not defined.' If it looks like "GET /index.htm" or "GET http://www.site.org/index.htm" the import is successful.
I believe the problem is occurring in the archive import tool at piwik/htdocs/misc/log-analytics/import_logs.py and not on the Piwik php side .
Looking at the text supplied to the piwik instance for importing the hits and printing the data of the JSON sent to the server shows that successful imports have 'http://' in the URL provided to piwik.
Tested with a Vagrant/Puppet install provided at http://piwik.org/blog/2012/08/get-started-with-piwik-development-with-puppet-and-vagrant/ (v 2.0.3) and with the Piwik install provided by Bitnami (v 2.0.2)
Both produced working Piwik installations. I configured the sites in settings to accept site.org and www.site.org.
Workaround: If host is specified in logfile, add 'http://' or remove host. sed -i 's/GET site.org/GET /g' logfile.log
Thanks for the report. I've never seen logs under this format before. Which software / server is generating accesss logs such as "GET www.site.org/index.htm" ?
I'm asking to know if it would impact a lot of users or just a few. cheers
Consolidating milestones FTW
@user10001001 do you you mind replying to my question? Is it Goddady that generates these server files?
Issue was moved to the new repository for Piwik Log Analytics: https://github.com/piwik/piwik-log-analytics/issues