@soulcreate opened this Issue on July 26th 2014

I use piwik/misc/log-analytics/import_logs.py for my log analytics.
A small part of queries is defined as ????. However, if you look at the source HTML, then everything is OK.
In the picture on the link address http://yandex.ru/yandsearch?text=%EA%F3%EF%E8%F2%FC+%E2%FB%EF%F3%F1%EA%ED%EE%E5+%EF%EB%E0%F2%FC%E5&lr=213
which was well converted into Russian.
It is observed on 20% of all requests from Russian yandex and google.
piwik log analytics

@mattab commented on August 3rd 2014 Owner

Thanks for the report! Could you please attach a small log file that helps us reproduce the issue?

Also which command line did you run to import it?

We will investigate and fix the issue then!

@soulcreate commented on August 5th 2014

I use bash script:
SCRIPT="python piwik/misc/log-analytics/import_logs.py --url=http://........../piwik/";
PARAM="--enable-http-redirects --enable-static --enable-bots --enable-http-errors --enable-reverse-dns --recorders=2 --add-sites-new-hosts";
PATHLOG="/var/log/nginx/";
for i in ls $PATHLOG;
do
if [[ "$i" = *"-access.log" ]]
then $SCRIPT $PATHLOG$i $PARAM;
echo $PATHLOG$i;
fi
done

@soulcreate commented on August 8th 2014

in log file:
**_.ru 95..._* - - [08/Aug/2014:13:31:19 +0600] "GET /%D0%B8%D0%BC%D0%BF%D0%BB%D0%B0%D0%BD%D1%82%D0%B0%D1%86%D0%B8%D1%8F-%D0%B7%D1%83%D0%B1%D0%BE%D0%B2 HTTP/1.0" 200 68785 "http://yandex.ru/yandsearch?text=%E8%EC%EF%EB%E0%ED%F2%E0%F6%E8%FF+%E7%F3%E1%EE%E2+%E5%EA%E0%F2%E5%F0%E8%ED%E1%F3%F0%E3&lr=213" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0)"
and this visitor in piwik - screenshot:
piwik log analytics2

@mattab commented on December 18th 2014 Owner

Hi @soulcreate thx for the report. I tried to add detection for this search engine but it didn't work. maybe @sgiehl you have some idea? my text fixture was:


- url: 'http://yandex.ru/yandsearch?text=%EA%F3%EF%E8%F2%FC+%E2%FB%EF%F3%F1%EA%ED%EE%E5+%EF%EB%E0%F2%FC%E5&lr=213'
  engine: 'Yandex'
  keywords: 'купить выпускное платье'
@soulcreate commented on January 5th 2015

in php is very easy to adjust using the "rawurldecode" function
http://php.net/manual/en/function.rawurldecode.php

@mattab commented on March 12th 2015 Owner

Issue was moved to the new repository for Piwik Log Analytics: https://github.com/piwik/piwik-log-analytics/issues

refs #7163

This Issue was closed on March 12th 2015
Powered by GitHub Issue Mirror