@magnus-84 opened this Issue on June 28th 2017

Hello

I have problems trying to import W3C logs from Incapsula services in to piwik. Below is the line i use to try to import the logfile. IP and domain info have been changed for protection.

/usr/bin/python /var/www/html/piwik/misc/log-analytics/import_logs.py --url=http://10.1.2.3 --idsite=8 --recorders=4 --enable-http-errors --enable-http-redirects --enable-static --enable-bots --log-format-name=w3c_extended --w3c-fields='#Fields: date time cs-vid cs-clapp cs-browsertype cs-js-support cs-co-support c-ip s-caip cs-clappsig s-capsupport s-suid cs(User-Agent) cs-sessionid s-siteid cs-countrycode s-tag cs-cicode s-computername cs-lat cs-long s-accountname cs-uri cs-postbody cs-version sc-action s-externalid cs(Referrer) s-ip s-port cs-method cs-uri-query sc-status s-xff cs-bytes cs-start cs-rule cs-severity cs-attacktype cs-attackid s-ruleName' /root/web.log --debug --debug

Debug output below

2017-06-28 11:21:17,172: [DEBUG] Accepted hostnames: all
2017-06-28 11:21:17,172: [DEBUG] Piwik Tracker API URL is: http://10.1.2.3
2017-06-28 11:21:17,172: [DEBUG] Piwik Analytics API URL is: http://10.1.2.3
2017-06-28 11:21:17,172: [DEBUG] No token-auth specified
2017-06-28 11:21:17,172: [DEBUG] No credentials specified, reading them from "/var/www/html/piwik/config/config.ini.php"
2017-06-28 11:21:17,240: [DEBUG] Authentication token token_auth is: 90871c8584ddf2265f54553a305b6ae1
2017-06-28 11:21:17,240: [DEBUG] Resolver: static
0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
2017-06-28 11:21:17,343: [DEBUG] Launched recorder
2017-06-28 11:21:17,343: [DEBUG] Launched recorder
2017-06-28 11:21:17,344: [DEBUG] Launched recorder
2017-06-28 11:21:17,344: [DEBUG] Launched recorder
Parsing log /root/web.log...
2017-06-28 11:21:17,345: [DEBUG] Based on 'Fields:' line, computed regex to be (?P\d+[-\d+]+\s+[\d+:]+)[.\d]?\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+"?(?P[\w.:-])"?\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?P".?"|\S)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?P\S)\s+(?P\d+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".*?"|\S+)
2017-06-28 11:21:17,350: [DEBUG] Invalid line detected (line did not match): #Software: Incapsula LOGS API

2017-06-28 11:21:17,350: [DEBUG] Invalid line detected (line did not match): #Version: 1.1

2017-06-28 11:21:17,350: [DEBUG] Invalid line detected (line did not match): #Date: 28/Jun/2017 07:28:59

2017-06-28 11:21:17,350: [DEBUG] Invalid line detected (line did not match): #Fields: date time cs-vid cs-clapp cs-browsertype cs-js-support cs-co-support c-ip s-caip cs-clappsig s-capsupport s-suid cs(User-Agent) cs-sessionid s-siteid cs-countrycode s-tag cs-cicode s-computername cs-lat cs-long s-accountname cs-uri cs-postbody cs-version sc-action s-externalid cs(Referrer) s-ip s-port cs-method cs-uri-query sc-status s-xff cs-bytes cs-start cs-rule cs-severity cs-attacktype cs-attackid s-ruleName

2017-06-28 11:21:17,351: [DEBUG] Invalid line detected (line did not match): "2017-06-28" "07:26:35" "a1f36498-c34a-45b9-b3a5-ee0bd00f91b6" "Chrome" "Browser" "false" "true" "123.123.123.123" "" "62a660e57ba257275cf7ccf699919eae18e07e84cb11c1075e99b1be98456059d3064ec14d3932ba6e89f5393a158b8b8c2572ad7ad7dadb0fe02a34ae4c3d504c035017bf9a6a7802bb898226378938" "NA" "774502" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" "452000660051880893" "44850949" "SE" "LS" "Stockholm" "www.example.com" "32.0000" "32.0000" "Customer" "www.example.com/artiklar/x/y/z/" "" "HTTP" "REQ_PASSED" "118866685985031205" "" "124.124.124.124" "80" "GET" "" "200" "123.123.123.123" "10117" "1498634795555" "" "" "" "" ""

Logs import summary

0 requests imported successfully
0 requests were downloads
5 requests ignored:
    0 HTTP errors
    0 HTTP redirects
    5 invalid log lines
    0 requests did not match any known site
    0 requests did not match any --hostname
    0 requests done by bots, search engines...
    0 requests to static resources (css, js, images, ico, ttf...)
    0 requests to file downloads did not match any --download-extensions

Website import summary

0 requests imported to 1 sites
    1 sites already existed
    0 sites were created:

0 distinct hostnames did not match any existing site:

Performance summary

Total time: 0 seconds
Requests imported per second: 0.0 requests per second

Original logfile example below.

Software: Incapsula LOGS API

Version: 1.1

Date: 28/Jun/2017 07:28:59

Fields: date time cs-vid cs-clapp cs-browsertype cs-js-support cs-co-support c-ip s-caip cs-clappsig s-capsupport s-suid cs(User-Agent) cs-sessionid s-siteid cs-countrycode s-tag cs-cicode s-computername cs-lat cs-long s-accountname cs-uri cs-postbody cs-version sc-action s-externalid cs(Referrer) s-ip s-port cs-method cs-uri-query sc-status s-xff cs-bytes cs-start cs-rule cs-severity cs-attacktype cs-attackid s-ruleName

"2017-06-28" "07:26:35" "a1f36498-c34a-45b9-b3a5-ee0bd00f91b6" "Chrome" "Browser" "false" "true" "123.123.123.123" "" "62a660e57ba257275cf7ccf699919eae18e07e84cb11c1075e99b1be98456059d3064ec14d3932ba6e89f5393a158b8b8c2572ad7ad7dadb0fe02a34ae4c3d504c035017bf9a6a7802bb898226378938" "NA" "774502" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" "452000660051880893" "44850949" "SE" "LS" "Stockholm" "www.example.com" "32.0000" "32.0000" "Customer" "www.example.com/artiklar/x/y/z/" "" "HTTP" "REQ_PASSED" "118866685985031205" "" "124.124.124.124" "80" "GET" "" "200" "123.123.123.123" "10117" "1498634795555" "" "" "" "" ""

I gues the problem is somthing in the regex? Any help would be appriciated. I have no knowledge of regex myself.

Regards
Magnus

@sgiehl commented on June 30th 2017 Member

@magnus-84: I've recreated the issue in the log importer repo: https://github.com/piwik/piwik-log-analytics/issues/179

This Issue was closed on June 30th 2017
Powered by GitHub Issue Mirror