@anonymous-piwik-user opened this Issue on August 11th 2009

i'm getting very often a user from the country "unknown" with the provider "googlebot", resolution 1024 x 1024, Browser: Mozilla 5.0 and an unknown operating system.

i'm getting this on a few sites.
maybe a new version of the googlebot?
Keywords: googlebot

@robocoder commented on August 11th 2009 Contributor

Can you check your web server log and give us a User Agent string? Sounds like Google's version of the Bing spambot.

@anonymous-piwik-user commented on August 11th 2009

"Mozilla/5.0 (compatible; Googlebot/2.1; !http://www.google.com/bot.html)"

@robocoder commented on September 11th 2009 Contributor

Reference: http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=80553

Maybe it's time to add an example bot tracking plugin to move bot-specific detection logic out of Visit.php...

@robocoder commented on September 11th 2009 Contributor

Can you provide a few lines from your web server's access log showing the Googlebot requests? Thanks.

@anonymous-piwik-user commented on September 11th 2009

66.249.71.35 - - +0200 "GET /ro/tag/englisch/ HTTP/1.1" 200 10033 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /wp-content/plugins/simple-ajax-shoutbox/ajax_shoutbox_process.php?1252281600 HTTP/1.1" 200 83 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /sk/tag/linux/ HTTP/1.1" 200 10899 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
83.64.31.37 - - +0200 "POST /wp-cron.php?doing_wp_cron HTTP/1.0" 200 - "-" "WordPress/2.8.4; http://blog.prasi.at"

66.249.71.35 - - +0200 "GET /tag/englisch/&rurl=translate.google.com&lang=de&usg=ALkJrhja1EyzL9WcZVjz7LgKxhfVOVrJEw HTTP/1.1" 302 20 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /2009/06/20/jailbreak-iphone-os-3-0-ist-ab-sofort-verfugbar/ HTTP/1.1" 200 10777 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /tag/nova-rock/&rurl=translate.google.com&lang=de&usg=ALkJrhi6maYH0aia7iAfuV7rpgHyGmOUMA HTTP/1.1" 302 20 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /tag/magento/&rurl=translate.google.com&lang=de&usg=ALkJrhgLB2Y3EEXHn82mkWp80wufYKHKwA HTTP/1.1" 302 20 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /tag/usb-stick/&rurl=translate.google.com&lang=de&usg=ALkJrhigtRYLjyjLlIUykrlJ13sz7ofdNw HTTP/1.1" 302 20 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /en/about-me/ HTTP/1.1" 200 8398 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /2009/06/23/left-4-dead-patch-diese-woche/&rurl=translate.google.com&lang=de&usg=ALkJrhhG8ZQLkAD2-GicJfebbgFaFc7bng HTTP/1.1" 302 20 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.35 - - +0200 "GET /tag/schweden/&rurl=translate.google.com&lang=de&usg=ALkJrhhPiufQvYU-FDqR0mTbZW8qKcqViA HTTP/1.1" 302 20 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

@mattab commented on September 12th 2009 Owner

A DNS lookup of the visitor host is done in the provider plugin when it is enabled.

Technically we should not require this DNS lookup for proper Piwik behavior, it should always be optional (as it can cause performance issues if DNS latency goes up).

@robocoder commented on September 12th 2009 Contributor

prasi: do you have one showing Googlebot fetching piwik.php?

matt: nod a similar latency issue arises with the honeypot suggestion in #653

@anonymous-piwik-user commented on September 12th 2009

66.249.71.210 - - +0200 "GET /robots.txt HTTP/1.1" 200 21 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

66.249.71.210 - - +0200 "GET /piwik.php?idsite=1&url=http%3A%2F%2Fblog.prasi.at%2F&res=1024x1024&h=3&m=51&s=21&cookie=1&urlref=&rand=0.278324234&pdf=0&qt=0&realp=0&wma=0&dir=0&fla=0&java=0&gears=0&ag=0&action_name= HTTP/1.1" 200 43 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

@robocoder commented on September 12th 2009 Contributor

In [1470], fixes #918 and #958 - Filter out Googlebot and Bing bot

This Issue was closed on September 12th 2009
Powered by GitHub Issue Mirror