@joostdekeijzer opened this Pull Request on November 18th 2014

Hi,

PR #6574 had issues when the GeoIP PECL extension was installed (thus the PR was reverted in #6647).

This is now fixed.

Tested on php53 and php56 with and without the GeoIP PECL extension enabled.
Manually tested with ISP and ORG GeoIP databases.

@tsteur @mattab are there any other test-vector/option to try?

@mattab commented on November 28th 2014 Owner

Hi @joostdekeijzer Thank you for following up and testing on 5.3 and 5.6 with and without PECL. I will merge the PR shortly and hopefully it will work now. We have a couple weeks left until the stable release so hopefully enough time to catch any issue :+1:

@joostdekeijzer commented on November 28th 2014

Great! Let me know if there are any issues so I can try to fix those.

@Globulopolis commented on January 5th 2015 Contributor

Any documentation on how to use ipv6 maxmind DB?

@joostdekeijzer commented on January 6th 2015

I assume you want to use the "new" maxmind databases (.mmdb files) ?

The current implementation uses the legacy Maxmind code and only supports the legacy database (.dat) files.

The currently used datafile (GeoLiteCity.dat) includes IPv6 information.

@Globulopolis commented on January 6th 2015 Contributor

Thank you for clarification @joostdekeijzer

@olavmrk commented on March 24th 2015

Hi,

I tried this functionality today, but as far as I can tell, GeoLiteCity.dat does not include IPv6 information. For that, one needs GeoLiteCityv6.dat

I did a simple test using the script at: https://gist.github.com/olavmrk/1cc21f890ffc97998b84

That script gives the following output:

GeoLiteCity(158.38.62.0): 'Norway'
GeoLiteCity(0:0:0:0:0:ffff:9e26:3e00): NULL
GeoLiteCity(2001:700:1::): 'United States'
GeoLiteCityv6(158.38.62.0): NULL
GeoLiteCityv6(0:0:0:0:0:ffff:9e26:3e00): 'Norway'
GeoLiteCityv6(2001:700:1::): 'Norway'

The correct country is Norway, so we can see that using GeoLiteCity.dat gives the incorrect country. Meanwhile, GeoLiteCityv6.dat is unable to look up IPv4-addresses directly.

What appears to work is looking up addresses in GeoLiteCityv6 using IPv6 addresses and IPv6-mapped IPv4 addresses.

@joostdekeijzer commented on March 25th 2015

Hi,

When I test your IP addresses using the Piwik API, 0:0:0:0:0:ffff:9e26:3e00 does return Norway for me (using http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz ; updated 2015-03-04)

Also see http://demo.piwik.org/index.php?module=API&method=UserCountry.getLocationFromIP&format=JSON&ip=0:0:0:0:0:ffff:9e26:3e00

But that's not to say that the database is good. I find many IPv6 addresses to be reported in China when I'm quite sure they shouldn't.

Eg. a reported IPv6 for google.com using dig <a class='mention' href='https://github.com/8'>@8</a>.8.8.8 aaaa google.com is 2a00:1450:4013:c00::8a which is reported to be in China.
http://demo.piwik.org/index.php?module=API&method=UserCountry.getLocationFromIP&format=JSON&ip=2a00:1450:4013:c00::8a

@olavmrk commented on March 25th 2015

Hi,

I believe Piwik transforms a IPv6-mapped IPv4 address into a IPv4 address before passing it to the GeoIP module. I only included the lines with the IPv6-mapped IPv4 address since it indicates a possible workaround.

To clarify:

My problem isn't IPv4 addresses. My problem is that the GeoLiteCity.dat does not work for IPv6 addresses, while the GeoLiteCityv6.dat file does not (directly) work for IPv4 addresses.

The GeoLiteCity.dat returns a result for IPv6-addresses, but the result appears to be incorrect. Meanwhile the result from GeoLiteCityv6.dat is correct, so the problem isn't the MaxMind database. It is the use of GeoLiteCity.dat.

For example, looking up 2a00:1450:4013:c00::8a in GeoLiteCityv6.dat yields Ireland instead of China.

Another IPv6-address belonging in Norway is 2a02:fe0:c610:9f50::, which GeoLiteCityv6.dat get correct while GeoLiteCity.dat places it in Hong Kong.

@olavmrk commented on March 25th 2015

Hi,

I had an idea about the cause of the strange behavior of IPv6-addresses: It appears that when you try to look up an IPv6 address in GeoLiteCity.dat, it simply casts it to a byte array and looks up the first four bytes as an IPv4 address.

E.g.:

@joostdekeijzer commented on March 25th 2015

Hi Olav,

That's good research!

I think I've found the base of the issue in the code.

Based on your findings, I guess there will also be the need to periodicly update the GeoLiteCityv6.dat file...

@olavmrk commented on March 26th 2015

I'm seeing three options for fixing this:

  1. Change plugins/UserCountry/LocationProvider/GeoIp/Php.php to load and maintain both GeoLiteCity.dat and GeoLiteCityv6.dat, using one for IPv4 addresses and the other for IPv6 addresses.
  2. Change plugins/UserCountry/LocationProvider/GeoIp/Php.php to load and maintain GeoLiteCityv6.dat, and transform IPv4 addresses into IPv6-mapped IPv4 addresses before passing it to the lookup function. Some simple tests indicate that this should work.
  3. Create a new Piwik\Plugins\UserCountry\LocationProvider\GeoIp2 location provider using the new GeoIP2 API.

I guess option 2 is the one requiring the least amount of code changes, but some tricks would probably have to be employed in order to be backwards-compatible with existing installations. I thing it should be possible to look at the database type and switch the method for looking up IPv4 addresses depending on whether it is the Geo(Lite)City.dat or Geo(Lite)Cityv6.dat database.

Option 3 looks like a better long term solution, but I don't know how tied Piwik is to the existing GeoIp location provider.

@joostdekeijzer commented on March 26th 2015

Hi,

Loading and maintaining both GeoLiteCity and GeoLiteCityv6 is quite possible, see the modifications here: https://github.com/piwik/piwik/compare/master...joostdekeijzer:geoip-legacy-IPv6-datafiles?expand=1

I'm afraid the code did not get any prettier... Moving to GeoIP2 would be nice but is a lot of work, also because the output is different from GeoIP1(different spelling of names, etc.).

I think it needs some review and I haven't run any tests yet.

@mattab should this be in a separate ticket or shall I just create a PR with above changes and continue the discussion there?

@olavmrk commented on March 26th 2015

To me that looks like a lot of changes -- I think option 2 would have been prettier :)

Looking at the code -- isn't the same changes required for the other database types (i.e. isp and org)? Or do they have a different structure combining IPv4 and IPv6 addresses?

Another thing I see is that you dropped in 2a02:fe0:c610:9f50:: in some test code. Since that address is an IPv6 address most likely in use by a single subscriber on a residential ISP, I don't think that address should be in that test.

@joostdekeijzer commented on March 26th 2015

I did not try option 2 because I was afraid of side-effects with backward-compatibility and the existing visting archives.
Also, I don't see any IPv6 reference for the ISP and ORG databases in the GeoIP code.

I just tried your IPv6 address (the one in the testcase https://github.com/piwik/piwik/blob/master/tests/PHPUnit/Fixtures/ManyVisitsWithGeoIP.php#L37 doesn't actually have any GeoIP info). Maybe use the Google IPv6 address instead?

@joostdekeijzer commented on March 26th 2015

Just tested: the ISP and ORG databases handle IPv6 addresses the same as GeoLiteCity.dat.

Entering the Google IPv6 addres will return:

  • Country name: Ireland (etc.)
  • isp: China Telecom Guangdong
  • org: China Telecom Guangdong

So this will need to be handled...

@olavmrk commented on March 27th 2015

Yes, backwards compatibility can be a challenge. I think it would be best solved by changing the defaults so that new installations would end up using GeoLiteCityv6.dat, while not changing anything for existing installations.

Regarding the IPv6 address that should be tested -- I just don't think the address of a random subscriber on a random ISP is the best approach. 2001:700:1:: would be slightly better -- that is the main employee network at my workplace, but I still cannot promise that it is persistent in any way.

@joostdekeijzer commented on March 30th 2015

I'd like some lead developer feedback on this.

About your backward compatibility solution: that should indeed work but my feeling is it should be done when migrating to GeoIP2 (I've done some tests with GeoIP2 and the output is really different)

Also agree on your IPv6 test address comment. Maybe Piwik has an address themselves?

@joostdekeijzer commented on March 30th 2015
@mattab commented on March 31st 2015 Owner

Hi there,

I'd like some lead developer feedback on this.

for sure I would like to help here, but this issue is closed. In general it is helpful for us if we discuss in issues that are still opened, so that we will not forget about them. Would you mind creating an issue to describe the specific problem as above? if there is more than one problem, it is best to create more than one issue :+1: hope it works...

This Pull Request was closed on November 28th 2014
Powered by GitHub Issue Mirror