How to build a IP geolocation database from scratch? (opens in new tab)

withinboredom2y ago

I'm very curious why you'd do VPN/proxy detection...

But at a previous company I worked at that ran a very large chunk of the internet, we did indexing of nearly the entire internet (even large portions of the dark web) approximately every two weeks. There were about 500 servers doing that non-stop. So, I think it is relatively reasonable if you have 600 servers to do that.

meroje2y ago

In the business of media streaming, rightholder will require that you check for vpn and proxies in addition to countries when deciding if a given viewer will be able to stream a given media.

lillecarl2y ago

You can guess pretty well how IP's are related by BGP announcements, so as long as a few per block and if small, ASN. You can use that logic.

matsur2y ago

ICMP response time not useful for “locating” an anycasted address, some of which have logical location associated with them. See https://blog.cloudflare.com/icloud-private-relay/ for an example

cuu5082y ago

Well, at least you can detect it is an anycast address, and mark it as such.

welder2y ago

Great comment. I'm a big fan and customer of IPinfo, using your API in our login notification emails to say "You just logged in from Berlin, Germany. If this wasn't you click here." To provide country data for customers in their audit logs. And for anti-spam and fraud detection.

reincoder2y ago

I appreciate it, sir! If you have any questions or feedback, please let us know.

The challenge of being a data provider is that you can use our data in a million ways, and we don't have coverage of all. So, when you come up with questions or ideas, we can help you better.

As you mentioned, audit logs. I highly recommend you look into the ASN field.

The ASN identifies an organization that owns a block of IP addresses. In my experience, I have found that the combination of ASN+Country is the most valuable information you can use in spam and fraud detection. You can fake the IP geolocation information with a VPN. However, it is not as easy to fake the ASN information of the IP address. So, when you use a combination of country + ASN, you can have a robust cybersecurity system.

welder2y ago

Can you explain more how to use ASN to detect fraud and how it's different from the country detected for the IP? I thought ASN was derived from the IP, basically the route to that IP? Here's the ipinfo response for an IP used by a recent fraud signup attempt. The asn field matches country.

  {
    "city": "Mumbai",
    "connection": {
      "asn": 24560,
      "isp": "Bharti Airtel Ltd."
    },
    "continent_code": "AS",
    "continent_name": "Asia",
    "country_code": "IN",
    "country_name": "India",
    "currency": {
      "code": "INR",
      "name": "Indian Rupee",
      "plural": "Indian rupees",
      "symbol": "Rs",
      "symbol_native": "\u099f\u0995\u09be"
    },
    "ip": "2401:4900:1f38:7402:5569:2e45:3bb:9c0d",
    "latitude": 19.076000213623047,
    "location": {
      "calling_code": "91",
      "capital": "New Delhi",
      "country_flag": "https://assets.ipstack.com/flags/in.svg",
      "country_flag_emoji": "\ud83c\uddee\ud83c\uddf3",
      "country_flag_emoji_unicode": "U+1F1EE U+1F1F3",
      "geoname_id": 1275339,
      "is_eu": false,
      "languages": [
        {
          "code": "hi",
          "name": "Hindi",
          "native": "\u0939\u093f\u0928\u094d\u0926\u0940"
        },
        {
          "code": "en",
          "name": "English",
          "native": "English"
        }
      ]
    },
    "longitude": 72.87770080566406,
    "region_code": "MH",
    "region_name": "Maharashtra",
    "time_zone": {
      "code": "IST",
      "current_time": "2023-09-15T10:52:42+05:30",
      "gmt_offset": 19800,
      "id": "Asia/Kolkata",
      "is_daylight_saving": false
    },
    "type": "ipv6",
    "zip": "400203"
  }

Here's the response from ipinfo.io which includes privacy fields. It's technically a proxy but might be hard to detect because it's probably a crowdsourced/botnet proxy not a public one. We don't pay for

  {
    "ip": "2401:4900:1f38:7402:5569:2e45:3bb:9c0d",
    "city": "Najafgarh",
    "region": "Delhi",
    "country": "IN",
    "loc": "28.6114,77.2982",
    "org": "AS24560 Bharti Airtel Ltd., Telemedia Services",
    "postal": "110097",
    "timezone": "Asia/Kolkata",
    "asn": {
      "asn": "AS24560",
      "name": "Bharti Airtel Ltd., Telemedia Services",
      "domain": "airtel.com",
      "route": "2401:4900:1f38::/48",
      "type": "isp"
    },
    "company": {
      "name": "ABTS (Karnataka),",
      "domain": "airtel.com",
      "type": "isp"
    },
    "privacy": {
      "vpn": false,
      "proxy": false,
      "tor": false,
      "relay": false,
      "hosting": false,
      "service": ""
    },
    "abuse": {
      "address": "Bharti Airtel Ltd., ISP Division - Transport Network Group, 234 , Okhla Industrial Estate,, Phase III, New Delhi-110020, INDIA",
      "country": "IN",
      "email": "ip.misuse@airtel.com",
      "name": "ABUSE BHARTIIN",
      "network": "2401:4900:1f30::/44",
      "phone": "+000000000"
    }
  }

EDIT: Oops, I confused ipinfo with ipstack. I'm actually using ipstack. Their security field also doesn't detect this IP as a proxy, which is why we only pay for Professional (no security field).

  {
    "ip": "2401:4900:1f38:7402:5569:2e45:3bb:9c0d",
    "type": "ipv6",
    "continent_code": "AS",
    "continent_name": "Asia",
    "country_code": "IN",
    "country_name": "India",
    "region_code": "MH",
    "region_name": "Maharashtra",
    "city": "Mumbai",
    "zip": "400203",
    "latitude": 19.076000213623047,
    "longitude": 72.87770080566406,
    "location": {
      "geoname_id": 1275339,
      "capital": "New Delhi",
      "languages": [
        {
          "code": "hi",
          "name": "Hindi",
          "native": "\u0939\u093f\u0928\u094d\u0926\u0940"
        },
        {
          "code": "en",
          "name": "English",
          "native": "English"
        }
      ],
      "country_flag": "https://assets.ipstack.com/flags/in.svg",
      "country_flag_emoji": "\ud83c\uddee\ud83c\uddf3",
      "country_flag_emoji_unicode": "U+1F1EE U+1F1F3",
      "calling_code": "91",
      "is_eu": false
    },
    "time_zone": {
      "id": "Asia/Kolkata",
      "current_time": "2023-09-15T12:27:08+05:30",
      "gmt_offset": 19800,
      "code": "IST",
      "is_daylight_saving": false
    },
    "currency": {
      "code": "INR",
      "name": "Indian Rupee",
      "plural": "Indian rupees",
      "symbol": "Rs",
      "symbol_native": "\u099f\u0995\u09be"
    },
    "connection": {
      "asn": 24560,
      "isp": "Bharti Airtel Ltd."
    },
    "security": {
      "is_proxy": false,
      "proxy_type": null,
      "is_crawler": false,
      "crawler_name": null,
      "crawler_type": null,
      "is_tor": false,
      "threat_level": "low",
      "threat_types": null
    }
  }

EwanToo2y ago

Have you considered making your database available for download as Parquet format so people could just copy the file to S3, Google Cloud, etc, and query it immediately with various tools?

I know it can be done with CSV but it's not as smooth.

reincoder2y ago

Thank you for the feature request.

We usually just send users the documentation of ingesting the data in CSV or NDJSON format (Newline Delimited JSON). We don't actually get many requests for data downloads in Parquet format. I think we have a few customers where we deliver the data in parquet format directly to their cloud storage bucket.

But keep an eye out for our emails if we announce the parquet data downloads. I will talk with the folks about this.

BUT, there are some good news.

At least for the free database, we deliver the data directly to data warehouse platforms. Not even storage buckets. And we supply a good amount of documentation.

We have the free database in Snowflake, GCP, Kaggle, and Splitgraph, and we are working on a few more deals. For the free database, atleast, we are working on better things than parquet. Like literally one-click solution to bring the IP data to your data warehouse.

Kaggle: https://www.kaggle.com/code/ipinfo/ipinfo-ip-to-country-asn-...

Snowflake: https://app.snowflake.com/marketplace/listing/GZSTZSHKQ4QY/i...

If you want to use our free IP database on Google Cloud or BigQuery, please send us an email (support@ipinfo.io) and mention that the DevRel sent you from HN. I can easily set you up with the free IP database in GCP/BQ.

TheClassic2y ago

Your comment is extremely interesting and what I was hoping to learn from the article (without an existing source of information, how do we determine the location of an IP address). Thank you!

reincoder2y ago

I really appreciate. Thank you. We are very transparent about our process. If you have any questions, you can always reach out to us.

We have a simplified explanation of our probe network here: https://ipinfo.io/blog/probe-network-how-we-make-sure-our-da...

The only update is the number of servers is like 600+ now. The probe network is growing extremely rapidly.

Our IP geolocation process is quite complicated, and we have a team of data engineers, infrastructure engineers, and data scientists working on various aspects of it. Therefore, our approach is users can ask us questions, and we will try our best to answer them.

freedomben2y ago

Just wanted to let you know, it's this transparency that turned me into a customer!

I love your company and service, but I hate your pricing. I work with a lot of small clients/apps that paying for usage would be a no-brainer, but the defined monthly price buckets don't make any economical sense at their scale. If you added a "pay as you go" tier that a small app could reasonably start by using dollars worth of API calls per month and grow from there, I'd be spreading your seed all over the place. I'm not saying this to rag on you, just trying to provide some constructive feedback as a thank you for your info sharing!

3 more replies

detourdog2y ago

I just noticed that my wifes iphone uses the same mycingular ip address while driving accross 3 states over 5 hours.l while checking mail.

inemesitaffia2y ago

There's several options/techniques for doing it. But just imagine you have a permanent zero overhead VPN.

I don't know if that provider terminates long running calls, but the calls would stay up too regardless of tower.

detourdog2y ago

Yes, I’m sure it is iOS anti-tracking and directly related to why firewall apps inside SIP my not know what is going on.

Daviey2y ago

Would you consider no-signup inspection of the data you hold on the requesters IP address? I would love to see what you have on MY IP address, and if sufficiency accurate it feels that it would be a good incentive to sign up to use commerically.

It feels like it couldn't be abused by 'freeloaders', because i'd guess their use-case is viewing other peoples.

reincoder2y ago

We have a very open approach to our data. In fact, our website is extremely accessible. It is quite useful for researching IP addresses and does not require signing up. The data is largely available to view on the website. Although we display all IP address meta data on the home page, if you intend to use our website frequently, I recommend utilizing the IP data pages.

You can enter IP addresses on the right side to look up information here: https://ipinfo.io/what-is-my-ip

Additionally, we offer some enjoyable tools that you can use here: https://ipinfo.io/tools

The CLI tool is particularly entertaining.

You can also use our API service without signing up, with a limit of 1000 requests per day.

If you do choose to sign up for a free account, you will receive 50,000 requests per month, free IP databases, a bulk lookup feature, and more.

kam2y ago

This is literally the most prominent thing on the https://ipinfo.io home page.

Daviey2y ago

That's embarrassing for me... I thought that was a static image of an example. And I did look through the site looking for a search. Oops.

qingcharles2y ago

Huh, that's cool. It got my home IP about 15 miles from where I am, but still not bad.

Wait - how does this work for cell IPs? A lot of cellphone v4 IPs are now shared between hundreds or thousands of devices, right?

chankstein382y ago

That's pretty neat! You're basically using ping triangulation!

sib2y ago

Trilateration (same technique as used for mobile network location - in addition to the GPS on the phone)

chaps2y ago

Not gonna lie, this creeps the heck out of me.

reincoder2y ago

Thousands of people live in a zip code, while hundreds and thousands of people live in a city. We are literally giving away that data for free through our API and database. The creepiness of IP geolocation is mostly a meme.

IP geolocation is mainly used in cybersecurity and marketing analytics. There are many ways to geolocate someone. I once came across a project that could estimate the country a user is from based on their writing style and grammar mistakes. For example, American people sometimes use "should of" instead of "should have". Knowing the geolocation of an IP address isn't super creepy. It's just how things work on the internet.

chaps2y ago

And you're literally advertising this project as being helpful for targeted ads. So it's pretty clear from the get go that what you consider creepy isn't what I consider creepy. And having done enough reidentification work to scare myself, "thousands of people" might as well be a couple dozen or less. I get why you're defensive and why you think it's not creepy, but calling it a "meme" is insultingingly dismissive.

Just because it's "how things work on the internet" doesn't make its mass collection right. Under the same logic, any side channel attack is just "how it works", and its abuse warrants no ethical question.

http://shouldiblockicmp.com/

giantrobot2y ago

You might want to unplug your router then. A conceit of being connected to a network is you're connected to the network. If you can see other nodes they can see you.

jiveturkey2y ago

Is this an accepted usage of the word `conceit`? I love the construction, and it does feel like it belongs, but I'm not finding this usage. https://www.wordnik.com/words/conceit has a bunch of meanings collected from various sources.

I wonder if you meant `concession`.

Also, it's a false dichotomy. One can use VPN or proxies, to limit exposure or to encapsulate it. Of course, you can't get perfect location privacy.

fragmede2y ago

Your IP address is LEAKING!

goodpoint2y ago

Together with the tons of data leaked by browsers it makes it very easy to track people across places and devices.

voltagex_2y ago

Can your probes be identified and blocked?

reincoder2y ago

It is just ping data. We ping an IP address, get the RTT, draw a radius on the globe, and say that the IP could be anywhere inside that radius. Then we do another ping and draw another radius, and at the cross-section of the two radii could be your IP address. Now, if we do it enough times, we can get an estimate of where the IP address is located.

The data is not derived from the IP address itself, but rather from the process itself. And it's just a ping. Moreover, the majority of the IP addresses are not pingable. So, we rely on other in house statistical and scientific models to estimate the location. The probe infrastructure is extremely complicated and there are billions and billions of IP addresses, which is why we do not have a robust range filter mechanism.

You can implement a dynamic ping blocking mechanism or use our data to find hosting ASNs and block ranges of those ASNs. You can download the database for free: https://ipinfo.io/developers/ip-to-country-asn-database

kube-system2y ago

    iptables -A INPUT -p icmp -j DROP

voltagex_2y ago

(But the guy running the probes is making a good counter argument)

j16sdiz2y ago

This breaks PMTU and is the source of many mystery download stalls

jiveturkey2y ago

This doesn't help. Even if you apply this at your router, you are locatable up to your ISP. Which is generally close enough.

Maybe if you delay pings by some amount (20ms? 100ms?), or randomize the delay, you can do a lot better at masking location.

eptyc12y ago

Indeed. Openwrt for some reason defaults to reply to pings. I see the value of ICMP for servers, but I don't see the value for home ISP routers.

I disabled ICMP reply on my home router.

chaps2y ago

This isn't helpful. The comment was specifically asking about the probes, not ICMP traffic.

https://tech.marksblogg.com/ipinfo-free-ip-address-location-...

theogravity2y ago

How does that work with edge servers that use anycast to assume the same IP across different regions?

SnorkelTan2y ago

Aren’t any cast addresses a specific subset of ips and thus knowable? Iirc, each autonomous system is allocated anycast ip space?

sgjohnson2y ago

No. Any prefix can be anycasted.

BonoboIO2y ago

Hi, cool idea with the geolocation via latency.

But I encountered 2 things using ipinfo: Hetzner Server that are in Germany in a fixed location that never moved are sometimes located in another country, for me it was once s Server placed into Moscow and once in South America.

How does this happen?

reincoder2y ago

If you can give me some information about the IP address or the IP range, I can take a closer look.

I guess it is because of IPv4 trading or IP address shuffling.

As far as I know, Hertzer, like many hosting companies, is buying IPv4 addresses around the world. Here is an article on the IPv4 trades:

When a company buys an IP address block or relocates an IP block from one of its data centers to another, the location of those IP addresses changes.

If your IP address is static, but we have made an error in geolocation, I would love to take a closer look. You can email our support (support@ipinfo.io) and send a link to the comment. We can discuss it further from there.

tmaly2y ago

Are there any historical sources for geo ip info?

reincoder2y ago

We don't have any free data for that. We have historical data that we sell as part of our custom enterprise deals. Historical data requests are rare, though.

A time series IP database requires a substantial amount of storage and computational cost to query, as I imagine. The city level geolocation data we have is ~1.5 gb in size. IP range data is complicated to query efficiently as you need to understand data platform settings and good amount of computer network math and computer science stuff. Adding a layer of time series complexities on top of that, makes this process quite difficult.

To give you some context of how IP metadata lookups work, you can check out this article

https://ipinfo.io/blog/ip-address-data-in-snowflake/

Even if you keep all your database in a binary format, the computational cost is still non-negligible.

yrro2y ago

hm, ipinfo.io tells me that I'm using a VPN even though I'm not...

reincoder2y ago

Our VPN recognition is behavior-based. So, there is probably a chance that the IP address you are using is showing some of those behavior patterns.

A behavior pattern could be that your IP address is being shuffled around random locations that go beyond the normal location shuffling of an ISP connection.

Also, if your IP range is listed in some public datasets that belong to a VPN service, we could recognize your IP as a VPN.

Please reach out to our support and let us know about this. Thanks

yrro2y ago

Will do thanks

louison112y ago

If you don't want to do this yourself, you can actually just get Cloudflare to do it for you for free using a simple Worker since all Cloudflare requests contain approximate IP location information.

You can also just send a request to my URL (Cloudflare Worker operated - so it should have global low latency): https://www.edenmaps.net/iplocation

Use it for small applications, I don't mind. Just don't start sending me 10M requests per day ;-)

Aachen2y ago

Or you download an IP database rather than sharing with a third party which IP address is likely connecting to your service with a third party

hotgeart2y ago

Located 100km from the Somali coast... I'm in Brussels, Belgium, thx for protecting my privacy :D

louison112y ago

The result is [lon, lat]. You’ve most likely copied it onto Google maps, which works with [lat, lon]. Believe it or not, the industry still hasn’t come up with a standard order.

mrklol2y ago

You could also just return a json with lat and lon as fields so nobody is getting confused. {lat:5, lon:7}

somedude8952y ago

Got me to within 1km, that's pretty crazy

helmchenlord2y ago

Same here. Scaring.

emadda2y ago

Does anyone know how accurate Cloudflare geolocation is (for workers requests)?

banana_giraffe2y ago

As accurate as MaxMind[1], since that's what they use [2]. In my experience, it's reasonably accurate for the US, less so for other countries. MaxMind publishes some accuracy data which might be an interesting starting point [3]

That said, for any analytics use cases of this data, be aware that MaxMind will group a lot of what should be unknowns in the middle of a country. Or, in the case the US now, I think they all end up in the middle of some lake, since some farm owners in Butler County, Kansas got tired of cops showing up and sued MaxMind. It can cause odd artifacts unless you filter the addresses out somehow.

1 https://developers.cloudflare.com/support/network/configurin...

2 https://www.maxmind.com/en/geoip-demo

3 https://www.maxmind.com/en/geoip2-city-accuracy-comparison

matwood2y ago

Yeah, MaxMind is the best I have used with caveats. You need to update it frequently, and you need to allow for overrides.

reincoder2y ago

I work for IPinfo and we do ping based geolocation. The best thing you can do to verify geolocation accuracy is the following:

- Download a few free IP databases - Generate a random list of IP addresses - Do the IP address lookups across all those databases - Identify the IP address that can be pinged - Visit a site that can ping an IP address from multiple server - Sort the results by lowest avg ping time

Then check where the geolocation provider is locating the IP address and what is the nearest server from there.

noizejoy2y ago

On one hand, I love that there’s some good alternatives in the geolocation space, but misleading geolocation precision can lead to very undesirable side effects[0].

[0] https://www.theguardian.com/technology/2016/aug/09/maxmind-m...

tiffanyh2y ago

This is excellent!

Would you mind open sourcing the code for that?

louison112y ago

This is the code running this endpoint:

  export function onRequest(context) {
    return new Response(JSON.stringify([parseFloat(context.request.cf.longitude), parseFloat(context.request.cf.latitude)]), {headers: {"Content-Type": "application/json;charset=UTF-8"}})
  }

This is a function on Cloudflare Pages (which is just a different name for Cloudflare Workers). Minor adjustment needed for Workers (get rid of "context", I believe)

carstenhag2y ago

I'm in Munich. Cloudflare tells a position that is 730km to the north in a random forest.

louison112y ago

You've inverted lat, lon.

carstenhag2y ago

No, the other one was in the ocean

junto2y ago

As someone that lives in a country where the national language is not my first language, I hate websites that use IP location to make assumptions about my choice of language and it being forced on me based on a lazy assumption, when my browser is sending language headers quite clearly, and they are ignored.

reincoder2y ago

I work for an IP geolocations service, even I hate this thing.

They call it "web experience personalization" in the industry, and it is annoying. I have never recommended anyone to do that. The best way to do website personalization through IP geolocation:

- Taxes and stuff (if applicable)

- Delivery costs (if applicable)

- Putting the user's country first in those country selection drop-down menu

And that's about it from the top of my head. In my experience, these translations never work and only create distractions. Regardless of the positive intention the website has, using Google Translate to create a native language version of the website is just not a good idea.

junto2y ago

A fair number of popular internationalization frameworks also drive the idea that region and language are fixed pairs.

The example I often use to illustrate this problem is that there are roughly 4 million Norwegian speakers in the world, but 14 million speakers of Catalan. Visit an international website in Spain and you rarely get given the option to have it in Catalan.

Good example is Amazon.es https://www.amazon.es/customer-preferences/edit?from=mobile&...

hedora2y ago

I live in the US, and IP geolocation points to the incorrect regions (plural) on all my devices.

Few technologies manage to make my day-to-day internet experience than these sorts of databases.

I wish they would just go away.

Websites could just ask me my zipcode on first load instead of guessing it wrong every single time and then burying the flow to fix it behind multiple links and page loads.

Also: There is no way to fix the database to produce the “correct” or “better” answer. I rarely want a website to use my current location.

Instead, I check inventory for stores in places where I will be. This whole space is trying to solve an ill-posed problem.

fer2y ago

I share your frustration so much that I wrote an article about it: https://www.fer.xyz/2021/04/i18n

jedberg2y ago

It all depends on what you want to use it for and how accurate it needs to be.

The best way to build a geolocation service is to have a billion devices that report their location to you at the same time they report their IP to you. That's basically Apple and Google. They have by far the best geolocation databases in the world, because they get constant updates of IP and location.

The trick is basically to make an app where people willingly give you their location, and then get a lot of people to use it. That's the best way to build an accurate geo-location database, and why every app in the world now asks for your location.

4-square had the right idea, they were just ahead of their time.

flounder32y ago

Even 10 years ago, Apple internal privacy policies prevented itself from collecting precise lat/long. We had to use HTTP session telemetry to determine which endpoints were best for a given IP (or subnet, but not ASN), which informed our own pseudo-geoIP database so we knew which endpoint to connect to based on real world conditions.

Even still, it had to be as ephemeral as possible for the sake of privacy. We weren’t allowed to use or record results from Apple Maps’ reverse geo service outside of the context of a live user request (finding nearby restaurants, etc).

jedberg2y ago

You don't need precise lat/lon to make a good database. Even a 1km circle would be more than enough.

> but not ASN

Why wasn't ASN allowed? That's what Netflix used to make endpoint routing decisions and worked really well.

flounder32y ago

You’re not wrong, but privacy concerns were paramount.

ASNs were allowed but too vague. We needed more granularity. Corporate proxies, subdelegations, many providers aggregating announcements below /24, etc.

hddqsb2y ago

Somewhat relevant: Google Maps can learn the location of your IP based on which locations you browse in the map. If you browse a specific location enough times, it will use that as the default location when you open Google Maps, even if you clear all cookies. (I discovered this just from using Google Maps, and I'm a little concerned by the privacy implications, considering that multiple people may share an IP address.)

gniv2y ago

I suspect it's the other way around. Google just has a very good IP geolocation db, so it uses that when you browse, absent any other info.

hddqsb2y ago

Google certainly uses its geolocation DB, but it also learns based on map browsing patterns.

To clarify, the scenario I described is as follows: 1. Initially, when I open Google Maps in a clean browser it defaults to my real location. 2. I repeatedly browse some other location. 3. When I open Google Maps in a clean browser, it defaults to that other location. The only reason for Google Maps to pick that other location is my map browsing.

gniv2y ago

Thanks for clarifying. That is indeed surprising and you are probably right.

netsharc2y ago

Well it has reporting beacons all over the world with GPS receivers, in the form of Android phones, and perhaps Google Maps users on iPhone too..

is_true2y ago

That would explain why it sometimes it thinks I'm in a river I paddle often and other times where I have my summer house.

cvalka2y ago

They use this for Google Workspace and data localization (including law enforcement localization).

dboreham2y ago

Interesting but this isn't actually how geolocation is done, right? The ARIN/RIPE data isn't sufficiently accurate to be useful beyond country. Commercial geolocation involves correlating client IP vs known physical location e.g. from WiFi AP or mailing a package to the user. At least that's what I have been told over the decades.

I work in adtech and this is how we do geolocation. There's also device geolocation but if the user doesn't consent to sharing their GPS data with us, we just use IP address for targeting. Common provider for this is Maxmind; they ship a database that you host locally and query

dawnerd2y ago

Even the free maxmind db is accurate enough for most applications.

tiffanyh2y ago

Does Cloudflare have the same data as Maxmind?

Because Cloudflare and Maxmind geolocate me to the exact same longitude/latitude.

klaussilveira2y ago

CloudFlare uses Maxmind: https://developers.cloudflare.com/support/network/configurin...

klaussilveira2y ago

Since you are in adtech: do you buy MaxMind, or roll your own? Are there any providers for US-only data, and therefore, cheaper?

https://github.com/Ne00n/yammdb

We licensed Maxmind's DB recently (it's like $300 a year or something). idk if there are US-only databases. Our customers are all in the US, and we use geo IP to filter european users for compliance (GDPR and otherwise)

nonethewiser2y ago

Comments seem fairly dismissive but I actually found this really interesting. It reminds me of a task I had in my first position to add PostGIS to our database and a location based search. That was based off addresses and zipcodes.

mannyv2y ago

That's relatively simple to do, even in mysql. One trick is to use a square instead of a circle, which avoids a lot of math.

spacedcowboy2y ago

So, at the risk of outing myself, I wrote http://www.hostip.info a long time ago* which used a community approach to get ip address location ("is this guess wrong ? Fix it please").

The last time I checked (maybe a decade ago [grin]) it worked pretty much perfectly for a country, imperfectly for a region, and better-than-a-coin-toss for city resolution. All the data is free.

I don't think they have it on the site any more, but I used to have a rotating 3D-cube thing (x,y,z were the first 3 octets of the address) for things like known-addresses, recent lookups, etc. I used different colours for different groups (country, continent,...) It was so old it was written as a Java applet. Yeah. I guess if I were to do it again, it'd be WebGL.

*: I sold it a long time ago, with the proviso that the data must always remain free. I actually didn't believe the offer at first (it came as an email, and looked like a scam) but it went through escrow.com just fine, and I think we both walked away happy. That was almost 2 decades ago now though.

kiririn2y ago

A modern version of the ping-based geoip mentioned

JoshGlazebrook2y ago

This just links to a mmdb file that is already compiled, there isn't anything relevant to show this is a "modern" implementation of anything if the implementation isn't available.

mootothemax2y ago

Any suggestions for geolocating datacenter IPs, even very roughly? I'm analysing traceroute data, and while I have known start and end locations, it's the bit in the middle I'm interested in.

I can infer certain details from airport codes in node hostnames, for example.

It would also be possible - I guess - to infer locations based on average RTT times, presuming a given node's not having a bad day.

Anyone have any other ideas?

Edit: A couple of troublesome example IPs are 193.142.125.129, 129.250.6.113, and 129.250.3.250. They come up in a UK traceroute - and I believe they're in London - but geolocate all over the world.

toast02y ago

Those IPs are owned by Google and NTT, who both run large international networks and can redeploy their IPs around the world when they feel like it. So lookup based geolocation is going to be iffy, as you've seen.

Traceroute to those IPs certainly looks like the networking goes to London.

The google IP doesn't respond to ping, but the NTT/Verio ones do. I'd bet if you ping from London based hosting, you'll get single digit ms ping responses, which sets an upper bound on the distance from London. Ping from other hosting in the country and across the channel, and you can confirm the lowest ping you can get is from London hosting, and there you go. It could also be that its connectivity is through London, but it's elsewhere --- you can't really tell.

Check from other vantage points, just to make sure it's not anycast; if you ping 8.8.8.8 from most networks around the world, you'll get something nearby; but these IPs give traceroutes to london from the Seattle area, so probably not anycast (at least at the moment, things can change).

If you don't have hosting around the world, search for public looking glasses at well connected network that you can use for pings like this from time to time.

tyingq2y ago

This looked promising:

"TULIP's purpose is to geolocate a specified target host (identified by IP name or address) using ping RTT delay measurements to the target from reference landmark hosts whose positions are well known (see map or table)."

https://tulip.slac.stanford.edu/

But the endpoint it posts to seems dead.

dontdoxxme2y ago

https://ensa.fi/papers/geolocation_imc17.pdf has some ideas.

Using RIPE atlas probes to get RTT to the IPs from known locations is close to your idea and probably the best anyway.

vinay_ys2y ago

> A couple of troublesome example IPs are 193.142.125.129, 129.250.6.113, and 129.250.3.250. They come up in a UK traceroute - and I believe they're in London - but geolocate all over the world.

If I'm running a popular app/web service, I would have my own AS number and I will have purchased a few blocks of IP addresses under this AS and then I would advertize these addresses from multiple owned/rented datacenters around the world.

These BGP advertisements would be to my different upstream Internet service providers (ISPs) in different locations.

For a given advertisement from a particular location, if you see a regional ISP as upstream, you can make an educated guess that this particular datacenter is in that region. If these are Tier 1 ISPs who provide direct connectivity around the world, then even that guess is not possible.

You can see the BGP relationships in a looking glass tool like bgp.tools – https://bgp.tools/prefix/193.142.125.0/24#connectivity

If you have ability to do traceroute from multiple probes sprinkled across the globe with known locations, then you could triangulate by looking at the fixed IPs of the intermediate router interfaces.

Even this is is defeated if I were to use a CDN like Cloudflare to advertise my IP blocks to their 200+ PoPs and ride their private networks across the globe to my datacenters.

sgjohnson2y ago

> If you have ability to do traceroute from multiple probes sprinkled across the globe with known locations

Everyone who's aware of RIPE Atlas has that ability.

I have almost a billion RIPE Atlas credits. A single traceroute costs 60. I have enough credits to run several traceroutes on the entire IPv4 internet. (the smallest possible BGP announcement is /24, so max of 2^24 traceroutes, but in reality it's even less).

SirMaster2y ago

These IP geolocation lookups never seen to work for me.

They are always multiple states off, and checking multiple different services pretty much never even seem to agree.

TZubiri2y ago

"how to scrape an ip geolocation database"

You know you can just run a whois query per ip you want to analyze, no point in scraping the whole ipvN space.

incolumitasOP2y ago

I have to scrape the whole IP address space since I offer location information as part of my API.

Also I only need to scrape as many WHOIS records as there are different networks out there. So for example for the IPv4 address space, there are much less networks as there are IPv4 addresses (2^32).

Also, most RIR's provide their WHOIS databases for download.

Therefore, "scraping" is not really the correct word, it's an hybrid approach, but mostly based on publicly available data from the five RIR's.

notlukesky2y ago

What was the easiest and the most frustrating part?

incolumitasOP2y ago

LACNIC is a fucking pain in the ass since they block after like 27 records

ARIN is awesome

djbusby2y ago

The whois data for IP is not accurate.

gsich2y ago

whois has no sane format.

ggm2y ago

RDAP is run by all the RIR and is Json and has all the whois data except IRR.

And it does 302 redirect to best source.

overcast2y ago

Step 1: Download Geolocation Database

Aachen2y ago

Scroll down, the article is confusingly below that

nonethewiser2y ago

Step 1: Download Geolocation Data

Unless you think CSV is a database?

debesyla2y ago

Maybe a dumb question (I have no knowledge), but why wouldn't we think of .CSV files as databases? It can have columns and rows filled with information and isn't that what makes a thing a database?

nobleach2y ago

Best I can guess here, the reply is considering relational databases as "real databases" and flat files.... not real.

nobleach2y ago

Are we really going to do the mincing of words here? Did you need the word "dump" or "export" before you understood? Although I wasn't wild about the original poster's "step 1" terseness, it's silly to think a normal person wouldn't be able to parse the sentence well enough to understand "download the database contents - perhaps stored in CSV format".

tmpX7dMeXU2y ago

If in your mind database implies a type of technology and not something conceptual, you’re really just outing yourself as someone that needs someone between you and the boardroom. Certainly not something to show off on Hacker News.

fasteo2y ago

>>> Consider Open Source Geolocation Projects

Not the definition of "from scratch" in my book

ChopSticksPlz2y ago

This is a very useful .csv, what is the license? Is it free for personal and commercial use?

johnklos2y ago

I think it's interesting that the one IP range I decided to check has correct information on the ipapi.is web site, but unambiguously incorrect information in the downloadable geolocationDatabaseIPv4.csv. Somehow Bedford, New Hampshire (which came straight from WHOIS) became Bedford, Texas.

How'd that happen?

cstuder2y ago

Question: What’s the motivation to put coordinates in one’s own WHOIS record? (geoloc/geofeed)

incolumitasOP2y ago

Many service providers actually want their clients to be able to locate them.

noizejoy2y ago

I’m duplicating my comment elsewhere in this thread, so each serves as a direct reply to the different geolocation providers in this thread, in the hope that it will be recognized as a problem with data that implies that it’s more precise than it really is:

> On one hand, I love that there’s some good alternatives in the geolocation space, but misleading geolocation precision can lead to very undesirable side effects[0].

[0] https://www.theguardian.com/technology/2016/aug/09/maxmind-m...

dontdoxxme2y ago

geofeed is used by big CDNs, it can actually help save money for the provider by meaning a CDN uses a more optimal network location.

sneak2y ago

I feel like a more useful and accurate way would be to buy client ip and GPS location data in bulk from one of the mobile data brokers who have their spyware embedded in zillions of popular apps/games and then group it by /24 or something.

Fileformat2y ago

Shameless self-promotion:

I built a page to compare IP geolocation providers: https://resolve.rs/ip/geolocation.html

I'll work on adding ipapi.is shortly!

XCSme2y ago

I am using the free version of https://db-ip.com, seems pretty good.

reincoder2y ago

Fantastic stuff. I work for IPinfo.io and I actually came across your site about a month ago. I was planning to contact to you about mentioning the free IP databases [0].

However, when I saw that a few API didn't return any response, I thought maybe the site was not maintained.

[0] https://ipinfo.io/products/free-ip-database

----

Tangent

I find the geographic coordinate values returning up to 15 decimal places is absurd for an IP geolocation response. IP geolocation is never that precise and, this level of "precision" is not warranted and frankly distracting. Like at best it should be 4 decimal places.

relevant xkcd: https://xkcd.com/2170/

bjornsing2y ago

I expected traceroute to play a bigger part in this. If you know the route to an IP address and the location of routers, perhaps even from a few different servers, then you should be able to locate it fairly well.

jl62y ago

Is anybody maintaining a historical archive of “IP address metadata” (which would include geolocation)?

If I have logs from 10 years ago, can I look up information about that IP as it was at the time?

quectophoton2y ago

Maybe not exactly what you're looking for, but the ipfire project has a git repository[1] mapping address ranges to countries. It apparently going back to 2017 only.

[1]: https://git.ipfire.org/?p=location/location-database.git;a=s...

bullen2y ago

Here is a solution for those that care about speed:

https://www.miyuru.lk/geoiplegacy

ggm2y ago

rfc8805 https://datatracker.ietf.org/doc/rfc8805/

rfc9092 https://datatracker.ietf.org/doc/rfc9092/

fabioyy2y ago

a long time ago i build a project like that but instead of relying on whois. i did a traceroute to every ipv4 address avaliable. several router hops, have a reverse dns that uses some names that include city codes, (like airport codes ). most providers have a single hop for a city. so its easy to correlate the latest router hop to a city.

nanmu422y ago

Thanks for sharing.

I have heard there is much effort to use BGP data to build GeoIP database.

bagels2y ago

Surely someone is using online shopping shipping addresses for this?

alberth2y ago

What are common use cases for needing IP geolocation?

jwie2y ago

The easiest way to get a geolocation is to ask the user. Maybe they’ll just tell you, and if that’s good enough for your application there’s no need for such solutions.

n2dasun2y ago

Step 1. Download Visual Basic

j / k navigate · click thread line to collapse

173 comments

reincoder2y ago

First, I am big fan of your articles even before I joined IPinfo, where we provide IP geolocation data service.

The publicly available information that you are referring to is sometimes not very reliable in providing IP location data as:

- They are often stale and not frequently updated.

- They are not precise enough to be generally useful.

- They provide location context at an large IP range level or even at organization level scale.

We have a free IP to Country ASN database that you can use in your project if you like.

https://ipinfo.io/developers/ip-to-country-asn-database

incolumitasOP2y ago

Big fan of what articles? On https://incolumitas.com/ or on https://ipapi.is/?

Great idea with latency triangulation, I used latency information for a lot of things, especially VPN and Proxy detection.

But I didn't assume you can obtain that accurate location. I am honestly impressed. But latency triangulation with 600 servers gives some very good approximation. Nice man!

Some questions:

- ICMP traffic is penalised/degraded by some ISP's. How do you deal with that?

- In order to geolocate every IPv4 address, you need to constantly ping billions of IPv4's, how do you do that? You only ping an arbitrary IP of each allocated inetnum/NetRange?

reincoder2y ago

https://incolumitas.com/

This is my all-time favorite article: https://incolumitas.com/2021/11/03/so-you-want-to-scrape-lik...

For the questions..... we have to kinda wait a bit, someone from our engineering team might come here and reply.

By the way, as I have you here have you considered converting the CSV files to MMDB format? I was planning to do that with our mmdbctl tool later today.

https://github.com/ipinfo/mmdbctl

withinboredom2y ago

I'm very curious why you'd do VPN/proxy detection...

meroje2y ago

In the business of media streaming, rightholder will require that you check for vpn and proxies in addition to countries when deciding if a given viewer will be able to stream a given media.

lillecarl2y ago

You can guess pretty well how IP's are related by BGP announcements, so as long as a few per block and if small, ASN. You can use that logic.

matsur2y ago

ICMP response time not useful for “locating” an anycasted address, some of which have logical location associated with them. See https://blog.cloudflare.com/icloud-private-relay/ for an example

cuu5082y ago

Well, at least you can detect it is an anycast address, and mark it as such.

welder2y ago

reincoder2y ago

I appreciate it, sir! If you have any questions or feedback, please let us know.

The challenge of being a data provider is that you can use our data in a million ways, and we don't have coverage of all. So, when you come up with questions or ideas, we can help you better.

As you mentioned, audit logs. I highly recommend you look into the ASN field.

welder2y ago

  {
    "city": "Mumbai",
    "connection": {
      "asn": 24560,
      "isp": "Bharti Airtel Ltd."
    },
    "continent_code": "AS",
    "continent_name": "Asia",
    "country_code": "IN",
    "country_name": "India",
    "currency": {
      "code": "INR",
      "name": "Indian Rupee",
      "plural": "Indian rupees",
      "symbol": "Rs",
      "symbol_native": "\u099f\u0995\u09be"
    },
    "ip": "2401:4900:1f38:7402:5569:2e45:3bb:9c0d",
    "latitude": 19.076000213623047,
    "location": {
      "calling_code": "91",
      "capital": "New Delhi",
      "country_flag": "https://assets.ipstack.com/flags/in.svg",
      "country_flag_emoji": "\ud83c\uddee\ud83c\uddf3",
      "country_flag_emoji_unicode": "U+1F1EE U+1F1F3",
      "geoname_id": 1275339,
      "is_eu": false,
      "languages": [
        {
          "code": "hi",
          "name": "Hindi",
          "native": "\u0939\u093f\u0928\u094d\u0926\u0940"
        },
        {
          "code": "en",
          "name": "English",
          "native": "English"
        }
      ]
    },
    "longitude": 72.87770080566406,
    "region_code": "MH",
    "region_name": "Maharashtra",
    "time_zone": {
      "code": "IST",
      "current_time": "2023-09-15T10:52:42+05:30",
      "gmt_offset": 19800,
      "id": "Asia/Kolkata",
      "is_daylight_saving": false
    },
    "type": "ipv6",
    "zip": "400203"
  }

  {
    "ip": "2401:4900:1f38:7402:5569:2e45:3bb:9c0d",
    "city": "Najafgarh",
    "region": "Delhi",
    "country": "IN",
    "loc": "28.6114,77.2982",
    "org": "AS24560 Bharti Airtel Ltd., Telemedia Services",
    "postal": "110097",
    "timezone": "Asia/Kolkata",
    "asn": {
      "asn": "AS24560",
      "name": "Bharti Airtel Ltd., Telemedia Services",
      "domain": "airtel.com",
      "route": "2401:4900:1f38::/48",
      "type": "isp"
    },
    "company": {
      "name": "ABTS (Karnataka),",
      "domain": "airtel.com",
      "type": "isp"
    },
    "privacy": {
      "vpn": false,
      "proxy": false,
      "tor": false,
      "relay": false,
      "hosting": false,
      "service": ""
    },
    "abuse": {
      "address": "Bharti Airtel Ltd., ISP Division - Transport Network Group, 234 , Okhla Industrial Estate,, Phase III, New Delhi-110020, INDIA",
      "country": "IN",
      "email": "ip.misuse@airtel.com",
      "name": "ABUSE BHARTIIN",
      "network": "2401:4900:1f30::/44",
      "phone": "+000000000"
    }
  }

EDIT: Oops, I confused ipinfo with ipstack. I'm actually using ipstack. Their security field also doesn't detect this IP as a proxy, which is why we only pay for Professional (no security field).

  {
    "ip": "2401:4900:1f38:7402:5569:2e45:3bb:9c0d",
    "type": "ipv6",
    "continent_code": "AS",
    "continent_name": "Asia",
    "country_code": "IN",
    "country_name": "India",
    "region_code": "MH",
    "region_name": "Maharashtra",
    "city": "Mumbai",
    "zip": "400203",
    "latitude": 19.076000213623047,
    "longitude": 72.87770080566406,
    "location": {
      "geoname_id": 1275339,
      "capital": "New Delhi",
      "languages": [
        {
          "code": "hi",
          "name": "Hindi",
          "native": "\u0939\u093f\u0928\u094d\u0926\u0940"
        },
        {
          "code": "en",
          "name": "English",
          "native": "English"
        }
      ],
      "country_flag": "https://assets.ipstack.com/flags/in.svg",
      "country_flag_emoji": "\ud83c\uddee\ud83c\uddf3",
      "country_flag_emoji_unicode": "U+1F1EE U+1F1F3",
      "calling_code": "91",
      "is_eu": false
    },
    "time_zone": {
      "id": "Asia/Kolkata",
      "current_time": "2023-09-15T12:27:08+05:30",
      "gmt_offset": 19800,
      "code": "IST",
      "is_daylight_saving": false
    },
    "currency": {
      "code": "INR",
      "name": "Indian Rupee",
      "plural": "Indian rupees",
      "symbol": "Rs",
      "symbol_native": "\u099f\u0995\u09be"
    },
    "connection": {
      "asn": 24560,
      "isp": "Bharti Airtel Ltd."
    },
    "security": {
      "is_proxy": false,
      "proxy_type": null,
      "is_crawler": false,
      "crawler_name": null,
      "crawler_type": null,
      "is_tor": false,
      "threat_level": "low",
      "threat_types": null
    }
  }

EwanToo2y ago

Have you considered making your database available for download as Parquet format so people could just copy the file to S3, Google Cloud, etc, and query it immediately with various tools?

I know it can be done with CSV but it's not as smooth.

reincoder2y ago

Thank you for the feature request.

But keep an eye out for our emails if we announce the parquet data downloads. I will talk with the folks about this.

BUT, there are some good news.

At least for the free database, we deliver the data directly to data warehouse platforms. Not even storage buckets. And we supply a good amount of documentation.

Kaggle: https://www.kaggle.com/code/ipinfo/ipinfo-ip-to-country-asn-...

Snowflake: https://app.snowflake.com/marketplace/listing/GZSTZSHKQ4QY/i...

TheClassic2y ago

Your comment is extremely interesting and what I was hoping to learn from the article (without an existing source of information, how do we determine the location of an IP address). Thank you!

reincoder2y ago

I really appreciate. Thank you. We are very transparent about our process. If you have any questions, you can always reach out to us.

We have a simplified explanation of our probe network here: https://ipinfo.io/blog/probe-network-how-we-make-sure-our-da...

The only update is the number of servers is like 600+ now. The probe network is growing extremely rapidly.

freedomben2y ago

Just wanted to let you know, it's this transparency that turned me into a customer!

3 more replies

detourdog2y ago

I just noticed that my wifes iphone uses the same mycingular ip address while driving accross 3 states over 5 hours.l while checking mail.

inemesitaffia2y ago

There's several options/techniques for doing it. But just imagine you have a permanent zero overhead VPN.

I don't know if that provider terminates long running calls, but the calls would stay up too regardless of tower.

detourdog2y ago

Yes, I’m sure it is iOS anti-tracking and directly related to why firewall apps inside SIP my not know what is going on.

Daviey2y ago

It feels like it couldn't be abused by 'freeloaders', because i'd guess their use-case is viewing other peoples.

reincoder2y ago

You can enter IP addresses on the right side to look up information here: https://ipinfo.io/what-is-my-ip

Additionally, we offer some enjoyable tools that you can use here: https://ipinfo.io/tools

The CLI tool is particularly entertaining.

You can also use our API service without signing up, with a limit of 1000 requests per day.

If you do choose to sign up for a free account, you will receive 50,000 requests per month, free IP databases, a bulk lookup feature, and more.

kam2y ago

This is literally the most prominent thing on the https://ipinfo.io home page.

Daviey2y ago

That's embarrassing for me... I thought that was a static image of an example. And I did look through the site looking for a search. Oops.

qingcharles2y ago

Huh, that's cool. It got my home IP about 15 miles from where I am, but still not bad.

Wait - how does this work for cell IPs? A lot of cellphone v4 IPs are now shared between hundreds or thousands of devices, right?

chankstein382y ago

That's pretty neat! You're basically using ping triangulation!

sib2y ago

Trilateration (same technique as used for mobile network location - in addition to the GPS on the phone)

chaps2y ago

Not gonna lie, this creeps the heck out of me.

reincoder2y ago

chaps2y ago

http://shouldiblockicmp.com/

giantrobot2y ago

You might want to unplug your router then. A conceit of being connected to a network is you're connected to the network. If you can see other nodes they can see you.

jiveturkey2y ago

I wonder if you meant `concession`.

Also, it's a false dichotomy. One can use VPN or proxies, to limit exposure or to encapsulate it. Of course, you can't get perfect location privacy.

fragmede2y ago

Your IP address is LEAKING!

goodpoint2y ago

Together with the tons of data leaked by browsers it makes it very easy to track people across places and devices.

voltagex_2y ago

Can your probes be identified and blocked?

reincoder2y ago

kube-system2y ago

    iptables -A INPUT -p icmp -j DROP

voltagex_2y ago

(But the guy running the probes is making a good counter argument)

j16sdiz2y ago

This breaks PMTU and is the source of many mystery download stalls

jiveturkey2y ago

This doesn't help. Even if you apply this at your router, you are locatable up to your ISP. Which is generally close enough.

Maybe if you delay pings by some amount (20ms? 100ms?), or randomize the delay, you can do a lot better at masking location.

eptyc12y ago

Indeed. Openwrt for some reason defaults to reply to pings. I see the value of ICMP for servers, but I don't see the value for home ISP routers.

I disabled ICMP reply on my home router.

chaps2y ago

This isn't helpful. The comment was specifically asking about the probes, not ICMP traffic.

https://tech.marksblogg.com/ipinfo-free-ip-address-location-...

theogravity2y ago

How does that work with edge servers that use anycast to assume the same IP across different regions?

SnorkelTan2y ago

Aren’t any cast addresses a specific subset of ips and thus knowable? Iirc, each autonomous system is allocated anycast ip space?

sgjohnson2y ago

No. Any prefix can be anycasted.

BonoboIO2y ago

Hi, cool idea with the geolocation via latency.

How does this happen?

reincoder2y ago

If you can give me some information about the IP address or the IP range, I can take a closer look.

I guess it is because of IPv4 trading or IP address shuffling.

As far as I know, Hertzer, like many hosting companies, is buying IPv4 addresses around the world. Here is an article on the IPv4 trades:

When a company buys an IP address block or relocates an IP block from one of its data centers to another, the location of those IP addresses changes.

tmaly2y ago

Are there any historical sources for geo ip info?

reincoder2y ago

We don't have any free data for that. We have historical data that we sell as part of our custom enterprise deals. Historical data requests are rare, though.

To give you some context of how IP metadata lookups work, you can check out this article

https://ipinfo.io/blog/ip-address-data-in-snowflake/

Even if you keep all your database in a binary format, the computational cost is still non-negligible.

yrro2y ago

hm, ipinfo.io tells me that I'm using a VPN even though I'm not...

reincoder2y ago

Our VPN recognition is behavior-based. So, there is probably a chance that the IP address you are using is showing some of those behavior patterns.

A behavior pattern could be that your IP address is being shuffled around random locations that go beyond the normal location shuffling of an ISP connection.

Also, if your IP range is listed in some public datasets that belong to a VPN service, we could recognize your IP as a VPN.

Please reach out to our support and let us know about this. Thanks

yrro2y ago

Will do thanks

louison112y ago

If you don't want to do this yourself, you can actually just get Cloudflare to do it for you for free using a simple Worker since all Cloudflare requests contain approximate IP location information.

You can also just send a request to my URL (Cloudflare Worker operated - so it should have global low latency): https://www.edenmaps.net/iplocation

Use it for small applications, I don't mind. Just don't start sending me 10M requests per day ;-)

Aachen2y ago

Or you download an IP database rather than sharing with a third party which IP address is likely connecting to your service with a third party

hotgeart2y ago

Located 100km from the Somali coast... I'm in Brussels, Belgium, thx for protecting my privacy :D

louison112y ago

The result is [lon, lat]. You’ve most likely copied it onto Google maps, which works with [lat, lon]. Believe it or not, the industry still hasn’t come up with a standard order.

mrklol2y ago

You could also just return a json with lat and lon as fields so nobody is getting confused. {lat:5, lon:7}

somedude8952y ago

Got me to within 1km, that's pretty crazy

helmchenlord2y ago

Same here. Scaring.

emadda2y ago

Does anyone know how accurate Cloudflare geolocation is (for workers requests)?

banana_giraffe2y ago

1 https://developers.cloudflare.com/support/network/configurin...

2 https://www.maxmind.com/en/geoip-demo

3 https://www.maxmind.com/en/geoip2-city-accuracy-comparison

matwood2y ago

Yeah, MaxMind is the best I have used with caveats. You need to update it frequently, and you need to allow for overrides.

reincoder2y ago

I work for IPinfo and we do ping based geolocation. The best thing you can do to verify geolocation accuracy is the following:

Then check where the geolocation provider is locating the IP address and what is the nearest server from there.

noizejoy2y ago

On one hand, I love that there’s some good alternatives in the geolocation space, but misleading geolocation precision can lead to very undesirable side effects[0].

[0] https://www.theguardian.com/technology/2016/aug/09/maxmind-m...

tiffanyh2y ago

This is excellent!

Would you mind open sourcing the code for that?

louison112y ago

This is the code running this endpoint:

  export function onRequest(context) {
    return new Response(JSON.stringify([parseFloat(context.request.cf.longitude), parseFloat(context.request.cf.latitude)]), {headers: {"Content-Type": "application/json;charset=UTF-8"}})
  }

This is a function on Cloudflare Pages (which is just a different name for Cloudflare Workers). Minor adjustment needed for Workers (get rid of "context", I believe)

carstenhag2y ago

I'm in Munich. Cloudflare tells a position that is 730km to the north in a random forest.

louison112y ago

You've inverted lat, lon.

carstenhag2y ago

No, the other one was in the ocean

junto2y ago

reincoder2y ago

I work for an IP geolocations service, even I hate this thing.

They call it "web experience personalization" in the industry, and it is annoying. I have never recommended anyone to do that. The best way to do website personalization through IP geolocation:

- Taxes and stuff (if applicable)

- Delivery costs (if applicable)

- Putting the user's country first in those country selection drop-down menu

junto2y ago

A fair number of popular internationalization frameworks also drive the idea that region and language are fixed pairs.

Good example is Amazon.es https://www.amazon.es/customer-preferences/edit?from=mobile&...

hedora2y ago

I live in the US, and IP geolocation points to the incorrect regions (plural) on all my devices.

Few technologies manage to make my day-to-day internet experience than these sorts of databases.

I wish they would just go away.

Websites could just ask me my zipcode on first load instead of guessing it wrong every single time and then burying the flow to fix it behind multiple links and page loads.

Also: There is no way to fix the database to produce the “correct” or “better” answer. I rarely want a website to use my current location.

Instead, I check inventory for stores in places where I will be. This whole space is trying to solve an ill-posed problem.

fer2y ago

I share your frustration so much that I wrote an article about it: https://www.fer.xyz/2021/04/i18n

jedberg2y ago

It all depends on what you want to use it for and how accurate it needs to be.

4-square had the right idea, they were just ahead of their time.

flounder32y ago

jedberg2y ago

You don't need precise lat/lon to make a good database. Even a 1km circle would be more than enough.

> but not ASN

Why wasn't ASN allowed? That's what Netflix used to make endpoint routing decisions and worked really well.

flounder32y ago

You’re not wrong, but privacy concerns were paramount.

ASNs were allowed but too vague. We needed more granularity. Corporate proxies, subdelegations, many providers aggregating announcements below /24, etc.

hddqsb2y ago

gniv2y ago

I suspect it's the other way around. Google just has a very good IP geolocation db, so it uses that when you browse, absent any other info.

hddqsb2y ago

Google certainly uses its geolocation DB, but it also learns based on map browsing patterns.

gniv2y ago

Thanks for clarifying. That is indeed surprising and you are probably right.

netsharc2y ago

Well it has reporting beacons all over the world with GPS receivers, in the form of Android phones, and perhaps Google Maps users on iPhone too..

is_true2y ago

That would explain why it sometimes it thinks I'm in a river I paddle often and other times where I have my summer house.

cvalka2y ago

They use this for Google Workspace and data localization (including law enforcement localization).

dboreham2y ago

dawnerd2y ago

Even the free maxmind db is accurate enough for most applications.

tiffanyh2y ago

Does Cloudflare have the same data as Maxmind?

Because Cloudflare and Maxmind geolocate me to the exact same longitude/latitude.

klaussilveira2y ago

CloudFlare uses Maxmind: https://developers.cloudflare.com/support/network/configurin...

klaussilveira2y ago

Since you are in adtech: do you buy MaxMind, or roll your own? Are there any providers for US-only data, and therefore, cheaper?