25 September 2024

WhosOnFirst GeoParquet Downloads

To celebrate the release of DuckDB v1.1.0 we’re now publishing regular updates of the WhosOnFirst Gazetteer in GeoParquet format.

DuckDB

DuckDB has been gaining a lot of popularity with map nerds, and it’s easy to see why. It’s super fast and instantly familiar to SQLite users. It comes bundled with geospatial extensions, allowing you to perform spatial joins, indexing, and other spatial operations directly in your terminal.

If you haven’t had the opportunity to try it out, this is your excuse! brew install duckdb yourself and give it a go. The Spatial Functions are instantly familiar to anyone who’s used PostGIS or SpatiaLite.

GeoParquet

GeoParquet has also been gaining popularity due to its columnar storage format, which allows for better compression and faster retrieval of large geospatial datasets.

Most interestingly, you can query the file directly without downloading the entire dataset first 😲.

WhosOnFirst

Who’s On First is a gazetteer of all the places in the world, from continents to neighborhoods, including their geometries and metadata, such as population counts.

It’s a project we’ve been involved with for almost a decade and is the source of all the administrative regions we use in our Geocoding APIs.

Without further ado, let’s see some examples…

Find a feature by ID and return the centroid:

SELECT name, placetype, ST_AsText(ST_Centroid(geometry)) as wkt
FROM read_parquet('https://data.geocode.earth/wof/dist/parquet/whosonfirst-data-admin-de-latest.parquet')
WHERE id='101909779';
┌─────────┬───────────┬───────────────────────────────────────────────┐
│  name   │ placetype │                      wkt                      │
│ varchar │  varchar  │                    varchar                    │
├─────────┼───────────┼───────────────────────────────────────────────┤
│ Berlin  │ locality  │ POINT (13.401858898704846 52.501526801979985) │
└─────────┴───────────┴───────────────────────────────────────────────┘

Return all geometries containing a point:

SELECT id, name, placetype
FROM read_parquet('https://data.geocode.earth/wof/dist/parquet/whosonfirst-data-admin-nz-latest.parquet')
WHERE (
  geometry_bbox.xmin <= 174.767595 AND
  geometry_bbox.xmax >= 174.767595 AND
  geometry_bbox.ymin <= -41.285379 AND
  geometry_bbox.ymax >= -41.285379
)
AND ST_ContainsProperly(geometry, ST_Point(174.767595, -41.285379));
┌────────────┬───────────────────┬────────────┐
│     id     │       name        │ placetype  │
│   int32    │      varchar      │  varchar   │
├────────────┼───────────────────┼────────────┤
│   85687233 │ Wellington Region │ region     │
│  102079339 │ Wellington City   │ county     │
│ 1729238583 │ Wellington        │ localadmin │
│ 1729339019 │ Kelburn           │ locality   │
│   85633345 │ New Zealand       │ country    │
└────────────┴───────────────────┴────────────┘

Head on over to our WhosOnFirst data downloads to see what’s available.

Image credit: Rubber Ducks with Sunglasses by David Dennis.