This guide covers how to use our command-line tool to geocode any CSV file.
The goal is to take values from an existing CSV file and append new geocoded columns which contain detailed geographical information about each row.
Your input CSV must contain a header row on the first line with column names.
With a modern version of node
installed, simply execute:
# note: some systems (such as Ubuntu) may require you use 'sudo'.
npm install -g @geocodeearth/ge
In order to authenticate, you’ll need a valid API key from Geocode Earth.
Use the environment variable GE_API_KEY
to make the key available in your shell:
export GE_API_KEY=<YOUR API KEY>
You can check that it’s been set correctly with the env
command.
ge batch csv --help
ge batch csv <file>
append geocoded columns to a CSV file
Positionals:
file location of the input CSV file. [string] [required]
Options:
--version Show version number [boolean]
-v, --verbose enable verbose logging [boolean] [default: false]
--help Show help [boolean]
-p, --param Define a parameter. [string]
-t, --template Define a template. [string]
--endpoint API endpoint to query. [string] [default: "/v1/search"]
--concurrency Maximum queries per-second. [number] [default: 5]
--discovery Maximum concurrency will be applied based on your plan
limits. [boolean] [default: true]
<file>
can be either a normal file or a stream, you can use /dev/stdin
to accept data from a pipe.Without configuration ge batch csv <file>
alone will not yield results.
You’ll first need to define a mapping from the field names in your CSV file to HTTP request parameters which will be sent to Geocode Earth.
This can be achieved using a pair of flags:
-p
to name the parameter-t
to define a template for the parameter valueFor example the following will set the querystring parameter text
to equal 1 Main Street, London
.
* assuming your CSV file contains columns named number
, street
and city
ge batch csv \
-p 'text' \
-t '${row.number} ${row.street}, ${row.city}'
Templating is based on the lodash template engine, data from each row of your CSV file is available in the row
variable.
You can add multiple pairs of parameters but take care to match each -p
with a -t
.
'
instead of double-quotes "
on the command-line to avoid your shell interpolating the string.The Structured Geocoding endpoint is perfect for tasks where you have all request information already split into individual columns.
See the Structured Geocoding documentation for a list of which parameters are available.
ge batch csv \
--endpoint '/v1/search/structured' \
-p 'address' -t '${row.NUMBER} ${row.STREET}' \
-p 'locality' -t '${row.CITY}' \
-p 'country' -t 'US' \
example.csv
The Reverse Geocoding endpoint is suitable for tasks where you only have lat/lon co-ordinates and you’d like to discover what places are at/near that location.
See the Reverse Geocoding documentation for a list of which parameters are available.
ge batch csv \
--endpoint '/v1/reverse' \
-p 'point.lat' -t '${row.LAT}' \
-p 'point.lon' -t '${row.LON}' \
example.csv
The Search endpoint is for tasks where you’re not able to use the Structured Geocoding endpoint due to the data not being completely normalized.
You can set the text
parameter with a single string which concatenates multiple fields, it will be parsed and interpreted before searching.
See the Search documentation for a list of which parameters are available.
ge batch csv \
--endpoint '/v1/search' \
-p 'text' -t '${row.NUMBER} ${row.STREET}, ${row.CITY}' \
-p 'boundary.country' -t 'NZ' \
example.csv
If your CSV file contains many rows it can take some time to complete. You can estimate the running time by dividing the row count by your plan concurrency limit.
For example, a 10,000 line CSV file on the Basic plan (10 QPS) will take ~17 minutes.
10,000 rows / 10 QPS / 60 seconds = 16.66 minutes
If you have a job with 2 million+ rows then we can handle the process for you. Contact us for a quote.
If you find a bug, or would like to request a feature, please open an issue on Github.