Importing a database

As specified in the introduction vignette, you can download pre-built search indices for selected country extracts. If you require more freedom in providing the geocoding data, you can choose to import from an existing Nominatim database or from a JSON dump. This vignette guides you through the setup and import of an external database.

Importing from Nominatim

Technically, Nominatim databases can only be reliably set up on Linux systems. Here, we use the mediagis/nominatim docker image to set up Nominatim irrespective of the operating system. You can use the helper functions cmd_options() and run() to run a Nominatim docker. It is important to expose the port 5432 on the host machine, otherwise photon is not able to connect to the database.

opts <- cmd_options(
  e = "PBF_URL=https://download.geofabrik.de/australia-oceania/samoa-latest.osm.pbf",
  e = "NOMINATIM_PASSWORD=mypassword",
  e = "FREEZE=true",
  p = "8080:8080",
  p = "5432:5432",
  name = "nominatim",
  "mediagis/nominatim:4.4",
  use_double_hyphens = TRUE
)

# Note: on Windows, make sure you have Docker Desktop running!
nominatim <- process$new("docker", c("run", opts))

# Wait until Nominatim is ready
ready <- FALSE
while (!ready) {
  Sys.sleep(5)
  logs <- run("docker", c("logs", "nominatim"))
  ready <- any(grepl("ready to accept requests", logs))
}

run(
  "docker",
  c(
    "exec", "--user", "postgres", "nominatim", "psql", "-d", "nominatim", "-c",
    "ALTER USER nominatim WITH ENCRYPTED PASSWORD 'mypassword'"
  )
)

To verify that the database can be connected to, you can connect to it from R.

library(RPostgres)
db <- dbConnect(Postgres(), password = "MNdtC2*pP#aMbe", user = "nominatim")
dbGetInfo(db)
#> $dbname
#> [1] "nominatim"
#> 
#> $host
#> [1] "localhost"
#> 
#> $port
#> [1] "5432"
#> 
#> $username
#> [1] "nominatim"
#> 
#> $protocol.version
#> [1] 3
#> 
#> $server.version
#> [1] 140013
#> 
#> $db.version
#> [1] 140013
#> 
#> $pid
#> [1] 604

dbDisconnect(db)

If the database can be connected to, you can start a new photon instance and import the database using $import(). The database import creates the folder photon_data inside the given photon directory.

dir <- file.path(tempdir(), "photon")
photon <- new_photon(dir, overwrite = TRUE)
#> ℹ java version "22" 2024-03-19
#> ℹ Java(TM) SE Runtime Environment (build 22+36-2370)
#> ℹ Java HotSpot(TM) 64-Bit Server VM (build 22+36-2370, mixed mode, sharing)
#> ✔ Successfully downloaded photon 1.0.0. [8.2s]        
#> ℹ No search index downloaded! Download one or import from a Nominatim database.
#> • Version: 1.0.0

photon$import(host = "localhost", password = "MNdtC2*pP#aMbe")

After the import has finished, you can start the photon instance.

photon$start()
#> 2024-10-24 23:26:46,360 [main] WARN  org.elasticsearch.node.Node - version [5.6.16-SNAPSHOT] is a pre-release version of Elasticsearch and is not suitable for production
#> ✔ Photon is now running. [11.1s]
geocode("Apia", limit = 3)
#> Simple feature collection with 3 features and 13 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -171.7631 ymin: -13.83613 xmax: -171.7512 ymax: -13.82611
#> Geodetic CRS:  WGS 84
#> # A tibble: 3 × 14
#>     idx osm_type     osm_id country osm_key city        street     countrycode osm_value name  state type  extent
#>   <int> <chr>         <int> <chr>   <chr>   <chr>       <chr>      <chr>       <chr>     <chr> <chr> <chr> <list>
#> 1     1 W        1322127938 Samoa   place   NA          NA         WS          city      Apia  Tuam… city  <dbl> 
#> 2     1 W         723300892 Samoa   landuse Matautu Tai NA         WS          harbour   Apia… Tuam… other <dbl> 
#> 3     1 W         666117780 Samoa   tourism Levili      Levili St… WS          attracti… Apia… Tuam… house <dbl> 
#> # ℹ 1 more variable: geometry <POINT [°]>

Import from a JSON dump

Since photon 0.7.0, databases can be dumped to and imported from JSON files (so called Nominatim Dump Files, see the docs). While pre-built databases are not available for every region through $download_data(), JSON dumps are. You can choose to download JSON dumps instead of pre-built databases by setting json = TRUE.

photon$remove_data()
photon$download_data("Andorra", json = TRUE)

Using this data, you can then simply import the dump using the $import() method with json = TRUE.

photon$import(json = TRUE)