16 How to install ogr2ogr
in a Databricks notebook (incl. Parquet support)
[!NOTE] This was tested on Serverless notebook, Environment version 2, as well as on a classic notebook with DBR 15.4 LTS. TODO: add that this is for single-node
ogr2ogr
is a geospatial file conversion tool, part of GDAL. For example, you can use it to read in a directory of GML (geo XML) files, and write them out to GeoPackage (.gpkg
), or even GeoParquet.
TODO: mirroring the colab version, we can add the apt install that doesn’t include parquet yet, spell it out: because you can’t as of june 2025 apt install libgdal-arrow-parquet
%sh
# if http traffic is blocked, we need to use https for `apt` sources
sed -i 's|http://|https://|g' /etc/apt/sources.list.d/ubuntu.sources
# following https://r-spatial.github.io/sf/#ubuntu
sudo apt -y update && apt install -y gdal-bin libgdal-dev
The libgdal-arrow-parquet
extension package that we need can be installed via conda-forge. So let’s first install conda-forge 1:
Now we can add arrow/parquet support:
And that’s it:
The curl download link comes from conda-forge and their installation instructions on GitHub.↩︎