8 Export Delta Lake table to other formats with DuckDB
8.1 Setup
Let’s first create an example table with GEOMETRY columns:
%sql
create or replace table tmp_geometries as
select
st_point(0, 0, 4326) as geometry,
"Null Island" as name
union all
select
st_transform(st_point(155000, 463000, 28992), 4326) as geometry,
"Onze Lieve Vrouwetoren" as name
union all
select
st_makepolygon(
st_makeline(
array(
st_point(- 80.1935973, 25.7741566, 4326),
st_point(- 64.7563086, 32.3040273, 4326),
st_point(- 66.1166669, 18.4653003, 4326),
st_point(- 80.1935973, 25.7741566, 4326)
)
)
) as geometry,
"Bermuda Triangle" as name;
select
*
from
tmp_geometries
-- Returns:
-- _sqldf:pyspark.sql.connect.dataframe.DataFrame
-- geometry:geometry(OGC:CRS84)
-- name:string
-- geometry name
-- SRID=4326;POINT(0 0) Null Island
-- SRID=4326;POINT(5.3872035084137675 52.15517230119224) Onze Lieve Vrouwetoren
-- SRID=4326;POLYGON((-80.1935973 25.7741566,-64.7563086 32.3040273,-66.1166669 18.4653003,-80.1935973 25.7741566)) Bermuda Triangle
8.2 Parquet files
We’ll use DuckDB Spatial to write he Geoparquet file, so first, we output the above Delta Lake table as a directory of Parquet files, using lon/lat coordinates.
(You could also use Databricks Temporary Table Credentials API to directly read the Delta Lake table with the DuckDB Delta Extension instead.)
We will use the above parquet export as a stepping stone to produce other formats below.
8.3 Geoparquet
We can use duckdb to transform the Parquet files into a valid Geoparquet files:
(Note that if you didn’t load the DuckDB Spatial extension, the below would still succeed but Geoparquet metadata would not be written.)
There are more details around writing Geoparquet such as writing custom CRS’s or defining a “covering” using bounding boxes, but the above example is already a valid Geoparquet. For example, if your QGIS already supports the Parquet format (as of Aug 2025, the latest Windows version does but the latest macOS version doesn’t), then you can open this file in QGIS:
(in fact, the GDAL Parquet reader used by QGIS can even open parquet files that are not valid geoparquet, as long as they have a WKB or WKT column and the column name and CRS matches the expected defaults or correctly defined)
8.4 Flatgeobuf
Exporting to Flatgeobuf is very similar to the above. Flatgeobuf as a format has two key advantages here: - It is faster to render (e.g. in QGIS) than Geoparquet, and - It can act as input to tippecanoe
(see below), which we’ll use to produce PMTiles, which is even better suited for web mapping.
query = f"""
load spatial;
copy (
select
* replace (st_geomfromwkb({GEOMETRY_COLUMN}) as geometry)
from
read_parquet('/Volumes/{CATALOG}/{SCHEMA}/{VOLUME}/geometries.parquet/part-*.parquet')
) to '/Volumes/{CATALOG}/{SCHEMA}/{VOLUME}/geometries.fgb'
(FORMAT GDAL, DRIVER flatgeobuf, LAYER_CREATION_OPTIONS 'TEMPORARY_DIR=/tmp/')"""
duckdb.sql(query)
9 PMTiles
For PMTiles, while theoretically we could keep using DuckDB Spatial with GDAL, we’ll instead use tippecanoe.
import os
HOME = os.environ["HOME"]
# see https://github.com/felt/tippecanoe/blob/main/README.md#try-this-first and e.g.
# https://github.com/OvertureMaps/overture-tiles/blob/main/scripts/2024-07-22/places.sh
# for possible options
!{HOME}/.local/bin/tippecanoe -zg -rg -o /tmp/geometries.pmtiles --drop-densest-as-needed --extend-zooms-if-still-dropping --maximum-tile-bytes=2500000 --progress-interval=10 -l geometries --force /Volumes/{CATALOG}/{SCHEMA}/{VOLUME}/geometries.fgb
# NOTE: this mv will emit an error related to updating metadata ("mv: preserving
# permissions for ‘[...]’: Operation not permitted"), this can be ignored.
!mv /tmp/geometries.pmtiles /Volumes/{CATALOG}/{SCHEMA}/{VOLUME}/geometries.pmtiles
To visualize, download the PMTiles from Volumes, and upload it to https://pmtiles.io/ (see below screenshot). To directly visualize it via Databricks Apps via downloading, see TODO:.
TO be clear: the advantage the PMTiles format is to be able to visualize very large datasets such as all of OpenStreetMap – this notebook only uses a very simple example but see TODO: for a larger case.