6  Storing spatial data in Delta Lake as WKB

You can store geometry or geography data in a Delta Lake table in a BINARY column as Well-known binary (WKB or EWKB). This is a more compact representation than well-known text (WKT), and widely supported incl. in the Geoparquet specification. On the other hand, unlike the newer GEOMETRY and GEOGRAPHY types, there is no higher level semantic support possible such as automatic spatial indexing (as of Aug 2025, coming soon). Also, you need to use the conversion function st_geomfromwkb or st_geomfromwekb before any other ST function.

%sql
create temporary view t_ewkb as
select st_asewkb(st_point(3,0,4326)) wkb_geometry
%sql
select
  st_geomfromewkb(wkb_geometry),
  wkb_geometry
from
  t_ewkb

Another example of Delta Lake tables with WKB columns are the Overture Maps datasets prepared by CARTO, available via the Databricks Marketplace. Follow the previous link to add any them (at no cost) to your catalog, if you haven’t yet. For example, for the below query, use Divisions (borders of countries and other administrative divisions):

%sql
select
  names:primary as name,
  geometry
from
  carto_overture_maps_divisions.carto.division_area
where
  subtype = 'country'

-- Returns:
-- name geometry
-- """Australia"""  AQYAAAAjAAAAAQMAAAABAAAA (truncated)
-- """Timor-Leste"""    AQYAAAAEAAAAAQMAAAABAAAA (truncated)
-- """Vanuatu"""    AQYAAABlAAAAAQMAAAABAAAA (truncated)
-- ... ...

These CARTO tables also show one pattern to organize and cluster tables with geometries: they include the bounding box columns __carto_xmin, __carto_xmax, __carto_ymin, ___carto_ymax and are clustered by these colums.

Another pattern would be to make use of spatial indexing such as H3.

6.1 Example usage with ST functions

NOTE: as of 2025-08-03, ST functions are only available on DBR 17.1+ and thus not available on Databricks Free Edition, but this limitation might be lifted soon.

But what if you wanted to use

show srid

%sql
-- The distance between the UK and France
with countries as (
  select
    country,
    st_geomfromwkb(geometry) wkb_geometry
  from
    carto_overture_maps_divisions.carto.division_area
  where
    subtype = 'country'
    and class = 'land'
),
uk as (
  select
    wkb_geometry
  from
    countries
  where
    country = 'GB'
),
fr as (
  select
    wkb_geometry
  from
    countries
  where
    country = 'FR'
)
select
  st_distancespheroid(uk.wkb_geometry, fr.wkb_geometry) / 1e3 distance_km
from
  uk, fr

Now the question is, if the Strait of Dover is about 32 kms narrow, why did we get more above? For this, we’d need to find out where exactly the shortest line is between the two countries. There is no st_shortestline yet in Databricks ST functions as of 2025-08-03, but we can use a Spark UDF with DuckDB spatial to fill this gap, see here TODO: .