Storing spatial data in Delta Lake as WKB
You can store geometry or geography data in a Delta Lake table in a BINARY column as Well-known binary (WKB or EWKB). This is a more compact representation than well-known text (WKT), and widely supported incl. in the Geoparquet specification. On the other hand, unlike the newer GEOMETRY and GEOGRAPHY types, there is no higher level semantic support possible. Also, you need to use the conversion function st_geomfromwkb or st_geomfromwekb before any other ST function.
Another example of Delta Lake tables with WKB columns are the Overture Maps datasets prepared by CARTO, available via the Databricks Marketplace. Follow the previous link to add any them (at no cost) to your catalog, if you haven’t yet. For example, for the below query, use Divisions (borders of countries and other administrative divisions):
The CARTO/Overture Maps tables are stored in us-west-2
as of writing, so if you are not using Databricks Free Edition and you are in any other region, you will have to pay egress charges based on the amount of data you read.
%sql
select
names:primary as name,
geometry
from
carto_overture_maps_divisions.carto.division_area
where
subtype = 'country'
-- Returns:
-- name geometry
-- """Australia""" AQYAAAAjAAAAAQMAAAABAAAA (truncated)
-- """Timor-Leste""" AQYAAAAEAAAAAQMAAAABAAAA (truncated)
-- """Vanuatu""" AQYAAABlAAAAAQMAAAABAAAA (truncated)
-- ... ...
These CARTO tables also show one pattern to organize and cluster tables with geometries: they include the bounding box columns __carto_xmin
, __carto_xmax
, __carto_ymin
, ___carto_ymax
and are clustered by these colums.
Another pattern would be to make use of spatial indexing such as H3.
Example usage with ST functions
%sql
with countries as (
select
country,
st_geogfromwkb(geometry) geography
from
carto_overture_maps_divisions.carto.division_area
where
subtype = 'country'
and class = 'land'
and country in ('GB', 'FR')
)
select
country,
st_area(geography) / 1e6 area_km2s
from
countries
-- Returns:
-- country area_km2s
-- FR 549231.6644010496
-- GB 244408.1099778328