Install DuckDB Extensions via HTTPS

Note

This issue should not come up on Databricks Free Edition, as there you’re using Serverless and the network is set up by Databricks.

If your network blocks HTTP requests, you cannot simply install DuckDB extensions by e.g. INSTALL spatial; (it will fail after ~10 minutes), because of the dependency on httpfs, an extension itself – see details here.

The workaround is to separately download the httpfs extension:

%pip install duckdb --quiet

import os
import platform
from urllib.parse import urlparse

import duckdb
import requests

arch = platform.machine()
if arch == "x86_64":
    architecture = "linux_amd64"
elif arch == "aarch64":
    architecture = "linux_arm64"
else:
    raise Exception(f"unknown_arch: {arch}")

duckdb_version = duckdb.__version__
url = f"https://extensions.duckdb.org/v{duckdb_version}/{architecture}/httpfs.duckdb_extension.gz"

output_file = os.path.basename(urlparse(url).path)
response = requests.get(url, timeout=30)
response.raise_for_status()
with open(output_file, "wb") as f:
    f.write(response.content)

duckdb.install_extension(output_file)

os.remove(output_file)

duckdb.sql("SET custom_extension_repository='https://extensions.duckdb.org'")

And now you can install other extensions, such as:

duckdb.sql("install spatial; load spatial")