to_lance

Contents

to_lance#

to_lance(catalog: HealpixDataset, *, base_catalog_path: str | Path | UPath, table_name: str = 'data', overwrite: bool = False, progress_bar: bool = True, optimize_dataset: bool = True) None[source]#

Writes a catalog to a Lance dataset.

All primary catalog partitions are written as a single flat Lance dataset. Every column in the catalog — including the HEALPix spatial index — is preserved. The margin catalog (if present) is not written to Lance. The resulting dataset can be opened with lancedb.connect(base_catalog_path).open_table("data").

Parameters:
catalogHealpixDataset

The catalog to export.

base_catalog_pathstr | Path | UPath

Path where the Lance dataset will be written.

table_namestr, default “data”

Name of the table to create in the Lance database. This is the name used to open the table later with lancedb.

overwritebool, default False

If True, an existing dataset at base_catalog_path is overwritten. If False and a dataset already exists there, an error is raised.

progress_barbool, default True

If True, shows a progress bar while writing partitions.

optimize_datasetbool, default True

If True, optimizes the Lance dataset after writing all partitions. This will improve query performance but will increase the total time required to write the dataset.

Raises:
ImportError

If the lancedb package is not installed.

ValueError

If a dataset already exists at base_catalog_path and overwrite=False.

RuntimeError

If the catalog is empty and no data is written.

Examples

Export a catalog and open it with lancedb:

>>> import lsdb
>>> catalog = lsdb.read_hats("path/to/small_sky")
>>> catalog.to_lance("/tmp/my_catalog")

Open the result:

>>> import lancedb
>>> db = lancedb.connect("/tmp/my_catalog")
>>> tbl = db.open_table("data")