lsdb.io.to_hats#
Functions#
|
Writes a pandas dataframe to a single parquet file and returns the total count |
|
Splits a partition into pixels at a specified order and computes |
|
Writes a catalog to disk, in HATS format. The output catalog comprises |
|
Saves catalog partitions as parquet to disk and computes the sparse |
Creates a modified version of the HATS catalog structure |
Module Contents#
- perform_write(df: nested_pandas.NestedFrame, hp_pixel: hats.pixel_math.HealpixPixel, base_catalog_dir: str | pathlib.Path | upath.UPath, histogram_order: int, **kwargs) tuple[int, hats.pixel_math.sparse_histogram.SparseHistogram] [source]#
Writes a pandas dataframe to a single parquet file and returns the total count for the partition as well as a count histogram at the specified order.
- Parameters:
df (npd.NestedFrame) – dataframe to write to file
hp_pixel (HealpixPixel) – HEALPix pixel of file to be written
base_catalog_dir (path-like) – Location of the base catalog directory to write to
histogram_order (int) – Order of the count histogram
**kwargs – other kwargs to pass to pq.write_table method
- Returns:
The total number of points on the partition and the sparse count histogram at the specified order.
- calculate_histogram(df: nested_pandas.NestedFrame, histogram_order: int) hats.pixel_math.sparse_histogram.SparseHistogram [source]#
Splits a partition into pixels at a specified order and computes the sparse histogram with the respective counts.
- Parameters:
df (npd.NestedFrame) – Partition data frame
histogram_order (int) – Order of the count histogram
- Returns:
The sparse count histogram for the partition, at the specified order.
- to_hats(catalog: lsdb.catalog.dataset.healpix_dataset.HealpixDataset, *, base_catalog_path: str | pathlib.Path | upath.UPath, catalog_name: str | None = None, default_columns: list[str] | None = None, histogram_order: int = 8, overwrite: bool = False, **kwargs)[source]#
Writes a catalog to disk, in HATS format. The output catalog comprises partition parquet files and respective metadata, as well as JSON files detailing partition, catalog and provenance info.
- Parameters:
catalog (HealpixDataset) – A catalog to export
base_catalog_path (str) – Location where catalog is saved to
catalog_name (str) – The name of the output catalog
default_columns (list[str]) – A metadata property with the list of the columns in the catalog to be loaded by default. Uses the default columns from the original hats catalogs if they exist.
histogram_order (int) – The default order for the count histogram. Defaults to 8.
overwrite (bool) – If True existing catalog is overwritten
**kwargs – Arguments to pass to the parquet write operations
- write_partitions(catalog: lsdb.catalog.dataset.healpix_dataset.HealpixDataset, base_catalog_dir_fp: str | pathlib.Path | upath.UPath, histogram_order: int, **kwargs) tuple[list[hats.pixel_math.HealpixPixel], list[int], list[hats.pixel_math.sparse_histogram.SparseHistogram]] [source]#
Saves catalog partitions as parquet to disk and computes the sparse count histogram for each partition. The histogram is either of order 8 or the maximum pixel order in the catalog, whichever is greater.
- Parameters:
catalog (HealpixDataset) – A catalog to export
base_catalog_dir_fp (path-like) – Path to the base directory of the catalog
histogram_order – The order of the count histogram to generate
**kwargs – Arguments to pass to the parquet write operations
- Returns:
A tuple with the array of non-empty pixels, the array with the total counts as well as the array with the sparse count histograms.
- create_modified_catalog_structure(catalog_structure: hats.catalog.healpix_dataset.healpix_dataset.HealpixDataset, catalog_base_dir: str | pathlib.Path | upath.UPath, catalog_name: str, **kwargs) hats.catalog.healpix_dataset.healpix_dataset.HealpixDataset [source]#
Creates a modified version of the HATS catalog structure
- Parameters:
catalog_structure (hc.catalog.Catalog) – HATS catalog structure
catalog_base_dir (UPath) – Base location for the catalog
catalog_name (str) – The name of the catalog to be saved
**kwargs – The remaining parameters to be updated in the catalog info object
- Returns:
A HATS structure, modified with the parameters provided.