lsdb.loaders.dataframe.margin_catalog_generator#
Classes#
Creates a HATS formatted margin catalog |
Module Contents#
- class MarginCatalogGenerator(catalog: lsdb.Catalog, margin_order: int = -1, margin_threshold: float | None = 5.0, use_pyarrow_types: bool = True, **kwargs)[source]#
Creates a HATS formatted margin catalog
- _resolve_margin_order()[source]#
Calculate the order of the margin cache to be generated. If not provided the margin will be calculated based on the smallest pixel possible for the threshold.
- Raises:
ValueError – if the margin order and thresholds are incompatible with the catalog.
- create_catalog() lsdb.catalog.margin_catalog.MarginCatalog | None [source]#
Create a margin catalog for another pre-computed catalog.
Only one of margin order / threshold can be specified. If the margin order is not specified: if the threshold is zero the margin is an empty catalog; if the threshold is None, the margin is not generated (it is None).
- Returns:
Margin catalog object or None if the margin is not generated.
- _create_catalog() lsdb.catalog.margin_catalog.MarginCatalog [source]#
Create a non-empty margin catalog
- _create_empty_catalog() lsdb.catalog.margin_catalog.MarginCatalog [source]#
Create an empty margin catalog
- _get_margins() tuple[list[hats.pixel_math.HealpixPixel], list[nested_pandas.NestedFrame]] [source]#
Generates the list of pixels that have margin data, and the dataframes with the margin data for each partition
- Returns:
A tuple of the list of HealpixPixels corresponding to partitions that have margin data, and a list of the dataframes with the margin data for each partition.
- _generate_dask_df_and_map(pixels: list[hats.pixel_math.HealpixPixel], partitions: list[pandas.DataFrame]) tuple[lsdb.nested.NestedFrame, dict[hats.pixel_math.HealpixPixel, int], int] [source]#
Create the Dask Dataframe containing the data points in the margins for the catalog as well as the mapping of those HEALPix to Dataframes
- Parameters:
pixels (List[HealpixPixel]) – The list of healpix pixels in the catalog with margins
partitions (List[pd.DataFrame]) – The list of dataframes containing the margin rows for each partition, aligned with the pixels list
- Returns:
Tuple containing the Dask Dataframe, the mapping of margin HEALPix to the respective partitions and the total number of rows.
- _find_margin_pixel_pairs(pixels: list[hats.pixel_math.HealpixPixel]) pandas.DataFrame [source]#
Calculate the pairs of catalog pixels and their margin pixels
- Parameters:
pixels (List[HealpixPixel]) – The list of HEALPix to compute margin pixels for. These include the catalog pixels as well as the negative pixels.
- Returns:
A Pandas Dataframe with the many-to-many mapping between each catalog HEALPix and the respective margin pixels.
- _create_margins(margin_pairs_df: pandas.DataFrame) dict[hats.pixel_math.HealpixPixel, pandas.DataFrame] [source]#
Compute the margins for all the pixels in the catalog
- Parameters:
margin_pairs_df (pd.DataFrame) – A DataFrame containing all the combinations of catalog pixels and respective margin pixels
- Returns:
A dictionary mapping each margin pixel to the respective DataFrame.
- _create_catalog_info(catalog_name: str | None = None, **kwargs) hats.catalog.TableProperties [source]#
Create the margin catalog info object
- Parameters:
catalog_name (str) – name of the PRIMARY catalog being created. this margin catalog will take on a name like <catalog_name>_margin.
**kwargs – Arguments to pass to the creation of the catalog info.
- Returns:
The margin catalog info object.