lsdb.loaders.dataframe.dataframe_catalog_loader#
Classes#
Creates a HATS formatted Catalog from a Pandas Dataframe |
Module Contents#
- class DataframeCatalogLoader(dataframe: pandas.DataFrame, *, ra_column: str = 'ra', dec_column: str = 'dec', lowest_order: int = 0, highest_order: int = 7, drop_empty_siblings: bool = False, partition_size: int | None = None, threshold: int | None = None, should_generate_moc: bool = True, moc_max_order: int = 10, use_pyarrow_types: bool = True, schema: pyarrow.Schema | None = None, **kwargs)[source]#
Creates a HATS formatted Catalog from a Pandas Dataframe
- _calculate_threshold(partition_size: int | None = None, threshold: int | None = None) int [source]#
Calculates the number of pixels per HEALPix pixel (threshold) for the desired partition size.
- Parameters:
partition_size (int) – The desired partition size, in number of rows
threshold (int) – The maximum number of data points per pixel
- Returns:
The HEALPix pixel threshold
- _create_catalog_info(catalog_name: str = 'from_lsdb_dataframe', ra_column: str = 'ra', dec_column: str = 'dec', catalog_type: hats.catalog.CatalogType = CatalogType.OBJECT, **kwargs) hats.catalog.TableProperties [source]#
Creates the catalog info object
- Parameters:
catalog_name – it is recommended to provide a new name for your catalog
ra_column – column to find right ascension coordinate
dec_column – column to find declination coordinate
catalog_type – type of table being created (e.g. OBJECT, SOURCE, MAP)
**kwargs – Arguments to pass to the creation of the catalog info
- Returns:
The catalog info object
- load_catalog() lsdb.catalog.catalog.Catalog [source]#
Load a catalog from a Pandas Dataframe
- Returns:
Catalog object with data from the source given at loader initialization
- _set_spatial_index()[source]#
Generates the spatial indices for each data point and assigns the spatial index column as the Dataframe index.
- _compute_pixel_list() list[hats.pixel_math.HealpixPixel] [source]#
Compute object histogram and generate the sorted list of HEALPix pixels. The pixels are sorted by ascending spatial index.
- Returns:
List of HEALPix pixels for the final partitioning.
- _generate_dask_df_and_map(pixel_list: list[hats.pixel_math.HealpixPixel]) tuple[lsdb.nested.NestedFrame, lsdb.types.DaskDFPixelMap, int] [source]#
Load Dask DataFrame from HEALPix pixel Dataframes and generate a mapping of HEALPix pixels to HEALPix Dataframes
- Parameters:
pixel_list (List[HealpixPixel]) – final partitioning of data
- Returns:
Tuple containing the Dask Dataframe, the mapping of HEALPix pixels to the respective Pandas Dataframes and the total number of rows.