lsdb.dask.merge_map_catalog_data

lsdb.dask.merge_map_catalog_data#

Functions#

perform_merge_map(catalog_partition, map_partition, ...)

Applies a function to each pair of partitions in this catalog and the map catalog.

merge_map_catalog_data(...)

Applies a function to each pair of partitions in this catalog and the map catalog.

Module Contents#

perform_merge_map(catalog_partition: nested_pandas.NestedFrame, map_partition: nested_pandas.NestedFrame, catalog_pixel: hats.pixel_math.HealpixPixel, map_pixel: hats.pixel_math.HealpixPixel, catalog_structure: hats.catalog.TableProperties, map_structure: hats.catalog.TableProperties, func: Callable[Ellipsis, nested_pandas.NestedFrame], *args, **kwargs)[source]#

Applies a function to each pair of partitions in this catalog and the map catalog.

Parameters:
  • catalog_partition (npd.NestedFrame) – partition of the point-source catalog

  • map_partition (npd.NestedFrame) – partition of the continuous map catalog

  • catalog_pixel (HealpixPixel) – the HEALPix pixel of the catalog partition

  • map_pixel (HealpixPixel) – the HEALPix pixel of the map partition

  • catalog_structure (hc.TableProperties) – the catalog info of the catalog

  • map_structure (hc.TableProperties) – the catalog info of the map

  • func (Callable) – method to apply to the two partitions

Returns:

A dataframe with the result of calling func

merge_map_catalog_data(point_catalog: lsdb.catalog.Catalog, map_catalog: lsdb.catalog.MapCatalog, func: Callable[Ellipsis, nested_pandas.NestedFrame], *args, meta: nested_pandas.NestedFrame | None = None, **kwargs) tuple[nested_dask.NestedFrame, lsdb.types.DaskDFPixelMap, hats.pixel_tree.PixelAlignment][source]#

Applies a function to each pair of partitions in this catalog and the map catalog.

The pixels from each catalog are aligned via a PixelAlignment, and the respective dataframes are passed to the function. The resulting catalog will have the same partitions as the point source catalog.

Parameters:
  • point_catalog (lsdb.Catalog) – the point-source catalog to apply

  • map_catalog (lsdb.MapCatalog) – the continuous map catalog to apply

  • func (Callable) – The function applied to each catalog partition, which will be called with: func(catalog_partition: npd.NestedFrame, map_partition: npd.NestedFrame, ` ` healpix_pixel: HealpixPixel, *args, **kwargs) with the additional args and kwargs passed to the merge_map function.

  • *args – Additional positional arguments to call func with.

  • meta (pd.DataFrame | pd.Series | Dict | Iterable | Tuple | None) – An empty pandas DataFrame that has columns matching the output of the function applied to the catalog partition. Other types are accepted to describe the output dataframe format, for full details see the dask documentation https://blog.dask.org/2022/08/09/understanding-meta-keyword-argument If meta is None (default), LSDB will try to work out the output schema of the function by calling the function with an empty DataFrame. If the function does not work with an empty DataFrame, this will raise an error and meta must be set. Note that some operations in LSDB will generate empty partitions, though these can be removed by calling the Catalog.prune_empty_partitions method.

  • **kwargs – Additional keyword args to pass to the function. These are passed to the Dask DataFrame dask.dataframe.map_partitions function, so any of the dask function’s keyword args such as transform_divisions will be passed through and work as described in the dask documentation https://docs.dask.org/en/stable/generated/dask.dataframe.DataFrame.map_partitions.html

Returns:

A tuple of the dask dataframe with the result of the operation, the pixel map from HEALPix pixel to partition index within the dataframe, and the PixelAlignment of the two input catalogs.