lsdb.nested.datasets.generation#

Functions#

generate_data(n_base, n_layer[, npartitions, seed, ...])

Generates a toy dataset.

_generate_cone_search(→ tuple[numpy.ndarray, ...)

_generate_box_radec(ra_range, dec_range, n_base[, seed])

Generates a random set of RA and Dec values within a given range.

_generate_pixel_search(→ tuple[numpy.ndarray, ...)

generate_catalog(n_base, n_layer[, seed, ra_range, ...])

Generates a toy catalog.

Module Contents#

generate_data(n_base, n_layer, npartitions=1, seed=None, ra_range=(0.0, 360.0), dec_range=(-90, 90), search_region=None)[source]#

Generates a toy dataset.

Docstring copied from nested-pandas.

Parameters:
  • n_base (int) – The number of rows to generate for the base layer

  • n_layer (int, or dict) – The number of rows per n_base row to generate for a nested layer. Alternatively, a dictionary of layer label, layer_size pairs may be specified to created multiple nested columns with custom sizing.

  • npartitions (int) – The number of partitions to split the data into.

  • seed (int) – A seed to use for random generation of data

  • ra_range (tuple) – A tuple of the min and max values for the ra column in degrees

  • dec_range (tuple) – A tuple of the min and max values for the dec column in degrees

  • search_region (AbstractSearch) – A search region to apply to the generated data. Currently supports the ConeSearch, BoxSearch, and PixelSearch regions. Note that if provided, this will override the ra_range and dec_range parameters.

Returns:

The constructed Dask NestedFrame.

Return type:

NestedFrame

Examples

>>> from lsdb.nested.datasets import generate_data
>>> nf = generate_data(10,100)
>>> nf = generate_data(10, {"nested_a": 100, "nested_b": 200})

Constraining spatial ranges: >>> nf = generate_data(10, 100, ra_range=(0., 10.), dec_range=(-5., 0.))

Using a search region: >>> from lsdb.core.search import ConeSearch >>> nf = generate_data(10, 100, search_region=ConeSearch(5, 5, 1))

_generate_box_radec(ra_range, dec_range, n_base, seed=None)[source]#

Generates a random set of RA and Dec values within a given range.

Parameters:
  • ra_range (tuple) – A tuple of the min and max values for the ra column in degrees

  • dec_range (tuple) – A tuple of the min and max values for the dec column in degrees

  • n_base (int) – The number of rows to generate for the base layer

  • seed (int) – A seed to use for random generation of data

Returns:

An array of shape (n_base, 2) containing the generated RA and Dec values.

Return type:

np.ndarray

generate_catalog(n_base, n_layer, seed=None, ra_range=(0.0, 360.0), dec_range=(-90, 90), search_region=None, **kwargs)[source]#

Generates a toy catalog.

Parameters:
  • n_base (int) – The number of rows to generate for the base layer

  • n_layer (int, or dict) – The number of rows per n_base row to generate for a nested layer. Alternatively, a dictionary of layer label, layer_size pairs may be specified to created multiple nested columns with custom sizing.

  • seed (int) – A seed to use for random generation of data

  • ra_range (tuple) – A tuple of the min and max values for the ra column in degrees

  • dec_range (tuple) – A tuple of the min and max values for the dec column in degrees

  • search_region (AbstractSearch) – A search region to apply to the generated data. Currently supports the ConeSearch and BoxSearch regions. Note that if provided, this will override the ra_range and dec_range parameters.

  • **kwargs – Additional keyword arguments to pass to lsdb.from_dataframe.

Returns:

The constructed LSDB Catalog.

Return type:

Catalog

Examples

>>> from lsdb.nested.datasets import generate_catalog
>>> gen_cat = generate_catalog(10,100)
>>> gen_cat = generate_catalog(1000, 10, ra_range=(0.,10.), dec_range=(-5.,0.))

Constraining spatial ranges: >>> gen_cat = generate_data(10, 100, ra_range=(0., 10.), dec_range=(-5., 0.))

Using a search region: >>> from lsdb.core.search import ConeSearch # doctest: +SKIP >>> gen_cat = generate_data(10, 100, search_region=ConeSearch(5, 5, 1))