Region Selection#
TODO - Help wanted
In this tutorial, we will demonstrate how to:
Set up a Dask client and load an object catalog
Select data from regions in the sky
cone
radec box
polygon
Introduction#
Large astronomical surveys contain a massive volume of data. Billion-object, multi-terabyte-sized catalogs are challenging to store and manipulate because they demand state-of-the-art hardware. Processing them is expensive, both in terms of runtime and memory consumption, and doing so on a single machine has become impractical. LSDB is a solution that enables scalable algorithm execution. It handles loading, querying, filtering, and crossmatching astronomical data (of HATS format) in a distributed environment.
[1]:
import lsdb
1. Load a catalog#
We create a basic dask client, and load an existing HATS catalog - the ZTF DR22 catalog.
Additional Help
For additional information on dask client creation, please refer to the official Dask documentation and our Dask cluster configuration page for LSDB-specific tips. Note that dask also provides its own best practices, which may also be useful to consult.
For tips on accessing remote data, see our Accessing remote data tutorial
[2]:
from dask.distributed import Client
client = Client(n_workers=4, memory_limit="auto")
client
[2]:
Client
Client-228f3ff8-2b78-11f0-8cd8-42cb0b321d21
Connection method: Cluster object | Cluster type: distributed.LocalCluster |
Dashboard: http://127.0.0.1:8787/status |
Cluster Info
LocalCluster
49724d8e
Dashboard: http://127.0.0.1:8787/status | Workers: 4 |
Total threads: 4 | Total memory: 13.09 GiB |
Status: running | Using processes: True |
Scheduler Info
Scheduler
Scheduler-7dd4defc-dbdc-42d2-b3e9-585fdd1b7aed
Comm: tcp://127.0.0.1:42337 | Workers: 4 |
Dashboard: http://127.0.0.1:8787/status | Total threads: 4 |
Started: Just now | Total memory: 13.09 GiB |
Workers
Worker: 0
Comm: tcp://127.0.0.1:45935 | Total threads: 1 |
Dashboard: http://127.0.0.1:43507/status | Memory: 3.27 GiB |
Nanny: tcp://127.0.0.1:34565 | |
Local directory: /tmp/dask-scratch-space/worker-y437nfxx |
Worker: 1
Comm: tcp://127.0.0.1:41271 | Total threads: 1 |
Dashboard: http://127.0.0.1:45311/status | Memory: 3.27 GiB |
Nanny: tcp://127.0.0.1:43069 | |
Local directory: /tmp/dask-scratch-space/worker-5xv6h3xq |
Worker: 2
Comm: tcp://127.0.0.1:40281 | Total threads: 1 |
Dashboard: http://127.0.0.1:35607/status | Memory: 3.27 GiB |
Nanny: tcp://127.0.0.1:44551 | |
Local directory: /tmp/dask-scratch-space/worker-2vxruo5l |
Worker: 3
Comm: tcp://127.0.0.1:45839 | Total threads: 1 |
Dashboard: http://127.0.0.1:45985/status | Memory: 3.27 GiB |
Nanny: tcp://127.0.0.1:44955 | |
Local directory: /tmp/dask-scratch-space/worker-flgk2z2n |
[3]:
ztf_object_path = "https://data.lsdb.io/hats/ztf_dr22/ztf_lc"
ztf_object = lsdb.read_hats(ztf_object_path)
ztf_object
[3]:
objectid | filterid | fieldid | rcid | objra | objdec | nepochs | hmjd | mag | magerr | clrcoeff | catflags | Norder | Dir | Npix | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
npartitions=10839 | |||||||||||||||
Order: 4, Pixel: 0 | int64[pyarrow] | int8[pyarrow] | int16[pyarrow] | int8[pyarrow] | float[pyarrow] | float[pyarrow] | int64[pyarrow] | list<element: double>[pyarrow] | list<element: float>[pyarrow] | list<element: float>[pyarrow] | list<element: float>[pyarrow] | list<element: int32>[pyarrow] | uint8[pyarrow] | uint64[pyarrow] | uint64[pyarrow] |
Order: 4, Pixel: 1 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 5, Pixel: 12286 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 5, Pixel: 12287 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2. Selecting a region of the sky#
There are 3 common types of spatial filters to select a portion of the sky: cone, polygon and box.
Filtering consists of two main steps:
A coarse stage, in which we find what pixels cover our desired region in the sky. These may overlap with the region and only be partially contained within the region boundaries. This means that some data points inside that pixel may fall outside of the region.
A fine stage, where we filter the data points from each pixel to make sure they fall within the specified region.
The fine
parameter allows us to specify whether or not we desire to run the fine stage, for each search. It brings some overhead, so if your intention is to get a rough estimate of the data points for a region, you may disable it. It is always executed by default.
catalog.box_search(..., fine=False)
catalog.cone_search(..., fine=False)
catalog.polygon_search(..., fine=False)
Throughout this notebook, we will use the Catalog’s plot_pixels
method to display the HEALPix of each resulting catalog as filters are applied.
[4]:
ztf_object.plot_pixels(plot_title="ZTF_DR14 - pixel map")
[4]:
(<Figure size 1000x500 with 2 Axes>,
<WCSAxes: title={'center': 'ZTF_DR14 - pixel map'}>)

3. Cone search#
A cone search is defined by center (ra, dec)
, in degrees, and radius r
, in arcseconds.
[5]:
ztf_object_cone = ztf_object.cone_search(ra=-60.3, dec=20.5, radius_arcsec=5 * 3600)
ztf_object_cone
[5]:
objectid | filterid | fieldid | rcid | objra | objdec | nepochs | hmjd | mag | magerr | clrcoeff | catflags | Norder | Dir | Npix | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
npartitions=142 | |||||||||||||||
Order: 6, Pixel: 12843 | int64[pyarrow] | int8[pyarrow] | int16[pyarrow] | int8[pyarrow] | float[pyarrow] | float[pyarrow] | int64[pyarrow] | list<element: double>[pyarrow] | list<element: float>[pyarrow] | list<element: float>[pyarrow] | list<element: float>[pyarrow] | list<element: int32>[pyarrow] | uint8[pyarrow] | uint64[pyarrow] | uint64[pyarrow] |
Order: 6, Pixel: 12844 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 6, Pixel: 14400 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 6, Pixel: 14401 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
[6]:
ztf_object_cone.plot_pixels(plot_title="ZTF_DR14 - cone pixel map")
[6]:
(<Figure size 1000x500 with 2 Axes>,
<WCSAxes: title={'center': 'ZTF_DR14 - cone pixel map'}>)

4. The Search object#
To perform a search on a catalog, there are two modes: a shape-specific call, or passing a search object to the search()
method. The above case uses the cone shape call.
Using a search object can be useful if you intend to re-use the shape for filtering multiple catalogs. We also provide some basic plotting for cone and box searches. The 5 degree cone search is outlined in red in the below plot.
[7]:
from lsdb.core.search import ConeSearch
cone_search = ConeSearch(ra=-60.3, dec=20.5, radius_arcsec=5 * 3600)
[8]:
ztf_object.plot_pixels(plot_title="ZTF_DR14 - pixel map")
cone_search.plot(fc="#00000000", ec="red")
[8]:
(<Figure size 1000x500 with 2 Axes>, <WCSAxes: >)

5.Polygon search#
A polygon search is defined by convex polygon with vertices [(ra1, dec1), (ra2, dec2)...]
, in degrees.
[9]:
vertices = [(-60.5, 15.1), (-62.5, 18.5), (-65.2, 15.3), (-64.2, 12.1)]
ztf_object_polygon = ztf_object.polygon_search(vertices)
ztf_object_polygon
[9]:
objectid | filterid | fieldid | rcid | objra | objdec | nepochs | hmjd | mag | magerr | clrcoeff | catflags | Norder | Dir | Npix | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
npartitions=35 | |||||||||||||||
Order: 6, Pixel: 12842 | int64[pyarrow] | int8[pyarrow] | int16[pyarrow] | int8[pyarrow] | float[pyarrow] | float[pyarrow] | int64[pyarrow] | list<element: double>[pyarrow] | list<element: float>[pyarrow] | list<element: float>[pyarrow] | list<element: float>[pyarrow] | list<element: int32>[pyarrow] | uint8[pyarrow] | uint64[pyarrow] | uint64[pyarrow] |
Order: 6, Pixel: 12843 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 7, Pixel: 122748 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 7, Pixel: 122749 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
[10]:
ztf_object_polygon.plot_pixels(plot_title="ZTF_DR14 - polygon pixel map")
[10]:
(<Figure size 1000x500 with 2 Axes>,
<WCSAxes: title={'center': 'ZTF_DR14 - polygon pixel map'}>)

6.Box search#
A box search can be defined by right ascension and declination bands [(ra1, ra2), (dec1, dec2)]
.
[11]:
ztf_object_box = ztf_object.box_search(ra=[-65, -60], dec=[12, 15])
ztf_object_box
[11]:
objectid | filterid | fieldid | rcid | objra | objdec | nepochs | hmjd | mag | magerr | clrcoeff | catflags | Norder | Dir | Npix | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
npartitions=36 | |||||||||||||||
Order: 6, Pixel: 12834 | int64[pyarrow] | int8[pyarrow] | int16[pyarrow] | int8[pyarrow] | float[pyarrow] | float[pyarrow] | int64[pyarrow] | list<element: double>[pyarrow] | list<element: float>[pyarrow] | list<element: float>[pyarrow] | list<element: float>[pyarrow] | list<element: int32>[pyarrow] | uint8[pyarrow] | uint64[pyarrow] | uint64[pyarrow] |
Order: 6, Pixel: 12840 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 7, Pixel: 122737 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 7, Pixel: 122738 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
[12]:
ztf_object_box.plot_pixels(plot_title="ZTF_DR14 - box pixel map")
[12]:
(<Figure size 1000x500 with 2 Axes>,
<WCSAxes: title={'center': 'ZTF_DR14 - box pixel map'}>)

We can stack a several number of filters, which are applied in sequence. For example, catalog.box_search().polygon_search()
should result in a perfectly valid HATS catalog containing the objects that match both filters.
Closing the Dask client#
[13]:
client.close()
About#
Authors: Sandro Campos and Melissa DeLucchi
Last updated on: April 14, 2025
If you use lsdb
for published research, please cite following instructions.