Row Filtering#

In this tutorial, we will demonstrate how to:

  • Set up a Dask client and load an object catalog

  • Filter data by column values

[1]:
import lsdb
from dask.distributed import Client

1. Load a catalog#

We create a basic dask client, and load an existing HATS catalog - the ZTF DR22 catalog.

Additional Help

For additional information on dask client creation, please refer to the official Dask documentation and our Dask cluster configuration page for LSDB-specific tips. Note that dask also provides its own best practices, which may also be useful to consult.

For tips on accessing remote data, see our Accessing remote data tutorial

[2]:
client = Client(n_workers=4, memory_limit="auto")
client
[2]:

Client

Client-2f8b4eac-2b78-11f0-8e36-42cb0b321d21

Connection method: Cluster object Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status

Cluster Info

[3]:
ztf_object_path = "https://data.lsdb.io/hats/ztf_dr22/ztf_lc"
ztf_object = lsdb.read_hats(ztf_object_path)
ztf_object
[3]:
lsdb Catalog ztf_lc:
objectid filterid fieldid rcid objra objdec nepochs hmjd mag magerr clrcoeff catflags Norder Dir Npix
npartitions=10839
Order: 4, Pixel: 0 int64[pyarrow] int8[pyarrow] int16[pyarrow] int8[pyarrow] float[pyarrow] float[pyarrow] int64[pyarrow] list<element: double>[pyarrow] list<element: float>[pyarrow] list<element: float>[pyarrow] list<element: float>[pyarrow] list<element: int32>[pyarrow] uint8[pyarrow] uint64[pyarrow] uint64[pyarrow]
Order: 4, Pixel: 1 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Order: 5, Pixel: 12286 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Order: 5, Pixel: 12287 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
The catalog has been loaded lazily, meaning no data has been read, only the catalog schema

2. Selecting data by column query#

We filter by column values via query().

The expression inside () follows the same syntax accepted by Pandas .query(), which supports a subset of Python expressions for filtering DataFrames.

The column names that are not valid Python variables names should be wrapped in backticks, and any variable values can be injected using f-strings. The use of ‘@’ to reference variables is not supported.

More information about Pandas query strings is available here.

[4]:
ztf_object.query("mean_mag_i < 16")
[4]:
lsdb Catalog ztf_lc:
objectid filterid fieldid rcid objra objdec nepochs hmjd mag magerr clrcoeff catflags Norder Dir Npix
npartitions=10839
Order: 4, Pixel: 0 int64[pyarrow] int8[pyarrow] int16[pyarrow] int8[pyarrow] float[pyarrow] float[pyarrow] int64[pyarrow] list<element: double>[pyarrow] list<element: float>[pyarrow] list<element: float>[pyarrow] list<element: float>[pyarrow] list<element: int32>[pyarrow] uint8[pyarrow] uint64[pyarrow] uint64[pyarrow]
Order: 4, Pixel: 1 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Order: 5, Pixel: 12286 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Order: 5, Pixel: 12287 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
The catalog has been loaded lazily, meaning no data has been read, only the catalog schema

3. Previewing part of the data#

Computing an entire catalog requires loading all of its resulting data into memory, which is expensive and may lead to out-of-memory issues.

Often, our goal is to have a peek at a slice of data to make sure the workflow output is reasonable (e.g., to assess if some new created columns are present and their values have been properly processed). head() is a pandas-like method which allows us to preview part of the data for this purpose. It iterates over the existing catalog partitions, in sequence, and finds up to n number of rows.

Notice that this method implicitly calls compute().

[5]:
ztf_object.head()
[5]:
objectid filterid fieldid rcid objra objdec nepochs hmjd mag magerr clrcoeff catflags Norder Dir Npix
_healpix_29
1174453390 402314400009559 3 402 55 45.003078 0.003229 1 [59223.22946] [20.787151] [0.21386838] [0.18703164] [0] 4 0 0
1256983831 402214400001930 2 402 55 45.001633 0.003649 4 [59081.44555 59110.46323 59149.35751 59821.49103] [21.612473 20.959953 20.576942 21.740543] [0.27192712 0.21536374 0.18216264 0.28302884] [0.11257833 0.10523333 0.09787548 0.1067627 ] [ 0 32768 0 0] 4 0 0
3121267776 1447107200003291 1 1447 25 44.996162 0.005613 41 [58384.36177 58427.33331 58430.31481 58436.24749 58483.18223 58495.2053 58726.50285 58734.4817 58734.48215 58746.46852 58746.46898 58749.42242 58749.42287 58758.43125 58758.4317 58760.44971 58760.45016 58766.33611 58766.33656 58767.36208 58767.36254 58768.40135 58768.40422 58876.156 58876.15645 59084.42936 59084.42983 59118.48358 59118.48404 59120.35467 59120.36564 59121.36772 59121.36819 59130.39867 59130.39914 59131.37422 59131.37469 59136.35829 59136.35875 59869.43353 59906.27311] [18.1267 18.178541 18.151234 18.142717 18.18145 18.157625 18.159729 18.15543 18.145329 18.13134 18.159607 18.148232 18.144363 18.132896 18.155525 18.12857 18.116041 18.193005 18.194855 18.12814 18.120125 18.146465 18.133575 18.169474 18.16064 18.161383 18.14731 18.133022 18.172457 18.09327 18.105621 18.172886 18.039137 18.190388 18.135687 18.184774 18.226088 18.139923 18.169136 18.127535 18.138668] [0.03005696 0.03114585 0.03056585 0.03038791 0.03120849 0.03070029 0.03074473 0.03065404 0.03044231 0.03015235 0.03074216 0.03050295 0.03042217 0.03018441 0.03065605 0.03009536 0.02983943 0.03145896 0.0314993 0.03008652 0.02992253 0.03046605 0.03019843 0.03095163 0.03076402 0.0307797 0.03048368 0.03018702 0.03101536 0.02938179 0.02962882 0.03102454 0.02833176 0.03140198 0.03024204 0.03128028 0.03219078 0.03032983 0.03094443 0.03007411 0.03030378] [-0.07027889 -0.06930869 -0.06497753 -0.03894358 -0.06473728 -0.05909613 -0.06853689 -0.07077068 -0.07396315 -0.06756518 -0.07377605 -0.09147864 -0.07347372 -0.06398817 -0.09142346 -0.06623353 -0.05255399 -0.05801508 -0.04943917 -0.01583591 -0.06443135 -0.0613529 -0.04225167 -0.0841284 -0.08837119 -0.08912608 -0.10244943 -0.09214832 -0.08353224 -0.05507442 -0.04923208 -0.06755196 -0.03072934 -0.06771254 -0.0803161 -0.06243316 -0.07899416 -0.09654921 -0.09602686 -0.06111438 -0.08093766] [ 0 0 0 0 0 32768 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 65535] 4 0 0
3121276937 1447207200005775 2 1447 25 44.996151 0.005617 47 [58356.39095 58386.38345 58423.37971 58424.29682 58425.30809 58429.32805 58431.35375 58432.31822 58435.33433 58436.32125 58439.30662 58457.24186 58482.22483 58739.46234 58740.49768 58740.49813 58748.38794 58748.3884 58761.42479 58761.42524 58773.33312 58773.33358 58775.35025 58775.3507 58776.32762 58776.32807 58777.33909 58777.43856 58778.41056 58778.41101 58783.3964 58783.39685 58787.27175 58787.2722 58812.29226 58812.29271 58828.2959 58833.27386 58833.27431 58859.20851 58859.20895 58860.18411 58860.18456 58861.17803 58861.17848 59061.45521 59882.33655] [17.579742 17.525295 17.566818 17.599663 17.599651 17.55067 17.599087 17.557632 17.536266 17.507595 17.78836 17.60382 17.585388 17.574453 17.561443 17.567154 17.548437 17.534819 17.536034 17.536688 17.534334 17.622917 17.537136 17.569315 17.57914 17.551102 17.534225 17.53706 17.528645 17.576523 17.542791 17.555126 17.550135 17.49578 17.613556 17.568148 17.744974 17.545502 17.543709 17.506718 17.598688 17.577156 17.57929 17.536074 17.53376 17.522251 17.587633] [0.02421532 0.02341327 0.0240211 0.02451942 0.02451925 0.02378184 0.02451055 0.02388454 0.02357152 0.0231615 0.02770521 0.02458363 0.02430092 0.02413554 0.02394105 0.02402612 0.02374902 0.02355053 0.02356813 0.02357762 0.02354353 0.02488187 0.02358413 0.02405844 0.0242062 0.02378816 0.02354196 0.02358302 0.02346141 0.0241667 0.02366642 0.02384747 0.02377395 0.02299585 0.02473499 0.02404098 0.02692159 0.02370602 0.02367982 0.02314913 0.02450441 0.02417625 0.02420849 0.02356872 0.02353521 0.02336967 0.02433508] [0.11228631 0.10507433 0.11654762 0.10412797 0.11153019 0.11898478 0.10963035 0.11548576 0.11220532 0.11345611 0.12203256 0.09424207 0.11780193 0.12842824 0.12737516 0.12518984 0.12666526 0.12699775 0.1236252 0.12323357 0.12692362 0.12783302 0.1123318 0.11313702 0.12791719 0.12769237 0.12146648 0.12329257 0.12511027 0.13168044 0.12156131 0.13163903 0.12227581 0.12884793 0.11143624 0.11146308 0.05330697 0.12057662 0.12140149 0.13080874 0.1319859 0.13584742 0.13523754 0.1343387 0.12839234 0.13436702 0.13036682] [ 0 0 0 32768 0 0 0 0 0 32768 32768 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 32768 0 0 0 0 0 0 0 0 0 0] 4 0 0
3122682099 402314400002005 3 402 55 44.996151 0.005628 73 [59182.29524 59184.33479 59187.27466 59190.26198 59200.28723 59203.25701 59206.28057 59211.19298 59217.21117 59220.18692 59223.22946 59226.1476 59229.16704 59232.24831 59253.13798 59256.20072 59266.1693 59270.17178 59424.4673 59436.40053 59450.48997 59459.40896 59462.44927 59465.3838 59468.40558 59471.41378 59476.38813 59479.3292 59484.4454 59485.43599 59487.31647 59491.34123 59497.40423 59502.31252 59505.34089 59511.42599 59512.34794 59520.34159 59522.33097 59523.24577 59526.30151 59529.30279 59532.32355 59540.2942 59550.3256 59623.17513 59787.4692 59811.40706 59816.40128 59824.45955 59827.40493 59842.35277 59846.35119 59849.42443 59853.42465 59857.44244 59860.38685 59865.31487 59866.41391 59870.40832 59873.31693 59877.33314 59880.32264 59883.28889 59889.42486 59895.27372 59898.35843 59901.32126 59904.36283 59907.23722 59913.31565 59940.30368 59969.21834] [17.406069 17.381184 17.397192 17.350647 17.372389 17.34214 17.449512 17.369156 17.339355 17.336542 17.396152 17.342094 17.427944 17.364958 17.368366 17.33733 17.350737 17.389849 17.35707 17.351816 17.387192 17.399292 17.37222 17.316654 17.386658 17.345432 17.34167 17.351562 17.419113 17.351486 17.361557 17.360413 17.422794 17.350395 17.318989 17.476593 17.42629 17.365421 17.324104 17.349354 17.351711 17.367695 17.36921 17.34206 17.3719 17.3498 17.37325 17.369358 17.373829 17.388357 17.405914 17.341066 17.403149 17.395748 17.345264 17.368683 17.386166 17.378468 17.382153 17.398455 17.254221 17.370487 17.389965 17.34571 17.393509 17.428799 17.386578 17.35696 17.348564 17.39962 17.369831 17.393126 17.357119] [0.02651762 0.02607874 0.02635986 0.02555448 0.02592614 0.0254112 0.02730958 0.02587037 0.02536455 0.02531756 0.02634147 0.02541042 0.02691225 0.02579824 0.02585679 0.02533071 0.025556 0.02623036 0.02566348 0.02557427 0.02618373 0.02639705 0.02592322 0.02498901 0.02617437 0.02546649 0.0254033 0.02556998 0.02675193 0.02556868 0.02574 0.02572045 0.02681859 0.02555023 0.02502723 0.02782026 0.02688213 0.02580617 0.02511132 0.02553262 0.0255725 0.02584524 0.0258713 0.02540986 0.02591772 0.02554015 0.02594101 0.02587386 0.02595103 0.02620416 0.02651486 0.02539319 0.02646556 0.0263343 0.02546369 0.02586222 0.02616575 0.02603146 0.02609563 0.02638222 0.02399843 0.02589332 0.02623239 0.02547119 0.02629478 0.02692784 0.02617298 0.02566159 0.0255193 0.02640286 0.02588202 0.02628801 0.02566429] [0.17656344 0.17923029 0.1795861 0.18652731 0.1819551 0.18744485 0.17160816 0.1820151 0.18973497 0.18618271 0.18703164 0.18489346 0.19011725 0.18261212 0.18494834 0.192506 0.18079472 0.19385093 0.18811963 0.19210212 0.18436077 0.18707308 0.1920265 0.18582326 0.18608862 0.18769678 0.17905413 0.17896281 0.19011566 0.19187912 0.18861422 0.1811538 0.18779151 0.18099068 0.18606457 0.13817345 0.18287414 0.18685363 0.18310606 0.18019493 0.18017785 0.18656172 0.1880041 0.177176 0.18003187 0.17572905 0.19030909 0.17817988 0.19272278 0.18783969 0.1830531 0.18174969 0.18278527 0.18171775 0.1894914 0.18257731 0.18529885 0.18744282 0.18613681 0.18401226 0.1973768 0.18431398 0.1752168 0.18595129 0.18680006 0.18646672 0.187953 0.1954474 0.18847504 0.18657699 0.1914743 0.18184996 0.19201401] [ 0 0 0 32768 0 0 32768 0 0 0 0 0 0 0 32768 0 0 0 0 0 0 0 0 0 0 0 0 0 0 32768 0 0 0 0 0 32768 0 0 0 0 0 0 0 0 0 0 0 32768 0 0 0 0 0 0 0 0 0 0 0 0 32768 0 0 0 0 0 0 0 0 0 0 0 0] 4 0 0

5 rows × 15 columns

Closing the Dask client#

[6]:
client.close()

About#

Authors: Sandro Campos and Melissa DeLucchi

Last updated on: April 14, 2025

If you use lsdb for published research, please cite following instructions.