Array library options in ccdproc#
Note
Who needs this? If you are currently using numpy for your image processing
there is no need to change anything about what you currently do. The changes
made in ccdproc to adopt the array API were made with the intent of
requiring no change to existing code that continues to use numpy.
What is the “Array API”?#
The Python array API specifies an interface that has been adopted by many different array libraries (e.g. jax, dask, CuPy). The API is very similar to the familiar numpy interface. The array API was constructed to allow users with specialized needs to use any of the large variety of array options available in Python.
What array libraries are supported?#
The best list of array libraries that implement the array API is at array-api-compat. ccdproc is currently regularly tested against numpy, dask, and jax. It is occasionally tested against CuPy; any errors your encounter running ccdproc on a GPU using CuPy should be reported as an issue.
Though the sparse array library supports the array API, ccdproc does not currently work with sparse. A pull request to add support for sparse would be a welcome contribution to the project.
What limitations should I be aware of?#
The
medianfunction is not part of the array API, but most array libraries do provide amedian. If the array library you choose does not have amedianfunction then ccdproc will automatically fall back to using amedianfunction from bottleneck, if that is installed, or to numpy.
Which array library should I use?#
If you have access to a GPU then using cupy will be noticeably faster than
using numpy. If you routinely use very large datasets, consider using dask.
The array library that the maintainers of ccdproc most often use is numpy.
How do I use the array API?#
There are two ways to use the array API in ccdproc:
Use the ccdproc functions as you normally would, but pass in an array from the array library of your choice. For example, if you want to use dask arrays, you can do this:
import dask.array as da import ccdproc from astropy.nddata import CCDData data = da.random.random((1000, 1000)) ccd = CCDData(data, unit='adu') ccd = ccdproc.trim_image(ccd[:900, :900])
Use ccdproc functions to read/write data in addition to using ccdproc functions to process the data. For example, if you want to use dask arrays to process a set of images, you can do this:
import dask.array as da import ccdproc from astropy.nddata import CCDData images = ccdproc.ImageFileCollection('path/to/images/*.fits', array_package=da) for ccd in images.ccds(): ccd = ccdproc.trim_image(ccd[:900, :900]) # Do more processing with ccdproc functions # ...
If you do this, image combination will also be done using the array library you specified.
To do image combination with the array library of your choice without doing any other processing, you can either create a
ccdproc.Combinerobject with a list of file names and thearray_packageargument set to the array library you want to use, or use theccdproc.combinefunction a list of file names and thearray_packageargument set to the array library you want to use. For example, to combine images using dask arrays, you can do this:import dask.array as da import ccdproc from astropy.nddata import CCDData images = ccdproc.ImageFileCollection('path/to/images/*.fits', array_package=da) combined = ccdproc.combine_images(images.ccds(), method='median')