Abstracting the Storage and Retrieval of Image Data at the LSST

Tim Jenness,James Bosch,Pim Schellart,Kian-Tat Lim,Andrei Salnikov,Michelle Gower

Abstracting the Storage and Retrieval of Image Data at the LSST

2019

Tim Jenness
James Bosch
Pim Schellart
Kian-Tat Lim
Andrei Salnikov
Michelle Gower

Writing generic data processing pipelines requires that the algorithmic code does not ever have to know about data formats of files, or the locations of those files. At LSST we have a software system known as "the Data Butler," that abstracts these details from the software developer. Scientists can specify the dataset they want in terms they understand, such as filter, observation identifier, date of observation, and instrument name, and the Butler translates that to one or more files which are read and returned to them as a single Python object. Conversely, once they have created a new dataset they can give it back to the Butler, with a label describing its new status, and the Butler can write it in whatever format it has been configured to use. All configuration is in YAML and supports standard defaults whilst allowing overrides.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations