LSST Data Products Definition Document
LSE-163
Latest Revision 9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
1
Large Synoptic Survey Telescope (LSST)
Data Products Definition Document
LSE-163
Latest Revision Date: September 26, 2016
This LSST document has been approved as a Content-Controlled Document. Its contents are subject to
configuration control and may not be changed, altered, or their provisions waived without prior
approval. If this document is changed or superseded, the new document will retain the Handle
designation shown above. The control is on the most recent digital document with this Handle in the
LSST digital archive and not printed versions.
LSST Data Products Definition Document
LSE-163
Latest Revision 9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
i
Change Record
Version
Date
Description
Owner name
1
10/7/2013
Initial Release
Mario Juric
2
9/26/2016
Implementation of LCR-758 Update Data
Products Definition Document, LSE-163
Gregory Dubois-
Felsman (LCR), Tim
Jenness (document),
Robert McKercher
(DocuShare)
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
Large Synoptic Survey Telescope
Data Products De?nition Document
LSST Document LSE-163
M. Juri?c
?
, T. Axelrod, A.C. Becker, J. Becla, J.F. Bosch, D. Ciardi,
A.J. Connolly, G.P. Dubois-Felsmann, F. Economou, M. Freemon,
M. Gelman, M. Graham, Z.
?
Ivezi?c, T. Jenness, J. Kantor, S. Krugho?,
K-T Lim, R.H. Lupton, F. Mueller, D. Nidever, D. Petravick, D. Shaw,
C. Slater, M. Strauss, J. Swinbank, J.A. Tyson, M. Wood-Vasey and X. Wu
for the LSST Project
September 26, 2016
Abstract
This document describes the data products and processing services
to be delivered by the Large Synoptic Survey Telescope (LSST).
The LSST will deliver three levels of data products and services.
Level 1 (nightly) data products will include images, di?erence im-
ages, catalogs of sources and objects detected in di?erence images,
and catalogs of Solar System objects. Their primary purpose is to
enable rapid follow-up of time-domain events. Level 2 (annual) data
products will include well calibrated single-epoch images, deep coadds,
and catalogs of objects, sources, and forced sources, enabling static
sky and precision time-domain science. Level 3 (user-created) data
product services will enable science cases that greatly bene?t from
co-location of user processing and/or data within the LSST Archive
Center. LSST will also devote 10% of observing time to programs
with special cadence. Their data products will be created using the
same software and hardware as Levels 1 and 2. All data products will
be made available using user-friendly databases and web services.
?
Please direct comments to
<mjuric@lsst.org>.
1
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
CONTENTS
2
Contents
1 Preface
4
2 Introduction
5
2.1 The Large Synoptic Survey Telescope .............. 5
2.2 General Image Processing Concepts for LSST ......... 6
2.3 Classes of LSST Data Products ................. 7
3 General Considerations
10
3.1 Estimator and Naming Conventions ............... 10
3.2 Image Characterization Data ................... 11
3.3 Fluxes and Magnitudes ...................... 12
3.4 Uniqueness of IDs across database versions ........... 13
3.5 Repeatability of Queries ..................... 13
4 Level 1 Data Products
14
4.1 Overview .............................. 14
4.2 Level 1 Data Processing ..................... 15
4.2.1
Di?erence Image Analysis ................ 15
4.2.2
Solar System Object Processing ............. 17
4.3 Level 1 Catalogs ......................... 18
4.3.1 DIASource Table ..................... 20
4.3.2 DIAObject Table ..................... 26
4.3.3 SSObject Table ...................... 28
4.3.4
Precovery Measurements ................. 29
4.3.5
Reprocessing the Level 1 Data Set ............ 30
4.4 Level 1 Image Products ...................... 31
4.4.1
Visit Images ........................ 31
4.4.2
Di?erence Images ..................... 31
4.4.3
Image Di?erencing Templates .............. 31
4.5 Alerts to DIASources ....................... 32
4.5.1
Information Contained in Each Alert .......... 32
4.5.2
Receiving and Filtering the Alerts ............ 33
5 Level 2 Data Products
35
5.1 Overview .............................. 35
5.2 Level 2 Data Processing ..................... 36
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
CONTENTS
3
5.2.1
Object Characterization Measures ............ 38
5.2.2
Supporting Science Cases Requiring Full Posteriors .. 41
5.2.3
Source Characterization ................. 42
5.2.4
Forced Photometry .................... 43
5.2.5
Crowded Field Photometry ............... 43
5.3 The Level 2 Catalogs ....................... 44
5.3.1 The Object Table ..................... 44
5.3.2 Source Table ....................... 50
5.3.3 ForcedSource Table ................... 52
5.4 Level 2 Image Products ...................... 53
5.4.1
Visit Images ........................ 53
5.4.2
Calibration Data ..................... 53
5.4.3
Coadded Images ..................... 54
5.5 Data Release Availability and Retention Policies ........ 55
6 Level 3 Data Products and Capabilities
57
6.1 Level 3 Data Products and Associated Storage Resources ... 57
6.2 Level 3 Processing Resources ................... 58
6.3 Level 3 Programming Environment and Framework ...... 59
6.4 Migration of Level 3 data products to Level 2 ......... 61
7 Data Products for Special Programs
62
8 Appendix: Conceptual Pipeline Design
64
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
1 PREFACE
4
1 Preface
The purpose of this document is to describe the data products produced by
the Large Synoptic Survey Telescope (LSST).
To a future LSST user, it should clarify what catalogs, image data, soft-
ware, and services they can expect from LSST. To LSST builders, it provides
direction on how to ow down the LSST System Requirements Document
to system design, sizing, budget and schedule as they pertain to the data
products.
Though under strict change control
1
, this is a living document. LSST
will undergo a period of construction and commissioning lasting no less than
seven years, followed by a decade of survey operations. To ensure their
continued scienti?c adequacy, the designs and plans for LSST Data Products
will be periodically reviewed and updated.
1
LSST Docushare handle for this document is LSE-163.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
2 INTRODUCTION
5
2 Introduction
2.1 The Large Synoptic Survey Telescope
LSST will be a large, wide-?eld ground-based optical telescope system de-
signed to obtain multiple images covering the sky that is visible from Cerro
Pach?on in Northern Chile. The current baseline design, with an 8.4m (6.7m
e?ective) primary mirror, a 9.6 deg
2
?eld of view, and a 3.2 Gigapixel cam-
era, will allow about 10,000 square degrees of sky to be covered every night
using pairs of 15-second exposures, with typical 5˙ depth for point sources
of r ˘ 24:5 (AB). The system is designed to yield high image quality as well
as superb astrometric and photometric accuracy. The total survey area will
include ˘30,000 deg
2
with ? < +34:5
?
, and will be imaged multiple times in
six bands, ugrizy, covering the wavelength range 320{1050 nm. For a more
detailed, but still concise, summary of LSST, please see the LSST Overview
paper
2
.
The project is scheduled to begin the regular survey operations at the
start of next decade. About 90% of the observing time will be devoted to
a deep-wide-fast survey mode which will uniformly observe a 18,000 deg
2
region about 1000 times (summed over all six bands) during the anticipated
10 years of operations, and yield a coadded map to r ˘ 27:5. These data
will result in catalogs including over 38 billion stars and galaxies, that will
serve the majority of the primary science programs. The remaining 10% of
the observing time will be allocated to special projects such as a Very Deep
and Fast time domain survey
3
.
The LSST will be operated in fully automated survey mode. The images
acquired by the LSST Camera will be processed by LSST Data Management
software to a) detect and characterize imaged astrophysical sources and b) de-
tect and characterize temporal changes in the LSST-observed universe. The
results of that processing will be reduced images, catalogs of detected objects
and the measurements of their properties, and prompt alerts to \events" {
changes in astrophysical scenery discovered by di?erencing incoming images
against older, deeper, images of the sky in the same direction (templates, see
x4.4.3). Measurements will be internally and absolutely calibrated.
The broad, high-level, requirements for LSST Data Products are given by
arXiv:0805.2366, http://ls.st/2m9
3
Informally known as \Deep Drilling Fields".
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
2 INTRODUCTION
6
the LSST Science Requirements Document
4
(SRD). This document lays out
the speci?cs of what the data products will comprise of, how those data will
be generated, and when. It serves to inform the ow-down from the LSST
SRD through the LSST System Requirements Document (the LSR; LSE-29)
and the LSST Observatory System Speci?cations (OSS; LSE-30), to the LSST
Data Management System Requirements (DMSR; LSE-61), the UML model,
and the database schema.
2.2 General Image Processing Concepts for LSST
A raw image (baselined as a pair of successive 15-second exposures, called
snaps), delivered by the LSST camera, is processed by the Instrument Sig-
nature Removal (ISR) pipeline, to produce a single-visit image with, at least
conceptually, counts proportional to photon ux entering the telescope pupil
(in reality, there are many additional optical, pixel and bandpass e?ects, in-
cluding random counting noise and various subtle systematic errors, that are
treated during subsequent processing). This single-visit image processed by
the ISR is called a "calibrated exposure" and its main data structures include
counts, their variance and various masks, all de?ned on per pixel basis. After
the ISR step is completed, the pixel values and their variance are not mod-
i?ed any more. These single-visit images are used downstream to produce
coadded and di?erence images. The rest of the processing is essentially a
model-based interpretation of imaging observations that includes numerous
astrophysical and other assumptions.
The basic interpretation model assumes a sum of discrete (but possibly
overlapping) sources and a relatively smooth background. The background
has a di?erent spectral energy distribution than discrete sources, and it can
display both spatial gradients and temporal changes. Discrete sources can
vary in brightness and position. The motion can be slow or fast (between
two successive observations, less or more motion than about the seeing disk
size), and naturally separates stars with proper motions and trigonometric
parallax from moving objects in the Solar System. Some objects that vary in
brightness can be detectable for only a short period of time (e.g., supernovae
and other cosmic explosions).
The image interpretation model separates time-independent model com-
ponent from a temporally changing component (\DC" and \AC", respec-
4
LSST Document Handle LPM-17, available at
http://ls.st/srd
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
2 INTRODUCTION
7
tively). Discrete DC sources are not operationally nor astrophysically asso-
ciated with discrete AC sources even when they are spatially coincident.
Images (a series of Footprints, where Footprint is a set of connected pixels
with counts above some threshold level set by noise properties) of discrete
objects are modeled using two models. A two-component galaxy model in-
cludes a linear combination of bulge and disk, with their radial intensity
variation described using Sersic pro?les. Stars are modeled using a moving
point source model with its parallax motion superposed on a linear proper
motion. This model shares motion parameters across the six bandpasses and
assumes constant ux in each band, and thus includes 11 free parameters.
Both galaxy and stellar models are ?t to all objects, except for fast-moving
objects (the Solar System objects), which are treated separately. Discrete
objects detected in di?erence images will be modeled using three models: a
point source model, a trailed point source model, and a point source dipole
model.
2.3 Classes of LSST Data Products
The main LSST data products are illustrated in Figure 1 (see Appendix
for a conceptual design of pipelines which will produce these data products).
LSST Data Management will perform two, somewhat overlapping in scienti?c
intent, types of image analyses:
1. Analysis of di?erence images, with the goal of detecting and charac-
terizing astrophysical phenomena revealed by their time-dependent na-
ture. The detection of supernovae superimposed on bright extended
galaxies is an example of this analysis. The processing will be done
on nightly or daily basis and result in Level 1 data products. Level 1
products will include di?erence images, catalogs of sources detected in
di?erence images (DIASources), astrophysical objects
5
these are asso-
ciated to (DIAObjects), and Solar System objects (SSObjects
6
). The
5
The LSST has adopted the nomenclature by which single-epoch detections of astro-
physical objects are called sources. The reader is cautioned that this nomenclature is not
universal: some surveys call detections what LSST calls sources, and use the term sources
for what LSST calls objects.
6
SSObjects used to be called \Moving Objects" in previous versions of the LSST Data
Products baseline. The name is potentially confusing as high-proper motion stars are
moving objects as well. A more accurate distinction is the one between objects inside and
outside of the Solar System.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
2 INTRODUCTION
8
Figure 1: Overview of data products produced by LSST Imaging Processing
Science Pipelines.
catalogs will be entered into the Level 1 database and made available
in near real time. Noti?cations (\alerts") about new DIASources will
be issued using community-accepted standards
7
within 60 seconds of
observation. Level 1 data products are discussed in x 4.
2. Analysis of direct images, with the goal of detecting and characteriz-
ing astrophysical objects. Detection of faint galaxies on deep coadds
and their subsequent characterization is an example of this analysis.
The results are Level 2 data products. These products, generated and
released annually
8
, will include the single-epoch images, deep coadds,
catalogs of characterized Objects (detected on deep coadds as well as
individual visits
9
), Sources
10
(detections and measurements on indi-
vidual visits), and ForcedSources (constrained measurement of ux
For example, VOEvent, see http://ls.st/4tt
8
Except for the ?rst two data releases, which will be created six months apart.
9
The LSST takes two exposures per pointing, nominally 15 seconds in duration each,
called snaps. For the purpose of data processing, that pair of exposures will typically be
coadded and treated as a single exposure, called a visit.
10
When written in bold monospace type (i.e., ntt), Objects and Sources refer to objects
and sources detected and measured as a part of Level 2 processing.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
2 INTRODUCTION
9
on individual visits). It will also include fully reprocessed Level 1 data
products (see x4.3.5). In contrast to the \living" Level 1 database,
which is updated in real-time, the Level 2 databases are static and will
not change after release. Level 2 data products are discussed in x 5.
The two types of analyses have di?erent requirements on timeliness.
Changes in ux or position of objects may need to be immediately followed
up, lest interesting information be lost. Thus the primary results of analysis
of di?erence images { discovered and characterized DIASources { generally
need to be broadcast as event alerts within 60 seconds of end of visit acqui-
sition. The analysis of science (direct) images is less time sensitive, and will
be done as a part of annual data release process.
Recognizing the diversity of astronomical community needs, and the need
for specialized processing not part of the automatically generated Level 1
and 2 products, LSST plans to devote 10% of its data management system
capabilities to enabling the creation, use, and federation of Level 3 (user-
created) data products. Level 3 capabilities will enable science cases that
greatly bene?t from co-location of user processing and/or data within the
LSST Archive Center. The high-level requirement for Level 3 is established
in x 3.5 of the LSST SRD. Their details are discussed in x 6 of this document.
Finally, LSST Survey Speci?cations (x 3.4 of LSST SRD) prescribe that
90% of LSST observing time be spent in the so-called \universal cadence"
mode of surveying the sky. These observations will result in Level 1 and 2
data products discussed above. The remaining 10% of observing time will
be devoted to special programs, designed to obtain improved coverage
of interesting regions of observational parameter space. Examples include
very deep (r ˘ 26, per exposure) observations, observations with very short
revisit times (˘1 minute), and observations of \special" regions such as the
Ecliptic, Galactic plane, and the Large and Small Magellanic Clouds. The
data products for these programs will be generated using the same processing
software and hardware and possess the general characteristics of Level 1 and
2 data products, but may be performed on a somewhat di?erent cadence.
They will be discussed in x 7.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
3 GENERAL CONSIDERATIONS
10
3 General Considerations
Most LSST data products will consist of images and/or catalogs. The cata-
logs will be stored and o?ered to the users as relational databases which they
will be able to query. This approach was shown to work well by prior large
surveys, for example the Sloan Digital Sky Survey (SDSS).
Di?erent data products will generally be stored in di?erent databases. For
example, Level 1 data products will be stored in a Level 1 database. Level 2
\universal cadence" products will be stored in a Level 2 database database.
The products for special programs may be stored in many di?erent databases,
depending on the nature of the program.
Nevertheless, all these databases will follow certain naming and other
conventions. We discuss these in the subsections to follow.
3.1 Estimator and Naming Conventions
For all catalogs data, we will employ a convention where estimates of stan-
dard errors have the su?x Err, while the estimates of inherent widths of
distribution (or functions in general) have the su?x Sigma
11
. The latter are
de?ned as the square roots of the second moment about the quoted value of
the quantity at hand.
Unless noted otherwise, maximum likelihood values (called likelihood for
simplicity) will be quoted for all ?tted parameters (measurements). To-
gether with covariances, these let the end-user apply whatever prior they
deem appropriate when computing posteriors
12
. Where appropriate, multi-
ple independent samples from the likelihood may be provided to characterize
departures from Gaussianity.
We will provide values of log likelihood, the ˜
2
for the ?tted parameters,
and the number of data points used in the ?t. Database functions (or precom-
puted columns) will be provided for frequently used combinations of these
quantities (e.g., ˜
2
=dof). These can be used to assess the model ?t quality.
Note that, if the errors of measured quantities are normally distributed, the
11
Given N measurements, standard errors scale as N
1 =2
, while widths remain constant.
12
There's a tacit assumption that a Gaussian is a reasonably good description of the
likelihood surface around the ML peak.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
3 GENERAL CONSIDERATIONS
11
likelihood is related to the ˜
2
as:
L =
Y
k
1
˙
k
p
2ˇ
!
exp
?
˜
2
2
?
(1)
where the index k runs over all data points included in the ?t. For complete-
ness, ˜
2
is de?ned as:
˜
2
=
X
k
?
x
k
x?
˙
k
?
2
;
(2)
where x? is the mean value of x
k
.
For uxes, we recognize that a substantial fraction of astronomers will just
want the posteriors marginalized over all other parameters, trusting the LSST
experts to select an appropriate prior
13
. For example, this is nearly always
the case when constructing color-color or color-magnitude diagrams. We
will support these use cases by providing additional pre-computed columns,
taking care to name them appropriately so as to minimize accidental incorrect
usage. For example, a column named gFlux may be the expectation value
of the g-band ux, while gFluxML may represent the maximum likelihood
value.
3.2 Image Characterization Data
Raw images will be processed to remove instrumental signature and char-
acterize their properties, including backgrounds (both due to night sky and
astrophysical), the point spread function and its variation, photometric zero-
point model, and the world coordinate system (WCS).
That characterization is crucial for deriving LSST catalogs and under-
standing the images. It will be kept and made available to the users. The
exact format used to store these (meta)data will depend on the ?nal adopted
algorithm in consultation with the scienti?c community to ensure the formats
in which these data are served are maximally useful.
Each processed image
14
, including the coadds, will record information on
pixel variance (the \variance plane"), as well as per-pixel masks (the \mask
13
It's likely that most cases will require just the expectation value alone.
14
It is also frequently referred to as calibrated exposure
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
3 GENERAL CONSIDERATIONS
12
plane"). These will allow the users to determine the validity and usefullness
of each pixel in estimating the ux density recorded in that area of the sky.
This information will be per-pixel, and potentially unwieldy to use for
certain science cases. We plan to investigate approximate schemes for storing
masks based on geometry (e.g., similar to Mangle or STOMP), in addition
to storing them on a per pixel basis.
3.3 Fluxes and Magnitudes
Because ux measurements on di?erence images (Level 1 data products; x 4)
are performed against a template and thus represent a ux di?erence, the
measured ux of a source on the di?erence image can be negative. The
ux can also go negative for faint sources in the presence of noise. Negative
uxes cannot be stored as (Pogson) magnitudes; log of a negative number
is unde?ned. We therefore prefer to store uxes, rather than magnitudes, in
database tables
15
.
We quote uxes in units of \maggie". A maggie, as introduced by SDSS,
is a linear measure of ux. It is de?ned so that an object having a ux of
one maggie (integrated over the bandpass) has an AB magnitude of zero:
m
AB
= 2:5 log
10
(f=maggie)
(3)
We chose to use maggies (as opposed to, say, Jansky) to allow the user
to di?erentiate between two distinct sources of photometric calibration er-
ror: the error in relative (internal) calibration of the survey, and the error
in absolute calibration that depends on the knowledge of absolute ux of
photometric standards.
Nevertheless, we acknowledge that the large majority of users will want
to work with magnitudes. For convenience, we plan to provide columns
with (Pogson) magnitudes
16
, where values with negative ux will evaluate
to NULL. Similarly, we will provide columns with ux expressed in Jy (and
its error estimate that includes the relative and absolute calibration error
contributions).
15
This is a good idea in general. E.g., given multi-epoch observations, one should always
be averaging uxes, rather than magnitudes.
16
These will most likely be implemented as \virtual" or \computed" columns
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
3 GENERAL CONSIDERATIONS
13
3.4 Uniqueness of IDs across database versions
To reduce the likelihood for confusion, all IDs shall be unique across databases
and database versions, other than those corresponding to uniquely identi?-
able entities (i.e., IDs of exposures).
For example, DR4 and DR5 (or any other) release will share no iden-
tical Object, Source, DIAObject or DIASource IDs (see x 4 and 5 for the
de?nitions of Objects, DIAObjects, etc.).
3.5 Repeatability of Queries
We require that queries executed at a known point in time against any LSST-
delivered database be repeatable at a later date. This promotes the repro-
ducibility of science derived from LSST data. It is of special importance for
Level 1 catalogs (x 4) that will change on a nightly basis as new time domain
data is being processed and added to the catalogs.
The exact implementation of this requirement is left to the LSST Data
Management database team. One possibility may be to make the key tables
(nearly) append-only, with each row having two timestamps { createdTai
and deletedTai, so that queries may be limited by a WHERE clause:
SELECT * FROM DIASource WHERE 'YYYY-MM-DD-HH-mm-SS' BETWEEN
createdTAI and deletedTAI
or, more generally:
SELECT * FROM DIASource WHERE "data is valid as of YYYY-MM-DD"
A perhaps less error-prone alternative, if technically feasible, may be to
provide multiple virtual databases that the user would access as:
CONNECT lsst-dr5-yyyy-mm-dd
SELECT * FROM DIASource
The latter method would probably be limited to nightly granularity, unless
there's a mechanism to create virtual databases/views on-demand.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
14
4 Level 1 Data Products
4.1 Overview
Level 1 data products are a result of di?erence image analysis (DIA; x4.2.1).
They include the sources detected in di?erence images (DIASources), astro-
physical objects that these are associated to (DIAObjects), identi?ed Solar
System objects
17
(SSObject), and related, broadly de?ned, metadata (in-
cluding eg., cut-outs
18
).
DIASources are sources detected on di?erence images (those with the
signal-to-noise ratio S=N > transSNR after correlation with the PSF pro?le,
with transSNR de?ned in the SRD and presently set to 5). They represent
changes in ux with respect to a deep template. Physically, a DIASource
may be an observation of new astrophysical object that was not present
at that position in the template image (for example, an asteroid), or an
observation of ux change in an existing source (for example, a variable
star). Their ux can be negative (eg., if a source present in the template
image reduced its brightness, or moved away). Their shape can be complex
(eg., trailed, for a source with proper motion approaching ˘ deg=day, or
\dipole-like", if an object's observed position exhibits an o?set { true or
apparent { compared to its position on the template). Some DIASources will
be caused by background uctuations; with transSNR = 5, we expect about
one such false positive per CCD (of the order 200,000 per typical night). The
expected number of false positives due to background uctuations is a very
strong function of adopted transSNR: a change of transSNR by 0.5 results
in a variation of an order of magnitude, and a change of transSNR by unity
changes the number of false positives by about two orders of magnitude.
Clusters of DIASources detected on visits taken at di?erent times are
associated with either a DIAObject or an SSObject, to represent the un-
derlying astrophysical phenomenon. The association can be made in two
di?erent ways: by assuming the underlying phenomenon is an object within
the Solar System moving on an orbit around the Sun
19
, or by assuming it
17
The SRD considers Solar System object orbit catalog to be a Level 2 data product
(LSST SRD, Sec 3.5). Nevertheless, to successfully di?erentiate between apparitions of
known Solar System objects and other types DIASources we consider it functionally a
part of Level 1.
18
Small, 30 ? 30, sub-images at the position of a detected source. Also known as postage
stamps.
19
We don't plan to ?t for motion around other Solar System bodies; eg., identifying new
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
15
to be distant enough to only exhibit small parallactic and proper motion
20
.
The latter type of association is performed during di?erence image analysis
right after the image has been acquired. The former is done at daytime by
the Moving Objects Processing Software (MOPS), unless the DIASource is an
apparition of an already known SSObject. In that case, it will be agged as
such during di?erence image analysis.
At the end of the di?erence image analysis of each visit, we will issue time
domain event alerts for all newly detected DIASources
21
.
4.2 Level 1 Data Processing
4.2.1 Di?erence Image Analysis
The following is a high-level description of steps which will occur during
regular nightly di?erence image analysis (see Figures 3 and 5):
1. A visit is acquired and reduced to a single visit image (cosmic ray
rejection, instrumental signature removal
22
, combining of snaps, etc.).
2. The visit image is di?erenced against the appropriate template and
DIASources are detected. If necessary, deblending will be performed
at this stage. Both the parent blend and the deblended children will be
measured and stored as DIASources (see next item), but only the chil-
dren will be matched against DIAObjects and alerted on. Deblended
objects will be agged as such.
3. The ux and shape
23
of the DIASource are measured on the di?erence
image. PSF photometry is performed on the visit image at the position
of the DIASource to obtain a measure of the total ux.
satellites of Jupiter is left to the community.
20
Where 'small' is small enough to unambiguously positionally associate together indi-
vidual apparitions of the object.
21
For observations on the Ecliptic near the opposition Solar System objects will dominate
the DIASource counts and (until they're recognized as such) overwhelm the explosive
transient signal. It will therefore be advantageous to quickly identify the majority of Solar
System objects early in the survey.
22
Eg., subtraction of bias and dark frames, at ?elding, bad pixel/column interpolation,
etc.
23
The \shape" in this context consists of weighted 2
nd
moments, as well as ?ts to a
trailed source model and a dipole model.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
16
4. The Level 1 database (see x4.3) is searched for a DIAObject or an
SSObject with which to positionally associate the newly discovered
DIASource
24
. If no match is found, a new DIAObject is created and
the observed DIASource is associated to it.
5. If the DIASource has been associated with an SSObject (a known Solar
System object), it will be agged as such and an alert will be issued.
Further processing will occur in daytime (see section 4.2.2).
6. Otherwise, the associated DIAObject measurements will be updated
with new data collected during previous 12 months. Hence, the com-
puted parameters for DIAObjects have a \memory" of past data that
does not extend beyond this cuto?
25
. All a?ected columns will be re-
computed, including proper motions, centroids, light curves, etc.
7. The Level 2 database
26
is searched for one or more Objects positionally
close to the DIAObject, out to some maximum radius
27
. The IDs of
these nearest-neighbor Objects are recorded in the DIAObject record
and provided in the issued event alert (see below).
8. An alert is issued that includes: the name of the Level 1 database, the
timestamp of when this database has been queried to issue this alert,
the DIASource ID, the SSObject ID or DIAObject ID
28
, name of the
Level 2 database and the IDs of nearby Objects, and the associated
science content (centroid, uxes, low-order lightcurve moments, peri-
ods, etc.), including the full light curves. See Section 4.5 for a more
complete enumeration.
24
The association algorithm will guarantee that a DIASource is associated with not
more than one existing DIAObject or SSObject. The algorithm will take into account the
parallax and proper (or Keplerian) motions, as well as the errors in estimated positions
of DIAObject, SSObject, and DIASource, to ?nd the maximally likely match. Multiple
DIASources in the same visit will not be matched to the same DIAObject.
25
This restriction is removed when Level 1 processing is rerun during Data Release
production, see x 4.3.5.
26
Level 2 database is a database resulting from annual data release processing. See x 5
for details.
27
Eg., a few arcseconds.
28
We guarantee that a receiver will always be able to regenerate the alert contents at
any later date using the included timestamps and metadata (IDs and database names).
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
17
9. For all DIAObjects overlapping the ?eld of view, including those that
have an associated new DIASource from this visit, forced photome-
try will be performed on di?erence image (point source photometry
only). Those measurements will be stored as appropriately agged
DIASources
29
. No alerts will be issued for these DIASources.
10. Within 24 hours of discovery, precovery PSF forced photometry will
be performed on any di?erence image overlapping the position of new
DIAObjects taken within the past 30 days, and added to the database.
Alerts will not be issued with precovery photometry information.
In addition to the processing described above, a smaller sample of sources
detected on di?erence images below the nominal transSNR = 5 threshold
will be measured and stored, in order to enable monitoring of di?erence image
analysis quality.
Also, the system will have the ability to measure and alert on a limited
30
number of sources detected below the nominal threshold for which additional
criteria are satis?ed. For example, a transSNR = 3 source detection near
a gravitational keyhole
31
may be highly signi?cant in assessing the danger
posed by a potentially hazardous asteroid. The initial set of criteria will be
de?ned by the start of LSST operations.
4.2.2 Solar System Object Processing
The following will occur during regular Solar System object processing (in
daytime
32
, after a night of observing; see Figure 6):
1. The orbits and physical properties of all SSObjects re-observed on
the previous night are recomputed. External orbit catalogs (or obser-
29
For the purposes of this document, we're treating the DIASources generated by forced
photometry or precovery measurements to be the same as DIASources detected in di?er-
ence images (but agged appropriately). In the logical schema, these may be divided into
two separate tables.
30
It will be sized for no less than ˘ 10% of average DIASource per visit rate.
31
A gravitational keyhole is a region of space where Earth's gravity would modify the
orbit of a passing asteroid such that the asteroid would collide with the Earth in the future.
32
Note that there is no strict bound on when daytime Solar System processing must
?nish, just that, averaged over some reasonable timescale (eg., a month), a night's worth
of observing is processed within 24 hours. Nights rich in moving objects may take longer
to process, while nights with less will ?nish more quickly. In other words, the requirement
is on throughput, not latency.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
18
vations) are also used to improve orbit estimates. Updated data are
entered to the SSObjects table.
2. All DIASources detected on the previous night, that have not been
matched at a high con?dence level to a known Object, DIAObject,
SSObject, or an artifact, are analyzed for potential pairs, forming track-
lets.
3. The collection of tracklets collected over the past 30 days is searched
for subsets forming tracks consistent with being on the same Keplerian
orbit around the Sun.
4. For those that are, an orbit is ?tted and a new SSObject table entry
created. DIASource records are updated to point to the new SSObject
record. DIAObjects \orphaned" by this unlinking are deleted.
33
.
5. Precovery linking is attempted for all SSObjects whose orbits were
updated in this process. Where successful, SSObjects (orbits) are re-
computed as needed.
4.3 Level 1 Catalogs
The described alert processing design relies on the \living" Level 1 database
that contains the objects and sources detected on di?erence images. At the
very least
34
, this database will have tables of DIASources, DIAObjects, and
SSObjects, populated in the course of nightly and daily di?erence image and
Solar System object processing
35
. As these get updated and added to, their
updated contents becomes visible (query-able) immediately
36
.
This database is only loosely coupled to the Level 2 database. All of the
coupling is through positional matches between the DIAObjects entries in the
Level 1 database and the Objects in the Level 2 database database. There
33
Some DIAObjects may only be left with forced photometry measurements at their
location (since all DIAObjects are force-photometered on previous and subsequent visits);
these will be kept but agged as such.
34
It will also contain exposure and visit metadata, MOPS-speci?c tables, etc. These are
either standard/uncontroversial, implementation-dependent, or less directly relevant for
science and therefore not discussed in this document.
35
The latter is also colloquially known as DayMOPS.
36
No later than the moment of issuance of any event alert that may refer to it.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
19
is no direct DIASource-to-Object match: in general, a time-domain object
is not necessarily the same astrophysical object as a static-sky object, even
if the two are positionally coincident (eg. an asteroid overlapping a galaxy).
Therefore, adopted data model emphasizes that having a DIASource be posi-
tionally coincident with an Object does not imply it is physically related to it.
Absent other information, the least presumptuous data model relationship is
one of positional association, not physical identity.
This may seem odd at ?rst: for example, in a simple case of a variable star,
matching individual DIASources to Objects is exactly what an astronomer
would want. That approach, however, fails in the following scenarios:
? A supernova in a galaxy. The matched object in the Object table will
be the galaxy, which is a distinct astrophysical object. We want to keep
the information related to the supernova (eg., colors, the light curve)
separate from those measurements for the galaxy.
? An asteroid occulting a star. If associated with the star on ?rst ap-
parition, the association would need to be dissolved when the source is
recognized as an asteroid (perhaps even as early as a day later).
? A supernova on top of a pair of blended galaxies. It is not clear in
general to which galaxy this DIASource would \belong". That in itself
is a research question.
DIASource-to-Object matches can still be emulated via a two-step re-
lation (DIASource-DIAObject-Object). For ease of use, views or pre-built
table with these matches will be o?ered to the end-users.
In the sections to follow, we present the conceptual schemas for the most
important Level 1 database tables. These convey what data will be recorded
in each table, rather than the details of how. For example, columns whose
type is an array (eg., radec) may be expanded to one table column per el-
ement of the array (eg., ra, decl) once this schema is translated to SQL
37
.
Secondly, the tables to be presented are largely normalized (i.e., contain no
redundant information). For example, since the band of observation can
be found by joining a DIASource table to the table with exposure meta-
data, there's no column named band in the DIASource table. In the as-built
37
The SQL realization of this schema can be browsed at
http://ls.st/8g4
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
20
database, the views presented to the users will be appropriately denormalized
for ease of use.
4.3.1 DIASource Table
This is a table of sources detected at transSNR ? 5 on di?erence images
38
(DIASources). On average, the LSST SRD expects ˘ 2000 DIASources per
visit (˘ 2M per night; 20,000 per deg
2
of the sky per hour).
Some transSNR ? 5 sources will not be caused by observed astrophysical
phenomena, but by artifacts (bad columns, di?raction spikes, etc.). The
di?erence image analysis software will attempt to identify and ag these as
such.
Unless noted otherwise, all DIASource quantities ( uxes, centroids, etc.)
are measured on the di?erence image.
Table 1: DIASource Table
Name
Type
Unit
Description
diaSourceId
uint64
Unique source identi?er
ccdVisitId
uint64
ID of CCD and visit where
this source was measured
diaObjectId
uint64
ID of the DIAObject this
source was associated with,
if any.
ssObjectId
uint64
ID of the SSObject this
source has been linked to, if
any.
parentSourceId
uint64
ID of the parent Source this
object has been deblended
from, if any.
midPointTai
double
time
Time of mid-exposure for
this DIASource
39
.
radec
double[2]
degrees
Centroid, (?; ?)
40
.
Continued on next page
38
This requirement is speci?ed in the LSST SRD.
39
The visit mid-exposure time generally depends on the position of the source relative
to the shutter blade motion trajectory.
40
The astrometric reference frame will be chosen closer to start of operations.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
21
Table 1: DIASource Table
Name
Type
Unit
Description
radecCov
oat[3]
various
radec covariance matrix.
xy
oat[2]
pixels
Column and row of the cen-
troid.
xyCov
oat[3]
various
Centroid covariance matrix.
apFlux
oat
nmgy
Calibrated aperture
ux.
Note that this actually mea-
sures the ux di?erence be-
tween the template and the
visit image.
apFluxErr
oat
nmgy
Estimated uncertainty of
apFlux.
SNR
oat
The signal-to-noise ratio at
which this source was de-
tected in the di?erence im-
age.
41
psFlux
oat
nmgy
42
Calibrated
ux for point
source model.
Note this
actually measures the ux
di?erence between the tem-
plate and the visit image.
psRadec
double[2]
degrees
Centroid for point source
model.
psCov
oat[6]
various
Covariance matrix for point
source model parameters.
psLnL
oat
Natural log likelihood of
the observed data given the
point source model.
psChi2
oat
˜
2
statistic of the model ?t.
Continued on next page
41
This is not necessarily the same as apFlux/apFluxErr, as the ux measurement algo-
rithm may be more accurate than the detection algorithm.
42
A \maggie", as introduced by SDSS, is a linear measure of ux in units of 3631 Jy;
one maggie has an AB magnitude of 0, m
AB
= 2:5 log
10
(maggie). \nmgy" is short for
a nanomaggie (1 nmgy = 3.631 ?Jy). For example, a ux of 0:158 nmgy corresponds to
AB magnitude of 24:5. See x3.3 for details.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
22
Table 1: DIASource Table
Name
Type
Unit
Description
psNdata
int
The number of data points
(pixels) used to ?t the
model.
trailFlux
oat
nmgy
Calibrated ux for a trailed
source model
43;44
. Note this
actually measures the ux
di?erence between the tem-
plate and the visit image.
trailRadec
double[2]
degrees
Centroid for trailed source
model.
trailLength
oat
arcsec
Maximum likelihood ?t of
trail length
45;46
.
trailAngle
oat
degrees
Maximum
likelihood
?t
of the angle between the
meridian
through
the
centroid and the trail direc-
tion (bearing, direction of
motion).
trailCov
oat[15]
various
Covariance matrix of trailed
source model parameters.
trailLnL
oat
Natural log likelihood of
the observed data given the
trailed source model.
Continued on next page
43
A Trailed Source Model attempts to ?t a (PSF-convolved) model of a point source that
was trailed by a certain amount in some direction (taking into account the two-snap nature
of the visit, which may lead to a dip in ux around the mid-point of the trail). Roughly,
it's a ?t to a PSF-convolved line. The primary use case is to characterize fast-moving
Solar System objects.
44
This model does not ?t for the direction of motion; to recover it, we would need to
?t the model to separately to individual snaps of a visit. This adds to system complexity,
and is not clearly justi?ed by increased MOPS performance given the added information.
45
Note that we'll likely measure trailRow and trailCol, and transform to trail-
Length/trailAngle (or trailRa/trailDec) for storage in the database. A stretch goal is
to retain both.
46
TBD: Do we need a separate trailCentroid? It's unlikely that we do, but one may
wish to prove it.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
23
Table 1: DIASource Table
Name
Type
Unit
Description
trailChi2
oat
˜
2
statistic of the model ?t.
trailNdata
int
The number of data points
(pixels) used to ?t the
model.
dipMeanFlux
oat
nmgy
Maximum likelihood value
for the mean absolute ux
of the two lobes for a dipole
model
47
.
dipFluxDi?
oat
nmgy
Maximum likelihood value
for the di?erence of absolute
uxes of the two lobes for a
dipole model.
dipRadec
double[2]
degrees
Centroid for dipole model.
dipLength
oat
arcsec
Maximum likelihood value
for the lobe separation in
dipole model.
dipAngle
oat
degrees
Maximum
likelihood
?t
of the angle between the
meridian
through
the
centroid and the dipole
direction (bearing,
from
negative to positive lobe).
dipCov
oat[15]
various
Covariance matrix of dipole
model parameters.
dipLnL
oat
Natural log likelihood of
the observed data given the
dipole source model.
dipChi2
oat
˜
2
statistic of the model ?t.
Continued on next page
47
A Dipole Model attempts to ?t a (PSF-convolved) model of two point sources, with
uxes of opposite signs, separated by a certain amount in some direction. The primary
use case is to characterize moving stars and problems with image di?erencing (e.g., due
to astrometric o?sets).
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
24
Table 1: DIASource Table
Name
Type
Unit
Description
dipNdata
int
The number of data points
(pixels) used to ?t the
model.
totFlux
oat
nmgy
Calibrated
ux for point
source model measured on
the visit image centered at
the centroid measured on
the di?erence image (forced
photometry ux)
totFluxErr
oat
nmgy
Estimated uncertainty of
fpFlux.
di?Flux
oat
nmgy
Calibrated
ux for point
source model centered on
radec but measured on the
di?erence of snaps compris-
ing this visit
48
.
di?FluxErr
oat
nmgy
Estimated uncertainty of
diffFlux.
fpBkgd
oat
nmgy/asec
2
Estimated background at
the position (centroid) of
the object in the template
image.
fpBkgdErr
oat
nmgy/asec
2
Estimated uncertainty of
fpBkgd.
Ixx
oat
nmgy asec
2
Adaptive second moment of
the source intensity.
See
Bernstein & Jarvis (2002)
for detailed discussion of
all adaptive-moment related
quantities
49
.
Continued on next page
48
This ux can be used to detect sources changing on timescales comparable to snap
exposure length (˘ 15 sec).
49
http://ls.st/5f4
for a brief summary.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
25
Table 1: DIASource Table
Name
Type
Unit
Description
Iyy
oat
nmgy asec
2
Adaptive second moment of
the source intensity.
Ixy
oat
nmgy asec
2
Adaptive second moment of
the source intensity.
Icov
oat[6]
nmgy
2
asec
4
Ixx, Iyy, Ixy covariance
matrix.
IxxPSF
oat
nmgy asec
2
Adaptive second moment
for the PSF.
IyyPSF
oat
nmgy asec
2
Adaptive second moment
for the PSF.
IxyPSF
oat
nmgy asec
2
Adaptive second moment
for the PSF.
extendedness
oat
A measure of extendedness,
computed using a combina-
tion of available moments,
or from a likelihood ratio
of point/trailed source mod-
els (exact algorithm TBD).
extendedness = 1 implies
a high degree of con?dence
that the source is extended.
extendedness = 0 implies
a high degree of con?dence
that the source is point-like.
spuriousness
oat
A measure of spuriousness,
computed using informa-
tion
50
from the source and
image characterization, as
well as the information on
the Telescope and Camera
system (e.g., ghost maps,
defect maps, etc.).
Continued on next page
50
The computation of spuriousness will be \prior free" to the extent possible and not
use any information about the astrophysical neighborhood of the source, whether it has
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
26
Table 1: DIASource Table
Name
Type
Unit
Description
ags
bit[64]
bit
Various useful ags.
Some fast-moving, trailed, sources may be due to passages of nearby
asteroids. Their trails may exhibit signi?cant curvature. While we do not
measure the curvature directly, it can be inferred by examining the length
of the trail, the trailed model covariance matrices, and the adaptive shape
measures. Once curvature is suspected, the users may ?t curved trail models
to the cutout provided with the alert.
4.3.2 DIAObject Table
Table 2: DIAObject Table
Name
Type
Unit
Description
diaObjectId
uint64
Unique identi?er.
radec
double[2]
degrees
(?; ?) position of the object
at time radecTai.
radecCov
oat[3]
various
radec covariance matrix.
radecTai
double
time
Time at which the object
was at a position radec.
pm
oat[2]
mas/yr
Proper motion vector
51
.
parallax
oat
mas
Trigonometric arallax.
pmParallaxCov
oat[6]
various
Proper motion - parallax co-
variances.
pmParallaxLnL
oat
Natural log of the likelihood
of the linear proper motion-
parallax ?t
52
.
Continued on next page
been previously observed or not, etc. The intent is to avoid introducing a bias against
unusual sources or sources discovered in unusual environments.
51
High proper-motion or parallax objects will appear as \dipoles" in di?erence images.
Great care will have to be taken not to misidentify these as subtraction artifacts.
52
radec, pm, and parallax will all be simultaneously ?tted for.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
27
Table 2: DIAObject Table
Name
Type
Unit
Description
pmParallaxChi2
oat
˜
2
statistic of the model ?t.
pmParallaxNdata int
The number of data points
used to ?t the model.
psFluxMean
oat[ugrizy] nmgy
Weighted mean of point-
source model ux, psFlux.
psFluxMeanErr
oat[ugrizy] nmgy
Standard
error
of
psFluxMean.
psFluxSigma
oat[ugrizy] nmgy
Standard deviation of the
distribution of psFlux.
psFluxChi2
oat[ugrizy]
˜
2
statistic for the scat-
ter of psFlux around
psFluxMean.
psFluxNdata
int[ugrizy]
The
number
of
data
points used to compute
psFluxChi2.
fpFluxMean
oat[ugrizy] nmgy
Weighted mean of forced
photometry ux, fpFlux.
fpFluxMeanErr
oat[ugrizy] nmgy
Standard error of fpFlux.
fpFluxSigma
oat[ugrizy] nmgy
Standard deviation of the
distribution of fpFlux.
lcPeriodic
oat[6 ? 32]
Periodic features extracted
from light-curves using gen-
eralized Lomb-Scargle peri-
odogram (Table 4, Richards
et al. 2011)
53
.
lcNonPeriodic
oat[6 ? 20]
Non-periodic features ex-
tracted from light-curves
(Table 5, Richards et al.
2011).
Continued on next page
53
The exact features in use when LSST begins operations are likely to be di?erent
compared to the baseline described here. This is to be expected given the rapid pace of
research in time domain astronomy. However, the number of computed features is unlikely
to grow beyond the present estimate.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
28
Table 2: DIAObject Table
Name
Type
Unit
Description
nearbyObj
uint64[3]
Closest Objects in Level 2
database.
nearbyObjDist
oat[3]
arcsec
Distances to nearbyObj.
nearbyObjLnP
oat[3]
Natural log of the prob-
ability that the observed
DIAObject is the same as
the nearby Object
54
.
ags
bit[64]
bit
Various useful ags.
4.3.3 SSObject Table
Table 3: SSObject Table
Name
Type
Unit
Description
ssObjectId
uint64
Unique identi?er.
oe
double[7]
various
Osculating orbital elements
at epoch (q, e, i,
, !, M
0
,
epoch).
oeCov
double[21]
various
Covariance matrix for oe.
arc
oat
days
Arc of observation.
orbFitLnL
oat
Natural log of the likelihood
of the orbital elements ?t.
orbFitChi2
oat
˜
2
statistic of the orbital el-
ements ?t.
orbFitNdata
int
The number of data points
(observations) used to ?t
the orbital elements.
MOID
oat[2]
AU
Minimum orbit intersection
distances
55
Continued on next page
54
This quantity will be computed by marginalizing over the product of position and
proper motion error ellipses of the Object and DIAObject, assuming an appropriate prior.
http://www2.lowell.edu/users/elgb/moid.html
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
29
Table 3: SSObject Table
Name
Type
Unit
Description
moidLon
double[2]
degrees
MOID longitudes.
H
oat[6]
mag
Mean absolute magnitude,
per band (Muinonen et al.
2010 magnitude-phase sys-
tem).
G
1
oat[6]
mag
G
1
slope parameter, per
band (Muinonen et al. 2010
magnitude-phase system).
G
2
oat[6]
mag
G
2
slope parameter, per
band (Muinonen et al. 2010
magnitude-phase system).
hErr
oat[6]
mag
Uncertainty of H estimate.
g1Err
oat[6]
mag
Uncertainty of G
1
estimate.
g2Err
oat[6]
mag
Uncertainty of G
2
estimate.
ags
bit[64]
bit
Various useful ags.
The G
1
and G
2
parameters for the large majority of asteroids will not
be well constrained until later in the survey. We may decide not to ?t for
it at all over the ?rst few DRs and add it later in Operations, or provide
two-parameter G
12
?ts. Alternatively, we may ?t it using strong priors on
slopes poorly constrained by the data. The design of the data management
system is insensitive to this decision, making it possible to postpone it to
Commissioning to ensure it follows the standard community practice at that
time.
The LSST database will provide functions to compute the phase (Sun-
Asteroid-Earth) angle ? for every observation, as well as the reduced, H(?),
and absolute, H, asteroid magnitudes in LSST bands.
4.3.4 Precovery Measurements
When a new DIASource is detected, it's useful to perform PSF photometry
at the location of the new source on images taken prior to discovery. These
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
30
are colloquially known as precovery measurements
56
. Performing precovery
in real time over all previously acquired visits is too I/O intensive to be
feasible. We therefore plan the following:
1. For all newly discovered objects, perform precovery PSF photometry
on visits taken over the previous 30 days
57
.
2. Make available a \precovery service" to request precovery for a limited
number of DIASources across all previous visits, and make it available
within 24 hours of the request. Web interface and machine-accessible
APIs will be provided.
The former should satisfy the most common use cases (eg., SNe), while
the latter will provide an opportunity for more extensive yet timely precovery
of targets of special interest.
4.3.5 Reprocessing the Level 1 Data Set
In what we've described so far, the \living" Level 1 database is continu-
ally being added to as new images are taken and DIASources identi?ed.
Every time a new DIASource is associated to an existing DIAObject, the
DIAObject record is updated to incorporate new information brought in by
the DIASource. Once discovered and measured, the DIASources would never
be re-discovered and re-measured at the pixel level.
This would be far from optimal. The instrument will be better understood
with time. Newer versions of LSST pipelines will improve detection and
measurements on older data. Also, precovery photometry should optimally
be performed for all objects, and not just a select few. This argues for
periodic reprocessing of the Level 1 data set.
We plan to reprocess all image di?erencing-derived data (the Level 1
database), at the same time we perform the annual Level 2 data release
productions. This will include all images taken since the start of survey
operations, to the time when the data release production begins. The im-
ages will be reprocessed using a single version of the image di?erencing and
measurement software, resulting in a consistent data set.
56
When Solar System objects are concerned, precovery has a slightly di?erent meaning:
predicting the positions of newly identi?ed SSObjects on previously acquired visits, and
associating with them the DIASources consistent with these predictions.
57
We will be maintaining a cache of 30 days of processed images to support this feature.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
31
There will be three main advantages of Level 1 database produced during
Data Release processing, compared to \living" Level 1 database: i) even the
oldest data will be processed with the latest software, ii) astrometric and
photometric calibration will be better, and iii) there will be no 12-month
limit on the width of data window used to computed associated DIAObject
measurements (proper motions, centroids, light curves, etc.).
Older versions of the Level 1 database produced during Data Release pro-
cessing will be archived following the same rules as for the Level 2 databases.
The most recent DR, and the penultimate data release will be kept on disk
and loaded into the database. Others will be archived to tape and made
available as bulk downloads. See x 5.5 for more detail.
4.4 Level 1 Image Products
4.4.1 Visit Images
Raw and processed visit images will be made available for download no later
than 24 hours from the end of visit acquisition.
The images will remain accessible with low-latency (seconds from request
to start of download) for at least 30 days, with slower access afterwards
(minutes to hours).
4.4.2 Di?erence Images
Complete di?erence images will be made available for download no later than
24 hours from the end of visit acquisition.
The images will remain accessible with low-latency (seconds from request
to start of download) for at least 30 days, with slower access afterwards
(minutes to hours).
4.4.3 Image Di?erencing Templates
Templates for di?erence image analysis will be created by coadding 6-months
to a year long groups of visits. The coaddition process will take care to
remove transient or fast moving objects (eg., asteroids) from the templates.
Di?erence image analysis will use the appropriate template given the time of
observation, airmass, and seeing.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
32
4.5 Alerts to DIASources
4.5.1 Information Contained in Each Alert
For each detected DIASource, LSST will emit an \Event Alert" within 60
seconds of the end of visit (de?ned as the end of image readout from the
LSST Camera). These alerts will be issued in VOEvent format
58
, and should
be readable by VOEvent-compliant clients.
Each alert (a VOEvent packet) will at least include the following:
? alertID: An ID uniquely identifying this alert. It can also be used to
execute a query against the Level 1 database as it existed when this
alert was issued
? Level 1 database ID
? Science Data:
{ The DIASource record that triggered the alert
{ The entire DIAObject (or SSObject) record
{ All previous DIASource records
{ A matching DIAObject from the latest Data Release, if it exists,
and its DIASource records
? Cut-out of the di?erence image centered on the DIASource (10 bytes/pixel,
FITS MEF)
? Cut-out of the template image centered on the DIASource (10 bytes/pixel,
FITS MEF)
The variable-size cutouts will be sized so as to encompass the entire foot-
print of the detected source, but be no smaller than 30 ? 30 pixels. The
provided images will comprise of a ux (32 bit oat), variance (32 bit oat),
and mask (16 bit ags) planes, and include metadata necessary for further
processing (e.g., WCS, zero point, PSF, etc.).
58
Or some other format that is broadly accepted and used by the community at the
start of LSST commissioning.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
33
The items above are meant to represent the information transmitted with
each alert; the content of the alert packet itself will be formatted to con?rm
to VOEvent (or other relevant) standard. Where the existing standard is
inadequate for LSST needs, LSST will propose extensions and work with the
community to reach a common solution.
With each alert, we attempt to include as much information known to
LSST about the DIASource as possible, to minimize the need for follow-up
database queries. This speeds up classi?cation and decision making at the
user end, and relaxes the requirements on the database on the Project end.
4.5.2 Receiving and Filtering the Alerts
Alerts will be transmitted in VOEvent format, using standard IVOA protocols
(eg., VOEvent Transport Protocol; VTP
59
. As a very high rate of alerts is
expected, approaching ˘ 2 million per night, we plan for public VOEvent
Event Brokers
60
to be the primary end-points of LSST's event streams. End-
users will use these brokers to classify and ?lter events for subsets ?tting
their science goals. End-users will not be able to subscribe to full, un?ltered,
alert streams coming directly from LSST
61
.
To directly serve the end-users, LSST will provide a basic, limited ca-
pacity, alert ?ltering service. This service will run at the LSST U.S. Archive
Center (at NCSA). It will let astronomers create simple ?lters that limit what
alerts are ultimately forwarded to them
62
. These user de?ned ?lters will be
possible to specify using an SQL-like declarative language, or short snippets
of (likely Python) code. For example, here's what a ?lter may look like:
# Keep only never-before-seen events within two
# effective radii of a galaxy. This is for illustration
# only; the exact methods/members/APIs may change.
59
VOEvent Transport Protocol is currently an IVOA Note, but we understand work is
under way to ?nalize and bring it up to full IVOA Recommendation status.
60
These brokers are envisioned to be operated as a public service by third parties who
will have signed MOUs with LSST.
61
This is due to ?nite network bandwidth available: for example, a 100 end-users sub-
scribing to a ˘ 100 Mbps stream (the peak full stream data rate at end of the ?rst year of
operations) would require 10Gbps WAN connection from the archive center, just to serve
the alerts.
62
More speci?cally, to their VTP clients. Typically, a user will use the Science User
Interface (the web portal to LSST Archive Center) to set up the ?lters, and use their VTP
client to receive the ?ltered VOEvent stream.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
4 LEVEL 1 DATA PRODUCTS
34
def filter(alert):
if len(alert.sources) > 1:
return False
nn = alert.diaobject.nearest_neighbors[0]
if not nn.flags.GALAXY:
return False
return nn.dist < 2. * nn.Re
We emphasize that this LSST-provided capability will be limited, and is
not intended to satisfy the wide variety of use cases that a full- edged public
Event Broker could. For example, we do not plan to provide any classi?-
cation (eg., \is the light curve consistent with an RR Lyra?", or \a Type
Ia SN?"). No information beyond what is contained in the VOEvent packet
will be available to user-de?ned ?lters (eg., no cross-matches to other cata-
logs). The complexity and run time of user de?ned ?lters will be limited by
available resources. Execution latency will not be guaranteed. The number
of VOEvents transmitted to each user per user will be limited as well (eg.,
at least up to ˘ 20 per visit per user, dynamically throttled depending on
load). Finally, the total number of simultaneous subscribers is likely to be
limited { in case of overwhelming interest, a TAC-like proposal process may
be instituted.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
35
5 Level 2 Data Products
5.1 Overview
Level 2 data products result from direct image
63
analysis. They're designed to
enable static sky science (eg., studies of galaxy evolution, or weak lensing),
and time-domain science that is not time sensitive (eg. statistical investi-
gations of variability). They include image products (reduced single-epoch
exposures, called calibrated exposures, and coadds), and catalog products (ta-
bles of objects, sources, their measured properties, and related metadata).
Similarly to Level 1 catalogs of DIAObjects and DIASources, Objects
in the Level 2 catalog represent the astrophysical phenomena (stars, galax-
ies, quasars, etc.), while Sources represent their single-epoch observations.
Sources are independently detected and measured in single epoch exposures
and recorded in the Source table.
The master list of Objects in Level 2 will be generated by associating
and deblending the list of single-epoch DIASource detections and the lists of
sources detected on coadds. We plan to build coadds designed to maximize
depth (\deep coadds") and coadds designed to achieve a good combination of
depth and seeing (\best seeing coadds"), unless algorithms will enable these
two to be the same. We will also build a series of short-period (eg. yearly,
or multi-year) coadds. The ux limit in deep coadds will be signi?cantly
fainter than in individual visits, and the best seeing coadds will help with
deblending the detected sources. The short-period coadds are necessary to
avoid missing faint objects showing long-term variability. These coadds will
be built for all bands, as well as some combining multiple bands (\multi-color
coadds"). Not all of these will be preserved after sources are detected
and measured (see x 5.4.3 for details). We will provide a facility to regenerate
their subsections as Level 3 tasks (x 6).
The deblender will be run simultaneously on the catalog of peaks
64
de-
tected in the coadds, the DIAObject catalog from the Level 1 database, and
one or more external catalogs. It will use the knowledge of peak positions,
63
As opposed to di?erence image, for Level 1.
64
The source detection algorithm we plan to employ ?nds regions of connected pixels
above the nominal S=N threshold in the PSF-likelihood image of the visit (or coadd).
These regions are called footprints. Each footprint may have one or more peaks, and it
is these peaks that the deblender will use to infer the number and positions of objects
blended in each footprint.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
36
bands, time, time variability (from Level 1 and the single-epoch Source de-
tections), inferred motion, Galactic longitude and latitude, and other avail-
able information to produce a master list of deblended Objects. Metadata
on why and how a particular Object was deblended will be kept.
The properties of Objects, including their exact positions, motions, par-
allaxes, and shapes, will be characterized by MultiFit-type algorithms
65
.
Finally, to enable studies of variability, the uxes of all Objects will be
measured on individual visits (using both direct and di?erence images), with
their shape parameters and deblending resolutions kept constant. This pro-
cess is known as forced photometry (see x 5.2.4), and the ux measurements
will be stored in the ForcedSource table.
5.2 Level 2 Data Processing
Figures 3 and 4 present a high-level overview of the Level 2 data process-
ing work ow
66
. Logically
67
, the processing begins with single-visit image
reduction and source measurement, followed by global astrometric and pho-
tometric calibration, coadd creation, detection on coadds, association and
deblending, object characterization, and forced photometry measurements.
The following is a high-level description of steps which will occur during
regular Level 2 data processing (bullets 1 and 2 below map to pipeline 1,
Single Visit Processing, in Figure 3, bullet 3 is pipeline 2, Image Coaddition,
bullets 4-6 map to pipeline 3, Coadded Image Analysis, and bullet 7 is pipeline
4, Multi-epoch Object Characterization):
1. Single Visit Processing: Raw exposures are reduced to calibrated visit
exposures, and Sources are independently detected, deblended, and
measured on all visits. Their measurements (instrumental uxes and
shapes) are stored in the Source table.
2. Relative calibration: The survey is internally calibrated, both photo-
metrically and astrometrically. Relative zero point and astrometric
65
\MultiFit algorithms" are those that ?t a PSF-convolved model (so-called \forward
modeling") to all multi-epoch observations of an object. This approach is in contrast to
measurement techniques where multi-epoch images are coadded ?rst, and the properties
are measured from the coadded pixels.
66
Note that some LSST documents refer to Data Release Processing, which includes
both Level 1 reprocessing (see x 4.3.5), and the Level 2 processing described here.
67
The actual implementation may parallelize these steps as much as possible.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
37
corrections are computed for every visit. Su?cient data is kept to re-
construct the normalized system response function ˚
b
(?) (see Eq. 5,
SRD) at every position in the focal plane at the time of each visit as
required by x 3.3.4 of the SRD.
3. Coadd creation: Deep, seeing optimized, and short-period per-band
coadds are created in ugrizy bands, as well as deeper, multi-color,
coadds
68
. Transient sources (including Solar System objects, explosive
transients, etc), will be rejected from the coadds. See x 5.4.3 for details.
4. Coadd source detection. Sources will be detected on all coadds gener-
ated in the previous step. The source detection algorithm will detect
regions of connected pixels, known as footprints, above the nominal
S=N threshold in the PSF-likelihood image of the visit. An appropriate
algorithm will be run to also detect extended low surface brightness
objects (eg., binned detection algorithm from SDSS). Each footprint
may have one or more peaks, and the collection of these peaks (and
their membership in the footprints) are the output of this stage.
5. Association and deblending. The next stage in the pipeline, which we
will for simplicity just call the deblender, will synthesize a list of unique
objects. In doing so it will consider the catalogs of CoaddSources, cata-
logs of DIASources, DIAObjects and SSObjects detected on di?erence
images, and objects from external catalogs
69
.
The deblender will make use of all information available at this stage,
including the knowledge of peak positions, bands, time, time variability
(from Level 1), Galactic longitude and latitude, etc. The output of this
stage is a list of uncharacterized Objects
70
.
68
We'll denote the \band" of the multi-color coadd as 'M'.
69
Note that Sources are not considered when generating the Object list (given the large
number of visits in each band, the false positives close to the faint end would increase
the complexity of association and deblending algorithms). It is possible for intermittent
sources that are detected just above the faint detection limit of single visits to be unde-
tected in coaddds, and thus to not have a matching Object. To enable easy identi?cation
of such Sources, the nearest Object associated with each Source, if any, will be recorded.
70
Depending on the exact implementation of the deblender, this stage may also attach
signi?cant metadata (eg, deblended footprints and pixel-weight maps) to each deblended
Object record.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
38
6. Coadd object characterization. Object properties such as adaptive mo-
ments and aperture uxes will be measured on the coadds. These will
be stored in additional columns in the Object table. Models ?t in
multi-epoch object characterization will also be ?t ?rst to the coadds
and stored.
7. Multi-epoch object characterization. A set of prede?ned model ?ts and
measurements will be performed on each of the Objects identi?ed in
the previous step, taking all available multi-epoch data into account.
Model ?ts will be performed using MultiFit-type algorithms. Rather
than coadding a set of images and measuring object characteristics on
the coadd, MultiFit simultaneously ?ts PSF-convolved models to the
objects multiple observations. This reduces systematic errors, improves
the overall S=N, and allows for ?tting of time-dependent quantities
degenerate with shape on the coadds (for example, the proper motion).
The models we plan to ?t will not allow for ux variability (see the
next item).
8. Forced Photometry. Source uxes will be measured on every visit, with
the position, motion, shape, and the deblending parameters character-
ized in the previous step kept ?xed. Measurements will be performed
on both direct images and di?erence images. This process of forced pho-
tometry, will result in the characterization of the light-curve for each
object in the survey. The uxes will be stored in the ForcedSource
table.
5.2.1 Object Characterization Measures
Properties of detected objects will be measured as a part of the object char-
acterization step described in the previous section and stored in the Object
table. These measurements are designed to enable LSST \static sky" science.
This section discusses at a high level which properties will be measured and
how those measurements will be performed. For a detailed list of quantities
being ?t/measured, see the table in x 5.3.1.
All measurements discussed in this section deal with properties of objects,
and will be performed on multi-epoch coadds, or by simultaneously ?tting
to all epochs. Measurements of sources in individual visits, independent of
all others, are described in x 5.2.3.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
39
To enable science cases depending on observations of non-variable ob-
jects in the LSST-observed sky, we plan to measure the following using the
MultiFit approach:
? Point source model ?t. The observed object is modeled as a point
source with ?nite proper motion and parallax and constant ux (al-
lowed to be di?erent in each band). This model is a good description
for non-variable stars and other unresolved sources. Its 11 parameters
will be simultaneously constrained using information from all available
observations in all bands
71
.
? Bulge-disk model ?t. The object is modeled as a sum of a de Vau-
couleurs (Sersic n = 4) and an exponential (Sersic n = 1) component.
This model is intended to be a reasonable description of galaxies
72
.
The object is assumed not to move
73
. The components share the same
ellipticity and center. The model is independently ?t to each band.
There are a total of 8 free parameters, which will be simultaneously
constrained using information from all available epochs for each band.
Where there's insu?cient data to constrain the likelihood (eg., small,
poorly resolved, galaxies, or very few epochs), priors will be adopted
to limit the range of its sampling.
We will also explore ?tting the galaxy model simultaneously to all
bands, with some parameters constrained to be the same or related
via a hierarchical model across bands. As this reduces the number of
overall model parameters signi?cantly, we could then consider freeing
up other parameters. One likely scenario is that we would allow the
bulge and disk ellipticities to di?er; another possibility is allowing the
Sersic indices of one or both components to vary. The ultimate de-
termination of which model to use will be driven by empirical tests of
the robustness and quality of the ?ts on both low- and high-redshift
galaxies.
In addition to the maximum likelihood values of ?tted parameters and
their covariance matrix, a number (currently planned to be ˘ 200, on
71
The ?tting procedure will account for di?erential chromatic refraction.
72
We may reconsider this choice if a better suited parametrization is discovered while
LSST is in Construction.
73
I.e., have zero proper motion and trigonometric parallax.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
40
average
74
) of independent samples from the likelihood function will be
provided. These will enable use-cases sensitive to departures from the
Gaussian approximation, with shear measurement being the primary
use case. A permissible descope, in case of insu?cient storage, will be
not to sample the posterior for u and y bands.
? Standard colors. Colors of the object in \standard seeing" (for example,
the third quartile expected survey seeing in the i band, ˘ 0:9") will be
measured. These colors are guaranteed to be seeing-insensitive, suitable
for estimation of photometric redshifts
75
.
? Centroids. Centroids will be computed independently for each band
using an algorithm similar to that employed by SDSS. Information
from all
76
epochs will be used to derive the estimate. These centroids
will be used for adaptive moment, Petrosian, Kron, standard color, and
aperture measurements.
? Adaptive moments. Adaptive moments will be computed using infor-
mation from all epochs, independently for each band. The moments of
the PSF realized at the position of the object will be provided as well.
? Petrosian and Kron uxes. Petrosian and Kron radii and uxes will be
measured in standard seeing using self-similar elliptical apertures com-
puted from adaptive moments. The apertures will be PSF-corrected
and homogenized, convolved to a canonical circular PSF
77
. The radii
will be computed independently for each band. Fluxes will be com-
puted in each band, by integrating the light within some multiple of
74
This choice of the number of independent samples will be veri?ed during Construction.
75
The problem of optimal determination of photometric redshift is the subject of in-
tense research. The approach we're taking here is conservative, following contemporary
practices. As new insights develop, we will revisit the issue.
76
Whenever we say all, it should be understood that this does not preclude reasonable
data quality cuts to exclude data that would otherwise degrade the measurement.
77
This is an attempt to derive a de?nition of elliptical apertures that does not depend
on seeing. For example, for a large galaxy, the correction to standard seeing will introduce
little change to measured ellipticity. Corrected apertures for small galaxies will tend to be
circular (due to smearing by the PSF). In the intermediate regime, this method results in
derived apertures that are relatively seeing-independent. Note that this is only the case
for apertures; the measured ux will still be seeing dependent and it is up to the user to
take this into account.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
41
the radius measured in the canonical band
78
(most likely the i band).
Radii enclosing 50% and 90% of light will be provided.
? Aperture surface brightness. Aperture surface brightness will be com-
puted in a variable number
79
of concentric, logarithmically spaced,
PSF-homogenized, elliptical apertures, in standard seeing.
? Variability characterization. Parameters will be provided, designed to
characterize periodic and aperiodic variability features (Richards et al.
2011), in each bandpass. We caution that the exact features in use
when LSST begins operations are likely to be di?erent compared to the
baseline described here; this is to be expected given the rapid pace of
research in time domain astronomy. However, their number is unlikely
to grow beyond the present estimate.
5.2.2 Supporting Science Cases Requiring Full Posteriors
Science cases sensitive to systematics, departures of likelihood from Gaus-
sianity, or requiring user-speci?ed priors, demand knowledge of the shape of
the likelihood function beyond a simple Gaussian approximation around the
ML value. The estimate of bulge-disk model parameters and the estimate of
photometric redshifts are two examples where knowledge of the full posterior
is likely to be needed for LSST science cases.
We currently plan to provide this information in two ways: a) by pro-
viding independent samples from the likelihood function (in the case of the
bulge-disk model), and b) by providing parametric estimates of the likelihood
function (for the photometric redshifts). As will be shown in Table 4, the
current allocation is ˘ 200 samples (on average) for the bulge-disk model,
and ˘ 100 parameters for describing the photo-Z likelihood distributions,
per object.
The methods of storing likelihood functions (or samples thereof) will con-
tinue to be developed and optimized throughout Construction and Commis-
sioning. The key limitation, on the amount of data needed to be stored, can
78
The shape of the aperture in all bands will be set by the pro?le of the galaxy in the
canonical band alone. This procedure ensures that the color measured by comparing the
ux in di?erent bands is measured through a consistent aperture. See
org/dr7/algorithms/photometry.html
for details.
79
The number will depend on the size of the source.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
42
be overcome by compression techniques. For example, simply noticing that
not more than ˘ 0:5% accuracy is needed for sample values allows one to
increase the number of samples by a factor of 4. More advanced techniques,
such as PCA analysis of the likelihoods across the entire catalog, may al-
low us to store even more, providing a better estimate of the shape of the
likelihood function. In that sense, what is presented in Table 4 should be
thought of as a conservative estimate, which we plan to improve upon as
development continues in Construction.
5.2.3 Source Characterization
Sources will be detected on individual visits as well as the coadds. Sources
detected on coadds will primarily serve as inputs to the construction of the
master object list as described in x 5.2, and may support other LSST science
cases as seen ?t by the users (for example, searches for objects whose shapes
vary over time).
The following Source properties are planned to be measured:
? Static point source model ?t. The source is modeled as a static point
source. There are a total of 3 free parameters (?, ?, ux). This model
is a good description of stars and other unresolved sources.
? Centroids. Centroids will be computed using an algorithm similar to
that employed by SDSS. These centroids will be used for adaptive mo-
ment and aperture magnitude measurements.
? Adaptive moments. Adaptive moments will be computed. The mo-
ments of the PSF realized at the position of the object will be provided
as well.
? Aperture surface brightness. Aperture surface brightness will be com-
puted in a variable number
80
of concentric, logarithmically spaced,
PSF-homogenized, elliptical apertures.
Note that we do not plan to ?t extended source Bulge+Disk models to
individual Sources, nor measure per-visit Petrosian or Kron uxes. These
80
The number will depend on the size of the source.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
43
are object properties that are not expected to vary in time
81
, and will be bet-
ter characterized by MultiFit (in the Object table). For example, although
a simple extendedness characterization is present in the Source table, star-
galaxy separation (an estimate of the probability that a source is resolved,
given the PSF) will be better characterized by MultiFit.
5.2.4 Forced Photometry
Forced Photometry is the measurement of ux in individual visits, given a
?xed position, shape, and the deblending parameters of an object. It enables
the study of time variability of an object's ux, irrespective of whether the
ux in any given individual visit is above or below the single-visit detection
threshold.
Forced photometry will be performed on all visits, for all Objects, using
both direct images and di?erence images. The measured uxes will be stored
in the ForcedSources table. Due to space constraints, we only plan to
measure the PSF ux.
5.2.5 Crowded Field Photometry
A fraction of LSST imaging will cover areas of high object (mostly stellar)
density. These include the Galactic plane, the Large and Small Magellanic
Clouds, and a number of globular clusters (among others).
LSST image processing and measurement software, although primarily
designed to operate in non-crowded regions, is expected to perform well in
areas of crowding. The current LSST applications development plan envi-
sions making the deblender aware of Galactic longitude and latitude, and
permitting it to use that information as a prior when deciding how to de-
blend objects. While not guaranteed to reach the accuracy or completeness
of purpose-built crowded ?eld photometry codes, we expect this approach
will yield acceptable results even in areas of moderately high crowding.
Note that this discussion only pertains to processing of direct images.
Crowding is not expected to signi?cantly impact the quality of data products
81
Objects that do change shape with time would, obviously, be of particular interest.
Aperture uxes provided in the Source table should su?ce to detect these. Further per-
visit shape characterization can be performed as a Level 3 task.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
44
derived from di?erence images (i.e., Level 1).
5.3 The Level 2 Catalogs
This section presents the contents of key Level 2 catalog tables. As was
the case for Level 1 (see x 4.3), here we present the conceptual schemas for
the most important Level 2 tables (the Object, Source, and ForcedSource
tables).
These convey what data will be recorded in each table, rather than the
details of how. For example, columns whose type is an array (eg., radec) may
be expanded to one table column per element of the array (eg., ra, decl) once
this schema is translated to SQL. Secondly, the tables to be presented are
normalized (i.e., contain no redundant information). For example, since the
band of observation can be found by joining a Source table to the table with
exposure metadata, there's no column named band in the Source table. In
the as-built database, the views presented to the users will be appropriately
denormalized for ease of use.
5.3.1 The Object Table
Table 4: Level 2 Catalog Object Table
Name
Type
Unit
Description
objectId
uint64
Unique object identi?er
parentObjectId
uint64
ID of the parent Object this
object has been deblended
from, if any.
radec
double[6][2] arcsec
Position of the object (cen-
troid), computed indepen-
dently in each band. The
centroid will be computed
using an algorithm similar
to that employed by SDSS.
radecErr
double[6][2] arcsec
Uncertainty of radec.
psRadecTai
double
time
Point source model: Time
at which the object was at
position psRadec.
Continued on next page
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
45
psRadec
double[2]
degrees
Point source model: (?; ?)
position of the object at
time psRadecTai.
psPm
oat[2]
mas/yr
Point source model: Proper
motion vector.
psParallax
oat
mas
Point source model: Paral-
lax.
psFlux
oat[ugrizy] nmgy
Point source model uxes
82
.
psCov
oat[66]
various
Point-source model covari-
ance matrix
83
.
psLnL
oat
Natural log likelihood of
the observed data given the
point source model.
psChi2
oat
˜
2
statistic of the model ?t.
psNdata
int
The number of data points
(pixels) used to ?t the
model.
bdRadec
double[2][ugrizy]
B+D model
84
: (?;?) posi-
tion of the object, in each
band.
bdEllip
oat[2][ugrizy]
B+D model:
Ellipticity
(e
1
; e
2
) of the object.
bdFluxB
oat[ugrizy] nmgy
B+D model:
Integrated
ux of the de Vaucouleurs
component.
bdFluxD
oat[ugrizy] nmgy
B+D model:
Integrated
ux of the exponential com-
ponent.
Continued on next page
82
Point source model assumes that uxes are constant in each band. If the object is
variable, psFlux will e?ectively be some estimate of the average ux.
83
Not all elements of the covariance matrix need to be stored with same precision. While
the variances will be stored as 32 bit oats (˘ seven signi?cant digits), the covariances
may be stored to ˘ three signi?cant digits (˘ 1% ).
84
Though we refer to this model as \Bulge plus Disk", we caution the reader that the
decomposition, while physically motivated, should not be taken too literally.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
46
bdReB
oat[ugrizy] arcsec
B+D model: E?ective ra-
dius of the de Vaucouleurs
pro?le component.
bdReD
oat[ugrizy] arcsec
B+D model: E?ective ra-
dius of the exponential pro-
?le component.
bdCov
oat[36][ugrizy]
B+D model covariance ma-
trix
85
.
bdLnL
oat[ugrizy]
Natural log likelihood of
the observed data given the
bulge+disk model.
bdChi2
oat[ugrizy]
˜
2
statistic of the model ?t.
bdNdata
int[ugrizy]
The number of data points
(pixels) used to ?t the
model.
bdSamples
oat16[9][200][ugrizy]
Independent
samples
of
bulge+disk likelihood sur-
face. All sampled quantities
will be stored with at least
˘ 3 signi?cant digits of
precision.
The number
of samples will vary from
object to object, depending
on how well the object's
likelihood function is ap-
proximated by a Gaussian.
stdColor
oat[5]
mag
Color of the object mea-
sured in \standard seeing".
While the exact algorithm
is yet to be determined,
this color is guaranteed to
be seeing-independent and
suitable for photo-Z deter-
minations.
stdColorErr
oat[5]
mag
Uncertainty of stdColor.
Continued on next page
85
See psCov for notes on precision of variances/covariances.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
47
Ixx
oat
nmgy asec
2
Adaptive second moment of
the source intensity.
See
Bernstein & Jarvis (2002)
for detailed discussion of
all adaptive-moment related
quantities
86
.
Iyy
oat
nmgy asec
2
Adaptive second moment of
the source intensity.
Ixy
oat
nmgy asec
2
Adaptive second moment of
the source intensity.
Icov
oat[6]
nmgy
2
asec
4
Ixx, Iyy, Ixy covariance
matrix.
IxxPSF
oat
nmgy asec
2
Adaptive second moment
for the PSF.
IyyPSF
oat
nmgy asec
2
Adaptive second moment
for the PSF.
IxyPSF
oat
nmgy asec
2
Adaptive second moment
for the PSF.
m4
oat[ugrizy]
Fourth order adaptive mo-
ment.
petroRad
oat[ugrizy] arcsec
Petrosian radius, computed
using elliptical apertures de-
?ned by the adaptive mo-
ments.
petroRadErr
oat[ugrizy] arcsec
Uncertainty of petroRad
petroBand
int8
The band of the canonical
petroRad
petroFlux
oat[ugrizy] nmgy
Petrosian ux within a de-
?ned multiple of the canon-
ical petroRad
petroFluxErr
oat[ugrizy] nmgy
Uncertainty in petroFlux
petroRad50
oat[ugrizy] arcsec
Radius containing 50% of
Petrosian ux.
petroRad50Err
oat[ugrizy] arcsec
Uncertainty of petroRad50.
Continued on next page
86
http://ls.st/5f4
for a brief summary.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
48
petroRad90
oat[ugrizy] arcsec
Radius containing 90% of
Petrosian ux.
petroRad90Err
oat[ugrizy] arcsec
Uncertainty of petroRad90.
kronRad
oat[ugrizy] arcsec
Kron radius (computed us-
ing elliptical apertures de-
?ned by the adaptive mo-
ments)
kronRadErr
oat[ugrizy] arcsec
Uncertainty of kronRad
kronBand
int8
The band of the canonical
kronRad
kronFlux
oat[ugrizy] nmgy
Kron ux within a de?ned
multiple of the canonical
kronRad
kronFluxErr
oat[ugrizy] nmgy
Uncertainty in kronFlux
kronRad50
oat[ugrizy] arcsec
Radius containing 50% of
Kron ux.
kronRad50Err
oat[ugrizy] arcsec
Uncertainty of kronRad50.
kronRad90
oat[ugrizy] arcsec
Radius containing 90% of
Kron ux.
kronRad90Err
oat[ugrizy] arcsec
Uncertainty of kronRad90.
apNann
int8
Number of elliptical annuli
(see below).
apMeanSb
oat[6][apNann
nmgy/as
]
2
Mean
surface
brightness
within an annulus
87
.
apMeanSbSigma
oat[6][apNann
nmgy/as
]
2
Standard
deviation
of
apMeanSb.
Continued on next page
87
A database function will be provided to compute the area of each annulus, to enable
the computation of aperture ux.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
49
extendedness
oat
A measure of extendedness,
computed using a combina-
tion of available moments,
or from a likelihood ratio
of point/B+D source mod-
els (exact algorithm TBD).
extendedness = 1 implies
a high degree of con?dence
that the source is extended.
extendedness = 0 implies
a high degree of con?dence
that the source is point-like.
lcPeriodic
oat[6 ? 32]
Periodic features extracted
from
di?erence
image-
based
light-curves
using
generalized
Lomb-Scargle
periodogram
(Table
4,
Richards et al. 2011).
lcNonPeriodic
oat[6 ? 20]
Non-periodic features ex-
tracted
from
di?erence
image-based
light-curves
(Table 5, Richards et al.
2011).
photoZ
oat[2 ? 100]
Photometric redshift likeli-
hood samples { pairs of (z,
logL) { computed using a
to-be-determined published
and widely accepted algo-
rithm at the time of LSST
Commissioning.
ags
bit[128]
bit
Various useful ags.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
50
5.3.2 Source Table
Source measurements are performed independently on individual visits. They're
designed to enable relative astrometric and photometric calibration, variabil-
ity studies of high signal-to-noise objects, and studies of high SNR objects
that vary in position and/or shape (eg., comets).
Table 5: Level 2 Catalog Source Table
Name
Type
Unit
Description
sourceId
uint64
Unique source identi?er
88
ccdVisitId
uint64
ID of CCD and visit where
this source was measured
objectId
uint64
ID of the Object this source
was associated with, if any.
ssObjectId
uint64
ID of the SSObject this
source has been linked to, if
any.
parentSourceId
uint64
ID of the parent Source this
source has been deblended
from, if any.
xy
oat[2]
pixels
Position of the object (cen-
troid), computed using an
algorithm similar to that
used by SDSS.
xyCov
oat[3]
Covariance matrix for xy.
radec
double[2]
arcsec
Calibrated (?, ?) of the
source, transformed from
xy.
radecCov
oat[3]
arcsec
Covariance
matrix
for
radec.
apFlux
oat
nmgy
Calibrated aperture ux.
apFluxErr
oat
nmgy
Estimated uncertainty of
apFlux.
Continued on next page
88
It would be optimal if the source ID is globally unique across all releases. Whether
that's realized will depend on technological and space constraints.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
51
Table 5: Level 2 Catalog Source Table
Name
Type
Unit
Description
sky
oat
nmgy/asec
2
Estimated
background
(sky) surface brightness at
the position (centroid) of
the source.
skyErr
oat
nmgy/asec
2
Estimated uncertainty of
sky.
psRadec
double[2]
degrees
Point source model: (?; ?)
position of the object.
psFlux
oat
nmgy
Calibrated
point
source
model ux.
psCov
oat[6]
various
Point-source model covari-
ance matrix
89
.
psLnL
oat
Natural log likelihood of
the observed data given the
point source model.
psChi2
oat
˜
2
statistic of the model ?t.
psNdata
int
The number of data points
(pixels) used to ?t the
model.
Ixx
oat
nmgy asec
2
Adaptive second moment of
the source intensity.
See
Bernstein & Jarvis (2002)
for detailed discussion of
all adaptive-moment related
quantities
90
.
Iyy
oat
nmgy asec
2
Adaptive second moment of
the source intensity.
Ixy
oat
nmgy asec
2
Adaptive second moment of
the source intensity.
Continued on next page
89
Not all elements of the covariance matrix will be stored with same precision. While
the variances will be stored as 32 bit oats (˘ seven signi?cant digits), the covariances
may be stored to ˘ three signi?cant digits (˘ 1% ).
90
http://ls.st/5f4
for a brief summary.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
52
Table 5: Level 2 Catalog Source Table
Name
Type
Unit
Description
Icov
oat[6]
nmgy
2
asec
4
Ixx, Iyy, Ixy covariance
matrix.
IxxPSF
oat
nmgy asec
2
Adaptive second moment
for the PSF.
IyyPSF
oat
nmgy asec
2
Adaptive second moment
for the PSF.
IxyPSF
oat
nmgy asec
2
Adaptive second moment
for the PSF.
apNann
int8
Number of elliptical annuli
(see below).
apMeanSb
oat[apNann] nmgy
Mean surface brightness
within an annulus.
apMeanSbSigma
oat[apNann] nmgy
Standard
deviation
of
apMeanSb.
extendedness
oat
A measure of extendedness,
computed using a combina-
tion of available moments
(exact
algorithm
TBD).
extendedness = 1 implies
a high degree of con?dence
that the source is extended.
extendedness = 0 implies
a high degree of con?dence
that the source is point-like.
ags
bit[64]
bit
Various useful ags.
5.3.3 ForcedSource Table
Table 6: Level 2 Catalog ForcedSource Table
Name
Type
Unit
Description
objectId
uint64
Unique object identi?er
Continued on next page
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
53
Table 6: Level 2 Catalog ForcedSource Table
Name
Type
Unit
Description
ccdVisitId
uint64
ID of CCD and visit where
this source was measured
psFlux
oat
nmgy
Point source model ux on
direct image, if performed.
psFluxErr
oat
nmgy
Point source model ux er-
ror, stored to 1% precision.
psDi?Flux
oat
nmgy
Point source model ux on
di?erence image, if per-
formed.
psDi?FluxErr
oat
nmgy
Point source model ux er-
ror, stored to 1% precision.
ags
bit[8]
bit
Various useful ags.
5.4 Level 2 Image Products
5.4.1 Visit Images
Raw exposures, including individual snaps, and processed visit images will be
made available for download as FITS ?les. They will be downloadable both
through a human-friendly Science User Interface, as well as using machine-
friendly APIs.
Required calibration data, processing metadata, and all necessary image
processing software will be provided to enable the user to generate bitwise
identical processed images from raw images
91
.
5.4.2 Calibration Data
All calibration frames (darks, ats, biases, fringe, etc.) will be preserved and
made available for download as FITS ?les.
All auxiliary telescope data, both raw (images with spectra) and pro-
cessed (calibrated spectra, derived atmosphere models), will be preserved
and made available for download.
91
Assuming identically performing software and hardware con?guration.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
54
5.4.3 Coadded Images
In course of Level 2 processing, multiple classes and numerous of coadds will
be created:
? A set of deep coadds. One deep coadd will be created for each of the
ugrizy bands, plus a seventh, deeper, multi-color coadd. These coadds
will be optimized for a reasonable combination of depth (i.e., employ no
PSF matching) and resolution (i.e., visits with signi?cantly degraded
seeing may be omitted). Transient sources (including Solar System
objects, explosive transients, etc), will be removed. Care will be taken
to preserve the astrophysical backgrounds
92
.
The six per-band coadds will be kept inde?nitely and made available to
the users. Their primary purpose is to enable the end-users to apply al-
ternative object characterization algorithms, perform studies of di?use
structures, and for visualization.
? A set of best seeing coadds. One deep coadd will be created for each of
the ugrizy bands, using only the best seeing data (for example, using
only the ?rst quartile of the realized seeing distribution). These will
be built to assist the deblending process. These coadds will be kept
inde?nitely and made available to the users. We will retain and provide
su?cient metadata for users to re-create them using Level 3 or other
resources.
? A set of short-period coadds. These will comprise of multiple (ugrizyM)
sets of yearly and multi-year coadds. Each of these sets will be created
using only a subset of the data, and otherwise share the characteristics
of the deep coadds described above. These are designed to enable detec-
tion of long-term variable or moving
93
objects that would be \washed
out" (or rejected) in full-depth coadds. We do not plan to keep and
make these coadds available. We will retain and provide su?cient
metadata for users to re-create them using Level 3 or other resources.
? One (ugrizyM) set of PSF-matched coadds. These will be used to
measure colors and shapes of objects at \standard" seeing. We do
not plan to keep and make these coadds available. We will
92
For example, using \background matching" techniques;
http://ls.st/l9u
93
For example, nearby high proper motion stars.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
55
retain and provide su?cient metadata for users to re-create them using
Level 3 or other resources.
The exact details of which coadds to build, and which ones to keep, can
change during Construction without a?ecting the processing system design
as the most expensive operations (raw image input and warping) are constant
in the number of coadds produced. The data management system design is
sensitive to the total number and size of coadds to be kept { these are the
relevant constraining variables.
We reiterate that not all coadds will be kept and served to the
public
94
, though su?cient metadata will be provided to users to recreate
them on their own. Some coadds may be entirely \virtual": for example,
the PSF-matched coadds could be implemented as ad-hoc convolutions of
postage stamps when the colors are measured.
We will retain smaller sections of all generated coadds, to support quality
assessment and targeted science. Retained sections may be positioned to
cover areas of the sky of special interest such as overlaps with other surveys,
nearby galaxies, large clusters, etc.
5.5 Data Release Availability and Retention Policies
Over 10 years of operations, LSST will produce eleven data releases: two for
the ?rst year of survey operations, and one every subsequent year. Each data
release will include reprocessing of all data from the start of the survey, up
to the cuto? date for that release.
The contents of data releases are expected to range from a few PB (DR1)
to ˘ 70 PB for DR11 (this includes the raw images, retained coadds, and
catalogs). Given that scale, it is not feasible to keep all data releases loaded
and accessible at all times.
Instead, only the contents of the most recent data release, and
the penultimate data release will be kept on fast storage and with
catalogs loaded into the database. Statistics collected by prior surveys
(eg., SDSS) show that users nearly always prefer accessing the most recent
data release, but sometimes may use the penultimate one (this is especially
94
The coadds are a major cost driver for storage. LSST Data Management system is
currently sized to keep and serve seven coadds, ugrizyM, over the full footprint of the
survey.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
5 LEVEL 2 DATA PRODUCTS
56
true just after the publication of a new data release). Older releases are used
rarely.
To assist with data quality monitoring and assessment small, overlap-
ping, samples of data from older releases will be kept loaded in
the database. The sample size is expected to be on order of ˘ 1
5% of
the data release data, with larger samples kept early on in the survey. The
goal is to allow one to test how the reported characterization of the same
data varies from release to release.
Older releases will be archived to mass storage (tape). The users will
not be able to perform database queries against archived releases.
They will be made available as bulk downloads in some common format (for
example, FITS binary tables). Database software and data loading scripts
will be provided for users who wish to set up a running copy of an older (or
current) data release database on their systems.
All raw data used to generate any public data product (raw exposures,
calibration frames, telemetry, con?guration metadata, etc.) will be kept and
made available for download.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
6 LEVEL 3 DATA PRODUCTS AND CAPABILITIES
57
6 Level 3 Data Products and Capabilities
Level 3 capabilities are envisioned to enable science cases that would greatly
bene?t from co-location of user processing and/or data within the LSST
Archive Center. The high-level requirement for Level 3 is established in x 3.5
of the LSST SRD.
Level 3 capabilities include three separate deliverables:
1. Level 3 Data Products and associated storage resources
2. Level 3 processing resources, and
3. Level 3 programming environment and framework
Many scientists' work may involve using two or all three of them in concert,
but they can each be used independently. We describe each one of them in
the subsections to follow.
6.1 Level 3 Data Products and Associated Storage Re-
sources
These are data products that are generated by users on any computing re-
sources anywhere that are then brought to an LSST Data Access Center
(DAC) and stored there. The hardware for these capabilities includes the
physical storage and database server resources at the DAC to support them.
For catalog data products, there is an expectation that they can be "fed-
erated" with the Level 1 (L1) and Level 2 (L2) catalogs to enable analyses
combining them. Essentially this means that either the user-supplied tables
include keys from the L1/L2 catalogs that can be used for key-equality-
based joins with them (example: a table of custom photometric redshifts for
galaxies, with a column of object IDs that can be joined to the L2 Object
catalog), or that there are columns that can be used for spatial (or tempo-
ral, or analogous) joins against L1/L2 tables. The latter implies that such
L3 table's columns must be in the same coordinate system and units as the
corresponding L1/L2 columns.
There is no requirement that Level 3 data products (L3DPs) are derived
from L1 or L2 other than that they be joinable with them. For instance,
a user might have a catalog of radio sources that they might want to bring
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
6 LEVEL 3 DATA PRODUCTS AND CAPABILITIES
58
into federation with the LSST catalogs. That can be thought of as a Level 3
Data Product as long as they have \LSST-ized" it by ensuring compatibility
of coordinate, time, measurement systems, etc. Nevertheless, we do expect
the majority of L3DPs to be derived from processed LSST data.
There could also be L3 image data products; for example, user-generated
coadds with special selection criteria or stacking algorithms (eg. the so-called
shift & stack algorithm for detecting moving objects).
Any L3DP may have access controls associated with it, restricting read
access to just the owner, to a list of people, to a named group of people, or
allowing open access.
The storage resources for L3DPs come out of the SRD requirement for
10% of LSST data management capabilities to be devoted to user processing.
In general, they are likely to be controlled by some form of a \space allo-
cation committee". Users will probably have some small baseline automatic
allocation, beyond which a SAC proposal is needed. The SAC may take into
account scienti?c merit, length of time for which the storage is requested,
and openness of the data to others, in setting its priorities.
It is to be decided whether users will be required to provide the code
and/or documentation behind their L3DPs, or whether the SAC may include
the availability of this supporting information in its prioritization. Obviously
if a user intends to make a L3DP public or publish it to a group it will be
more important that supporting information be available.
Level 3 data products that are found to be generally useful can be mi-
grated to Level 2. This is a fairly complex process that ultimately involves
the project taking responsibility for supporting and running LSST-style code
that implements the algorithm necessary to produce the data product (it's
not just relabeling an existing L3DP as L2). The project will provide neces-
sary support for such migrations.
6.2 Level 3 Processing Resources
These are project-owned computing resources located at the DACs. They are
available for allocation to all users with LSST data rights. They may be used
for any computation that involves the LSST data and advances LSST-related
science. The distinctive feature of these computing resources is that they are
located with excellent I/O connections to the image and catalog datasets at
Level 1 and Level 2. There may be other co-located but not project-owned,
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
6 LEVEL 3 DATA PRODUCTS AND CAPABILITIES
59
resources available at the LSST DACs
95
; their use is beyond the scope of this
document, except to note that reasonable provisions will be made to ensure
it is possible to use them to process large quantities of LSST data.
Level 3 processing resources will, at least, include systems that can carry
out traditional batch-style processing, probably similarly con?gured to those
LSST will be using for the bulk of data release production processing. It is
to be determined whether any other avors of hardware would be provided,
such as large-memory machines; this is likely to be driven by the project
need (or lack thereof) for such resources.
There will be a time allocation committee (TAC) for these resources.
Every LSST-data-rights user may get a small default allocation (enough to
run test jobs). Substantial allocations will require a scienti?c justi?cation.
Priorities will be based on the science case and, perhaps, also on whether the
results of the processing will be released to a larger audience. Requests must
specify what special avors of computing will be needed (e.g., GPUs, large
memory, etc.).
A fairly standard job control environment (like Condor), will be available,
and users will be permitted to work with it at a low, generic level. They
will not be required to use the higher levels of the LSST process control
middleware (but they may; see x 6.3).
These processing resources can be available for use in any clearly LSST-
related scienti?c work. It is not strictly required that they be used to process
LSST data, in this context. For instance, it could be acceptable to run
special types of cosmological simulations that are in direct support of an
LSST analysis, if the closeness to the data makes the LSST facility uniquely
suitable for such work. The TAC will take into account in its decisions
whether proposed work makes good use of the enhanced I/O bandwidth
available to LSST data on these systems.
6.3 Level 3 Programming Environment and Frame-
work
As a part of the Level 3 Programming Environment and Framework, the
LSST will make available the LSST software stack to users, to aid in the
analyses of LSST data. This includes all code implementing the core process-
95
For example, the U.S. DAC will be located at the National Petascale Facility building
at NCSA, adjacent to the Blue Waters supercomputer.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
6 LEVEL 3 DATA PRODUCTS AND CAPABILITIES
60
ing algorithms (image characterization and manipulation, building of coadds,
image di?erencing, object detection, object characterization, moving object
detection, etc.), the middleware necessary to run those codes at large scale,
as well as the LSST database management system.
These analyses could be done on LSST-owned systems (i.e., on the Level
3 processing resources) but also on a variety of supported external systems.
We will aim to support common personal Unix avors (for example, common
distributions of Linux and Mac OS X) as well as commonly used cluster and
HPC environments. The vision is to enable relatively straightforward use
of major national systems such as XSEDE or Open Science Grid, as well
as some common commercial cloud environments. The decision of which
environments to support will be under con?guration control and we will seek
advice from the user community. We cannot commit to too many avors. In-
kind contributions of customizations for other environments will be welcome
and may provide a role for national labs.
The Level 3 environment is intended, when put to fullest use, to allow
users to run their own productions-like runs on bulk image and/or catalog
data, with mechanisms for creating and tracking large groups of jobs in a
batch system.
The Level 3 environment, in asymptopia, has a great deal in common
with the environment that the Project will use to build the Level 2 data
releases. It is distinct, however, as supporting it as a tool meant for the
end-users imposes additional requirements:
? In order to be successful as a user computing environment, it needs to
be easy to use. Experience with prior projects
96
has shown that if the
production computing environment is not envisioned from the start as
being shared with users, it will likely evolve into an experts-only tool
that is too complicated, or too work-hardened, to serve users well.
? While it is desirable for the production computing to be portable to
Grid, cloud, etc. resources, this option might not be exercised in prac-
tice and could atrophy. For the user community, it's a far more central
capability. Early community engagement is therefore key to developing
and maintaining these capabilities.
? Not all the capabilities of the LSST production environment need nec-
essarily be exported to the users. LSST-speci?c capabilities associated
96
For example, BaBar.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
6 LEVEL 3 DATA PRODUCTS AND CAPABILITIES
61
with system administration, for instance, are not of interest to end-
users.
6.4 Migration of Level 3 data products to Level 2
? For the migration to be considered, the creator of the L3DP will need
to agree to make their data product public to the entire LSST data-
rights community, along with supporting documentation and code. The
code at ?rst need not be in the LSST framework or even in an LSST-
supported language.
? If the original proponent wrote her/his code in the C++/Python LSST
stack environment (the "Level 3 environment"), it will be easier to mi-
grate it to Level 2 (though, obviously, using the same languages/frameworks
does not guarantee that the code is of production quality).
? If the original code was written in another language or another data
processing framework, the project may consider rewriting it to required
LSST standards.
? Taking on a new Level 2 DP means that the project is committing to
code maintenance, data quality review, space allocation, and continuing
production of the new L2DP through DR11.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
7 DATA PRODUCTS FOR SPECIAL PROGRAMS
62
7 Data Products for Special Programs
LSST Survey Speci?cations (LSST SRD, x 3.4) specify that 90% of LSST
observing time will be spend executing the so-called \universal cadence".
These observations will result in Level 1 and 2 data products described earlier
in this document.
The remaining 10% of observing time will be devoted to special programs,
obtaining improved coverage of interesting regions of observational parameter
space. Examples include very deep (r ˘ 26, per exposure) observations,
observations with very short revisit times (˘1 minute), and observations of
\special" regions such as the Ecliptic, Galactic plane, and the Large and
Small Magellanic Clouds. A third type of survey, micro-surveys, that would
use about 1% of the time, may also be considered.
The details of these special programs or micro surveys are not yet de-
?ned
97
. Consequently, the speci?cs of their data products are left unde?ned
at this time. Instead, we just specify the constraints on these data products,
given the adopted Level 1/2/3 architecture. It is understood that no special
program will be selected that does not ?t these constraints
98
. This allows
us to size and construct the data management system, without knowing the
exact de?nition of these programs this far in advance.
Processing for special programs will make use of the same software stack
and computing capabilities as the processing for universal cadence. The
programs are expected to use no more than ˘10% of computational and
storage capacity of the LSST data processing cluster. When special products
include time domain event alerts, their processing shall generally be subject
to the same latency requirements as Level 1 data products.
For simplicity of use and consistency, the data products for special pro-
grams will be stored in databases separate from the \main" (Level 1 and 2)
databases. The system will, however, allow for simple federation with Level
1/2/3 data products (i.e., cross-queries and joins).
As a concrete example, a data product complement for a \deep drilling"
?eld designed for supernova discovery and characterization may consist of: i)
alerts to events discovered by di?erencing the science images against a special
deep drilling template, ii) a Level 1-like database iii) one or more \nightly
97
The initial complement is expected to be de?ned and selected no later than Science
Veri?cation.
98
Or will come with additional, external, funding, capabilities, and/or expertise.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
7 DATA PRODUCTS FOR SPECIAL PROGRAMS
63
co-adds" (co-adds built using the data from the entire night), produced and
made available within ˘ 24 hours, and iv) special deep templates, built us-
ing the best recently acquired seeing data, produced on a fortnightly cadence.
Note that the data rights and access rules apply just as they would for
for Level 1/2/3. For example, while generated event alerts (if any) will be
accessible world-wide, the image and catalog products will be restricted to
users with LSST data rights.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
8 APPENDIX: CONCEPTUAL PIPELINE DESIGN
64
8 Appendix: Conceptual Pipeline Design
A high-level conceptual overview of the LSST image processing science pipelines
is illustrated in Figure 2. The pipeline de?nitions presented here are driven
by their inputs, outputs and processing steps; they do not describe exact
boundaries in the actual implemention code, execution, or development re-
sponsibilities within the Project. Processing from pipelines marked with 1,
2, and 5-8 is executed every day when new data are taken to produce Level
1 Data Products. Annual Data Release processing includes pipelines 1-6
and 8 (everything except Alert production). These main conceptual steps in
LSST image processing include the following pipelines (enumeration in this
list corresponds to enumeration in Figure 2 but note that these steps can be
interleaved in the actual processing ow):
1. Single Visit Processing pipeline (Figure 3) produces calibrated and
characterized single-visit images from raw snaps. The main processing
steps include instrumental signature removal, background estimation,
source detection, deblending and measurements, point spread function
estimation, and astrometric and photometric calibration.
2. Image Coaddition pipeline (Figure 4) produces coadded images of dif-
ferent avors (optimized for depth, seeing, etc.) from an ensemble of
single-visit images.
3. Coadded Image Analysis pipeline (Figure 4) de?nes the Object list and
performs initial measurements on coadded images.
4. Multi-epoch Object Characterization pipeline (Figure 4) ?ts a library of
image models to a set of Footprints of an Object family, measures addi-
tional quantities (e.g., the pro?le surface brightness in a series of annuli)
not captured by those models, and performs Forced Photometry. All
these measurements are performed on single-visit images (direct and
di?erence images) and for all Objects.
5. Image Di?erencing pipeline (Figure 5) produces di?erence images from
a single-visit and coadded (template) images.
6. Di?erence Image Analysis pipeline (Figure 5) updates DIAObject and
SSObject lists with new DIASources detected on processed di?erence
image, ?ts a library of image models to Footprints of these DIASources,
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
8 APPENDIX: CONCEPTUAL PIPELINE DESIGN
65
and for all DIAObjects overlapping the di?erence image it performs
Forced Photometry and recomputes summary quantities. During nightly
Level 1 processing, this pipeline also performs Forced Photometry for
all new DIAObjects on di?erence images from the last 30 days.
7. Alert Generation and Distribution pipeline (Figure 5) uses updated
DIAObjects and DIASources to generate and distribute Alerts (which
also include postage stamp images of the DIASource in di?erence image
and coadded template image).
8. Moving Object Processing pipeline (MOPS, Figure 6)) combines all un-
associated DIASources into plausible SSObjects and estimates their
orbital parameters. The three main pipeline stages include associating
new DIASources with known SSObjects, discovering new SSObjects,
and orbit re?nement and management.
Further details about the pipeline design and implementation are avail-
able from the LSST document
99
LDM-151.
See http://ls.st/LDM-151
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
8 APPENDIX: CONCEPTUAL PIPELINE DESIGN
66
Figure 2: Illustration of the conceptual design of LSST science pipelines for
imaging processing.
Figure 3: Illustration of the conceptual algorithm design for Single Visit
Processing pipeline.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
8 APPENDIX: CONCEPTUAL PIPELINE DESIGN
67
Figure 4: Illustration of the conceptual algorithm design for Image Coad-
dition, Coadded Image Analysis, and Multi-epoch Object Characterization
pipelines.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
8 APPENDIX: CONCEPTUAL PIPELINE DESIGN
68
Figure 5: Illustration of the conceptual algorithm design for Image Di?er-
encing, Di?erence Image Analysis, and Alert Generation and Distribution
pipelines.
LSST Data Products Definition Document
LSE-163
9/26/2016
The contents of this document are subject to configuration control and may not be changed, altered, or their provisions
waived without prior approval.
8 APPENDIX: CONCEPTUAL PIPELINE DESIGN
69
Figure 6: Illustration of the conceptual algorithm design for the Moving
Object Processing Software pipeline.
Back to top