Large Synoptic Survey Telescope (LSST)
    Data Management Science Pipelines
    Design
    J.D. Swinbank, T. Axelrod, A.C. Becker, J. Becla, E. Bellm,
    J.F. Bosch, H. Chiang, D.R. Ciardi, A.J. Connolly,
    G.P. Dubois-Felsmann, F. Economou, M. Fisher-Levine, M. Graham, Ž.
    Ivezić, M. Jurić, T. Jenness, R.L. Jones, J. Kantor, S. Krughoff,
    K-T. Lim, R.H. Lupton, F. Mueller, D. Petravick, P.A. Price,
    D.J. Reiss, D. Shaw, C. Slater, M. Wood-Vasey, X. Wu, P. Yoachim,
    for the LSST Data Management
    LDM-151
    Latest Revision: 2017-05-19
    This LSST document has been approved as a Content-Controlled Document by the LSST DM Tech-
    nical Control Team. If this document is changed or superseded, the new document will retain
    the Handle designation shown above. The control is on the most recent digital document with
    this Handle in the LSST digital archive and not printed versions. Additional information may be
    found in the corresponding DM RFC.
    LARGE SYNOPTIC SURVEY TELESCOPE

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Abstract
    The LSST Science
    Requirements
    Document (the LSST
    SRD
    ) specifies a set of data
    product guidelines, designed to support science goals envisioned to be enabled
    by the LSST observing program. Following these guidlines, the details of these
    data products have been described in the LSST Data Products Definition Document
    (
    DPDD
    ), and captured in a formal
    flow-down
    from the
    SRD
    via the LSST System Re-
    quirements
    (
    LSR
    ), Observatory
    System Specifications
    (
    OSS
    ), to the Data Management
    System
    Requirements
    (
    DMSR
    ). The LSST Data Management subsystem’s responsi-
    bilities include the design, implementation, deployment and execution of software
    pipelines necessary to generate these data products. This document describes the
    design of the scientific aspects of those pipelines.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    ii

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Change Record
    VersionDate
    Description
    Owner name
    1
    2009-03-26 Initial version as Document-7396Tim Axelrod et al.
    1.2
    2009-03-27 Minor edits
    Tim Axelrod
    1.3
    2009-04-17 General edits and updatesTim Axelrod
    1.4
    2009-05-08 Explicit reference to multifit added to Section
    6.1
    Tim Axelrod
    1.5
    2010-02-11 General edits and updates; generated from
    SysML model
    Jeff Kantor
    2
    2011-08-04 Elevated to LDM handle; general updates and
    edits
    Tim Axelrod
    3
    2013-10-07 Updates for consistency withMarioFDRJuricbaseline
    2017-05-08 Major reorganization forMarioDM replanJuric
    4.0
    2017-05-19
    ApprovedRFC-338inand
    released. Mario Juric (approval),
    Tim Jenness (release)
    Document curator:
    J.D. Swinbank
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    iii

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Contents
    1 Preface
    1
    2 Introduction
    2
    2.1 LSST Data Management. . . . System. . . . . . . . . . . . . . . . . . . . . .2. .
    2.2 DataProducts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. . . .
    2.3 DataUnits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4. . . .
    2.4 Science Pipelines. . . Organization. . . . . . . . . . . . . . . . . . . . . . . .5. .
    3 AlertProduction
    7
    3.1 Single Frame Processing Pipeline. . . . . . . .(WBS. . . .02C.03.01). . . . .9
    3.1.1 Input.Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9. . .
    3.1.2 Output. .Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11. . . .
    3.1.3 Instrumental Signature. . . . . . . . Removal. . . . . . . . . . . . . .11. .
    3.1.4 PSF and background. determination. . . . . . . . . . . . . . . . . . . 12. .
    3.1.5 Sourcemeasurement. . . . . . . . . . . . . . . . . . . . . . . . . . .13. . .
    3.1.6 Photometric and Astrometric. . . . . . . . .calibration. . . . . . . . .14.
    3.2 Alert Generation Pipeline. . . . . .(WBS. . . .02C.03.04). . . . . . . . . . 15. .
    3.2.1 Input. Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15. . . .
    3.2.2 Output. .Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17. . . .
    3.2.3 Template Generation. . . . . . . . . . . . . . . . . . . . . . . . . . .17. . .
    3.2.4 Imagedifferencing. . . . . . . . . . . . . . . . . . . . . . . . . . . . .18. . .
    3.2.5 Source Association. . . . . . . . . . . . . . . . . . . . . . . . . . . . .19. . .
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    iv

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    3.3 Alert Distribution Pipeline. . . . . . . .(WBS. . . .02C.03.03). . . . . . . . 21. .
    3.3.1 Input. Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22. . . .
    3.3.2 Output. .Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22. . . .
    3.3.3 Alert postage stamp. . . . .generation. . . . . . . . . . . . . . . . .22. .
    3.3.4 Alert queuing and. . . persistance. . . . . . . . . . . . . . . . . . . .23. .
    3.4 Precovery and Forced Photometry. . . . . . . . . . . Pipeline. . . . . . . . . 25. .
    3.4.1 Input. Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26. . . .
    3.4.2 Output. .Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26. . . .
    3.4.3 Forced PhotometryDIAObjects.on. . all. . . . . . . . . . . . . . . . 26. .
    3.4.4 DIAObject Forced. . . Photometry:. . . . . . . . . . . . . . . . . . . .27. .
    3.5 Moving Object Pipeline. . . (WBS. . . . 02C.03.06). . . . . . . . . . . . . . 28. .
    3.5.1 Input. Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .28. . . .
    3.5.2 Output. .Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29. . . .
    3.5.3 Tracklet identification. . . . . . . . . . . . . . . . . . . . . . . . . . .29. . .
    3.5.4 Precovery and merging. . . . . of. . .tracklets. . . . . . . . . . . . 30. .
    3.5.5 Linking tracklets. . . and. . . . orbit. . . . . .fitting. . . . . . . . .30. .
    3.5.6 Globalprecovery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31. . .
    3.5.7 Prototype Implementation. . . . . . . . . . . . . . . . . . . . . . . . 32. . .
    4 Calibration Products Production
    34
    4.1 KeyRequirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34. . . .
    4.2 Inputs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34. . . . .
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    v

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    4.2.1 Bias Frames. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35. . . .
    4.2.2 GainValues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35. . . .
    4.2.3 Linearity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35. . . .
    4.2.4 Darks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35. . . .
    4.2.5 Crosstalk. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36. . . .
    4.2.6 Defect. .Map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36. . . .
    4.2.7 Saturation. . . .levels. . . . . . . . . . . . . . . . . . . . . . . . . .36. . .
    4.2.8 Broadband. . .Flats. . . . . . . . . . . . . . . . . . . . . . . . . . .36. . .
    4.2.9 Monochromatic. . . . . .Flats. . . . . . . . . . . . . . . . . . . . . .37. . .
    4.2.10 CBPData. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37. . . .
    4.2.11 Filter Transmission. . . . . . . . . . . . . . . . . . . . . . . . . . . . .38. . .
    4.2.12 Atmospheric Characterization. . . . . . . . . . . . . . . . . . . . . . .38. .
    4.3 Outputs from the Calibration Product Pipelines39
    == Inputs to the AP/DRP Pipelines
    4.3.1 Master. .Bias. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39. . . .
    4.3.2 Master. Darks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39. . . .
    4.3.3 Master.Linearity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40. . . .
    4.3.4 Master Fringe. . . .Frames. . . . . . . . . . . . . . . . . . . . . . .40. . .
    4.3.5 MasterGain. . . .Values. . . . . . . . . . . . . . . . . . . . . . . . .40. . .
    4.3.6 Master .Defects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41. . . .
    4.3.7 Saturation. . . .Levels. . . . . . . . . . . . . . . . . . . . . . . . . .41. . .
    4.3.8 Crosstalk. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41. . . .
    4.3.9 Master Impure Broadband. . . . . . . . . Flats. . . . . . . . . . . . .42. .
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    vi

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    4.3.10 Master Impure Monochromatic. . . . . . . . . . Flats. . . . . . . . . 43. .
    4.3.11 Master Pure Monochromatic. . . . . . . . . .Flats. . . . . . . . . . . 43. .
    4.3.12 Master PhotoFlats. . . . . . . . . . . . . . . . . . . . . . . . . . . . .43. . .
    4.3.13 Master Low-resolution. .narrow-band. . . . . . . . . . . .flats. . .44.
    4.3.14 Pixel. .Sizes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44. . . .
    4.3.15 Brighter-Fatter. . . .Coefficients. . . . . . . . . . . . . . . . . . . . 45. . .
    4.3.16 CTEMeasurement. . . . . . . . . . . . . . . . . . . . . . . . . . . . .45. . .
    4.3.17 Filter Transmission. . . . . . . . . . . . . . . . . . . . . . . . . . . . .45. . .
    4.3.18 Ghost catalog. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46. . . .
    4.3.19 Spectral.Standards. . . . . . . . . . . . . . . . . . . . . . . . . . . .47. . .
    4.3.20 Spectrophotometric. . . . . .Standards. . . . . . . . . . . . . . . . .47. .
    4.3.21 Astrometric. . Standards. . . . . . . . . . . . . . . . . . . . . . . . .47. . .
    4.4 CBPControl. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47. . . .
    4.5 Calibration Telescope. Input. . . . . .Calibration. . . . . . . . . . . . 48Data. .
    4.6 Calibration Telescope. . . . . . Output. . . . . . . Data. . . . . . . . . . . 49. . .
    4.6.1 Atmospheric. Absorption. . . . . . . . . . . . . . . . . . . . . . . . . 50. . .
    4.6.2 NightSkySpectrum. . . . . . . . . . . . . . . . . . . . . . . . . . . .52. . .
    4.7 Photometric calibration. . . . . . .walk-through. . . . . . . . . . . . . . . .52. .
    4.8 Prototype Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . .54. . .
    5 Data Release Production
    55
    5.1 Image Characterization. . . . and. . . .Calibration. . . . . . . . . . . . . .59. .
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    vii

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    5.1.1 BootstrapImChar. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .60. . .
    5.1.2 StandardJointCal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67. . .
    5.1.3 RefineImChar. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .69. . . .
    5.1.4 FinalImChar. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .70. . . .
    5.1.5 FinalJointCal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70. . . .
    5.2 Image Coaddition and Image. . . . . .Differencing. . . . . . . . . . . . . . 71. .
    5.2.1 WarpAndPsfMatch. . . . . . . . . . . . . . . . . . . . . . . . . . . . .74. . .
    5.2.2 BackgroundMatchAndReject. . . . . . . . . . . . . . . . . . . . . . . .75. .
    5.2.3 WarpTemplates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77. . . .
    5.2.4 CoaddTemplates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .78. . .
    5.2.5 DiffIm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .78. . . .
    5.2.6 UpdateMasks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79. . . .
    5.2.7 WarpRemaining. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80. . . .
    5.2.8 CoaddRemaining. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .80. . .
    5.3 CoaddProcessing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .81. . . .
    5.3.1 DeepDetect. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .81. . . .
    5.3.2 DeepAssociate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83. . . .
    5.3.3 DeepDeblend. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84. . . .
    5.3.4 MeasureCoadds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .85. . .
    5.4 OverlapResolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .86. . . .
    5.4.1 ResolvePatchOverlaps. . . . . . . . . . . . . . . . . . . . . . . . . . .86. . .
    5.4.2 ResolveTractOverlaps. . . . . . . . . . . . . . . . . . . . . . . . . . .88. . .
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    viii

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    5.5 Multi-Epoch Object. Characterization. . . . . . . . . . . . . . . . . . . . . . .89. .
    5.5.1 MultiFit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .90. . . .
    5.5.2 ForcedPhotometry. . . . . . . . . . . . . . . . . . . . . . . . . . . . .92. . .
    5.6 Postprocessing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .92. . . .
    5.6.1 MovingObjectPipeline. . . . . . . . . . . . . . . . . . . . . . . . . . .92. . .
    5.6.2 ApplyCalibrations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .93. . .
    5.6.3 MakeSelectionMaps. . . . . . . . . . . . . . . . . . . . . . . . . . . .94. . .
    5.6.4 Classification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95. . . .
    5.6.5 GatherContributed. . . . . . . . . . . . . . . . . . . . . . . . . . . . .96. . .
    6 Algorithmic Components
    97
    6.1 Reference Catalog. .Construction. . . . . . . . . . . . . . . . . . . . . . . 97. . .
    6.1.1 Alert Production Reference. . . . . . . . .Catalogs. . . . . . . . . . 97. .
    6.1.2 Data Release Production. .Reference. . . . . . . . . . Catalogs. . . 97.
    6.2 Instrument Signature. . . . . . .Removal. . . . . . . . . . . . . . . . . . . 98. . .
    6.2.1 ISR for Alert. .Production. . . . . . . . . . . . . . . . . . . . . . . .100. . .
    6.3 Artifact.Detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .100. . . .
    6.3.1 Cosmic Ray Identification. . . . . . . . . . . . . . . . . . . . . . . . .100. . .
    6.3.2 Optical. . ghosts. . . . . . . . . . . . . . . . . . . . . . . . . . . . .101. . . .
    6.3.3 Linear feature detection. . . . . . . .and. . . removal. . . . . . . .102. .
    6.3.4 SnapSubtraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . .102. . . .
    6.3.5 Warped Image Comparison. . . . . . . . . . . . . . . . . . . . . . . .103. . .
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    ix

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    6.4 Artifact Interpolation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .104. . . .
    6.5 Source Detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .104. . . .
    6.6 Deblending. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .105. . . . .
    6.6.1 Single Frame .Deblending. . . . . . . . . . . . . . . . . . . . . . . .107. . .
    6.6.2 Multi-Coadd. Deblending. . . . . . . . . . . . . . . . . . . . . . . . .108. . .
    6.7 Measurement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108. . . . .
    6.7.1 Drivers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .108. . . . .
    6.7.2 Algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112. . . .
    6.7.3 Blended Measurement. . . . . . . . . . . . . . . . . . . . . . . . . .120. . .
    6.8 Spatial. .Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .125. . . . .
    6.9 Background Estimation. . . . . . . . . . . . . . . . . . . . . . . . . . . . .126. . . .
    6.9.1 Single-Visit Background. . . . . . . Estimation. . . . . . . . . . . . .126. .
    6.9.2 Coadd Background. . . Estimation. . . . . . . . . . . . . . . . . . . 127. . .
    6.9.3 Matched Background. . . . .Estimation. . . . . . . . . . . . . . . . 127. . .
    6.10 Build Background. . Reference. . . . . . . . . . . . . . . . . . . . . . . . .127. . .
    6.10.1 PatchLevel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .127. . . .
    6.10.2 Tract.Level. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .128. . . .
    6.11 PSFEstimation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .128. . . . .
    6.11.1 Single CCD PSF Estimation. . . . . . . . . . . . . . . . . . . . . . . .129. . .
    6.11.2 Wavefront Sensor .PSF. . . Estimation. . . . . . . . . . . . . . . . .129. .
    6.11.3 Full Visit PSF. . .Estimation. . . . . . . . . . . . . . . . . . . . . . .130. . .
    6.12 ApertureCorrection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .131. . . .
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    x

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    6.13 Astrometric. . . Fitting. . . . . . . . . . . . . . . . . . . . . . . . . . . . .132. . . .
    6.13.1 Single. .CCD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .132. . . .
    6.13.2 Single. . .Visit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .132. . . .
    6.13.3 Joint Multi-Visit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .133. . . .
    6.14 Photometric. . . Fitting. . . . . . . . . . . . . . . . . . . . . . . . . . . . .133. . . .
    6.14.1 SingleCCD. .(for. . . .AP). . . . . . . . . . . . . . . . . . . . . . 133. . . .
    6.14.2 Single. . .Visit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .134. . . .
    6.14.3 Joint Multi-Visit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .134. . . .
    6.14.4 Large-Scale. . . .Fitting. . . . . . . . . . . . . . . . . . . . . . . . 135. . . .
    6.15 Retrieve Diffim Template. . . . .for. . . a. .Visit. . . . . . . . . . . . . 136. . .
    6.16PSFMatching. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .136. . . . .
    6.16.1 Image Subtraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . 136. . . .
    6.16.2 PSF Homogenization. . .for. . . Coaddition. . . . . . . . . . . . . .138. .
    6.17ImageCoaddition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .138. . . .
    6.18 DCR-Corrected Template. . . . .Generation. . . . . . . . . . . . . . . . . 139. . .
    6.18.1 Generating a DCR Corrected. . . . . . . Template. . . . . . . . . . . 139. .
    6.19 ImageDecorrelation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .141. . . .
    6.19.1 Difference Image Decorrelation. . . . . . . . . . . . . . . . . . . . . 141. . .
    6.19.2 Coadd Decorrelation. . . . . . . . . . . . . . . . . . . . . . . . . . . 141. . . .
    6.20 Star/Galaxy Classification. . . . . . . . . . . . . . . . . . . . . . . . . . . . 142. . . .
    6.20.1 SingleFrame. . . . .S/G. . . . . . . . . . . . . . . . . . . . . . . .142. . . .
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    xi

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    6.20.2 Multi-Source. . . . . S/G. . . . . . . . . . . . . . . . . . . . . . . .142. . . .
    6.20.3ObjectClassification. . . . . . . . . . . . . . . . . . . . . . . . . . . 142. . . .
    6.21 Variability Characterization. . . . . . . . . . . . . . . . . . . . . . . . . . . 143. . . .
    6.21.1 Characterization of. Periodic. . . . . . . . . .Variability. . . . . . . 144. .
    6.21.2 Characterization of. Aperiodic. . . . . . . . . . Variability. . . . . . 145. .
    6.22 Proper Motion andDIASourcesParallax. . . . . from. . . . . . . . . . . . . .147. .
    6.23 Association .and. . . Matching. . . . . . . . . . . . . . . . . . . . . . . . 147. . . .
    6.23.1 Single CCD to Reference. . Catalog,. . . . . . . . .Semi-Blind. . . . .148.
    6.23.2 Single Visit to Reference. . . . . Catalog,. . . . . . . . .Semi-Blind. .148.
    6.23.3 Multiple Visits to. . Reference. . . . . . . . . . .Catalog. . . . . . .148. .
    6.23.4DIAObjectGeneration. . . . . . . . . . . . . . . . . . . . . . . . . . .149. . .
    6.23.5ObjectGeneration. . . . . . . . . . . . . . . . . . . . . . . . . . . . 149. . . .
    6.23.6 Blended Overlap. . . Resolution. . . . . . . . . . . . . . . . . . . . 149. . .
    6.24 Raw Measurement. .Calibration. . . . . . . . . . . . . . . . . . . . . . . .150. . .
    6.25 Ephemeris Calculation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .150. . . .
    6.26 MakeTracklets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151. . . . .
    6.27 Attribution .and. . . Precovery. . . . . . . . . . . . . . . . . . . . . . . . 152. . . .
    6.28 Orbit.Fitting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .153. . . . .
    6.29 OrbitMerging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .153. . . . .
    7 Software Primitives
    154
    7.1 CartesianGeometry. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .154. . . .
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    xii

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    7.1.1 Points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .154. . . . .
    7.1.2 Arraysof. . Points. . . . . . . . . . . . . . . . . . . . . . . . . . . .155. . . .
    7.1.3 Boxes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .155. . . . .
    7.1.4 Polygons. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156. . . . .
    7.1.5 Ellipses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156. . . . .
    7.2 Spherical. Geometry. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .156. . . .
    7.2.1 Points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .157. . . . .
    7.2.2 Arrays.of. .Points. . . . . . . . . . . . . . . . . . . . . . . . . . .157. . . .
    7.2.3 Boxes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .157. . . . .
    7.2.4 Polygons. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158. . . . .
    7.2.5 Ellipses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .158. . . . .
    7.3 Images. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .158. . . . .
    7.3.1 Simple.Images. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .158. . . .
    7.3.2 Masks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .159. . . . .
    7.3.3 MaskedImages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159. . . .
    7.3.4 Exposure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160. . . . .
    7.4 Multi-Type Associative. . . . . . .Containers. . . . . . . . . . . . . . . . . 161. . .
    7.5 Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .161. . . . .
    7.5.1 Source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .161. . . . .
    7.5.2 Object. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .161. . . . .
    7.5.3 Exposure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162. . . . .
    7.5.4 AmpInfo. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162. . . . .
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    xiii

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    7.5.5 Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .162. . . .
    7.5.6 Joins. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .162. . . . .
    7.5.7 Queries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162. . . . .
    7.5.8 N-Way Matching. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163. . . .
    7.6 Footprints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .163. . . . .
    7.6.1 PixelRegions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .164. . . .
    7.6.2 Functors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164. . . . .
    7.6.3 Peaks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .164. . . . .
    7.6.4 FootprintSets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .164. . . .
    7.6.5 HeavyFootprints. . . . . . . . . . . . . . . . . . . . . . . . . . . . .165. . . .
    7.6.6 Thresholding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .165. . . .
    7.7 BasicStatistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .165. . . . .
    7.8 Chromaticity. . . .Utilities. . . . . . . . . . . . . . . . . . . . . . . . . . .165. . . .
    7.8.1 Filters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .165. . . . .
    7.8.2 SEDs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .166. . . . .
    7.8.3 Color.Terms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .166. . . .
    7.9 PhotoCalib. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .166. . . . .
    7.10 Convolution. .Kernels. . . . . . . . . . . . . . . . . . . . . . . . . . . . .167. . . .
    7.11 Coordinate Transformations. . . . . . . . . . . . . . . . . . . . . . . . . . .167. . .
    7.12 Numerical Integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .168. . . .
    7.13 Random Number Generation. . . . . . . . . . . . . . . . . . . . . . . . . . .168. . .
    7.14 Interpolation and Approximation. . . . . . . . . . of. . .2-D. . .Fields. . 168. .
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    xiv

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    7.15 Common Functions and. .Source. . . . . . Profiles. . . . . . . . . . . . . 169. . .
    7.16 Camera Descriptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .169. . . .
    7.17 Numerical Optimization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169. . . .
    7.18 MonteCarlo.Sampling. . . . . . . . . . . . . . . . . . . . . . . . . . . . .169. . . .
    7.19 Point-Spread. Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . .170. . . .
    7.20Warping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .170. . . . .
    7.21 FourierTransforms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .170. . . .
    7.22 Tree Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170. . . . .
    8 Glossary
    171
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    xv

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Data Management Science Pipelines Design
    1 Preface
    The purpose of this document is to describe the design of pipelines belonging to the Applica-
    tions Layer of the Large Synoptic Survey Telescope (LSST) Data Management system. These
    include most of the core astronomical data processing software that LSST employs.
    The intended audience of this document are LSST software architects and developers. It
    presents the baseline architecture and algorithmic selections for core DM pipelines, devel-
    oped to a degree necessary to enable planning and costing of the pipelines assuming an Agile
    software development framework. The document assumes the reader/developer has the re-
    quired knowledge of astronomical image processing algorithms and solid understanding of
    the state of the art of the field, understanding of the LSST Project goals and concepts, and
    has read the LSST
    Science
    Requirements (
    SRD
    ) as well as the LSST Data Products Definition
    Document
    (
    DPDD
    ).
    Though under strict change
    living
    control,
    document
    . Firstly,this isasaa consequence of the
    “rolling wave” LSST software development model, the designs presented in this document will
    be refined and made more detailed as particular pipeline functionality is about to be imple-
    mented. Secondly, the LSST will undergo a period of construction and commissioning lasting
    no less than seven years, followed by a decade of survey operations. To ensure their contin-
    ued scientific adequacy, the overall designs and plans for LSST data processing pipelines will
    be periodically reviewed and updated.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    1

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    2 Introduction
    2.1 LSST Data Management System
    To carry out this mission the Data Management System (DMS) performs the following major
    functions:
    • Processes the incoming stream of images generated by the camera system during ob-
    serving to produce transient alerts and to archive the raw images.
    • Roughly once per year, creates and archives a Data Release (“DR”), which is a static self-
    consistent collection of data products generated from all survey data taken from the
    date of survey initiation to the cutoff date for the Data Release. The data products (de-
    scribed in
    detail
    in the
    DPDD
    ), include measurements of the properties (shapes, posi-
    tions, fluxes, motions, etc.) of all detected objects, including those below the single visit
    sensitivity limit, astrometric and photometric calibration of the full survey object cata-
    log, and limited classification of objects based on both their of the full survey area are
    produced as well.
    • Periodically creates new calibration data products, such as bias frames and flat fields,
    that will be used by the other processing functions, as necessary to enable the creation
    of the data products above.
    • Makes all LSST data available through interfaces that utilize, to the maximum possible
    extent, community-based standards such as those being developed by the Virtual Ob-
    servatory (“VO”), and facilitates user data analysis and the production of user-defined
    data products at Data Access Centers (“DAC”) and at external sites.
    This document discusses the role of the Science Pipelines software in the first three func-
    tions listed above. The fourth is discussed separately in the SUI Conceptual Design Document
    (
    SUID
    ).
    The overall architecture of the DMS is discussed in more detail in the Data Management Sys-
    tem
    Design
    (
    DMSD
    ) document.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    2

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    2.2 Data Products
    The LSST data products are organized into three groups, based on their intended use and/or
    origin. The full description is provided
    in the
    Data Products Definition Document (
    DPDD
    ); we
    summarize the key properties here to provide the necessary context for the discussion to
    follow.
    Level
    products
    1
    are intended to support timely detection and follow-up of time-domain
    events (variable and transient sources). They are generated by near-real-time process-
    ing the stream of data from the camera system during normal observing. Level 1 prod-
    ucts are therefore continuously generated and / or updated every observing night. This
    process is of necessity highly automated, and must proceed with absolutely minimal
    human interaction. In addition to science data products, a number of related Level 1
    “SDQA”
    1
    data products are generated to assess quality and to provide feedback to the
    Observatory Control System (OCS).
    Level
    products
    2
    are generated as part of a Data Release, generally performed yearly,
    with an additional data release for the first 6 months of survey data. Level 2 includes
    data products for which extensive computation is required, often because they combine
    information from many exposures. Although the steps that generate Level 2 products
    will be automated, significant human interaction may be required at key points to ensure
    the quality of the data.
    Level
    products
    3
    are generated on any computing resources anywhere and then stored
    in an LSST Data Access Center. Often, but not necessarily, they will be generated by
    users of LSST using LSST software and/or hardware. LSST DM is required to facilitate the
    creation of Level 3 data products by providing suitable APIs, software components, and
    computing infrastructure, but will not by itself create any Level 3 data products. Once
    created, Level 3 data products may be associated with Level 1 and Level 2 data products
    through database federation. Where appropriate, the LSST Project, with the agreement
    of the Level 3 creators, may incorporate user-contributed Level 3 data product pipelines
    into the DMS production flow, thereby promoting them to Level 1 or 2.
    Level 1 and Level 2 data products that have passed quality control tests will be made accessi-
    ble to the data rights holders on a cadence determined by the operations policy. Additionally,
    1
    Science Data Quality Analysis
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    3

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    the source code used to generate these products will be made available to enhance repro-
    ducibility and insight into the implementation of algorithms employed. The LSST will provide
    documentation, and a list of reference platforms on which the software is expected to build
    and execute.
    The pipelines used to produce these public data products will also produce many intermediate
    data products that may not be made publicly available (generally because they are fully super-
    seded in quality by a public data product). Intermediate products may be important for QA,
    however, and their specification is an important part of describing the pipelines themselves.
    2.3 Data Units
    In order to describe the components of our processing pipelines, we first need standard
    nomenclature for the units of data the pipeline will process.
    The smallest data units are those corresponding to individual astrophysical entities. In keep-
    ing with LSST conventions, we use “object” to refer to the astrophysical entity itself (which
    typically implies aggregation of some sort over all exposures), and “source” to refer to the re-
    alization of an object on a particular exposure. In the case of blending, of course, these are
    just our best attempts to define distinct astrophysical objects, and hence it is also useful to
    define terms that represent this process. We use “family” to refer to group of blended objects
    (or, more rarely, sources), and “child” to refer to a particular deblended object within a family.
    A “parent” is also created for each family, representing the alternate hypothesis that the blend
    is actually a single object. Blends may be hierarchical; a child at one level may be a parent at
    the level below.
    LSST observations are taken as a pair of 15-second “snaps”; together these constitute a “visit”.
    Because snaps are typically combined early in the processing (and some special programs
    and survey modes may take only a single snap), visit is much more frequently used as a unit
    for processing and data products. The image data for to a visit is a set of 189 “CCD” or “sensor”
    images. CCD-level data from the camera is further data divided across the 16 amplifiers within
    a CCD, but these are also combined at?.3 anCCDearly“rafts”stage,thatandplaythean3
    important role in the hardware design are relatively unimportant for the pipeline. This leaves
    visit and CCD the main identifiers of most exposure-level data products and pipelines.
    Our convention for defining regions on the sky is deliberately vague; we hope to build a code-
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    4

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    base capable of working with virtually any pixelization or projection scheme (though different
    schemes may have different performance or storage implications). Our approach involves
    two region concepts: “tracts” and “patches”. A tract is a large region with a single Cartesian
    coordinate system; we assume it is larger than the LSST field of view, but its maximum size is
    essentially set by the point at which distortion in the projection becomes significant enough
    to affect the processing (by e.g. breaking the assumption that the PSF is well-sampled on
    the pixel grid). Tracts are divided into patches, all of which share the tract coordinate sys-
    tem. Most image processing is perfomed at the patch level, and hence patch sizes are chosen
    largely to ensure that patch-level data products and processing fit in memory. Both tracts
    and patches are defined such that each region overlaps with its neighbors, and these over-
    lap regions must be large enough that any individual astronomical object is wholly contained
    in at least one tract and patch. In a patch overlap region, we expect pixel values to be nu-
    merically equivalent (i.e. equal up to floating point round-off errors) on both sides; in tract
    overlaps, this is impossible, but we expect the results to be scientifically consistent. Selecting
    larger tracts and patches thus reduces the overall fraction of the area that falls in overlap
    regions and must be processed multiple times, while increasing the computational load for
    processing individual tracts and patches.
    2.4 Science Pipelines Organization
    LSST data processing needs may be broken down into three major activities: Alert Produc-
    tion, Calibration Products Production, and3,Data4, and5Release,
    Production. In sections
    respectively, we describe breaking
    pipelines
    these. In downthis intodocument,constituent
    a pipeline is a high-level combination of algorithms that is intrinsically tied to its role in the
    production in which it is run. For instance, while both Alert Production and Data Release
    Production will include a pipeline for single-visit
    distinct
    ,
    processing, these two pipelines are
    because the details of their design depend very much on the context in which they are run.
    Pipelines are largely composed of Algorithmic Components: mid-level algorithmic code that
    we expect to reuse (possibly with different configuration) across different productions. These
    components constitute the bulk of the new code and algorithms to be developed for Alert
    Production and Data Release Production,6. Mostand arealgorithmicdiscussed in section
    components are applicable to any sort of astronomical imaging data, but some will be cus-
    tomized for LSST.
    The lowest level in this breakdown is made up of our shared software primitives: libraries
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    5

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    that provide important data structures and low-level algorithms, such as images, tables, co-
    ordinate transformations, and nonlinear optimizers. Much (but not all) of this content is
    astronomy-related, but essentially none of it is specific to LSST, and hence we can and will
    make use of third-party libraries whenever possible. These primitives will also make it easier
    to access and process Level 1 and Level 2 data products within the Notebook aspect of the
    LSST Science Platform and associated computing services, as they constitute the program-
    matic representation of those data products. Shared software primitives are discussed in
    section7.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    6

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    3 Alert Production
    Alert Production is run each night to produce catalogs and images for sources that have varied
    or moved relative to a previous observation. The data products produced by Alert production
    are givenTablein. 2
    Name
    AvailabilityDescription
    DIASource
    Stored Measurements from difference image analysis of individual ex-
    posures.
    DIAObject
    Stored Aggregate quantities computed by associating spatially colo-
    catedDIASources.
    DIAForcedSourceStored Flux measurements on each difference image at the position of
    aDIAObject.
    SSObject
    Stored Solar system objects derivedDIASourcesandby associatinginfer-
    ring their orbits.
    CalExp
    Stored Calibrated exposure images for each CCD/visit (sum of two
    snaps) and associated metadata (e.g. WCS and estimated back-
    ground).
    TemplateCoaddTemporaryDCR corrected template coadd.
    DiffExp
    Stored Difference between CalExp and PSF-matched template coadd.
    VOEvent
    Stored Database of VOEvents as streamed from the Alert Production
    Tracklets PersistedIntermediate data productSSObjectsfor thegen-generation of
    erated by linking moving sources within a given night
    TableTable2:
    of derived and persisted data products produced during Alert Production. A
    detailed description of these data products can be found in the Data Products Definition
    DocumentLSE-163[].
    Alert Production is designed as five separate components: single frame processing, alert gen-
    eration, alert distribution, precovery photometry, and a moving objects pipeline. The first four
    of these components run as a linear pass through of the data. The moving objects pipeline
    is run independently of the rest of the alert production. The flow of information through this
    system is shownFigure. in1
    In this document we do not address estimation of the selection function for alert generation
    through the injection of simulated sources. Such a process could be undertaken in batch
    mode as part of the DRP. Source detection thresholds can be estimated through the use of
    sky sources (PSF photometry measurements positioned in areas of blank sky).
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    7

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    FigureThe1:
    alert production flow of data through the processing pipelines (single frame
    processing, alert generation, alert distribution, precovery photometry)
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    8

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    3.1 Single Frame Processing Pipeline (WBS 02C.03.01)
    The Single Frame Processing (SFM)2) isPipelineresponsible(see forFigurereducing raw
    or camera-corrected
    calibrated
    image data, the
    exposures
    to detection and measurement of
    Sources(using
    the components functionally part of the Object Characterization Pipeline), the
    characterization of the point-spread-function (PSF), and the generation of an astrometric so-
    lution for an image. Calibrated exposures produced by the SFM pipeline must possess all
    information necessary for measurement of source properties by single-epoch Object Charac-
    terization algorithms.
    Astrometric and photometric calibration requires the detection and measurement of the prop-
    ertiesSourcesofon a CCD. Accurate centroidsSourcesandrequirefluxes forantheseestima-
    tion of the PSF and background, which in turn requires knowledge of the positions of the
    Sourceson
    an image. The SFM pipeline will, therefore, iterate over background estimation
    (see3.1.4) and source measurement3.1.5)
    (see
    The SFM pipeline will be implemented as a flexible framework where new processing steps
    can be added without modifying the stack code (this would include the ability to process non-
    crosstalk corrected images should a network outage between the base and processing center
    result in only the raw data being available). The pipeline, or a subset of the pipeline, should
    be capable of being run at the telescope facility during commissioning and operations.
    3.1.1 Input Data
    Raw Camera Images:
    Amplifier images that have been corrected for crosstalk and bias by
    the camera software. All images from a visit should be available to the task (including snaps).
    An approximate WCS is assumed to be available as metadata derived from the Telescope
    Control System with an absolute pointing uncertainty
    OSS-REQ-0298
    (for a full focal plane) of 2 arcseconds
    absPointErr
    and the field rotation known to an accuracyLTS-206].
    of 32 arcseconds [
    Reference Database:
    A full-sky astrometric and photometric reference catalog of stars de-
    rived either from an external dataset (e.g. Gaia) or from the Data Release Processing. Given
    the current Gaia data release timeline the initial reference catalog is expected to have an as-
    trometric uncertainty?=
    ?1?/?6milliarcsecondsof
    and a photometric?=20 millimaguncertainty of
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    9

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    FigureSingle2:
    frame processing of the nightly data: instrument signature removal, as-
    trometric and photometric calibration, background and PSF estimation from the cross-talk
    corrected camera images.
    (for?½ a?>
    ?2?:G2V
    star). The expected release of these calibration catalogs is 2018 and will be
    derived from the Gaia spectrophotometric observations of non-variable sources.
    Calibration
    Flat-field
    Images:
    calibration images for all passbands and all CCDs appropri-
    ate for the time at which the observations were undertaken. No corrections will be made
    in the flat-fields for non-uniform pixel sizes - the flat-fields will correct to a common surface
    brightness. A flat SED will be assumed for all flat field corrections. Fringe frame calibration
    images scaled to an amplitude derived from the sky background (i.e. no sky spectrum will be
    available).
    Image Metadata:
    List of the positions and extents of CCD defects for all CCDs within the
    focal plane; electronic parameters for all CCDs (saturation limits, readnoise parameters), elec-
    tronic and physical footprint for the CCDs, linearity functions, models for the variation in the
    PSF width with source brightness (brighter-fatter), and parameterized models for a component-
    based WCS (e.g. a series of optical distortion models) as needed.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    10

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    3.1.2 Output Data
    CalExp Images:
    A calibrated exposureExposure(CalExp)object.isThean CalExp contains
    the image pixel values, a variance image, a bitwise mask, a representation of the PSF, the WCS
    (possibly decomposed into separable components), a photometric calibration object, and a
    model for the background. For the alert production, it is not anticipated that a model of the
    per-pixel covariance will be persisted but this will be revisited dependent on the performance
    of image subtraction and anomaly characterization3.2.
    as described in
    Source Databases:
    A catalogSourcesofwith measured features3.1.5. described in
    OCS Database
    A parameterization of the PSF, WCS, photometric zeropoint, and depth for
    each CCD in a visit. The PSF may be a simplified version (e.g. a single Gaussian) of that derived
    for the Alert production. These data will be made available to the Telescope Control System
    (TCS) to assess the success of each observation. A limited version of nightly SFM could be run
    on the summit to generate this information or the data will be persisted within a database at
    the data center that will be accessible to the TCS.
    3.1.3 Instrumental Signature Removal
    Instrumental Signature Removal characterizes, corrects, interpolates and flags the camera (or
    raw) amplifier images to generate a flat-fielded and corrected full CCD exposure.
    Pipeline Tasks
    • Mask the image defects at the amplifier level based on the CCD defect lists, and the per
    CCD saturation limits
    • Assemble the amplifiers into a single frame (masking missing amplifiers)
    • Apply full frame corrections: dark current correction, flat field to preserve surface bright-
    ness, fringe corrections. Flat fields will assume a flat spectral energy distribution (SED)
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    11

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    for the source. Fringe frames will be normalized by fitting to the observed sky back-
    ground.
    • Apply pixel level corrections: apply a correction model for brighter-fatter to homogenize
    the PSF, correct for static pixel size effects based on a model
    • Interpolate across defects and saturated pixels assuming a model for the PSF (with a
    nominal FWHM). An estimate of the PSF will be needed for this operation (from the TC-
    S/OCS) or interpolation may be needed to3.1.4be. performed at the end of
    • Apply a cosmic ray detection algorithm6.3.1
    as described in
    • Generate a summed and difference image from the individual snaps propagating the
    union of the mask pixels in each snap
    Dependent on the properties of the delivered LSST image quality for 15 second snaps it may be
    required to model any bulk motion between snaps prior to combination (e.g. if dome seeing
    or the ground layer dominate the lower order components of the seeing).
    3.1.4 PSF and background determination
    Given exposures that have been processed throughSourcesInstrument Signature Removal,
    must be detected to determine the astrometric and photometric calibration of the images.
    As noted previously an iterative procedure will be adopted to generate an estimate of the
    background and PSF, and to characterize the properties of the detected sources. Convergence
    criteria for this procedure are not currently defined. The default implementation assumes
    three iterations.
    Pipeline
    The
    Tasks
    iterative process for PSF and background estimation comprises,
    • Background estimation on the scale of6.9a, whichsingledividesCCD is as described in
    the CCD into subregions and estimates the background using a robust mean from non-
    source pixels.
    • Subtraction of the background and the detection6.5. The of sources as described in
    initial detection threshold?hfor, with?hsourceestimateddetectionfrom variancewill be 5
    image plane.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    12

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    • Measurement of the properties of3.1.5the). Dependentdetected sourceson the (see
    density of sources it may be necessary to deblend6.6
    the images as described in
    • Selection of isolated PSF candidate stars based on a signal-to-noise threshold (default
    50?h). This threshold is significantly deeper than the magnitude limit for Gaia astromet-
    ric catalogs but is the threshold at which the astrometric error on the centroid due to
    photon noise is less than 10 mas and the photometric noise is less than 2% for the case
    of the use of a deeper DRP derived reference catalog.
    • Single CCD PSF determination using6.11.1theandtechniquesthe selecteddescribed in
    bright sources
    • Masking of source pixels within the CCDSources(growingto maskthe footprint of the
    the outer regionsSourceprofilesof the will likely be required to exclude contributes to
    the background from low surface brightness features).
    The default expectation is that all tasks within this procedure would iterate until convergence.
    There maybe significant speed optimizationsSourcedetectionto be gained by excluding the
    step after an initial detection if the number of sources does not change significantly with up-
    dates to the background model.
    3.1.5 Source measurement
    For theSourcecatalog generated3.1.4, sourcein properties are measured using a subset of
    features described6.7. Sourceinmeasurement is forSourceallcatalogsourcesandwithin the
    not just the bright subset used to calibrate the PSF. We anticipate using the following plugin
    algorithms
    Sourcewithinmeasurementthe
    step,
    • Centroids based on a static6.7.2andPSF6.7.2model)
    fit (see
    • Aggregation of pixel flags6.7.2as described in
    • Aperture Photometry6.7.2(butas gevenonly infor one or two radii)
    • PSF photometry6.7.2givenassumingin a static PSF model fit
    • An aperture correction estimated assuming a static PSF model and measurement of the
    curve of growth for detected6.12sources as given in
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    13

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    3.1.6 Photometric and Astrometric calibration
    Photometric and astrometric calibration entails a “semi-blind” cross match (because the point-
    ing of the telescope is known to an accuracy of 2 arcseconds) of a reference catalog derived
    either fromObjectstheorDRPfrom an external3.1.1catalog), the generation(see
    of a WCS
    (on the scale of a CCD or full focal plane), and the generation of a photometric zeropoint (on
    the scale of a CCD). These algorithms must degrade gracefully for the case of larger pointing
    errors (e.g. during the initial calibration of the system during commissioning) and may need to
    operate in a “blind” mode where the pointing and orientation of the telescope is not known.
    Pipeline
    The
    Tasks
    photometric and astrometric calibration is expected to be performed at
    the scale of a single CCD. It is possible that the calibration process will need to be extended to
    larger scales (up to a full focal plane) if there is significant structure in the photometric zero
    point, or if astrometric distortions cannot be calibrated at the scale of the CCD with sufficient
    accuracy (i.e. the astrometric distortions do not dominate the false positives in the image
    subtraction). A full focal plane level calibration strategy will introduce synchronization points
    within the processing of the CCDs as the detections on all CCDs will need to be aggregated
    prior to the astrometric fit.
    The procedures used to match and calibrate the data are,
    • CCD level source association between the DRP reference catalog (or external catalog)
    andSourcesdetected during the PSF and background estimation stage will use a simpli-
    fied Optimistic B approach6.23.1. Givendescribedan astrometricin
    ?= ?1?/?6accuracy
    of
    milliarcseconds from external
    ?½catalogs?> ?2?:G2V
    star)suchorasanGaiaac- (for a
    curacy?=
    ?6?1ofmilliarcseconds
    for the DRP catalogs the search radii for sources will be
    dominated by the uncertainties in the pointing of the telescope and the rotation angle
    of the camera.
    • Generationofaphotometricsolutiononthe6.14.1scaleofasingleCCDasdescribedin
    • Fitting of a WCS astrometric model for a single6.13.1. CCD using the algorithms given in
    The WCS model is expected to be composed of a sum of transforms or astrometric com-
    ponents (e.g. a optical model for the telescope, a lookup table or model for sensor effects
    such as tree rings).
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    14

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    • Persistance of the astrometric, PSF, and photometric solutions for possible use by the
    Telescope Control system3.1.2)
    (TCS) (see
    Given the number of stars available on a CCD or the complexity of the astrometric solutions
    for the LSST (e.g. the decomposition of the WCS into components) it may be necessary that
    the astrometric and photometric solutions be performed for a full focal plane and not just a
    CCD. For these cases the algorithms used6.23.2will), singlebe singlevisitvisit matching (see
    photometric solutions6.14.1), and single(see
    visit astrometric6.13.2). Fittingfitsto(seea full
    focal plane introduces a synchronization point in the alert processing where all CCDs must
    have completed their previous processing steps prior to the astrometric calibration.
    Astrometric and photometric solutions within crowded fields will utilize the bright and easily
    isolated sources within a CCD image. The order of the WCS used in the astrometric fits will,
    therefore, depend on the numberSourcesthatof calibrationare available.
    3.2 Alert Generation Pipeline (WBS 02C.03.04)
    The Alert Generation pipeline identifies variable, moving, and transient sources within a cal-
    ibrated exposure by subtracting a deeper3). TheDIASourcestemplatede- image (see Figure
    tected on a DiffExp are associatedDIAObjectsandSSObjectswith known(that have been prop-
    agated to the date of the CalExp exposure) and their properties measured. The process for
    image differencing requires the creation or retrieval of a TemplateCoadd, the matching of the
    astrometry and PSF of the TemplateCoadd to a CalExp, and subtracting the template image
    from the CalExp.DIASourcesSpuriouswill be removed using morphological and environment
    based classification algorithms.
    The Alert Generation pipeline is required toDIASourcedifference, and detect and characterize
    sources within 24s (allowing for multiple cores and multithreading of the processing).
    3.2.1 Input Data
    CalExp Images:
    Calibrated exposure processed3.1with associatedthroughWCS, PSF, mask,
    variance, and background estimation.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    15

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    FigureGeneration3:
    of alerts from the nightly data: image differencing and measurement
    of the propertiesDIASourcesof, identificationthe
    and filtering of spurious events, association
    of previouslyDIAObjectsdetectedandSSObjectswith the newlyDIASourcesdetected.
    Coadd Images:
    TemplateCoadd images that spatially overlap with the CalExp images pro-
    cessed through3.1. This coadded image is optimized for image subtraction and is expected
    to be characterized in terms of a tract/patch/filter. Generation of this template may account
    for differential chromatic refraction or be generated for a limited range of airmass, seeing,
    and parallactic angles.
    Object Databases:
    Objectsthat
    spatially overlap with the CalExp images processed through
    3.1. ThisObjectcatalog will provide the source list for determining nearest neighbors to the
    detectedDIASources.
    DIAObject Databases:
    DIAObjectsthat
    spatially overlap with the CalExp images processed
    through3.1. ThisDIAObjectcatalog will provide the associationDIA-
    list against which the
    Sourceswill
    be matched.
    SSObject Databases:
    TheSSObjectlist at the time of theSSObjectobservation.posi-
    The
    tions will be propagated to the date of the CalExp observations and will provide an associa-
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    16

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    tion list for cross-matchingDIASourcesagainstto identifythe detectedknown Solar System
    objects.
    Reference classification
    Classification
    catalogs:
    DIASourcesofbased
    on their morpholog-
    ical features (and possibly estimates of the local density or environment associated with the
    DIASource)
    will be undertaken prior to association in order to reduce the number of false pos-
    itives. The data structures that define these classifications will be required as an input to this
    spuriousness analysis.
    3.2.2 Output Data
    DiffExp Images:
    Image differences derived by subtracting a TemplateCoadd from a CalExp
    image.
    DIASource Databases:
    DIASourcesdetected
    and measured from the DiffExps using the set
    of parameters
    described
    in
    DPDD
    will be persisted.
    DIAObject Databases:
    DIASourcewill
    be associatedDIAObjectswithandexistingpersisted.
    NewDIASource(i.e. those not associated) will generateDIAObject.
    a new instance of a
    3.2.3 Template Generation
    Template generation requires the6.15)creationof a TemplateCoaddor retrievalthat(see
    is matched to the position and spatial extent of the input CalExp. Generation of the Tem-
    plateCoadd could be from a persisted Coadd that was generated from CalExp exposures with
    comparable (within a predefined tolerance) airmass and parallactic angles, or from a model
    that corrects for the effect of differential6.18). It chromaticis expectedrefractionthat
    (see
    these operations would be undertaken on a CCD level but for efficiency the TemplateCoadd
    might be returned for a full focal
    patches
    orplane
    tract
    a .
    or a series of
    Pipeline Tasks
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    17

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    • Query for a TemplateCoadd images that are within a given time interval of the CalExp
    (default 2 years) of the current CCD image, and are within a specified airmass and paral-
    lactic angle.
    • (optional) Derive a seeing and DCR corrected TemplateCoadd from a model (see DCR
    template generation6.18). The currentin
    prototype approach assumes that the Tem-
    plateCoadd will be derived for the zenith and will comprise a data cube with spatial and
    wavelength dimensions (a low resolution spectrum per pixel). Propagating to the obser-
    vation will require aligning the DCR correction in the direction of the parallactic angle of
    the CalExp.
    3.2.4 Image differencing
    Image differencing incorporates the matching of a TemplateCoadd to a CalExp (astrometricly
    and in terms of image quality), subtraction of the template image, detection and measure-
    mentDIASourcesof
    , removal of
    DIASourcesspurious,
    and associationDIASourcesofwiththe
    previouslyDIAObjectsidentified, andSSObjects.
    Pipeline Tasks
    • Determine a relative astrometric solution from the WCS of the TemplateCoadd image
    and CalExp image
    • Match theSourcesDRPfor the TemplateCoadd5.1.4) againstSources(see from the SFM
    pipeline3.1) of(seethe raw images.
    • Warp or resample the TemplateCoadd using7.20a Lanczos) to
    filter (as described in
    match the astrometry of the CalExp. It is possible that astrometricly matching the Tem-
    plateCoadd and CalExp using faint source will need to be undertaken dependent on the
    accuracy of the WCS.
    • For CalExp images with an image quality that is better than the TemplateCoadd precon-
    volve the CalExp image with the PSF.7.10Use) thata convolutionis matchedkernel (see
    to the source detection kernel. This reduces the need for deconvolution in the PSF
    matching6.16.1(see)
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    18

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    • Match the PSF of the CalExp and TemplateCoadd6.19.1and con-images as described in
    struct a spatial model for the matching kernel. This approach may include matching to
    a common PSF through homogenization6.16.2.
    of the PSF (see
    • Apply the matching kernel to the TempCoadd and subtract the images to generate a
    DiffExp (as described6.16.1). Dependentin
    on the relative signal-to-noise in the science
    and template image decorrelation of the template image due to the convolution of the
    template with a matching kernel6.19.1may)
    be necessary (see
    • DetectDIASourceson the DiffExp using the algorithms6.5. Convolutiondescribedwithin
    a detection kernel will depend on whether the CalExp was preconvolved in item 4.
    • MeasurementsDIASourcesof theon the DiffExp will include dipole models and trailed
    PSF models6.7.2(seeand6.7.2and parameters
    described in
    Table 2 of the
    DPDD
    . The
    specific algorithms used forDIASourcesmeasurementwill dependofon whether the
    CalExp image was preconvolved.
    • Measurement of the PSF flux on snap
    DIASourcesdifference.
    images for all
    • The application of spuriousness algorithms, also known as “real-bogus”, may be applied
    at this time dependent on whether the number of false positives is less than 50% of the
    detected sources6.7.2)
    2
    .
    DIASources(see
    classified as spurious at this
    mopsPurityMin
    stage may not be
    OSS-REQ-0354
    persisted (dependent on the density of the false positives). The default technique will be
    based on a trained random forest classifier. It is likely that the training of this classifier
    will need to be conditioned on the image quality and airmass of the observations.
    3.2.5 Source Association
    In Source AssociationDIASourcesdetected within a given CCD will be cross-matched or associ-
    ated (see6.23.4) withDIAObjectthe table andSSObjectsthe(whose ephemerides have been
    generated for the time of the current observation). The association will be probabilistic and
    account for the uncertainties within the positions. The association may include flux and pri-
    ors on expected proper motions for the sources. External targets (e.g. well localized transient
    events from other telescopes or instruments) can be incorporated within this component of
    the nightly pipeline (essentially treatingDIAObjectsandexternalasso- sources as additional
    ciating themDIASourceswith) theenabling eitherDIASourcesmatchingor generationto
    of
    forced photometry at the position of the external source.
    2
    The requirement for a 50% false
    positive
    rate is given in the
    OSS
    (when discussing Solar System Object require-
    ments) and impacts the sizing model for the alert stream
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    19

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Pipeline Tasks
    • Generate the positionsSSObjectsthatofoverlap a DiffExp given its observation time by
    propagatingSSObjecttheorbit6.25(see)
    • As described6.23.4sourcein
    association willDIASourcesbe undertaken. Matchingfor all
    will beDIAObjectsto , and the ephemeridesSSObjects. Positionsof
    DIAObjectsforwill
    be
    based on a a time windowed (defaultDIASources30 day)thataveragemake upof the
    theDIAObject. A linear motion model for parallax and proper motion will be applied
    to propagateDIAObjecttheto the time of the observation. A probabilistic association
    may need to account for one-to-many and many-to-one associations. In dense regions
    it may be necessary to generate jointDIAObjectsassociations(and associatedacross all
    DIASources)
    in the local
    DIASourcevicinityto
    correctof a
    for mis-assignment from previ-
    ous observations. This could include theDIASourcespruningbe-
    and reassignment of
    tweenDIAObjects. A baseline approach for nightly processing will be to select based on
    a maximum a posteriori estimate for the association.
    DIASourceswill
    be positionally matched to the nearest 3 stars and 3 galaxies in the DRP
    Objectdatabase.
    In its simplest case the search algorithm will be a tree-based nearest
    neighbor search (the default radius for associationObjects is not defined) . The matched
    will be persisted as a measure of local environment.
    DIASourcesunassociatedDIAObjectwithwilla
    instantiateDIAObject. a new
    • The aggregate positionsDIAObjectswillforbetheupdated based on a rolling time win-
    dow (default 30 days).
    • Proper motion and parallaxDIAObjectwillofbetheupdated using a linear model as
    described6.22. in
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    20

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    3.3 Alert Distribution Pipeline (WBS 02C.03.03)
    The Alert Distribution PipelineDIAObjectstakes(includingthe newlytheirdiscoveredassoci-
    ated historical observations) and all
    related
    metadata as described in the
    DPDD
    , and delivers
    alert packetsVOEventformatin
    to a variety of endpoints viaVO-standard IVOA protocols (eg.,
    EventTransport
    Protocol; VTP). Packaging of the event will include the generation of postage
    stamp cutouts (30x30 pixels on average) for the difference image and the template image
    together with the variance and mask pixels for these cutouts.
    The
    SRD
    requires that the design of the LSST alert
    ?8
    eventssystem should be able to handle 10
    per night, which corresponds
    ?5
    alerts pertovisit10
    or 50 alerts per CCD (with the time between
    subsequent visits averaging 39
    ?5
    seconds).per visit)Allmustalertsbe transmitted(up to 10
    within 60s of the closure of the shutter of the final snap within a visit.
    For a nightly event
    ?8
    , andrateassumingof 10 the schema described in Tables 1 and 2 in the
    DPDD
    together with the generation of the postageVOEventsstamp cutouts, the compressed
    data stream amounts to approximately 600GB of data per night (assuming no filtering of the
    data). The Alert Distribution pipeline is designed to distribute these alerts with a workflow,
    including the access point of external4. event brokers, shown in Figure
    In addition to the full data stream the Alert Generation Pipeline will provide a basic alert fil-
    tering service. This service will run at the LSST U.S. Archive Center (at NCSA). It will enable as-
    tronomers to create3.3.4) thatfilterslimit(see what alerts, and what fields from those alerts,
    are ultimately forwarded
    user
    to
    defined
    them.will
    filters
    Thesebe configurable with a simpli-
    fied SQL-like declarative language. Access to this filtering service will require authentication
    by a user.
    VOEventalertswillbepersistedinanalertdatabaseaswellasdistributedthroughamessage
    queue. The alert database (AlertDB) will be synchronized at least once every 24 hours and will
    be queriable by external users. The message queue that distributes the alerts is expected to
    have the capability to replay events for the case of a break in the network connection between
    the queue and client but not to support general queries.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    21

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    FigureDistribution4:
    of alerts from the nightly processing: generation of postage stamps
    around each
    DIASourcedetected,
    distributionDIAObjectsofastheVOEvents, simple filtering
    of the event stream, and persistence of the events in a database.
    3.3.1 Input Data
    DIAObject Database:
    DIAObjects,
    with newDIASources, generated through image differenc-
    ing will be used to create alert packets.
    Difference Images:
    The DiffExp will be used to generate postage stamp (cut-out) images of
    DIASourceswithin
    the CCD.
    Coadd Images:
    The TemplateCoadd used in image subtraction will be used to generate
    postage stamp images of theDIAObjectstemplate.
    image for
    3.3.2 Output Data
    VOEventDatabase:
    VOEventsgenerated
    fromDIAObjectstheand cutouts will be persisted
    within a database (e.g. a noSQL database) or object store.
    3.3.3 Alert postage stamp generation
    Creates the associated image cutouts (30x30DIAObjectspixels(cutoutson average) for all detect
    are generated from the current observation and not from historical observations).
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    22

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Pipeline Tasks
    • Extract from the DiffExp
    DIAObjectthe
    cutoutwithDIASourceaof eachdetected within
    the current observation. Cutout imagesDIASourcewillbutbeonscaled to the size of the
    average will be 30x30 pixels. Variance and mask planes, WCS, background model, and as-
    sociated metadata will be persisted. The prototype implementation assumes that these
    cutouts will be persisted as FITS images with a projection that is the native projection of
    the DiffExps.
    • Extract from the TemplateCoaddDIAObjectawithcutoutDIASourcea
    ofdetectedeach
    within the current observation. Cutout images will be identical in size and footprint as
    those derived from the DiffExp. Variance and mask planes, WCS, and associated meta-
    data will be extracted with the pixel data. The prototype implementation assumes that
    these cutouts will be persisted as FITS imagesand that the projection will be that of the
    DiffExps.
    3.3.4 Alert queuing and persistance
    The alert queue distributesDIAObjectwithandnewDIASourcespersistsasVOEventsthrough
    a message queue. It
    limited
    includesfilteringainterface butVOEventspersistsin an
    the full
    AlertDB. The event message stream and the AlertDB will be synchronized at least once every
    24 hours.
    Pipeline Tasks
    • PublishDIAObjectsto a caching
    messageApachequeueKafka)
    through(e.g.
    the butler.
    The prototype implementation assumes a distributed and partitioned messaging sys-
    tem that
    publication-subscriptionuses a
    model for communication between clients and
    the queue. This model maintains feeds of messages in categories called topics. An ex-
    ample topic wouldDIAObject.beWhethera
    a topic wouldDIAObjectcompriseor
    a full
    a subset of the data remains open (passing subsets of parameters as individual topics
    would require that the client be able to synchronizeDIAObject).
    and join topics into a full
    For each of the 189 CCDs, approximately 50 events will be passed as messages to the
    messaging queuing system. The distribution of the events from a given CCD will not
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    23

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    be synchronized with other CCDs within the focal plane (alerts from each CCD will be
    independently processed).
    • A consumer layer will subscribe to the messageVOEventsqueue and package them as
    and distribute these events to external users. To allow for network outages between the
    message queue and the consumer the message queue must be able to replay previous
    events.
    • The consumer layer will provide a command line API to define simple queries or filters of
    the events (limited to queryingDIAObjectfields,onorexistingfiltering the attributes of
    theDIAObject). Web-based interfaces to the consumer layer will be developed by SUIT.
    • Filtered or theDIAObjectsfull streamwill beofpackagedVOEventsandintobroadcast to
    VOEvent clients through the consumer layer
    • A full, unfiltered, VOEvent alert stream will be broadcast to the AlertDB using the con-
    sumer layer.
    • Prior to the start of the subsequent night’s observations, the message queue will be
    flushed and synchronized with the AlertDB. It is possible to persist the message queue
    on longer timescale but it is a requirement hat synchronization be performed within 24
    hours of the observations.
    To cope with the variation in density of events as a function of position on the sky and the
    need for fault tolerance the message queue will need to be able to partition and replicate
    data. Given the 600GB of data generated per nightDIA-from the alert distribution, each full
    Objectstreamwillrequireabout0.1Gb/snetworkcapacity.
    Whethertheconsumerlayerwill
    instantiate a new consumer for each filterVOEvents(orfromclient)a sin-or will orchestrate the
    gle subscription to the message queue is an open question that will depend on the expected
    network topology (internal and external to the data center at NCSA).
    The AlertDB will have an interface that can be queried (to enable historical searches of events)
    including searches on other than timestamps. It is expected that the AlertDB will be a noSQL
    datastore (e.g. Cassandra).
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    24

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    3.4 Precovery and Forced Photometry Pipeline
    The precovery and forced photometry pipeline5). ForcedperformsPSF two tasks (see Figure
    photometry is undertakenDIAObjectsthatforhaveall a detectedDIASourcewithin a, default
    1 year, window of time from the observation. Second, within 24 hours, precovery forced pho-
    tometry is performed onDIASourcesall unassociatedwithin an imageDIAObjects(i.e.).newFor
    each newDIAObject, forced (PSF) photometry will be measured at the position of the source
    in each of the preceding 30-days of DiffExps.
    Forced photometry is not required prior to alert generation. Completion of the precovery
    photometry is required within 24 hours of the completion of the observations. Forced and
    precovery can be undertaken as part of the nightly workflow if they do not impact the time
    required to distribute the alerts.
    FigureForced5:
    photometryDIAObjects:forforced photometry on a night’s DiffExp for all
    DIAObjectsthat
    have detectedDIASourceswithin the last year, precovery photometry for the
    previous 30 days of DiffExpsDIAObjectsfor new
    For ZI: I moved the precovery to a single pipeline
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    25

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    3.4.1 Input Data
    Difference images:
    A cache of DiffExps within a finite time interval (default 30 days) of the
    previous nights observations (inclusive of the previous nights data)
    DIAObject Database:
    AllDIAObjectswithDIASourcea
    detection within the last 12 months
    and all unassociatedDIAObjectsobserved(new)
    within the previous night
    3.4.2 Output Data
    DIAForcedSource
    Forced
    Databases:
    PSF photometry at the centroid (from the aggregated
    individualDIASourcecentroids)DIAObjectof.aThe forced photometry is undertaken on the
    current night’sDIAObjectsDiffExpwithforDIASourcesall detected within the last year, and on
    the previous 30 days of DiffExpDIASourcesfor all. newly detected
    3.4.3 Forced PhotometryDIAObjectson all
    Generate forced (PSF) photometryDIAObjectson thethatDiffExpoverlapforwithall the foot-
    print of the CCD. Forced photometryDIAObjectsisforonlywhichgeneratedthere hasfor
    beenDIASourcea
    detection within the last 12 months. The forced photometry is persisted in
    the forced photometry table in the Level 1 database. Alerts are released prior to the genera-
    tion of forced photometry and forced photometry is not released as apart of an alert which
    means that this component of the processing is not subject to the 60 second processing re-
    quirements for nightly processing.
    Pipeline Tasks
    • ExtractDIAObjectsallwithin the Level 1 databaseDIASourcewithwithina detectedthe
    last year (including the current nights observations). This information is available from
    theDIASourceandDIAObjectassociation.
    • For the aggregate positionsDIAObjectundertakewithin thea PSF forced measurement
    as described6.7.1in section
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    26

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    • Update the forced photometry tables in the Level 1 database.
    3.4.4 DIAObject Forced Photometry:
    Updated forced photometryDIAObjectstable for all new
    Pipeline Tasks
    • Extract from the LevelDIAObjects1 databasethat wereallunassociatedDIA- (i.e. new
    Sourcedetections)
    from the previous nightsDIAObjectsreduction.will
    Filtering of the
    need to account for casesDIASourceswhereare observednew
    more than once within a
    night (where the second or subsequent observationsDIAObject).
    do not result in a new
    • Extract DiffExps within a default 30 day window prior to the observation
    • Force photometer the extracted6.7.1imagesusing aasPSFdescribedmodel andinthe
    centroid definedDIAObjectin the
    • Update the forced photometry table within the Level 1 database
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    27

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    3.5 Moving Object Pipeline (WBS 02C.03.06)
    The Moving Object Pipeline (MOPS) is responsible for generating and managing the Solar Sys-
    tem data products. These are Solar System objects with associated Keplerian orbits, errors,
    and detectedDIASources. Quantitatively, it shall be capable orbitCompleteness
    of detecting 95% of all Solar Sys-
    tem objects that meet the
    criteria
    specified in the
    OSS
    (i.e. the observations required
    OSS-REQ-0159
    to define
    an orbit). Each visit within 10 degrees of the Ecliptic will detect approximately 4,000 asteroids.
    Components of MOPS are run during and separately6). from nightly processing (see Figure
    MOPS for nightly processing3.2.5asispartdescribedof sourceinassociation. “Day MOPS”
    processes newlyDIAObjectsdetectedto search for candidate asteroid tracks. The procedure
    for Day-MOPS isDIASourceto linkdetections within a night (called tracklets), to link these
    tracklets across multiple nights (into tracks), to fit the tracks with an orbital model to identify
    those tracks that are consistent with an asteroid orbit, to match these new orbits with existing
    SSObjects,
    and to updateSSObjectthetable. By its nature this processDIA-
    is iterative with
    Sourcesbeing
    associated and disassociatedSSObjects. It is expectedwith that a frequency of
    one day for these iterationsSSObjectswill(i.e.be updatethe
    each day) will be sufficient.
    3.5.1 Input Data
    DIAObject Database:
    UnassociatedDIASourcesfrom the previous night of observing. This
    meansDIAObjectsthat were newly created during the previous night because they could not
    be associatedDIAObjectswith known.
    DIASourcesassociatedSSObjectwithinanthe
    night are
    still passed through the MOPS machinery
    SSObject Database:
    The catalog of known solar system sources
    Exposure Metadata:
    A description of the footprint of the observations including the posi-
    tions of bright stars or a model for the detection threshold as a function of position on the
    sky (including gaps between chips)
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    28

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    3.5.2 Output Data
    SSObject Database:
    An updatedSSObjectdatabaseSSObjectswithboth added and pruned
    as the orbital fits are refined
    DIASource Database:
    A updatedDIASourcedatabaseDIASourceswithassigned and unas-
    signedSSObjectsto
    Tracklet Database:
    A temporary database of tracklets measured during a night. This database
    will be persisted for at least a lunation.
    3.5.3 Tracklet identification
    From multiple visits withinDIASourcesa night,tolinkformunassociatedtuples (or n-tuples)
    ofDIASources
    Pipeline Tasks
    • Extract unassociatedDIASourcesfrom the Level 1 database
    • LinkDIASourcesinto tracklets assuming a maximum velocity for the moving sources. The
    maximum velocity will be based 20on].aForprioreachastrackletdescribeda veloc-in [
    ity vector will be calculated to enable pruning or merging of degenerate tracklets within
    a data set.
    • Merge tracklets by clustering in velocity and position (propagated to a common visit
    time). Tracklets can contain multiple points and all permutations of the asteroid tuples
    will be stored. In the processDIASourcesof mergingthat aretrackletsnot a good fit
    for the merged tracklet will be remove and their associated tracklets returned to the
    tracklet database. Moving or trailed sources will incorporate the position angle of the
    source when linking. Details ofDIASourcethe implementationlinkage is describedof the
    in6.26
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    29

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    • Temporarily persist a database of tracklets. This database will be required for at least
    30 days of data but, depending on resources available, may persist for longer.
    3.5.4 Precovery and merging of tracklets
    Tracklets are matched and
    SSObjectsmergedandwithremovedexistingfrom
    the Tracklet
    database. This cullsDIASourcesany trackletsthat obviouslyor
    belongSSObjectto an existing
    from the rest of the processing.
    Pipeline Tasks
    • Return all tracklets identified within a given night of observations
    • Return the footprints of each visit and the time of the observation
    • ExtractSSObjectsfrom theSSObjectdatabase and propagated those orbits to the posi-
    tion and time of a visit. Details of this orbit propagation for precovery are described in
    6.25.
    • Merge (precovery) the trackletsSSObjecttrajectorieswith the projectedSS-and refit the
    Objectorbit
    model.DIASourcespreviously associatedSSObjectmaywithno longeran
    fit
    the updatedSSObjectorbits.
    DIASourcesThesewill
    be removedSSObjectfromandthe
    returned as unassociatedDIAObjectsto the level 1 database. All tracklets associated with
    theseDIAObjectswill be returned to the tracklet database. Details of this attribution and
    precovery are described6.27
    in
    3.5.5 Linking tracklets and orbit fitting
    Given a database of tracklets constructed from a window (default 30 days) of time, link the
    tracklets into tracks assuming a quadratic approximation to the trajectory. Fit these tracks
    with orbital modelsSSObjectand database.update the
    Pipeline Tasks
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    30

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    • Extract all tracklets from the tracklet database for a specified window in time (default 30
    days)
    • Merge tracklets into tracks based on their velocities and accelerations. Candidate tracks
    are pruned by fitting a quadratic relation to the positions (after applying a topocentric
    correction to the positions of the sources). Efficiency in this matching procedure is pro-
    vided by a spatial index such6.28). as a kd-tree (see
    • Fit an orbit to each candidate track using a tool such as OOrb
    (https://github.com/oorb/oorb) and, for poorlyDIASourcesandfitting points, return the
    associated tracklets to their respective databases for subsequent reprocessing.
    • MergeSSObjectsthat have similar orbital parameters based on range searches within
    the six dimensional orbital parameterSSObjectswillspace.need toMergedbe refit
    and any poorlyDIASourcesfitting(and associated tracklets) returned to their respective
    databases for subsequent reprocessing. Details6.29
    of this procedure are given in
    3.5.6 Global precovery
    For all new orSSObjectsupdatedpropagate the orbits to the positions and times of the ob-
    servations of all trackletsDIAObjectsto and“precover”orphanfurther support for the orbits.
    This will prune the numberDIAObjectsofthattrackletswill requireand
    merging in subsequent
    observations.
    Pipeline Tasks
    • Return all tracklets identified within a given night of observations
    • Return the footprints of each visit and the time of the observation
    • Extract orbits for allSSObjectsnewandorpropagateupdated the positions to the times
    of the observations for all visits covering the extent of the tracklet database, default 30
    days, (see6.25)
    • Merge the tracklets withSSObjectthepositionsprojectedandSSObjectrefitorbitthe
    model. PoorlyDIASourcesfitting(and associated tracklets) willSS- be removed from the
    Objectand
    returned as unassociatedDIAObjectsto the Level 1 database (as described
    in6.27).
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    31

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    The process for precoverySSObjectand updatingmodels isofnaturallythe
    iterative (given
    the pruning of poorlyDIAObjectsandfittingtracklets). UpdatesSSObjectsasofparttheof
    each night of operations should enable sufficient iterations without requiring Day-MOPS to
    be rerun multiple times per day. The computationally expensive operations in this pipeline
    are the orbit propagation and the orbit fitting. Resources required for orbit propagation could
    be reduced be removing the initial precovery stage but at the cost of increasing the number
    of tracklets that would be available for matching into tracks. Orbital trajectories could be
    pre-calculated and modelled as polynomials to enable fast interpolation during Day-MOPS.
    Extending the Global PrecoveryDIASourcesto(i.e.includeone thatsingletonare not merged
    into tracklets) would enable the identification of asteroids at the edge of the nightly footprint
    (where an object moves outside of the nightly survey footprint prior to the second visit or a
    second visit is not obtained for a given field).
    3.5.7 Prototype Implementation
    Prototype MOPS
    codeshttps://github.com/lsst/mops_daymopsare available at
    andhttps://
    github.com/lsst/mops_nightmops.
    Current DayMOPS prototype already performs within the
    computational envelope envisioned for LSST Operations, though it does not yet reach the
    required completeness requirement.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    32

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    FigureDetection6:
    and orbital modelling of moving sources within the nightly data: Tracklet
    generation from revisits, filteringSSObjectsof tracklets, fitting of tracksbased onandknown
    orbits to tracklets, pruningDIAObjectsbasedof trackletson new andandSSObjectsupdated.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    33

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    4 Calibration Products Production
    This section details the input data and algorithms required to generate all data products nec-
    essary for the photometric calibration of the LSST survey. Details of the application of these
    products is covered in other sections of this document. The details of the input datasets are
    given4.2inand§ 4.5§, which define the source
    i.e.
    ofwhichthesewilldata,be provided by
    the camera team and which will be measured on4.3theand4.6mountain.§
    Finally, sections §
    list the various output data products from the Calibration Products Pipeline.
    4.1 Key Requirements
    The work performed in this WBS serves several complementary roles:
    • It will enable the production of calibration data products as required by the Level 2 Pho-
    tometric CalibrationLSE-180) andPlanother( planning documents. This includes both
    characterization of the sensitivity of the LSST system (optics, filters and detector) and
    the transmissivity of, and emission from, the atmosphere;
    • It will characterize detector anomalies in such a way that they can be corrected either
    by the instrument signature removal routines in the Single Frame Processing Pipeline
    (WBS 02C.03.01) or, if appropriate, elsewhere in the system;
    • It will provide updated values of the crosstalk matrix to the camera DAQ (for AP) and DM
    (for DRP) for correction of the raw data;
    • It will allow for characterization of the optical ghosts and scattered light in the system.
    4.2 Inputs
    The following section details the input datasets which will be available to the Calibration Prod-
    ucts Pipeline which will be acquired by the operations team at some frequency TBD. Some
    of these will be acquired
    e.g.
    flats,frequently,while some will be acquired much less fre-
    quently,
    e.g.
    the gain and linearity values. It should be noted that these are the raw inputs,
    and as such, the algorithmic sections for items that are listed as camera team deliverables
    are shown as “None” as these will have been previously developed. However, many of these
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    34

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    items are re-listedoutputsin,sectionthewhere the algorithms to recalculate/monitor these
    on the mountain are defined.
    4.2.1 Bias Frames
    A set of bias frames used for the production of the master bias frame, obtained by acquiring
    many zero second exposures with the shutter remaining closed, taken at the normal LSST
    cadence.
    • Algorithmic component: None - these just need to be taken.
    4.2.2 Gain Values
    Camera Team deliverable
    The gain values for all amplifiers?Æ
    ??
    /ADU;innotethethatcamera,theseinare required to
    high accuracy (0.1%), as they are used in determination of the photometric flats.
    • Algorithmic component: None.
    4.2.3 Linearity
    Camera Team deliverable
    The linearity curve for each amplifier in the camera, as well as the level above which these
    non-linearity curves should be considered unreliable.
    • Algorithmic component: None.
    4.2.4 Darks
    Sets of long dark
    300s)frameswith the( actual exposure length optimized for the dark current
    in the delivered sensors, the delivered read-noise, and considering the trade-off against the
    integrated cosmic ray flux and radioisotope contamination.
    • Algorithmic component: None - these just need to be taken.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    35

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    4.2.5 Crosstalk
    Camera Team deliverable
    The crosstalk matrix for every pair of amplifiers in the camera. It is worth noting that this is
    expected to be a very sparse.
    • Algorithmic component: None.
    4.2.6 Defect Map
    Camera Team deliverable
    A list of all bad (unusable) pixels in each CCD,
    i.e.
    asoneswell as list of possibly suspect pixels,
    which should be flagged as such during processing.
    • Algorithmic component: None.
    4.2.7 Saturation levels
    Camera Team deliverable
    The lowest level (in electrons), for each amplifier, at which charge bleeds into the neighboring
    pixels. If necessary, they will also provide the
    i.e.
    levelif
    at which the serial register saturates (
    the serial saturates at a lower level than the parallels).
    • Algorithmic component: None.
    4.2.8 Broadband Flats
    Sets of flats taken through the standard LSST filters. Flats will be taken at a number of flux
    levels to measure the “brighter-fatter effect”coefficients and to check linearity, including sets
    of “superflats” - sets of high-flux flats??
    ?6?1,
    possiblywith?? many?2?1?1). Therepeatssuperflats(
    taken for “brighter-fatter effect”characterization will not need to be taken regularly as this
    effect is not expected to evolve with time.
    • Algorithmic component: None - these just need to be taken.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    36

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    4.2.9 Monochromatic Flats
    Sets of ‘monochromatic’
    c.
    1nm bandwidth(
    and spacing) flat-field screen images taken with
    no filter/glass in the beam.
    • Algorithmic component: None - these just need to be taken.
    4.2.10 CBP Data
    Sets of images taken with the Collimated Beam Projector (CBP). The proposed resolutions and
    steps in these datasets are preliminary. All CBP data will be processed using the standard
    LSST ISR, except without the application of flat-fielding. Standard LSST aperture photometry
    will then be used to measure the number of counts associated with each CBP spot.
    • Algorithmic component: Scripting the CBP/8.4m to take each of these datasets in concert.
    The scripting/control requirements for the4.4. CBP are dealt with separately in §
    CBP dataset
    Sets
    1
    of CBP images scanned in wavelength
    3
    every 1nmatfor1nm resolution
    a fixed set of spot positions on the camera, and for fixed footprint on M1. No filter should be
    in the beam.
    CBP dataset
    Sets
    2
    of CBP images scanned in wavelength at 20nm bandwidth every 100nm,
    while rotating the CBP about a pupil to move the spot pattern around the camera for a fixed
    footprint on M1. No filter should be in the beam.
    CBP dataset
    Sets
    3
    of CBP images scanned in wavelength at 20nm resolution every 100nm
    for a fixed set of spot positions on the camera, and for a number of footprints on M1; the
    minimum number of footprints
    c.
    6 for a 30cm isCBP, but in reality the use of more pointings
    will be explored to test the assumption of azimuthal symmetry. No filter should be in the
    beam.
    3
    1nm ‘resolution’ here denotes the bandwith of the light source, and can be this width, or any amount lower. It
    should, however, be noted that the accuracy on the wavelength calibration of the light source needs to be at the
    0.1nm level.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    37

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    CBP dataset
    Sets
    4
    of CBP images scanned in wavelength at 1nm resolution every 1nm for
    a fixed set of spot positions on the camera, and for a fixed footprint on M1. Repeated for
    every filter.
    N.b.
    the wavelength range for each scan need only cover the range for which the
    filter transmits appreciable light.
    CBP dataset
    Sets
    5
    of CBP images scanned in wavelength at 20nm resolution every 20nm
    for a fixed set of spot positions on the camera, and for fixed footprint on M1. Repeated for
    every filter.
    CBP Crosstalk Measurement
    Sets of CBP images taken with a suitably-designed sparse
    mask to allow identification and measurement of all ghost images arising from electronic
    crosstalk. The simplest sparse mask would have only a single spot, used to illuminate each
    amplifier in the camera in turn (but less sparse solutions are likely also possible). The wave-
    lengths used are unimportant, and there are no constraints on beam footprints on M1 or filter
    choice. This will be particularly necessary should LSST be operated in a slow-readout mode,
    for example for use with 30s integrations, as crosstalk coefficients would change considerably.
    4.2.11 Filter Transmission
    The transmission curves (transmission as a function of wavelength) for each filter, as a func-
    tion of filter position. This is to be delivered by the filter vendors rather than the camera team,
    but is input data which will not be measured by DM. The required resolution is 1nm or better,
    in keeping with the resolution of the monochromatic flats.
    • Algorithmic component: None. We need to check what the proposed wavelength resolution
    and accuracy the vendors are proposing to use for this is. I spoke to Steve Ritz at the AHM
    and he seemed very positive about the vendor’s proposal for this, but we should check what
    the plan is.
    4.2.12 Atmospheric Characterization
    These are the external measurements
    e.g.
    oftheatmosphericbarometricparameters,pres-
    sure, ozone and temperature, provided by measurement systems both on and off site.
    • Algorithmic component: Interfacing with the site team or parties responsible for the equip-
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    38

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    ment, to automate obtaining the measurements in a machine-readable form, including the
    ozone data from satellites.
    4.3 Outputs from the Calibration Product Pipelines == Inputs to the AP/DRP
    Pipelines
    This section details the outputs from the Calibration Products Pipeline. Algorithms for the
    production of each item are defined, and includes provision for the re-calculation of items
    previously listed as “camera team deliverables”.
    4.3.1 Master Bias
    A trimmed, overscan subtracted, master bias frame from the entire camera, produced by tak-
    ing the median of several-to-many bias frames for each CCD on the focal plane.
    • Algorithmic component: Given LSST’s 2s readout, we do not expect to need to remove cosmic
    rays explicitly; a robust stacking algorithm should be sufficient. A prototype construction algo-
    rithm currentlypipe_driversexists. Theinfinal version must be configurable to use scalar-,
    vector- or array-type overscan
    4
    If theresubtraction.is significant structure in the overscan
    regions or the bias images themselves, some summary of this will be made and kept as meta-
    data to ensure that the fixed-pattern in each observation is the same.
    4.3.2 Master Darks
    A trimmed, overscan and bias-frame subtracted, master dark frame for each CCD on the fo-
    cal plane. These are produced by taking the
    c.
    300s)mediandarkof several-to-many long (
    exposures, which are subsequently scaled to 1s exposure length.
    • Algorithmic component: The individual frames will be run through the standard ISR process-
    ing (including cosmic ray removal) before being combined; this combination may be done
    using the standard LSST image stacking code, and a prototype construction algorithm cur-
    rently existspipe_driversin. The final version must be configurable to use scalar-, vector-
    or array-type overscan subtraction, and be robust to contamination from cosmic rays when
    coadding.
    4
    If the readout noise in any channel is too low (relative to the gain) to properly sample the noise distribution,
    a simple fix is to?Îadd(
    e.g.
    3)setsbiasofexposures before creating the stacked image.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    39

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    4.3.3 Master Linearity
    Linearity curves for each amplifier4.2.3in, unlessthe camera;updatedidenticalduring to §
    operations.
    • Algorithmic component: An algorithm will need to be written to generate the linearity curves
    from raw data, either from binned flats, CBP data or “ramp frames”. This requires careful
    treatment, as the “brighter-fatter effect”can masquerade as non-linearity. We expect to reuse
    the algorithm developed by the Camera Team to supply the initial values, provided it can be
    used to make this measurement to sufficient accuracy. The code to apply the non-linearity
    correction during ISR is currently being implemented by Russell Owen. Care must be taken to
    calculate these after bias subtraction, or be consistent with the way in which they are applied
    during ISR.
    4.3.4 Master Fringe Frames
    Compound (polychromatic) fringe frames, dynamically created to match the emission spec-
    trum of the atmosphere at the time of observation, if necessary. Should it be found that the
    night sky’s emission spectrum is sufficiently stable so as not to change the fringe pattern, the
    first few PCA components of the fringe pattern will be used instead.
    • Algorithmic component: Construction ofmonochromaticthese fringe frames proceeds from
    flats, likely using the existingpipe_driversPCA. algorithm in
    4.3.5 Master Gain Values
    The gain values for all amplifiers?Æ
    ??
    /ADU; identicalin the4.2.2camera,, unlessto §inupdated
    during operations, though it is thought that this will likely be necessary.
    • Algorithmic component: Whilst highly accurate initial gain measurements will exist as an
    input (to better than 0.1%), monitoring the evolution of the gains to the required accuracy is
    currently an unsolved problem. The algorithm to determine this on the mountain is poten-
    tially tricky and will need to be developed.
    It will be possible to monitor the relative gain within a given CCD by demanding that the flat
    fields be continuous across amplifier boundaries; this is, however, more difficult across device
    boundaries.DM-6030Ticketexists
    to explore the possibility of using cosmic ray muons and the
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    40

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    unavoidable radioisotope contamination inside
    5
    thenthe camera for this purpose. If this fails
    another method will need to be devised. The necessary accuracy of this measurement should
    be firmly established.
    Two main techniques exist to measure
    photon
    the
    transfer
    gaintechniquein CCDs:
    curve
    the
    (PTC), and illumination
    ?6?6
    ofFetheX-rayssensoror thosewithfrom another similar radioiso-
    tope. Both of these techniques need to be applied with care to achieve good results. Given
    the “brighter-fatter effect”, it is not clear to what accuracy PTC can be used to measure the
    gain, though sufficiently large binning of flat-fields can be used to mitigate the majority of
    this effect, and while radioisotope gain measurement achieves good precision, the ability to
    illuminate the focal plane in a suitable
    ?6?6
    Femannermeasurementis uncertain. Should the
    technique be used, the flux measurement will use the standard stack source-finding and flux-
    measurement algorithms.
    4.3.6 Master Defects
    A list of all the bad pixels 4.2.6in ,eachunlessCCD;updatedidenticalduringto §operations.
    • Algorithmic component: Perform statistical analysis of dark frames, flats and “pocket-pumping”
    exposures to derive an updated defect list. These algorithms should be transferable from the
    Camera and electro-optical test teams.
    4.3.7 Saturation Levels
    The level (in electrons), for each amplifier, at which charge bleeds into a neighboring pixel;
    identical4.2.7, unlessto §
    updated during operations.
    • Algorithmic component: This will be measured using CBP spot projections, though these
    levels could also be measured by saturating many stars in long sky exposures. Code will be
    written to detect where saturation is occurring using the shape of the spots, and calculate the
    saturation levels.
    4.3.8 Crosstalk
    The crosstalk matrix element for every pair of4.3.8amplifiers,
    in the camera; identical to §
    unless updated during operations. The probability that this will need to be updated is high,
    5
    Merlin’s estimate is that the likelihood of failure is moderate-to-high, Robert disagrees.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    41

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    as the validity of these values depends on them being measured with the camera in its final
    configuration. This is due to the inter-CCD and inter-raft crosstalk levels being determined by
    the capacitive couplings, which, though supposedly small (especially in the case of the inter-
    raft coupling), depend on the exact physical locations of all the circuit boards and flex cables
    with respect to one another. It is therefore necessary to be able to remeasure this on the
    mountain using theCBPCBPcrosstalkusing the.
    dataset
    • Algorithmic component: In the un-multiplexed limit, this involves dithering a single CBP spot
    around the focal plane and measuring the positive and negative crosstalk ghosts, whilst disam-
    biguating these from optical ghosts using the fact that electronic ghosts have fixed focal-plane
    coordinate offsets whereas optical ghosts will move as a function of the CBP pointing. Some
    multiplexing will be possible using a multi-pinhole CBP mask, though the level of this remains
    to be determined, and depends on the final properties of the camera and the optical system.
    We baseline for a single spot mask, a one-spot-per-CCD mask, a one-spot-per-raft-mask, and
    ideally a one-spot-per-amplifier mask.
    CBP dithering scripts will be written which will involve mask-specific raster scanning routines,
    followed by either performing a camera rotation or by re-raster scanning at a different M1
    position for the previous focal plane positions to differentiate between the crosstalk and op-
    tical ghosts. Code to perform this differentiation will be written, which will then measure the
    coupling coefficients. Further reading on crosstalk23].
    in LSST CCDs can be found in [
    Confirmation of the measured crosstalk matrix will be performed using either CBP data, sat-
    urated stars’ bleed trails, or35cosmic].
    rays in dark frames[
    4.3.9 Master Impure Broadband Flats
    A set of broadband master flats, one per filter, produced by taking the median of a set of
    trimmed, bias-, overscan-, and dark-corrected flat-field images for each filter. These flats will
    include
    any ghosted or scattered light, and will be used to monitor the evolution of dust spots
    etc.
    on the optics. A set of broadband flats will be acquired each day and compared to these
    master flats, and if significant change is found, this will prompt the reacquisition of the neces-
    sary input data4.2products, and the regenerationin §
    of the corresponding outputs.
    • Algorithmic component: Constructionpipe_drivers. algorithm exists in
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    42

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    4.3.10 Master Impure Monochromatic Flats
    A set of master flats produced by taking the median
    c.
    1nm) of a set of ‘monochromatic’ (
    trimmed, bias-, overscan-, and dark-corrected flat-field images for each filter. These flats will
    include
    any ghosted or scattered light.
    • Algorithmic component: Constructionpipe_drivers. algorithm exists in
    4.3.11 Master Pure Monochromatic Flats
    A set of master flats produced by taking the median
    c.
    1nm) of a set of ‘monochromatic’ (
    trimmed, bias-, overscan-, and dark-corrected flat-field images for each filter. These flats will
    exclude
    any ghosted or scattered light, with the ghost exclusion performed as follows.
    • Algorithmic component: Having performed
    6
    of theCBPa starflat-likedata, and
    processing
    having normalized the results, we will fit a surface through the CBP values, either per-CCD or
    for the whole camera. A spline would be a reasonable choice; either the product of two 1-D
    splines, or a thin plate spline. RHL would start with the former as they are easier to understand.
    The dome-flat is then divided by this surface, giving an estimate of the illumination and chip-to-
    chip correction. A curve is then fitted to this correction, and is used to correct the dome screen.
    This should be close to the valuesCBP data(andderivedcan preservefrom thediscontinuities
    in the QE across chips which the fitted curves have a hard time following). This process is
    then iterated a few times, with each iteration resulting in a smaller and smoother correction,
    which we are therefore better able to model. This process is then repeated at a suitable set of
    wavelengths, chosen so that the variation of these corrections as a function of wavelength is
    well captured. We will then know the relative QE for all the pixels in the camera, as a function
    of wavelength, in the absence offiltera filter.transmissionThen,, theusingrelativecurvesthe
    QE for all the pixels in the camera for each filter can be determined at 1nm resolution; this
    is our monochromatic
    photometric
    LSST’s
    flatfield.
    plans for
    See
    Calibrated
    thefor
    Photometry
    further reading.
    4.3.12 Master PhotoFlats
    A set of master flats, each composed ofpurea linearmonochromaticcombination,
    flatsof
    weighted by a flat-spectrum source (or other predefined standard SED), absorbed by a stan-
    6
    Some adaptation of the stack’s starflat processing code will likely be necessary to adapt it to processing CBP
    data, but his code by-and-large already exists or is independently under development.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    43

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    dard atmosphere, and observed through each filter. Each input flat will be calculated from
    the median of many exposures. This will only be necessary if per-object corrections are not
    being applied, though this product will always be used to flat-field the sky, with the appropri-
    ate sky-spectrum used for the weighting.
    • Algorithmic component:pureThe combinationmonochromaticis simple,of flats though
    the “standard atmosphere” and “standard SED” remain to be defined.
    4.3.13 Master Low-resolution narrow-band flats
    A set of master flats produced by taking the median of a set of a low-resolution (both in space
    and wavelength)4.3.11version, used toofsave§
    memory in the conversion of the photometry
    from the flattened data using the current sky colour to the proper flatfield for a given SED.
    • Algorithmic component: Scripting to perform the necessary sweeps of the laser light source,
    and characterization of its output, as the pulse energy will need to be normalized to.
    4.3.14 Pixel Sizes
    A map of the (effective) pixel-size distortions.?Îwidth?.
    ?Îheight?.At?3dat-worst,
    this will be a
    acube of floats. Pixel size distortions include small-scale quasi-random size variations, mask-
    stitching/tiling artifacts, tree-rings, and any other effects not dynamical in nature.
    • Algorithmic component: The algorithm to measure this is currently a (somewhat) unsolved
    problem. It has been claimed by Aaron Roodman, Michael Baumer and Christopher Davis
    that these can be measured from flat-fields, but the problem is under-constrained, and thus
    the stability (nay, validity?) of their measurements is questionable, despite seeming to work.
    Further thought is required to establish whether their method can be used, and if not, devise
    another one. It is not obvious how the problem can be made to be well constrained, but work
    is ongoing in the DESC Sensor Anomalies Working Group (SAWG) to investigate this which
    might help inform future thinking on the matter.
    rely
    Merlin knows that we are not allowed to
    on DESC work in the project, and hopes that the last sentence is phrased in such a way that
    it is useful to inform readers where to look for work on the subject without making it sound
    like they’re doing critical project work on which we will rely.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    44

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    4.3.15 Brighter-Fatter Coefficients
    The coefficients needed to model the “brighter-fatter effect”. It is hoped that these are a small
    number of floats per CCD, but this is not yet entirely clear. The input data necessary to calcu-
    late these will likelysuperflatsbeatrestrictedvarious fluxtolevels, with the possible addition
    of some star fields for verification of the coefficients.
    • Algorithmic component: A number of techniques exist to measure these (mostly developed
    by members of the Princeton LSST/HSC group). Code already exists to estimate the kernel/co-
    efficients, and apply the corrections using a slightly enhanced version of the Astier/Antilogus
    technique.
    4.3.16 CTE Measurement
    Measurement of the charge transfer efficiency for each amplifier/column in the camera. In the
    most simple case, where the dominant trap is close to the amplifier in the serial register and
    thus affects all columns equally, this would be a single number per amplifier. The next level
    of complexity would be a number per column, with the still more complex version involving
    characterizing the specific defects and their locations on the chips, in which case this becomes
    a per-pixel product, though this could be simplified with the use of bounding-boxes as with
    defect maps. The nominal case should likely be considered as per-column or per-amplifier,
    because if the number of columns with significant effects is small, these columns would most
    likely just be masked out rather than corrected.
    • Algorithmic component: Measurement of CTE is subtle, though several established methods
    exist for doing
    ?6?6
    so.Fe methodUsing themay not be possible due to the probable lack of a
    radioisotope source in
    extended
    the camera,
    pixel
    but(EPER)
    edge
    the
    response
    method and
    flat-field correlation method would both be possible using the existing input data products.
    We expect to be able to reuse the measurement algorithms from the Camera Team once they
    have been ported to run within the DM framework.
    4.3.17 Filter Transmission
    Monitoring of filter
    in-situ
    transmission. As well as the filter transmission measurement pro-
    vided by the camera team/vendor4.2.11, we furtherin §baseline the development of a proce-
    dure for monitoring the filter response at 1 nm resolution by making suitable CBP measure-
    ments with and without the filters in the beam, and averaging over the angles.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    45

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Whilst the flat-top portion of the filter pass-band will be monitored, given the small expected
    gradient and minimal ringing, the transmission across the top becomes degenerate with gray
    extinction or mirror degradation and its monitoring is therefore of less importance than that
    of the filter edges. The evolution of the edges of the filter bandpasses will be monitored to
    the best of the ability of the photometric calibration hardware, with the limit likely imposed
    by the laser performance and ability to characterize its output spectrum.
    • Algorithmic component: Created4.2.10from.
    measurements in §
    JIRA
    ticketDM-9046has
    been filed to determine whether, given recent results from DESC show-
    ing that 0.1nm resolution on the evolution of the filter edges is required for SNIa cosmology
    with LSST, this
    will be
    added to the
    SRD
    and the requirements flowed down to here.
    4.3.18 Ghost catalog
    A catalog of the optical ghosts and glints which is available for use in other parts of the system.
    Detailed characterization of ghosts in the LSST system will only be possible once the system
    is operational. The baseline design therefore calls for this system to be prototyped using
    data from precursor instrumentation; we
    e.g.
    noteHSC arethatwell-ghosts and ghoulies in
    known and more significant than are expected in LSST. It is not currently clear where the
    responsibility for characterizing ghosts and glints in the system lies. We assume it is outside
    this WBS. On realising this, RHL instructed that it be noted that this constitutes a possible new
    entry to the risk registry. Merlin has proposed a meeting between himself, Robert, John
    Swinbank, Chuck and anyone else the other attendees think would be advisable to invite, in
    order to discuss the status of and plan for the measurement and correction of ghosts in the
    system. Merlin has heard somewhat differing opinions as to how correctable these are, and
    whether or not we plan to correct for ghosted light for photometry, and during discussions
    it seemed like proposing a meeting with those with deeper knowledge of the subject was
    necessary to get a resolution.
    The plan, as it stands from Robert in an email, was that we would apply an oversize mask to
    glints, assuming they are rare, and “if ghosts are well-characterised and only very bright stars
    matter we would probably subtract them. So basically, until we know what we’re looking at I
    don’t know what we’ll do.”
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    46

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    4.3.19 Spectral Standards
    A set of standard stars, spectrally characterized above the atmosphere, covering a range of
    colors, and lying within an appropriate magnitude range, with one or more stars per LSST
    pointing; the likely source of this data is Gaia. However, should this prove not to be a suitable
    source
    7
    , a catalog will be carefully generated using the survey’s most photometric data, utiliz-
    ing an übercal/jointcal type approach.
    • Algorithmic component: Color transformations need to be constructed from the Gaia mea-
    surements, based on assumptions about
    i.e.
    nottheusingobjects’onlyintrinsiccolor
    SEDs,
    terms. In the case that Gaia does not provide the catalog, a process similar to the Forward
    Global Calibration Model (FGCM) implemented by Eli Rykoff and David Burke for DES would
    be used. The latter process would likely be able to share atmospheric modelling code with
    the reductions performed for the auxiliary/Calpyso/calibration telescope.
    4.3.20 Spectrophotometric Standards
    A set of photometrically characterized stars with well known spectra, distributed across the
    sky. This will likely be comprised of DA white-dwarves,
    i.e.
    CALSPEC standards, or a larger (
    fainter) set of stars which will be bootstrapped from21].the faint extension to the CALSPEC standards[
    • Algorithmic component: Exactly how these stars will be chosen and cataloged remains TBD.
    4.3.21 Astrometric Standards
    A set of stars used
    absolute
    astrometricfor the
    calibration
    i.e.
    the determinationof each visit,
    of the nominal pointing for each exposure. The likely source of this data is Gaia; there will be
    4 magnitudes of overlap between Gaia’s faintest astrometric sources and LSST’s brightest un-
    saturated sources, with the absolute astrometry provided by Gaia on these objects expected
    to be
    450?aas at the faint end. This data will be made available in Gaia Data Release 2.
    4.4 CBP Control
    The procurement of the CBP hardware includes that of the necessary low-level control driver-
    s/software. T&S TCS own the task of taking the vendor-provided low-level routines and turning
    these into real-world usable routines by constructing
    e.g.
    homing,
    higher level functions for
    7
    This is likely to be?Õ-band,the casewhereinGaia’sthe SNR for the BP spectra falls off rapidly.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    47

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    slew-to-position,
    etc.
    mount, thougha maskit should be noted that this is a non-exhaustive
    and purely illustrative list of example functions, and not the requirement for the functionality
    that will be provided. T&S will also provide a pointing model for the CBP itself.
    Control scripts for the CBP and interfaces with the OCS will be written, to allow taking all the
    desired measurements, especially as several, if not all of these, require doing so in concert
    with the 8.4m. As well as writing the necessary scripts to acquire the raw data products out-
    lined4.2in, it§will also be necessary to deliver a coordinate transformation package to allow
    the CBP to maintain a fixed position on the focal plane whilst illuminating different portions
    of the pupil, and vice versa.
    4.5 Calibration Telescope Input Calibration Data
    This section details the input data required to calibrate the auxiliary/Calpyso/calibration tele-
    scope itself. Broadly, this will include4.2, butmostnamely:of the ingredients listed in §
    • Gain values
    Less accuracy is needed here than for the main camera; a PTC-based measurement
    using flats will likely be sufficient if the data is binned and a quadratic fitted to correct
    for the “brighter-fatter effect”, and will therefore reuse the PTC-based algorithm from
    the 8.4m.
    • Crosstalk matrix
    This will reuse the algorithm designed for the 8.4m assuming we have a previous CBP
    version available, otherwise this will be calculated from saturated stars and/or cosmic
    rays.
    • Linearity curves for each amplifier
    This will reuse the algorithm designed for the 8.4m.
    • Defect map
    This will reuse the algorithm designed for the 8.4m.
    • Saturation levels
    This will reuse the algorithm designed for the 8.4m assuming we have a previous
    CBP version available, otherwise it will involved the use of saturated stars.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    48

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    • Bias frames
    This will reuse the algorithm designed for the 8.4m.
    • Dark frames
    This will reuse the algorithm designed for the 8.4m.
    • Broadband flat-fields
    This will reuse the algorithm designed for the 8.4m.
    • Monochromatic
    8
    flat-fields.
    This will reuse the algorithm designed for the 8.4m.
    • Disperser (grating/grism) transmission
    The baseline specification is for a Ronchi grating to be used as the dispersive element
    in the optical design. Although the transmission of a Ronchi grating is flat in wavelength,
    because it will be placed in a non-parallel beam second order light contamination means
    that its effective transmission will not be perfectly flat, and this will need to be corrected
    for. Furthermore, a grism or blazed-grating are also being considered for use as the
    dispersive element, neither of which have flat responses in wavelength. However, the
    smoothly varying nature of their transmission functions will allow these to be fit for along
    at the same time as performing the fit to the atmospheric model.
    Should the contents of this document be strictly limited to the current design plan even when
    this has not been finalised? If so I will move the part about grism/blazing transmission to
    a comment to be re-included in the future if/when the aux telescope design is finalised. I
    thought this was probably OK for now though.
    Further to these standard camera calibration data products, an illumination/ghost correction
    will also be required, which will either be derived from star field observations or using the
    final CBP prototype for direct measurement.
    4.6 Calibration Telescope Output Data
    This section details the calibrated outputs from the auxiliary/Calpyso/calibration telescope,
    which, like items4.3,inaresectionoutputs§ from the Calibration Products Pipelines to be
    8
    It is confirmed to be part of the baseline design that there will be both broadband and monochromatic light
    sources at the auxiliary/Calpyso/calibration telescope.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    49

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    used during photometric calibration at various levels.
    4.6.1 Atmospheric Absorption
    As shown in7,Figurethe determination of the atmospheric transmission starts with a two im-
    ages, one dispersed, and one direct and unfiltered, acquired back-to-back with the auxiliary/-
    Calpyso/calibration telescope, where the camera rotator will likely be set to align the spec-
    trum along the parallactic angle. Both images are initially bias-subtracted and dark-corrected
    as per the normal image processing, with cosmic rays detected and interpolated over, along
    with defective pixels.
    Both images then have their PSFs measured, and are astrometrically matched in the usual
    way; the direct image is treated exactly as normal, while for the dispersed image only the
    brightest objects in the image will be used, thereby preventing contamination from spurious
    detections due to the spectra.
    The direct image is suitably flux-scaled and warped, and is then subtracted from the dispersed
    image to remove the zeroth-order light, leaving only the spectra. It should be noted that, to
    first order at least, this will remove the sky-background from the dispersed image, and fur-
    thermore, as the stars used for atmospheric characterization will be very bright, any residual
    sky-background is thought to be a negligible contribution.
    Using the astrometry and the nominal dispersion relation given by the optical configuration,
    the regions in which the spectra fall are identified and the spectra are extracted. The strongest
    spectral lines in these crude uncorrected-spectra are identified and used to provide an im-
    proved estimate of the dispersion relation. This is performed by matching spectral features
    found in?Í
    ?>the?2and?Í ?> ???2spectra.
    This first approximation of the wavelength solution
    is used to calculate the incident wavelength as function of position on the detector, which is
    then used to construct an appropriate flat-field for the main target object in a narrow strip
    around the source, as a set of parallel stripes perpendicular to the dispersion direction, con-
    structed from the monochromatic flatfield data-cube. If the dispersion is not parallel to the
    CCD’s serial or parallel
    i.e.
    if wedirection,choose to disperse along the parallactic angle as
    above, then the Bresenham algorithm will be used to construct the appropriate flats.
    Having applied the flatfield, the 1D spectra are then extracted from the image by fitting a Gaus-
    sian profile derived from the initially measured
    correct
    modelPSF. Whilst a Voigt profile is the
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    50

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    FigureFlowchart7:
    depicting the atmospheric absorption measurement pipeline.
    .
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    51

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    to fit here, the fitting is often not stable, and this is thought to be due spurious power in the
    wings, which is suppressed by using a Gaussian, while the result is good at the few percent
    level. The extracted spectrum is then flux calibrated and corrected for second order light con-
    tamination
    9
    . A more precise wavelength calibration is then performed using the spectral lines
    in this corrected spectrum, taking into account the effect of differential chromatic refraction,
    resulting in a spectrophotometrically calibrated measurement.
    Thesource’s trueand theSED calibrated spectrophotometric observation are then used in
    conjunction with the observational
    e.g.
    the zenithmeta-data,angle, temperature, and baro-
    metric pressure, to derive an empirical measurement of the atmospheric transmission. This
    absorption profile is then fitted to an atmospheric transmission model to improve the deliv-
    ered spectral absorption measurement, as well to provide a parametric description of the
    state of the atmosphere at the time of observation.
    4.6.2 Night Sky Spectrum
    The acquisition of a night sky spectrograph is unlikely as it is not in the baseline design spec-
    ification. However, in the eventuality that such an instrument is obtained, we provision for
    the determination of the emission spectrum of the night sky near the auxiliary/Calpyso/cali-
    bration telescope?¹ boresight,?©
    ?3?1?1
    10
    , which willwithbe used to synthesize flat-field images
    matching the sky’smonochromaticSED using the . dome flats
    • Algorithmic component: Assuming we have a sky spectrograph this is simple. In the absence
    of a sky spectrograph,?¹
    ?© ?2?1spectruman
    will be acquired using standard/narrowband filters.
    Furthermore, if the fringe structures
    i.e.
    they areare wellsufficientlydescribed
    3
    stable,by
    PCA components, we may be able to simply use a classic fringe subtraction.
    4.7 Photometric calibration walk-through
    The Calibration Products Production section aims to provide all the ingredients necessary to
    photometrically calibrate the entire LSST survey, visit-by-visit and band-to-band, thus arriving
    at everything except a single photometric zero-point for the survey.
    9
    It should be noted that strictly speaking, second order light contamination invalidates the flat-fielding method
    described above. If the effect is small, a simple QE curve will likely suffice to correct for this effect, otherwise an
    iterative approach to the flat-fielding will be taken.
    10
    It is not entirely clear yet whether these will be taken on the Calypso or the 8.4m boresight.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    52

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    The effective end-to-end instrumental throughput, as a function of wavelength, focal-plane/pupil
    position, and time, isghost-correctedknown from the monochromaticand thefilterflat-fields
    transmission. functions
    The atmospheric transmission as a function of wavelength, at the time each observation is
    made, is known at some position in the field-of-viewspectrophotomet-by taking the ratio of
    ricly calibratedofstellara bright?9
    th
    ??spectrum?2?1
    th
    (magnitude) star, as measured by the
    auxiliary/Calpyso/calibration telescope, to the BP/RP spectrum as measured above the atmo-
    sphere by Gaia.
    It should be noted that the effective deliveredGaia spectropho-spectral resolution of both the
    tometryand theatmospheric absorptioncan be improved using model fits. The stellar spectra
    from Gaia,?¹
    ?©with?5?1 ?? ?8?1)(for
    the BP and RP spectra32], canrespectively[be fitted to stan-
    dard stellar spectral types, as will be done 2internally]. For the by Gaia for their data releases[
    atmospheric absorption profile,4.6.1, theasabsorptiondescribed infeatures§
    from the at-
    mosphere will be fitted to an atmospheric
    e.g.
    MODTRANtransmission
    etc.
    ), allowing model (
    us to improve the delivered measurement of the spectral absorption features present at the
    time of observation.
    Images are initially flat-fielded using the color of the sky at the time of observation. This
    ensures that the sky background is correctly flat-fielded, and can therefore be smoothly sub-
    tracted across amplifier and chip boundaries without residual discontinuities. This flat-fielding
    is then reversed, and the resulting sky-background-subtracted image is re-flatfield with some
    pre-selected
    11
    in orderSED
    to obtain a first-order estimate of the object’s SED. Later in pro-
    cessing, when an assumed SED has been derived for each object, per-object corrections are
    made to adjust both for the derived SED and for the atmospheric transmission at the time
    of observation. With each object now flat-fielded with the appropriate spectrum, and the at-
    mospheric transmission and system response functions corrected for in each visit, the photo-
    metric zero-point for thephotometricvisit is fittedstandardusing. Thisthethereforestar set
    leaves just the overall flux level unknown, thus bringing us to one global photometric zero-
    point for the entire survey.
    This whole-survey zero-point can then calculated using an empirical approach to tie this back
    to the definition of the Jansky, and two proposals exist for doing so. The first is to use CBP mea-
    surements in conjunction with NIST calibrated photodiodes in the CBP’s integrating sphere to
    11
    Whether this is a flat SED or
    e.g.
    somea G-star’snominalremainsSED TBD.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    53

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    measure the absolute instrumental sensitivity, though this will require integrating over the
    pupil. The second is to use a ‘son-of-StarDICE’ type approach, where precisely calibrated and
    stabilized LEDs of known wavelength and luminosity are observed by either LSST (or the aux-
    iliary/Calpyso/calibration telescope, as their observations are already tied together), allowing
    the absolute system response to be measured using observations which illuminate the entire
    pupil at a set of wavelengths.
    It should be noted that it is not yet known whether the atmospheric transmission will vary
    significantly across LSST’s field of view, and that this is currently being measured for the first
    time by wide-field cameras such as DECam and HSC. Should it turn out that the atmospheric
    transmission varies on spatio-temporal scales relevant to the survey, we propose to make
    further per-visit corrections by measuring the variation in flux as a function of color/spectral
    classification for all Gaia sources across the field of view. However, should this not be nec-
    essary, measuring this variation anyway will allow the spatial structure of the atmospheric
    transmission to be constrained, providing a convenient quality-assurance null-test to validate
    this choice.
    4.8 Prototype Implementation
    While parts of the Calibration Products Pipeline have been prototyped by the LSST Calibra-
    tion Group LSE-180(see thefor discussion), these have not been written using LSST Data
    Management software framework or coding standards. We therefore expect to transfer the
    know-how, and rewrite the implementation.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    54

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    5 Data Release Production
    A Data Release Production is run every year (twice in the first year of operations) to produce
    a set of catalog and image data products derived from all observations from the beginning of
    the survey to the point the production began. This includes running a variant of the difference
    image analysis run in Alert Production, in addition to direct analysis of individual exposures
    and coadded images. The data products produced by a Data Release Production are summa-
    rized in3. table
    Name
    AvailabilityDescription
    Source
    Stored Measurements from direct analysis of individual exposures.
    DIASource Stored Measurements from difference image analysis of individual ex-
    posures.
    Object
    Stored Measurements for a single astrophysical object, derived from
    all available information, including coadd measurements, si-
    multaneous multi-epoch fitting, and forced photometry. Does
    not include solar system objects.
    DIAObject Stored Aggregate quantities computing by associating spatially colo-
    cated DIASources.
    ForcedSourceStored Fluxmeasurementsoneachdirectanddifferenceimageatthe
    position of every Object.
    SSObject Stored Solar system objects derived by associating DIASources and
    inferring their orbits.
    CalExp
    RegeneratedCalibrated exposure images for each CCD/visit (sum of two
    snaps).
    DiffExp
    RegeneratedDifference between CalExp and PSF-matched template coadd.
    DeepCoadd Stored Coadd image with a reasonable combination of depth and res-
    olution.
    ShortPeriodCoaddRenegeratedCoadd image that cover only a limited range of epochs.
    BestSeeingCoaddStored Coadd image built from only the best-seeing images.
    PSFMatchedCoaddRegeneratedCoadd image with a constant, predetermined PSF.
    TemplateCoaddStored Coadd image used for difference imaging.
    TableTable3:
    of public data products produced during a Data Release Production. A full
    description of these data products can be found in the Data Products Definition Document
    [LSE-163].
    From a conceptual standpoint, data release production can be split into six groups of pipelines,
    executed in approximately the following order:
    1. We characterize and calibrate each exposure, estimating point-spread functions, back-
    ground models, and astrometric and photometric calibration solutions. This iterates
    between processing individual exposures independently and jointly fitting catalogs de-
    rived from multiple overlapping exposures. These steps are described more fully in sec-
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    55

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    FigureSummary8:
    of the Data Release Production image processing flow. Processing is
    split into multiple pipelines, which are conceptually organized into the groups discussed
    in sections5.1-5.5. A final pipeline group discussed5.6simply inoperatessectionon the
    catalogs and is not shown here.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    56

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    tion5.1.
    2.We alternately combine images and subtract them, using differences to find artifacts and
    time-variable sources while building coadds that produce a deeper view of the static sky.
    Coaddition and image differencing5.2. is described in section
    3.We process coadds to generate preliminary object catalogs, including detection, deblend-
    ing, and the first phase of measurement.5.3This.
    is discussed in section
    4.We resolve overlap regions in our tiling of the sky, in which the same objects have been
    detected and processed multiple times.5.4. This is described in section
    5.We perform more precise measurements of objects by fitting models to visit-level im-
    ages, either simultaneously or individually,5.5.
    as discussed in section
    6.After all image processing is complete, we run additional catalog-only pipelines to fill in
    additional object properties. Unlike previous stages, this postprocessing is not localized
    on the sky, as it may use statistics computed from the full data release to improve our
    characterization of individual objects.8, butThispostpro-stage is not shown in Figure
    cessing pipelines are5.6described.
    in section
    This conceptual ordering is an oversimplification of the actual processing flow, however; as
    shown in Figures8and9, the first two groups are interleaved.
    Each pipeline in this the diagram represents a particular piece of code excuted in parallel on a
    specific unit of data, but pipelines may contain additional (and more complex) parallelization
    to further subdivide that data unit. The processing flow also includes the possibility of itera-
    tion between pipelines, indicated by cycles in the diagram. The number of iterations in each
    cycle will be determined (via tests on smaller productions) before the start of the production,
    allowing us to remove these cycles simply by duplicating some pipelines a fixed number of
    times. Decisions on the number of iterations must be backed by QA metrics. The final data
    release production processing can thus be described as a directed acyclic graph (DAG) to be
    executed by the orchestration middleware, with pipelines and (intermediate) data products
    as vertices. Most of the graph will be generated by applications code before the production
    begins, using a format and/or API defined by the orchestration middleware. However, some
    parts of the graph must be generated on-the-fly; this5.5.1. will be discussed further in section
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    57

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    FigureData9:
    flow diagram for the Data Release Production image coaddition and image
    differencing pipelines. Processing proceeds roughly counterclockwise, starting from the up-
    per right with pipelines5.1.describedEach updateintoSectiona component of the central
    CalExp dataset can in theory trigger another iteration of a previous loop, but in practice we
    will “unroll” these loops before production begins, yielding an acyclic graph with a series
    of incrementally updated CalExp datasets. The nature of this unrolling and the number of
    iterations will be determined by future algorithmic research. Numbered steps above are
    described more fully in the text.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    58

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    5.1 Image Characterization and Calibration
    The first steps in a Data Release Production characterize the properties of individual expo-
    sures, by iterating between pixel-level processing of individual visits (“ImChar”, or “Image Char-
    acterization” steps) and joint fitting of all catalogs overlapping a tract (“JointCal”, or “Joint Cal-
    ibration” steps). All ImChar steps involve fitting the PSF model and measuring Sources (grad-
    ually improving these as we iterate), while JointCal
    12
    ) and steps fit for new astrometric (WCS
    photometric solutions while building new reference catalogs for the ImChar steps. Iteration
    is necessary for a few reasons:
    • The PSF and WCS must have a consistent definition of object centroids. Celestial posi-
    tions from a reference catalog are transformed via the WCS to set the positions of stars
    used to build the PSF model, but the PSF model is then used to measure debiased cen-
    troids that feed the WCS fitting.
    • The later stages of photometric calibration and PSF modeling require secure star selec-
    tion and colors to infer their SEDs. Magnitude and morphological measurements from
    ImChar stages that supersede those in the reference catalogs are aggregated and used
    to update it in the subsequent JointCal stage, allowing these colors and classifications to
    be used for PSF modeling in the following ImChar stage.
    The ImChar and JointCal iteration is itself interleaved with background matching and differ-
    ence imaging, as described5.2. Thisinallowssectionthe better backgrounds and masks to
    be defined by comparisons between images before the final Source measurements, image
    characterizations, and calibrations.
    Each ImChar pipeline runs on a single visit, and each JointCal pipeline runs simultaneously on
    all visits within a single tract, allowing tracts to be run entirely independently. Some visits may
    overlap multiples tracts, however, and will hence be processed multiple times.
    The final output data products of the ImChar/JointCal iteration are the Source table and the
    CalExp (calibrated exposure)Exposureimages., and henceCalExphasismultiplean
    compo-
    nents that we will track separately.
    12
    This is not limited to FITS standard7.11transformations;.
    see Section
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    59

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    5.1.1 BootstrapImChar
    The BootstrapImChar pipeline is the first thing run on each science exposure in a data release.
    It has the difficult task of bootstrapping multiple quantities (PSF, WCS, background model,
    etc.) that each normally require all of the others to be specified when one is fit. As a result,
    while the algorithmic components to be run in this pipeline are generally clear, their ordering
    and specific requirements are not; algorithms that are run early will have a harder task than
    algorithms that are run later, and some iteration will almost certainly be necessary.
    A plausible (but by no means certain) high-level algorithm for this pipeline is given below in
    pseudocode. Highlighted terms are described in more detail below the pseudocode block.
    d e f
    B o o t s t r a prawI m, Creferenceh a r (, calibrations) :
    # S o mdea tpar o d uccotmsp o n eanr tevsi s-iwti daen ds o maer epe-rC C;D
    # t h e siem a g i ndaartytay p else tussd e awli tbho t. h
    # V i s i t E x paolssuhoraesc o m p o n;emnotssatr es e l- efx p l a n a, taonrdy
    # {m i} = ={i m a,gmea s, vka r i a}n( cfeo r" M a s k e d I" ) .m a g e
    calexp = VisitExposure()
    sources = VisitCatalog()
    snaps = VisitMaskedImageList()
    # h o l dbso tshn a,pbsu to n l{yi m a,gmea s, vka r i a}n c e
    p a r a l
    f
    l
    o
    e
    r
    lc c d
    i n
    A L L _ S E N S O R S :
    s n a p s [ c cRunISRd ] (=r a[w [ c c
    f
    d
    o
    ]
    r
    s) n a
    i
    p
    n
    S N A P _ N U M B E R S ]
    s n a p s [ c c d ]SubtractSnaps. m a s k =( s n a p s [ c c d ] )
    c a l e x p [ c c dCombineSnaps] . m i =( s n a p s [ c c d ] )
    c a l e x p . pFitWavefronts f = ( c a l e x p [ W A V E F R O N T _ S E N S O R S ] . m i )
    calexp.{image,mask,variance,background}
    = SubtractBackground( c a l e x p . m i )
    p a r a l
    f
    l
    o
    e
    r
    lc c d
    i n
    A L L _ S E N S O R S :
    s o u r c e s [ cDetectSourcesc d ] = ( c a l e x p . { mi , p s f } )
    s o u r c e s [ cDeblendSourcesc d ] = ( s o u r c e s [ c c d ] , c a l e x p . { mi , p s f } )
    s o u r c e s [ cMeasureSourcesc d ] = ( s o u r c e s [ c c d ] , c a l e x p . { mi , p s f } )
    m a t c h eMatchSemiBlinds =
    ( s o u r c e s , r e f e r e n c e )
    w h i l e
    c
    n
    o
    o
    n
    t
    v e r g e d :
    SelectStars( m a t c h e s , e x p o s u r e s )
    c a l e x p . wFitWCSc s (=m a t c h e s , s o u r c e s , r e f e r e n c e )
    c a l e x p . pFitPSFs f (=m a t c h e s , s o u r c e s , c a l e x p . { mi , w c s } )
    WriteDiagnostics( s n a p s , c a l e x p , s o u r c e s )
    p a r a l
    f
    l
    o
    e
    r
    lc c d
    i n
    A L L _ S E N S O R S :
    s n a p s [ c cSubtractSnapsd ] = ( s n a p s [ c c d ] , c a l e x p [ c c d ] . p s f )
    c a l e x p [ c c dCombineSnaps] . m i =( s n a p s [ c c d ] )
    c a l e x p [ c c dSubtractStars] . m i = ( c a l e x p [ c c d ] . { mi , p s f } , s o u r c e s [ c c d ] )
    c a l e x p . { mi , b a c kSubtractBackgroundg r o u n d } =( c a l e x p . m i )
    p a r a l
    f
    l
    o
    e
    r
    lc c d
    i n
    A L L _ S E N S O R S :
    s o u r c e s [ cDetectSourcesc d ] = ( c a l e x p . { mi , p s f } )
    calexp[ccd].mi, sources[ccd] =
    ReinsertStars( c a l e x p [ c c d ] . { mi , p s f } , s o u r c e s [ c c d ] )
    s o u r c e s [ cDeblendSourcesc d ] = ( s o u r c e s [ c c d ] , c a l e x p . { mi , p s f } )
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    60

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    s o u r c e s [ cMeasureSourcesc d ] = ( s o u r c e s [ c c d ] , c a l e x p . { mi , p s f } )
    m a t c h eMatchNonBlinds =
    ( s o u r c e s , r e f e r e n c e )
    c a l e x p . p s f . aFitApCorrp c o r r( m =a t c h e s , s o u r c e s )
    p a r a l
    f
    l
    o
    e
    r
    lc c d
    i n
    S C I E N C E _ S E N S O R S :
    s o u r c e s [ cApplyApCorrc d ] =( s o u r c e s [ c c d ] , c a l e x p . p s f )
    retu
    c
    r
    a
    n
    lexp, sources
    Much of this pipeline is an iteration that incrementally improves detection depth while im-
    proving the PSF model. This loop is probably only necessary in crowded fields, where it will
    be necessary to subtract brighter stars in order to detect fainter ones; we expect most high-
    latitude visits to require only a single iteration. The details of the convergence criteria and
    changes in behavior between iterations will be determined by future algorithm research. It
    is also likely that some of the steps within the loop may be moved out of the loop entirely, if
    they depend only weakly on quantities that change between iterations.
    Input Data Product:
    Raw amplifier
    Raw
    images from science and wavefront CCDs, spread
    across one or more snaps. Needed telescope telemetry (seeing estimate, approximate point-
    ing) is assumed to be included in the raw image metadata.
    Input Data Product:
    A full-sky
    Reference
    catalog of reference stars derived from both ex-
    ternal (e.g. Gaia) and LSST data.
    TheStandardJointCalpipeline will later define a deeper reference catalog derived from this
    one and the new data being processed, but the origin and depth of the initial reference cat-
    alog is largely TBD. It will almost certainly include Gaia stars, but it may also include data
    from other telescopes, LSST special programs, LSST commissioning observations, and/or the
    last LSST data release. Decisions will require some combination of negotation with the LSST
    commissioning team, specification of the special programs, experiments on our ability to ac-
    curately type faint stars using the Gaia catalog, and policy decisions from DM leadership on
    the degree to which data releases are required to be independent. Depending on the choices
    selected, it could also require a major separate processing effort using modified versions of
    the data release production pipelines.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    61

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Input Data Product:
    Calibration
    Calibrations
    frames andCalibrationmetadata from the
    Products .PipelineThis may include any of the data 4.3products, though listed in Section
    some will probably not be used until later stages of the production.
    Output Data Product:
    A preliminary
    Source
    version of the Source table. This could con-
    tain all of the
    columns
    in the
    DPDD
    Source schemaMeasureSourcesif the is appropriately
    configured, but some of these columns are likely unnecessary in its role as an intermediate
    data productStandardJointCalthat feeds, and it is
    likely that
    other non-
    DPDD
    columns will be
    present for that role.
    BootstrapImChar also has the capability to produce even earlier versions of the Source ta-
    ble for diagnosticWriteDiagnosticspurposes). These(see tables are not associated with any
    photometric calibration or aperture correction, and some may not have any measurements
    besides centroids, and hence are never substitutable for the final Source table.
    Output Data Product:
    A preliminary
    CalExp
    version of the CalExp (calibrated direct expo-
    sure). CalExpExposureisobject,an
    and hence it has several components; BootstrapImChar
    creates the first versions of all of these components (though some, such as the VisitInfo, are
    merely copiedrawfromimages).the Some CalExp components are determined at the scale
    of a full FoV and hence should probably be persisted at the visit level (PSF, WCS, PhotoCalib,
    Background), while others are straightforward CCD-level data products (Image, Mask, Uncer-
    tainty).
    RunISR
    DelegateISRto thealgorithmictocomponentperform standard detrending as well
    as brighter-fatter correction and interpolation for pixel-area variations. It is possible that
    these corrections will require a PSF model, and hence must be backed-out and recorrected at
    a later stage when an improved PSF model is available.
    We assume that the applied flat field is appropriate for background estimation.
    SubtractSnaps
    DelegateSnapto theSubtraction algorithmicto mask artifactscomponent
    in the difference between snaps. If passed a PSF (as in the iterative stage of BootstrapImChar),
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    62

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    also interpolate them Artifactby delegatingInterpolationalgorithmicto the
    component.
    We assume here that the PSF modeled on the combination of the two Snaps is sufficient for in-
    terpolation on the Snaps individually; if this is not true, we can just mask and interpolate both
    Snaps when an artifact appears on either of them (or we could do per-Snap PSF estimation,
    but that’s a lot more work for very little gain).
    CombineSnaps
    DelegateImageto theCoaddition algorithmicto combinecomponentthe
    two Snaps while handling masks appropriately.
    We assume there is no warping involved in combining snaps. If this is needed, we should
    instead consider treating each snap as a completely separate visit.
    FitWavefront
    DelegateWavefrontto the
    Sensor PSF algorithmicto generatecomponent
    an approximate PSF using only data from the wavefront sensors and observational metadata
    (e.g. reported seeing). Note that we expect this algorithmic component to be contributed by
    LSST Systems Engineering, not Data Management. We start with a PSF estimated from the
    wavefront sensors only because these should be able to use bright stars that are saturated in
    the science exposures, mitigating the effect of crowding; in high-latitude fields this step may
    be unnecessary.
    The required quality of this PSF estimate is TBD; setting preliminary requirements will involve
    running a version of BootstrapImChar with at least mature detection and PSF-modeling al-
    gorithms on precursor data taken in crowded fields, and final requirements will require pro-
    ceessing full LSST camera data in crowded fields. However, robustness to poor data quality
    and crowding is much more important than accuracy; this stage need only provide a good
    enough result for subsequent stages to prcoeed.
    SubtractBackground
    DelegateSingleto the Visit BackgroundalgorithmicEstimationcom-
    ponent to model and subtract the background consistently over the full field of view.
    The multiple backgrounds subtracted in BootstrapImChar may or may not be cumulative (i.e.
    we may or may not add the previous background back in before estimating the latest one).
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    63

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    DetectSources
    DelegateSourceto theDetection algorithmicto find above-thresholdcomponent
    regionsFootprints( ) and peaks within them in a PSF-correlated version of the image. We may
    first detect on the original image (i.e. without PSF correlation) at a higher threshold to improve
    peak identification for bright blended objects.
    In crowded fields, each iteration of detection will decrease the threshold, increasing the num-
    ber of objects detected. Because this will treat fluctuations in the background due to unde-
    tected objects as noise, we may need to extend PSF-correlation to the appropriate filter for
    an image with correlated noise and characterize the noise field from the image itself.
    DeblendSources
    DelegateSingleto the Frame Deblending algorithmicto split
    component
    Footprintswith multiple peaks into deblendHeavyFootprintsfamilies,thatandsplitgenerate
    each pixel’s values amongst the objects that contribute to it.
    MeasureSources
    DelegateSingleto the Frame Measurement algorithmicto
    component
    measure source properties.
    In BootstrapImChar, we anticipateNeighbor NoiseusingReplacementapproachthe
    to de-
    blending, with the following plugin algorithms:
    • Centroids
    • Second-Moment Shapes
    • Pixel Flag Aggregation
    • Aperture Photometry
    • Static Point Source Model Photometry
    These measurements will not be included in the final Source catalog, so they need only include
    algorithms necessary to feed later steps (and we may not measure the full suite of apertures).
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    64

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    MatchSemiBlind
    DelegateSingleto the Visit Reference Matching algorithmic component
    to match source catalogs to a global reference catalog. This occurs over the full field of view,
    ensuring robust matching even when some CCDs have no matchable stars due to crowding,
    flux limits, or artifacts.
    “Semi-Blind” refers to the fact that the WCS is not yet well known (all we have is what is pro-
    vided by the observatory), so the matching algorithm must account for an unknown (but small)
    offset between the WCS-predicted sources positions and the reference catalog positions.
    SelectStars
    Use reference catalog classifications and source flags to select a clean sample
    stars to use for later stages.
    If we decide not to rely on a pre-existing reference catalog to separate stars from galaxies
    and other objects, we will need a new algorithmic component to select stars based on source
    measurements.
    FitWCS
    DelegateSingleto the Visit Astrometric Fitto determinealgorithmicthecomponent
    WCS of the image.
    We assume this works by fitting a simple mapping from the visit’s focal plane coordinate sys-
    tem to the sky and composing it with the (presumed fixed) mapping between CCD coordinates
    and focal plane coordinates. This fit will be improved in later pipelines, so it does not need to
    be exact;?=0.05 arcsecond accuracy should be sufficient.
    As we iterate in crowded fields, the number of degrees of freedom in the WCS should be
    allowed to slowly increase.
    FitPSF
    DelegateFullto theVisit PSF Modeling algorithmicto constructcomponentan im-
    proved PSF model for the image.
    Because we are relying on a reference catalog to select stars, we should be able to use colors
    from the reference catalog to estimate SEDs and include wavelength dependence in the fit.
    If we do not use a reference catalog early in BootstrapImChar, PSF estimation here will not
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    65

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    be wavelength-dependent. In either case the PSF model will be further improved in later
    pipelines.
    PSF estimation at this stage must include some effort to model the wings of bright stars, even
    if this is tracked and constrained separately from the model for the core of the PSF. This
    aspect of PSF modeling is considerably less developed, and may require significant algorithmic
    research.
    As we iterate in crowded fields, the number of degrees of freedom in the PSF model should
    be allowed to slowly increase.
    WriteDiagnostics
    If desired, the currentsource,calexpstate, andsnapsofvariablesthe
    may
    be persisted here for diagnostic purposes.
    SubtractStars
    Subtract all detected stars above a flux limit from the image, using the PSF
    model (including the wings). In crowded fields,SubtractBack-this should allow subsequent
    groundandDetectSourcessteps to push fainter by removing the brightest stars in the image.
    Sources classified as extended are never subtracted.
    ReinsertStars
    Add stars removedSubtractStarsinback into the image, and merge corre-
    spondingFootprintsand peaks into the source catalog. Information about the nature of these
    detections will be propagated through the peaks.
    MatchNonBlind
    Match a single-CCD source catalog to a global reference frame, probably
    by delegatingthe sameto matching algorithm used. A separatein JointCalalgorithmpipelines
    component may be needed for efficiency or code maintenance reasons; this is a simple lim-
    iting case of the multi-way JointCal matching problem that may or may not merit a separate
    simpler implementation.
    “Non-Blind” refers to the fact that the WCS is now known well enough that there is no signifi-
    cant offset between WCS-projected source positions and reference catalog positions.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    66

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    FitApCorr
    DelegateApertureto the
    Correction algorithmicto constructcomponenta curve
    of growth from aperture photometry measurements and build an interpolated mapping from
    other fluxes (essentially all flux measurements aside from the suite of fixed apertures) to the
    predicted integrated flux at infinity.
    Additional research may be required to determine the best aperture corrections to apply to
    galaxy fluxes. Our baseline approach is to apply the same correction to galaxies that we apply
    to stars, which is correct for small galaxies and defines a consistent photometric system. This
    is formally incorrect for large galaxies, but there is (to our knowledge) no formally correct
    approach.
    ApplyApCorr
    DelegateApertureto the Correction algorithmicto apply aperturecomponent
    corrections to flux measurements.
    5.1.2 StandardJointCal
    In StandardJointCal, we jointly process all ofBoot-the Source tables produced by running
    strapImCharon each visit in a tract. There are four steps:
    1. We match all sources and the referenceJointCalMatchingcatalog. Thisby delegating to
    is a non-blind search; we assumeBootstrapImCharthe WCSsareoutputgood enoughby
    that we don’t need to fit for any additional offsets between images at this stage. Some
    matches will not include a reference object, as the sources will almost certainly extend
    deeper than the reference catalog.
    2.We classify matches to select a clean samplesJoint-of stars for later steps, delegating to
    CalClassification. The samples for photometric and astrometric calibration may be dif-
    ferent (for instance, we may require low variability only in the photometric fit and no
    proper motion only in the astrometric fit). This uses morphological and possibly color
    information from source measurements as well as reference catalog information (where
    available). This step also assigns an inferred SED to each match from its colors; whether
    this supersedes SEDs or colors in the reference catalog depends on our approach to
    absolute calibration.
    3.We fit simultaneously for an improved astrometric solution by requiring each star in
    a match to have the same position,JointdelegatingAstrometricalgorithmictoFitthe
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    67

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    component. This will need to correct (perhaps approximately) for centroid shifts due to
    DCR, proper motion, and parallax; if it does not, it must be robust against these shifts
    (perhaps via outlier rejection). This requires that StandardJointCal have access to the
    VisitInfo component of each CalExp, in order to calcluate DCR. The models and parame-
    ters to fit must be determined by experimentation on real data (as they depend on the
    number of degrees of freedom in the as-built system on different timescales), and hence
    the algorithm must be flexible enough to fit a wide variety of models. This fit updates
    the WCS component for each CalExp.
    4.We fit simultaneously for a per-visit zeropoint and a smooth atmospheric transmission
    correction by requiring each star in a match to have the same flux after applying the per-
    poch smoothed monochromatic flat fields produced by the calibration products pipeline,
    delegatingJointto thePhotometricalgorithmicFit component. This fit should also have
    the ability to fit per-CCD photometric zeropoints for diagnostic purposes. There is a
    small chance this fit will also be used to further constrain those monochromatic flat
    fields. This fit updates the PhotoCalib component for each CalExp.
    In addition to updating the CalExp, WCS, and PhotoCalib, StandardJointCal generates a new
    Reference dataset containing the joint-fit centroids and fluxes for each of its match groups as
    well as their classifications and inferred SEDs. The sources included in the reference catalog
    will be a securely-classified bright subset of the full source catalog.
    StandardJointCal mayRefineImCharbe iteratedto ensurewiththe PSF and WCS converge on
    the same centroid definitions. StandardJointCalBootstrapIm-is always run immediately after
    Char, butRefineImCharorStandardJointCalmay be the last step in the iteration run before
    procedingWarpAndPsfMatchwith
    .
    If the Gaia catalog cannot be used to tie together the photometric calibration between differ-
    ent tracts, a larger-scale multi-tract photometricGlobal Photometricfit must also be run (see
    Calibration), which would upgrade this step from a tract-level procedure to a larger sequence
    point. It is unlikely this sequence point would extend to the full survey. It would only be run
    once, but may happen in eitherFinalJointCalStandardJointCal. If the Gaia catalogor
    is suffi-
    cient for large-scale photometricGlobal Photometriccalibration,may insteadFittingbe run
    after the data release production as complete as a form of QA.
    Before LSST’s atmospheric monitoring telescope, the Gaia catalog, and the suite of monochro-
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    68

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    matic flats are available, photometric calibration will be considerably more difficult, and hence
    pipeline commissioning (on both precursor data and some LSST commissioning data) will re-
    quire a more sophisticatedInterimglobalWavelength-Dependentfit (see
    )
    Photometric Fitting
    that uses multiple observations of stars to infer their SEDs and the wavelength-dependent
    transmission of the system as well as their magnitudes and the spatial dependence of the
    transmission.
    5.1.3 RefineImChar
    RefineImChar performs an incremental improvementBoot-on the PSF model produced by
    strapImChar, then uses this to produce improved source measurements, assuming the im-
    proved reference catalog, WCS, andStandardJointCalPhotoCalib. Its stepsproducedareby
    thus a strict subsetBootstrapImCharof those. A pseudocodein
    description of RefineImChar
    is given below, but all steps refer5.1.1to:back to the descriptions in
    d e f
    R e f i n e I mcalexpC h,asourcesr ( , reference) :
    m a t c h eMatchNonBlinds =
    ( s o u r c e s , r e f e r e n c e )
    SelectStars( m a t c h e s , e x p o s u r e s )
    c a l e x p . pFitPSFs f (=m a t c h e s , s o u r c e s , c a l e x p . { mi , w c s } )
    p a r a l
    f
    l
    o
    e
    r
    lc c d
    i n
    S C I E N C E _ S E N S O R S :
    c a l e x p [ c c dSubtractStars] . m i = ( c a l e x p [ c c d ] . { mi , p s f } , s o u r c e s [ c c d ] )
    c a l e x p . { mi , b a c kSubtractBackgroundg r o u n d } =( c a l e x p . m i )
    p a r a l
    f
    l
    o
    e
    r
    lc c d
    i n
    S C I E N C E _ S E N S O R S :
    s o u r c e s [ cDetectSourcesc d ] = ( c a l e x p . { mi , p s f } )
    calexp[ccd].mi, sources[ccd] =
    ReinsertStars( c a l e x p [ c c d ] . { mi , p s f } , s o u r c e s [ c c d ] )
    s o u r c e s [ cDeblendSourcesc d ] = ( s o u r c e s [ c c d ] , c a l e x p . { mi , p s f } )
    s o u r c e s [ cMeasureSourcesc d ] = ( s o u r c e s [ c c d ] , c a l e x p . { mi , p s f } )
    c a l e x p . p s f . aFitApCorrp c o r r( m =a t c h e s , s o u r c e s )
    p a r a l
    f
    l
    o
    e
    r
    lc c d
    i n
    S C I E N C E _ S E N S O R S :
    s o u r c e s [ cApplyApCorrc d ] =( s o u r c e s [ c c d ] , c a l e x p . p s f )
    retu
    c
    r
    a
    n
    lexp, sources
    This is essentially just anotherBootstrapImChariteration, withoutof thetheloopWCS-in in
    fitting or artifact-handling stages. Previously-extracted wavefront information may again be
    used in PSF modeling, but we do not expect to do any additional processing of the wavefront
    sensors in this pipeline.
    Note that RefineImChar does not update the CalExp’s WCS, PhotoCalib, or Uncertainty; the
    WCS and PhotoCalib will have alreadyStandardJointCalbeen better, and noconstrained in
    changes have been made to the pixels. The Image is only updated to reflect the new back-
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    69

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    ground, and the Mask is only updated to indicate new detections.
    5.1.4 FinalImChar
    FinalImChar is responsible for producing the final PSF models and source measurements.
    While similarRefineImCharto, it is run after at leastBackgroundMatchAn-one iteration of the
    dRejectand possiblyUpdateMaskspipelines, which provide it with the final background model
    and mask.
    The steps in FinalImChar areRefineImCharidentical, with justto thosea fewinexceptions:
    • The background is not re-estimated and subtracted.
    • The suite of pluginSinglerunFrameby Measurementis expanded to included all algo-
    rithms indicated in the 12first. Thiscolumnshouldof Figureprovide all measurements
    in
    the
    DPDD
    Source table description.
    • We also classify sourcesSingleby delegatingFrame Classification, totofill the final
    Source table’s
    extendedness
    field. It is possible this RefineImCharwill also be run during
    andBootstrapImCharfor diagnostic purposes.
    5.1.5 FinalJointCal
    FinalJointCal
    almost
    identicalis StandardJointCalto , and the details of the differences will de-
    pend on the approach to absolute calibration and the as-built performance of the surrounding
    pipelines. Because it is responsible for the final photometric calibration, it may need to per-
    form some steps that couldStandardJointCalbe omittedbecausefromthey have no impact
    on the ImChar pipelines. This could include a role in determining the absolute photometric
    calibration of the survey, especially if a Gaia is relied upon exclusively to tie different tracts
    together.
    There is no need for FinalJointCal to produce a new or updated Reference dataset (except for
    its own internal use), as subsequent steps do not need one, and the DRP-generated reference
    catalog used by Alert Production will be derived from the Object table. It will produce an
    updated WCS and PhotoCalib for each CalExp, with the PhotoCalib possibly now reflecting
    absolute as well as relative calibration.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    70

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    As discussed5.1.2in, sectionthis pipeline may require a multi-tract sequence point.
    5.2 Image Coaddition and Image Differencing
    The next group of pipelines in a Data Release Production consists of image coaddition and
    image differencing, which we use to separate the static sky from the dynamic sky in terms
    of both astrophysical quantities and observational quantities. This group also includes an
    iteration between pipelines that combine images and pipelines that subtract the combined
    images from each exposure. At each differencing step, we better characterize the features
    that are unique to a single epoch (whether artifacts, background features, or astrophysical
    sources); we use these characterizations to ensure the next round of coadds include only
    features that are common to all epochs. Variable objects will be particularly challenging in
    this context, as our models of their effective coadded PSFs will be incorrect unless variability
    is included in those models.
    The processing flow in this pipeline group again centers around incremental updates to the
    CalExp dataset, which are limited here to its Background and Mask component (the Image
    component is also updated, but only to subtract the updated background). It will also return
    to the previous pipeline group5.1todescribedupdate otherin SectionCalExp components.
    As in the previous pipeline group, tracts are processed independently, and since some visits
    overlap multiple tracts, multiple CalExps (one for each tract) will be produced for the CCDs in
    these visits. The data flow between pipelines9, with the numbeis shownredinstepsFigure
    described further below:
    1. The first version of the CalExp datasetBootstrapImCharis produced,
    by running the
    StandardJointCal, andRefineImCharpipelines, as described5.1.
    in Section
    2.We generate an updated BackgroundBackgroundMatchAndRejectand Mask via the
    pipeline. This produces the final CalExp Background and Image, and possibly the final
    Mask.
    3.If the CalExp Mask has beenFinalImCharfinalized,andFinalJointCalwe runpipelines.the
    These produce the final PSF, WCS, and PhotoCal. If the Mask has not been finalized, we
    execute at least one iteration of the next step before this one.
    4.We run WarpTemplatesthe
    , CoaddTemplates, andDiffImpipelines to generate the DI-
    ASource and DiffExp datasets. We may then be able to generate better CalExp Masks
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    71

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    than we can obtainBackgroundMatchAndRejectfrom
    by comparing the DiffExp masks
    across visitsUpdateMasksin thepipeline.
    5.After all CalExp components haveWarpRemainingbeen finalized,andCoad-we run the
    dRemainingto build additional coadd data products.
    The baseline ordering of these steps is thus {1,2,3,4,5}, but {1,2,4,3,4,5} is perhaps just as likely,
    and we may ultimately require an ordering that repeats steps 2 or 3. Final decisions on the
    ordering and number of iteration will require testing with mature pipelines and a deep dataset
    taken with a realistic cadence; it is possible the configuration could even change between data
    releases as the survey increases in depth. Fortunately, this reconfiguring should not require
    significant new algorithm development.
    This pipeline group is responsible for producing the following final data products:
    CalExp
    See above.
    DiffExp
    A CCD-levelExposurethat is the difference between the CalExp and a template coadd,
    in the coordinate system of the CalExp. It may have the same PSF as the CalExp (if tra-
    ditional PSF matching is used) or its own PSF model (if the difference image is decorre-
    lated
    13
    after matching).
    DIASource
    ASourceCatalogcontaining sources detected and measured on the DiffExp im-
    ages.
    ConstantPSFCoadd
    A coadd data Exposureproductor subclass(
    thereof) with a constant,
    predefined PSF.
    DeepCoadd
    A coadd data product built to emphasize depth at the possible expense of see-
    ing.
    BestSeeingCoadd
    A coadd data product built to emphasize image quality at the possible ex-
    pense of depth. Depending on the algorithm used, this may be the same as DeepCoadd.
    ShortPeriodCoadd
    A coadd data product built from exposures in a short range of epochs,
    such as a year, rather than the full survey. Aside from the cut on epoch range, this would
    use the same filter as DeepCoadd.
    13
    Decorrelated
    refer
    images
    here to a technique for convolving images by the transpose of the PSF, summing or
    differencing them, and then deconvolving the transpose ofDMTN-015the effective PSF of the resulting image. See
    for more information.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    72

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    LikelihoodCoadd
    A coadd formed by correlating each image with its own PSF before com-
    bining them, used for detection and possibly building other coadds.
    ShortPeriodLikelihoodCoadd
    Short-period likelihood coadds will also be built.
    TemplateCoadd
    A coadd data product used for difference imaging in both DRP and AP. In
    order to produce templates appropriate for the level of DCR in a given science image,
    these coadds may require a third dimension in addition to the usual two image dimen-
    sions (likely either wavelength or a quantity that is a function of airmass).
    The nature of these coadd data products depends critically on whether we are able to develop
    efficient algorithms for optimal coaddition, and whether these coadds are suitable for differ-
    ence imaging. These algorithms are mathematically well-defined but computationally difficult;
    seeDMTN-015for more information. We will refer to the coadds produced by these algo-
    rithms as “decorrelated coadds”; a variant with constant PSF (“constant-PSF partially decor-
    related coadd”) is also possible. This choice is also mixed with the question of how we will
    correct for differential chromatic refraction in difference imaging; some algorithms for DCR
    correction involve templates that are the result of inference on input exposures rather than
    coaddition. The alternative strategies for using decorrelated coadds yield five main scenarios:
    A
    We use decorrelated coadds for all final coadd products. DeepCoadd and ShortPeriod-
    Coadd will be standard decorrelated coadds with a spatially-varying PSF, and ConstantPS-
    FCoadd and TemplateCoadd will be constant-PSF partially-decorrelated coadds. The
    BestSeeingCoadd data product will be dropped, as it will be redundant with DeepCoadd.
    This will make coadds more expensive and complex to build, and require more algorithm
    development for coaddition, but will improve coadd-based measurements and make it
    easier to warm-start multi-epoch measurements. Difference imaging may be easier, and
    more visits may be usable as inputs to templates due to softened or eliminated seeing
    cut.
    B
    We use decorrelated coadds for all coadds but TemplateCoadd. Measurement is still im-
    proved, and the additional computational cost of coaddition is limited to a single pipeline
    that is not run iteratively. Difference imaging may be harder, and the number of visits
    eligible for inclusion in templates may be reduced. In this scenario, we still have two
    options for building templates:
    B1
    Templates will be built as PSF-matched coadds, or a product of PSF-matched coadds.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    73

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    B2
    Templates are the result of inference on resampled exposures with no PSF-matching.
    C
    We do not use decorrelated coadds at all. DeepCoadd, BestSeeingCoadd, and ShortPe-
    riodCoadd will be direct coadds, and ConstantPSFCoadd will be a PSF-matched coadd.
    Coaddition will be simpler and faster, but downstream algorithms may require more
    sophistication, coadd measurements may be lower quality, and multi-epoch measure-
    ments may be more difficult to optimize. Here we again have the same two options for
    templates asB: option
    C1
    Templates will be built as PSF-matched coadds, or a product of PSF-matched coadds.
    C2
    Templates are the result of inference on resampled exposures with no PSF-matching.
    It is also possible to combine multiple scenarios across different bands. In particular, we
    may not need special templates to handle DCR in most bands, so we may select a simpler
    approach in those bands. The final selection between these options will require experiments
    on LSST data or precursor data with similar DCR and seeing, though decorrelated coaddition
    algorithms and some approaches to DCR correction may be ruled out earlier if preliminary
    algorithm development does not go well.
    Further differences in the pipelines themselves due to the presence or absence of decorre-
    lated coadds will be described in the sections below.
    5.2.1 WarpAndPsfMatch
    This pipeline resamples and then PSF-matches CalExp images from a visit into a single patch-
    level image with a constant PSF. The resampling and PSF-matching can probably be accom-
    plished separately byImagedelegatingWarpingandPSFtoHomogenizationthe
    algorithmic
    components, respectively. These operations can also be performed in the opposite order if
    the matched-to PSF is first transformed to the CalExp coordinate systems (so subsequent re-
    sampling yields a constant PSF in the coadd coordinate system). Doing PSF-matching first may
    be necessary (or at least easier to implement) for undersampled images.
    It is possible these operations will be performed simultaneously by a new algorithmic compo-
    nent; this could potentially yield improved computational performance and make it easier to
    properly track uncertainty. These improvements are unlikely to be necessary for this pipeline,
    because these images and the coadds we build from them will only be used to estimate back-
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    74

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    grounds and find artifacts, and these operations only require approximate handling of uncer-
    tainty. However, other coaddition pipelines may require building an algorithmic component
    capable of warping and PSF-matching simultaneously, and if that happens, we would proba-
    bly use it here as well. Simultaneously warping and PSF matching could also yield important
    computational performance improvements.
    The only output of the WarpAndPsfMatchExposurepipelineintermediateis the MatchedWarp
    data product. It containsExposureallcomponents,of the usualwhich must be propagated
    through the image operations as well. There is a separate MatchedWarp for each {patch,
    visit} combination, and these can be produced by running WarpAndPsfMatch independently
    on each such combination. However, individual CCD-level CalExps will be required by multi-
    ple patches, so I/O use or data transfer may be improved by running all WarpAndPsfMatch
    instances for a given visit together.
    5.2.2 BackgroundMatchAndReject
    This pipeline is responsible for generating our final estimates of the sky background and up-
    dating our artifact masks. It is one of the most algorithmically uncertain algorithms in Data
    Release Production from the standpoint of large-scale data flow and parallelization, and a
    working prototype has not yet been demonstrated except for SDSS data, for which the drift-
    scan observing strategy makes the problem easier. The algorithm is simple over any patch of
    sky where the set of input images is constant, and we do not anticipate significant difficulty
    in extending this to an algorithm that works across image boundaries. The main challenge is
    likely to be the parallelization and data flow necessary to efficiently ensure consistent back-
    grounds over a full tract. Separate tracts are stil processed independently, however.
    The steps involved in background matching are described below. All of these operations are
    performed on the MatchedWarp images; these are all in the same coordinate system and
    have the same PSF, so they can be meaningfully added and subtracted with no additional
    processing.
    1. We define one of the visits that overlap
    reference
    an area. At
    image
    of the sky as the
    least in the naive local specification of the algorithm, this image must be smooth and
    continuous over the region of interest.Build BackgroundThis is doneReferenceby the
    pipeline, which must artificially (but reversibly) enforce continuity in a reference image
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    75

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    that stitches together multiple visits to form a single-epoch-deep full tract image, unless
    we develop an approach for dealing with discontinuity downstream.
    2.We subtract the reference image from every other visit image. This must account for
    any artifical features due to the construction of the reference image.
    3.We runSource Detectionon the per-visit difference images to find artifacts and transient
    sources. We do not generate a traditional catalog of these detections, as they will only
    be used to generate improved CalExp masks; they will likely be stored as a sequence of
    Footprints.
    4.We estimate the background on the per-visit difference images by delegating to the
    Matched BackgroundalgorithmicEstimationcomponent. This difference background
    should be easier to be model than a direct image background, as the image will be mostly
    free of sources and astrophysical backgrounds. This stage must involve at least some
    communication between patches to ensure that the background is continuous and con-
    sistent in patch overlap regions.
    5.We build a PSF-matched coadd by adding all of the visit images (including the reference)
    and subtracting all of the difference image backgrounds; this yields a coadd that contains
    only the reference image background, whichCoaddwe then model and subtract via the
    Background Estimationalgorithmic component. This background estimation must also
    involve communication between patches to ensure consistency. Combining the images
    will be performedCoadditionbyalgorithmicthe
    component,WarpedwhileImagethe
    Comparisoncomponent is used to generate new CalExp masks by analyzing the per-
    pixel, multi-visit histograms of image and mask values (e.g. generalized statistical outlier
    rejection) to distinguish transients and artifacts from variable sources.
    6.We combine the relevant difference backgrounds with the coadd background and trans-
    form them back to the CalExp coordinate systems to compute new background models
    for each CalExp.
    We are assuming in the baseline plan thatWarpAndPsfMatchwe can use a matched-to PSF in
    large enough to match all visit images to it without deconvolution. If a large matched-to PSF
    adversely affects subsequentBackgroundMatchAndRejectprocessing, weinmay need to de-
    velop an iterative approachWarpAndPsfMatchin whichonlywe toapplybetter-seeing visits
    first, using a smallerBackgroundMatchAndRejecttarget PSF, runon these, and then re-match
    everything to a larger target PSF and repeat with a larger set of input visits. However, this prob-
    lem would suggestDiffImthatandUpdateMasksthe pipelines would be even better at finding
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    76

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    artifacts, so a more likely mitigation strategy would be to simply defer final Mask generation
    to after at least one iteration of those pipelines,9at as described in the discussion of Figure
    the beginning5.2of. Section
    The outputs of BackgroundMatchAndReject are updated Background and Mask components
    for the CalExp product. Because it is not built with the final photometric and astrometric
    calibration, the PSF-matched coadd built here is discarded.
    5.2.3 WarpTemplates
    This pipeline is responsible for generating the resampled visit-level images (TemplateWarp)
    used to build template coadds for difference imaging. The algorithmic content of this pipeline
    and the nature of its outputs depends on whether we are using decorrelated coadds (option
    Aat the beginning5.2), PSF-matchedof
    B1coaddsorC1), or(inferringB2templatesorC2).
    (
    If we are using decorrelatedA), the outputcoadds is(optionequivalent to the LikelihoodWarp
    data product producedWarpRemainingby thepipeline (aside from differences due to the
    state of the input CalExps), and the algorithm to produce it the same:
    • We correlate the image with its ownConvolutionPSF by delegatingsoft-Kernelsto the
    ware primitive.
    • We resample the image byImagedelegatingWarpingsoftwareto theprimitive.
    Here we should strongly consider developing a single algorithmic component to perform both
    operations. These operations must include full propogation of uncertainty.
    If we are not using decorrelatedB1 orC1), the outputcoaddsis( equivalent to the Matched-
    Warp data product, and the algorithmWarpAndPsfMatchis thepipeline.same as theWe
    cannot reuse existing MatchedWarps simply because we need to utilize updated CalExps.
    If we are inferringB2orC2templates), this pipeline(
    is only responsible for resampling, pro-
    ducing an output equivalent to the DirectWarpWarpRemainingdata product produced by the
    pipeline. This work isImagedelegatedWarpingsoftwareto theprimitive.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    77

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    5.2.4 CoaddTemplates
    This pipeline generates the TemplateCoadd dataset used as the reference image for differ-
    ence imaging. This may not be a simple?È(and possiblycoadd,?Õand?Ò);atinleastorderinto
    correct for differential chromatic refraction during difference imaging, we may need to add a
    wavelength or airmass dimension to the usual 2-d image, making a 3-d dimensional quantity.
    The size of the third dimension will likely be small, however, so it should be safe to generally
    consider TemplateCoadd to be a small suite of coadds, in which a 2-d image is the result a
    different sum of or fit to the usual visit-level images (the TemplateWarp dataset, in this case).
    Most of the work isDCR-Correcteddone by the TemplatealgorithmicGenerationcomponent,
    but its behavior depends on which of the coaddition scenarios is selected from the list at the
    beginning of5.2):Section
    A,B1,C1
    One or more coadd-like images (corresponding to different wavelengths, airmasses,
    etc.) are created by delegatingCoadditionalgorithmicto the component to sum the
    TemplateWarp images with
    A
    different
    only:
    coaddedweights.images are then par-
    tially decorrelated to constantCoaddPSFDecorrelationby delegatingalgorithmicto the
    component.
    B2,C2
    The template is inferred from the resample visit images using an inverse algorithm that
    is yet to be developed.
    5.2.5 DiffIm
    In the DiffIm pipeline, we subtract a warped TemplateCoadd from each CalExp, yielding the
    DiffExp image, where we detect and characterize DIASources. This is quite similar to Alert
    Production’sAlert Detectionpipeline but may not be identical for several reasons. The AP
    variant must be optimized for low latency, and hence may avoid full-visit processing that is
    perfectly acceptable in DRP. In addition, the input CalExps will have been better characterized
    in DRP, which may make some steps taken in AP unimportant or even counterproductive.
    However, we expect that the algorithmic components utilized in DRP are the same as those
    used by AP.
    The steps taken by DRP DiffIm are:
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    78

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    1. Retrieve the DiffIm template appropriate for the CalExps to be processed (probably
    handling a full visit at aTemplatetime), delegatingRetrievalalgorithmicto thecompo-
    nent. This selects the appropriate region of sky, and if necessary, collapses a higher-
    dimensional template dataset to a 2-d image appropriate for the CalExp’s level of DCR.
    2.(optional) Correlate the CalExp withConvolutionits own PSF, Kerneldelegating to the
    software primitive. This is the “preconvolution” approach to difference imaging, which
    makes PSF matching easier by performing PSF-correlation for detection first, reducing or
    eliminating the need for deconvolution. This approach is theoretically quite promising
    but still needs development.
    3.Resample the template to the coordinate system of the CalExp, by delegating to the
    Image Warpingsoftware primitive.
    4.Match the template’s PSF to the CalExp’s PSF and subtract them, by delegating to the
    Image Subtractionalgorithmic component.
    5.RunSource Detectionon the difference image. We correlate the image with its PSF first
    usingConvolutionthe
    softwareKernelsprimitive unless this was done prior to subtrac-
    tion.
    6.(optional) Decorrelate the CalExpDifferenceby delegatingImage Decorrelationto the
    algorithmic component.
    7.RunDiffIm Measurementon the difference image to characterize difference sources. If
    preconvolution is used but decorrelation is not, the difference image cannot be mea-
    sured using algorithms applied to standard images; alternate algorithms may be devel-
    oped for some measurements, but perhaps not all.
    DiffIm can probably be run entirely independently on each CCD image; this will almost cer-
    tainly be taken in Alert Production. However, joint processing across a full visit may be more
    computationally efficient for at least some parts of template retrieval, and PSF-matching may
    produce better results if a more sophisticated full-visit matching algorithm is developed.
    5.2.6 UpdateMasks
    UpdateMasks is an optional pipeline that is only run if DiffExp masks are being used to up-
    date CalExp masks. As such, it is notDiffImrun, afterand istheneverlastruniterationif
    of
    BackgroundMatchAndRejectconstructs the final CalExp masks.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    79

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    LikeBackgroundMatchAndReject, UpdateMasks compares the histogram of mask values at a
    particular spatial point to determine which masks correspond to transients (both astrophysi-
    cal sources and artifacts; we want to reject both from coadds) and which correspond to vari-
    able objects. This workCoadditionis delegated.
    to
    5.2.7 WarpRemaining
    This pipeline is responsible for the full suiteCoad-of resampled images used to build coadds in
    dRemaining, after all CalExp components have been finalized. It produces some combination
    of the following data products, depending on the scenario(s) described at the beginning of
    Section5.2:
    LikelihoodWarp
    CalExp images are correlated with their own PSF, then resampled, via the
    ConvolutionsoftwareKernelsprimitiveImageandWarpingsoftwarethe
    primitive. Like-
    lihoodWarp is computed in allCscenarios,it may not needbut intooptionpropagate
    uncertainty beyond the variance, as the resulting coadd will be used only for detection.
    MatchedWarp
    As inWarpAndPsfMatch, CalExp images are resampled then matched to a
    common PSF,ImageusingWarpingandPSF Homogenization. MatchWarped is only pro-
    duced in optionC.
    DirectWarp
    CalExp images are simply resampled, with no further processing of the PSF, us-
    ingImage Warping. DirectWarp is only producedC.
    in option
    Given that all of these steps involve resampling the image, it would be desirable for computa-
    tional reasons to do the resampling once up front, and then proceed with the PSF processing.
    While this is mathematically possible for all of these cases, it would significantly complicate
    the PSF correlation step required for building LikelihoodWarps.
    5.2.8 CoaddRemaining
    In CoaddRemaining, we build the suite of coadds used for deep detection, deblending, and ob-
    ject characterization. This includes the Likelihood, ShortPeriodLikelihood, Deep, BestSeeing,
    ShortPeriod, and ConstantPSF Coadds.
    The algorithms again depend on the scenarios5.2outlined:
    at the beginning of Section
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    80

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    A,B
    All non-template coadds are built from LikelihoodWarps. We start by building ShortPe-
    riodLikelihoodCoadds by simple coadditionImageof the LikelihoodWarps, using the
    Coadditionalgorithmic component. We decorrelateCoadd Decorrela-these using the
    tionalgorithmiccomponenttoproduceShortPeriodCoadds, thensumtheShortPeriod-
    LikelihoodCoadds to produce the full LikelihoodCoadd. The full LikelihoodCoadd is then
    decorrelated to produce DeepCoadd and ConstantPSFCoadd.
    C
    We generate LikelihoodCoadd and ShortPeriodLikelihoodCoadds using the same approach
    as above (though the accuracy requirements for uncertainty propagation are eased).
    ShortPeriodCoadd, DeepCoadd, and BestSeeingCoadd are then built as different combi-
    nations of DirectWarp images,ImageagainCoadditionalgorithmicusing the component.
    ConstantPSFCoadds are built by combining MatchedWarps.
    These coadds must propagate uncertainty, PSF models (including aperture corrections), and
    photometric calibration (including spatial- and wavelength-dependent photometric calibra-
    tion), in addition to pixel values.
    5.3 Coadd Processing
    In comparison to the previous two pipeline groups, the large-scale processing flow in coadd
    processing is relatively simple. All pipelines operate on individual patches, and there is no
    large-scale iteration between pipelines. These pipelines may individually require complex
    parallelization at a lower level, as they will frequently have memory usage above what can
    be expected to fit on a single core.
    Coadd processing beginsDeepDetectwithpipeline,the which simply finds above-threshold
    regions and peaks in multiple detection coadds.Deep-These are merged in catalog-space in
    Associate, then deblended at theDeepDeblendpixel. Theleveldeblendedin
    pixels are mea-
    suredMeasureCoaddsin
    , which may also fit multiple objects simultaneously using the original
    undeblended pixels.
    5.3.1 DeepDetect
    This pipeline simplySourcerunsDetectionalgorithmicthe
    component on combinations of
    LikelihoodCoadds and ShortPeriodLikelihoodCoadds, then optionally performs additional pre-
    liminary characterization on related coadds. These combinations are optimized for detecting
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    81

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    objects with different SEDs, and there are a few different scenarios for what combinations
    we’ll produce (which are not mutually exclusive):
    • We could simply detect on each per-band LikelihoodCoadds separately.
    • We could build a small suite of cross-band LikelihoodCoadds corresponding to simple
    and artificial but approximately spanning SEDs (flat spectra, step functions, etc.).
    • We could build?l
    ?3
    coadda singlefrom the per-band coadds, which is only optimal for
    objects the color of the sky noise, but may be close enough to optimal to detect a broad
    range of SEDs.
    Any of these combinations may also be used to combine ShortPeriodLikelihoodCoadds.
    We may also convolve the images further or bin them to improve our detection efficiency for
    extended objects.
    Actual detection on these images may be done with a lower threshold than our final target
    threshold?h, toofaccount5
    for loss of efficiency due using the incorrect SED or morphological
    filter.
    The details of the suite of detection images and morphological filters is a subject requiring
    further algorithmic research on precursor data (or LSST/ComCam data) at full LSST depths
    with at least approximately the right filter set.
    After detection, CoaddSources may be deblendedSingleand characterized by running the
    Frame Deblending, Single Frame Measurement, andSingle Frame Classificationalgorithmic
    components on DeepCoadd and ShortPeriodCoadd combinations that correspond to the Like-
    lihoodCoadd combinations used for detection. These characterizations (like the rest of the
    CoaddSource tables) will beDeepAssociatediscardedpipelineafteristherun, but may be
    necessary to inform higher-level association algorithms run there. The requirements on char-
    acterization processing in this pipelineDeepAssociatewillpipeline,be set by the needs of the
    but we do not expect it to involve significant new code beyond what will be used by the various
    ImChar pipelines.
    The only output of DeepDetect is the suite of CoaddSource tables (one for each detection
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    82

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    image) containingFootprints(including their Peaks and any characterizations necessary for
    association).
    5.3.2 DeepAssociate
    In DeepAssociate, we perform a sophisticated spatial match of all CoaddSources and DIA-
    Sources, generating tables of DIAObjects, Object candidates, and a table of unassociated DIA-
    Sources that will be used toMOPSconstruct.
    SSObjects in
    We do
    not
    include the Source table in this merge, as virtually all Sources correspond to as-
    trophysical objects better detected elsewhere. Non-moving or slowly-moving astrophysical
    objects (even variable non-transient objects) will be detected at much higher significance in
    DeepDetect(as CoaddSources). Transients and fast-moving objects will be detected at similar
    significance with significantly less blendingDiffIm(and(as DIA-much easier classification) in
    Sources). While a small number of transient/moving Sources near the detection limit may not
    be detected in difference images due to extra noise from the template, these will be nearly
    impossible to recover without a large false positive rate from a spatial match of the Source
    table.
    The baseline plan for association is to first associate DIASources into DIAObjects using the
    same approach used in AlertDIAObjectProductionGenerationalgorithmic(i.e. the
    com-
    ponent), then associate DIAObjects with the multipleObject CoaddSource tables (using the
    Generationalgorithmic
    component). DIASources not associated into DIAObjects will be con-
    sidered candidates for merging SSObjects,MovingObjectPipelinewhich will happen in the
    pipeline.
    These association steps must be considerably more sophisticated than simple spatial match-
    ing; they must utilize the limited flux and classification information available from detection
    to decide whether to merge sources detected in different contexts. This will require astro-
    physical models to be included in the matching algorithms at some level; for instance:
    • We must be able to associate the multiple detections that correspond to high proper-
    motion stars into a single Object.
    • We must not associate supernovae with their host galaxies, despite the fact that their
    positions may be essentially the same.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    83

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    To meet these goals (as well as similar ones which still need to be specified), DeepAssociate
    will have to
    multiple
    generatehypotheses for some blend families. Some of these conflicting
    hypotheses will beDeepDeblendrejected, whileby theothers may be present in the final
    Object catalog (flags will be used to indicate different interpretations and our most likely in-
    terpretation). This is a generalization of the simple parent/child hierarchy used to describe
    different blend hypotheses in the SDSS2.3). database (see Section
    It is possible that associations could be improved by doing both merge steps simultaneously
    (under the hypothesis that CoaddSource presence or absence could be used to improve DI-
    ASource association). This is considered a fallback option if the two-stage association proce-
    dure described above cannot be made to work adequately.
    The output of the DeepAssociate pipeline is the first version of the Object table, containing a
    superset of all Objects that will be characterized in later pipelines.
    5.3.3 DeepDeblend
    This pipeline simplyMulti-Coadddelegates Deblendingtoalgorithmicthe
    component to de-
    blend all Objects in a particular patch, utilizing all non-likelihood coadds of that patch. This
    yieldsHeavyFootprintscontaining consistent deblended pixels for every object in every (non-
    likelihood) coadd, while rejecting as many deblend hypotheses as possible to reduce the num-
    ber of hypotheses that must be subsequently measured.
    While the pipeline-level code and data flow is simple, the algorithmic component is not. Not
    only must deblending deal with arbirarily complex superpositions of objects with unknown
    morphologies, it must do so consistently across bands and epoch ranges (with different PSFs)
    and ensure proper handling of Objects spawned by DIASources that may not even appear in
    coadds. It must also parallelize this work efficiently over multiple cores; in order to fit patch-
    level images for all coadds in memory, the processing of at least the largest individual blend
    families must themselves be parallelized. This may be done by splitting the largest blend
    families into smaller groups that can be processed in parallel with only a small amount of
    serial iteration; it may also be done by using low-level multithreading over pixels.
    The output of the DeepDeblend pipeline is an update to the Object table, which adds columns
    to indicate the origins of Objects and the decisions taken by the deblender as well as modifying
    the set of rows to reflect the current object definitions. It also includes attaching pixel-level de-
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    84

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    blend information to each Object. IfHeavyFootprintsstored,directlythis wouldin the form of
    be a large dataset (comparable to the coadd pixel data). This form must be available at least
    to theMeasureCoaddspipeline, but it almost certainly needs to be available to science users
    as well. Depending on the deblender implementation, it may be possible to instead store
    analytic models or some other compressedHeavyFootprintsform that would allow the full
    to be reconstructed quickly on the fly, while requiring a relatively small amount of additional
    per-object information. If this compression is lossy, it should probably be applied before the
    deblend results areMeasureCoaddsfirst usedso thein deblends used there can be exactly
    reconstructed later.
    5.3.4 MeasureCoadds
    The MeasureCoadds pipelineMulti-CoadddelegatesMeasurementtoalgorithmicthe
    compo-
    nent to jointly measure all Objects on all coadds in a patch.
    LikeDeepDeblend, this pipeline is itself quite simple, but it delegates to a complex algorithmic
    component (but a simplerMulti-Coaddone thanDeblending). There are three classes of open
    questions in how multi-coadd measurement will proceed:
    • What parameters will be fit jointly across bands, and which will be fit independently? The
    measurement framework for multi-coadd measurement is designed to support joint
    fitting, but it is likely that someSinglealgorithmsFrame Measurementwill simply be
    orForced Measurementplugins that are simply run independently on the DeepCoadd
    and/or ConstantPSFCoadd in each band. Making these decisions will require experimen-
    tation on deep precursor and simulated data.
    • How will we measure blended objects? Coadd measurement will at least begin by using
    theHeavyFootprintsproducedDeepDeblendby
    to useNeighborthe
    Noise Replacement
    approach, but we maySimultaneousthen use to generateFitting improved warm-start
    parametersMultiFitforor to build models we can use as PSF-deconvolved templates to
    enableDeblendthe
    TemplateapproachProjectionMultiFitinand/orForcedPhotome-
    try. If the deblender utilizes simultaneous fitting internally, we may also be able to use
    the results of those fits directly as measurement outputs or to reduce the amount of
    subsequent fitting that must be done.
    • How will we parallelize?DeepDeblend, keepingAs with the full suite of coadds in mem-
    ory will require processing at least some blend families using many cores. For algorithms
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    85

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    that don’t require joint fitting across different coadds, this could be done by measuring
    each coadd independently, but the most expensive algorithms (e.g. galaxy model fitting)
    are likely to be the ones where we’ll want to fit jointly across bands.
    The output of the MeasureCoadds pipeline is an update to the Object table, which adds columns
    containing measured quantities.
    5.4 Overlap Resolution
    The two overlap resolution pipelines are together responsible for finalizing the definitions of
    Objects by merging redundant processing done in tract and patch overlap regions. In most
    cases, object definitions in the overlap region will be the same, making the problem trivial, and
    even when the definitions are different we can frequently resolve the problem using purely
    geometrical arguments. However, some difficult cases will remain, mostly relating to blend
    families that are defined differently on either side.
    We currently assume that overlap resolution actually drops Object rows when it merges them;
    this will avoid redundant processingMultiFitpipeline.in the performanceA slower butcritical
    perhaps safer alternative would be to simply flag redundant Objects. This would also allow
    tract overlap resolutionMultiFitto andbeForcedPhotometrymoved afterpipelines,the
    which
    would simplify large-scale parallelization and data flow by moving the first operation requiring
    more than oneResolveTractOverlapstract (
    ) until after all image processing is complete.
    5.4.1 ResolvePatchOverlaps
    In patch overlap resolution, all contributing patches to an area (there can be between one and
    four; see10 Figure) share the same pixel grid, and we furthermore expect that they will have
    the same coadd pixel values. This should ensure that any above-threshold pixel in one patch
    is also above threshold in all others, which in turn should guarantee that patches agree on
    the extent of each blend familyFootprint(as defined).
    by the parent
    A common pixel grid also allows us to define the overlap areas as exact rectangular regions;
    we consider each patch to have an inner region (which directly abuts the inner regions of
    neighboring patches) and an outer region (which extends into the inner regions of neighboring
    patches). If we consider the case of two overlapping patches, blend families in those patches
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    86

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    can fall into five different categories:
    • If the family falls strictly within one patch’s inner region, it is assigned to that patch (and
    the other patch’s version of the family is dropped).
    • If the family crosses the boundary between patch inner regions...
    ...but is strictly within both patches’ outer regions, it is assigned to the patch whose
    inner region includes more of the family’s footprint area.
    ...but is strictly within only one patch’s outer region, it is assigned to that patch.
    ...and is not strictly within either patch’s outer region, the two families must be
    merged at an Object-by-Object level. The algorithm used for this procedure is yet
    to be developed, but will beBlendedimplementedOverlapbyalgo-Resolutionthe
    rithmic component.
    Overlap regions with more than two patches contributing have more possibilities, but are
    qualitatively no different.
    FigurePatch10:
    boundaries and overlaps regions?.3 patches.for a Differ-single tract with 3
    ent colors represent different patches; dashed lines show outer patch regions and dotted
    lines show inner patch regions. Light gray regions are processed as part of only one patch,
    medium regions as part of two, and dark regions as part of four.
    If pixel values in patch overlap regions cannot be guaranteed to be identical, patch overlap
    resolution becomes significantly harder (but no harder than tract overlap resolution), because
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    87

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    adjacent patches may disagree on the above categories to which a family belongs.
    Patch overlap resolution can be run independently on every distinct overlap region that has
    a different set of patches contributing to it; in the limit of many patches per tract, there are
    three times as many overlap regions as patches (each patch has four overlap regions shared
    by two patches, and four overlap regions each shared by four patches).
    5.4.2 ResolveTractOverlaps
    Tract overlap resolution operates under the same principles as patch overlap resolution, but
    the fact that different tracts have different coordinate systems and subtly different pixel val-
    ues makes the problem significantly more complex.
    While we do not attempt to define inner and outer regions for tracts, we can still define dis-
    crete overlap regions in which the set of contributing tracts is constant (though these regions
    must now be defined using spherical geometry). Because tracts may differ on the extent and
    membership of blend families, it will be useful here to define the concept of a “blend chain”:
    within an overlap region a family’s blend chain is the recursive union of all families it overlaps
    with in any tract that contributes to11 . thatA blendoverlapchainregionis thussee Figure
    the maximal cross-tract definition of the extent of a blend family, and hence we can use it to
    categorize blends in tract overlaps:
    1. If a blend chain is strictly contained by only one tract, all families within that chain are
    assigned to that tract. Note that this can occur even if the blend chain overlaps multiple
    tracts, as 11in; Figureregion 1 there is wholly contained only by the blue tract even
    though it overlaps the green tract.
    2.If a blend chain is strictly contained by more than one tract, all families within that chain
    are assigned to the tract whose center is closest to the centroid of the blend chain. This
    is illustrated by 11region, which2wouldin Figurebe assigned to the red tract.
    3.If a blend chain is not strictly contained by any tract, all families in the chain must be
    merged at an Object-by-Object level.BlendedThisOverlapis doneResolutionby the
    algorithmic component, after first transforming all measurements to a new coordinate
    system defined to minimize distortion due to projection (such as a tangent projection at
    the blend chain’s centroid).
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    88

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    ResolveTractOverlaps is the first pipeline in Data Release Production to require access to pro-
    cessed results from more than one tract.
    FigureTract11:
    overlap scenarios, corresponding to the enumerated list in the text. Each
    region outlined in black is a blend chain; transparent filled regions within these indicate the
    contributes from individual
    0tracts.is
    strictlyThe regioncontainedlabeledby the green
    tract and does not touch any others, so it does not participate in tract overlap resolution at
    all.
    5.5 Multi-Epoch Object Characterization
    The highest quality measurements for the vast majority of LSST objects will be performed
    by theMultiFitandForcedPhotometrypipelines. These measurements include stellar proper
    motions and parallax, galaxy shapes and fluxes, and light curves for all objects. These super-
    sede many (but not all) measurements previously made on coadds and difference images by
    using deep, multi-epoch information to constrain models while fitting directly to the original
    CalExp (or DiffExp) images.
    The difference between the two pipelines is theirMulti-parallelization axis: an instance of the
    Fitpipeline processes a single Object family at a time, utilizing all of the CalExps that overlap
    that family asForcedPhotometryinput, whileprocesses one CalExp or DiffExp at a time, iter-
    ating over all Object families within its bounding box. Together these three pipelines must
    perform three roles:
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    89

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    • Fit moving point source and galaxy models to all Objects, adding new columns or updat-
    ing existing columns in the Object table. This requires access to all images simultane-
    ously, so it mustMultiFitbe .done in
    • Fit fixed-position point source modelsMultiFitfor-derivedeach objectposi- (using the
    tions) to each DiffExp image separately, populating
    differ-
    the ForcedSource table. This
    ential forced
    could
    photometry
    concievablyMultiFitbe, butdonewillin probably be more
    efficient toForcedPhotometrydo in
    .
    • Fit fixed-position point source models for each object to each CalExp image separately,
    also populating the ForcedSource
    direct forced
    table.can
    photometry
    easilyThis be done
    in either pipeline,MultiFitbutshoulddoinggiveit us more options for dealing with blend-
    ing, and it may decrease I/O costs as well.
    5.5.1 MultiFit
    MultiFit is the single most computationally demanding pipeline in Data Release Production,
    and its data flow is essentially orthogonal to that of all previous pipelines. Instead of process-
    ing flow based on data products, each MultiFit job is an Object family covering many distinct
    images, and hence efficient I/O will require the orchestration layer to process these jobs in an
    order that minimizes the number of times each image is loaded.
    From the Science Pipelines side, MultiFit is implemented as two routines, mediated by the
    orchestration layer:
    • The MultiFit “launcher” processes the Object table and defines family-level MultiFit jobs,
    including the region of sky required and the corresponding data IDs and pixel-area re-
    gions (unless the latter two are more efficiently derived from the sky area by the orches-
    tration layer).
    • The MultiFit “fitter” processes a single Object family, accepting all required image data
    from the orchestration layer and returning an Object record (and possibly a table of
    related ForcedSources).Multi-EpochThis isMeasurementthealgorithmic component.
    This simple picture is complicated by the presence of extremely large blend families, however.
    Some blend families may be large enough that a single MultiFit job could require more mem-
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    90

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    ory than is available on a full node (or require more cores on a node than can be utilized by
    lower-level parallelization). We see two possibilities for addressing this problem:
    • The fitter could utilize cross-node communication to extend jobs over more nodes. The
    most obvious approach would give each node full responsibility for any processing on a
    group of full CalExps it holds in memory, as well as responsibility for “directing” a num-
    ber of MultiFit jobs. These jobs would delegate pixel processing on CalExps to the nodes
    responsible for them (this constitutes the bulk of the processing). This would require
    low-latency but low-bandwidth communication; the summary information passed be-
    tween the directing jobs and the CalExp-level processing jobs is much smaller than the
    actual CalExps or even the portion of a CalExp used by a particular fitting job, but this
    communication happens within a relatively tight loop (though not the innermost loop).
    This approach will also require structuring the algorithmic code to abstract out commu-
    nication, and may require an alternate mode to run small jobs for testing.
    • The launcher could define a graph of sub-family jobs that correspond to an iterative
    divide-and-conquer approach to large families. This approach will require more flexibil-
    ity in the algorithmic code to handle more combinations of fixed and free parameters
    (to deal with neighboring objects on the edges of the images being considered), more
    tuning and experimentation, and more sophisticated launcher code. Fitting individual
    large objects in this scenario could also require binning images in the orchestration or
    data access layer.
    It is unclear which of these approaches will be more computationally expensive. The first
    option may reduce I/O or total network usage at the expense of sensitivity to network latency.
    The second option may require redundant processing by forcing iterative fitting, but that sort
    of iterative fitting may lead to faster convergence and hence be used even in the first option.
    If direct forced photometry is performed in MultiFit, moving-point source models will simply
    be re-fit with per-epoch amplitudes allowed to vary independently and all other parameters
    held fixed. The same approach could be used to perform differential forced photometry, but
    this would require also passing DiffExp pixel data to MultiFit.
    Significant uncertainty also remains in how MultiFit will handle blending even in small families,
    but this decision will not have larger-scale processing impacts, and will be discussed further
    in Section6.7.3.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    91

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    5.5.2 ForcedPhotometry
    In ForcedPhotometry, we simply measure point-source and possibly aperture photometry
    (the baseline is point source photometry, but aperture photometry should be implemented
    for diagnostic use and as a fallback) on individual CalExp or DiffExp images, using positions
    from the Object table.
    Aside from querying the Object table for the list of Objects overlapping the image, all work
    is delegatedForcedto theMeasurementalgorithmic component. The only algorithmic chal-
    lenge is how to deal with blending. If only differential forced photometry is performed in this
    pipeline, it may be appropriate to simply fit all Objects within each family simultaneously with
    point source models. The other alterativeMultiFitoris possiblyto project templates from
    MeasureCoaddsand replace neighbors with noise6.7.3(asand6.7.3described).
    in Sections
    5.6 Postprocessing
    The pipelines in the postprocessing group may be run after nearly all image processing is
    complete, and with the possibleMakeSelectionMapsexception, includeofno image process-
    ing themselves. While we do not expect that these pipelines will require significant new al-
    gorithm development, they include some of the least well-defined aspects of Data Release
    Production; many of these pipelines are essentially placeholders for work that may ultimately
    be split out into multiple new pipelines or included in existing ones. Unlike the rest of DRP,
    a more detailed design here is blocked more by the lack of clear requirements and policies
    than a need for algorithmic research.
    5.6.1 MovingObjectPipeline
    The Moving Object Pipeline plays essentially the same role in DRP that it plays in AP: it builds
    the SSObject (Solar System Object) table from DIASources that have not already been associ-
    ated with DIAObjects. We will attempt to make its implementation as similar as possible to
    theAP Moving Object, butPipelinethe fact that DRP will run on all DIASources in the survey at
    once (instead of incrementally) make this impossible in details. The steps in MOPS are (with
    some iteration):
    • DelegateMaketo theTrackletsalgorithmic component to combine unassociated DIA-
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    92

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Sources
    tracklets
    into.
    • DelegateAttributionto the
    andalgorithmicPrecovery component to predict the posi-
    tions of known solar system objects and associate them with tracklets. The definition
    of a “known” solar system object clearly depends on the input catalog; this may be an
    external catalog or a snapshot of the Level 1 SSObject table.
    • DelegateOrbitto theFittingalgorithmic component to merge unassociated tracklets into
    tracks and fit orbits for SSObjects where possible.
    The choice of initial catalog largely depends on the false-object rate in the Level 1 SSObject;
    if the only improvements in data release production are slightly improved orbit and/or new
    SSObjects, using the Level 1 SSObject table could dramatically speed up processing – but it
    may also remove the possibility of removing nonexistent objects.
    The DRP Moving Object Pipeline represents a full-survey sequence point in the production,
    but we expect that it will be a relatively easy one to implement, because it operates on rela-
    tively small inputs (unassociated DIASources) and produces a single new table (SSObject) as
    its only major output (though IDs linking DIASources and SSObjects must also be stored in
    either DIASource or a join table). This should mean that it can be run after most other data
    products have already been ingested, while requiring little temporary storage as the rest of
    the processing proceeds tract-by-tract.
    5.6.2 ApplyCalibrations
    The processing described in the previous sections produces six tables that ultimately must
    be ingested into the public database: Source, DIASource, Object, DIAObject, SSObject, and
    ForcedSource. The quantities inSource are either in raw units (e.g. fluxes are in counts, posi-
    tions in pixels) or pseudo-raw relative units (e.g. coadd-pixel counts or tract pixel coordinates).
    These must be transformed into calibrated units via our astrometric and photometric solu-
    tions, a process we delegateRaw Measurementto the algorithmicCalibration component.
    For the pseudo-raw relative units used for coadd measurements and multifit results, these
    transformations are exact and hence do not introduce any new uncertainty, but must still be
    applied.
    This is the primary place where the wavelength-dependent photometric calibrations gener-
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    93

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    ated by the Calibration Product Pipelines are applied. This will require inferring an SED for
    every object (or source) from its measured colors. The families of SEDs and the choice of
    color measurements used are subjects for future algorithmic research, but it should be possi-
    ble to resolve these questions with relatively little effort. The inferred SED must be recorded
    or deterministic, allowing science users to recalibrate as desired with their own preferred SED.
    One possible complication here is that PSF models are also wavelength dependent, and the
    SED for this purpose must be inferred much earlier in the processing. Because it is highly
    desirable that the SEDs used for PSF-dependent measurement be the same as those used
    for photometric calibration, we may need to either infer SEDs early in the processing from
    preliminary color measurements or estimate the response of measurements to changes in
    PSF-evaluation SED so it can be approximately updated later.
    TODO Reference appropriate subsection of CPP section.
    It is currently unclear when and where calibrations will be applied; there are several options:
    • We could apply calibrations to tables before ingesting them into the public database;
    this would logically create new calibrated versions of each table data product.
    • We could apply calibrations
    as
    we ingestto themtablesinto the final database.
    • We could ingest tables into the temporary tables in the database and apply the calibra-
    tions within the database.
    Regardless of which option is chosenRawforMeasurementeach publicCalibra-table, the
    tionalgorithmic component will need to support operation both outside the database on in-
    memory table data and within the database (via, e.g. user-defined functions). The former will
    be needed to apply calibrations to intermediate data products for diagnostic purposes, while
    the latter will be needed to allow Level 3 users to recalibrate objects according to their own
    assumed SEDs.
    5.6.3 MakeSelectionMaps
    The MakeSelectionMaps is responsible for producing multi-scale maps that describe LSST’s
    depth and efficiency at detecting different classes of object. The details of what metrics will
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    94

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    be mapped, the format and scale of the maps (e.g. hierarchical pixelizations vs polygons), and
    the way the metrics will be computed are all unknown.
    The approach must be extensible at Level 3: science users will need to build additional maps
    that can be utilized as efficiently by large collaborations as DM-produced maps. This will ease
    the pressure on DM to provide a large suite of maps, but the details of what DM will provide
    still needs to be clarified to the community.
    One potential major risk here is that the most common way to determine accurate depth
    and selection metrics is to add fake sources to the data and reprocess, and this can require
    reprocessing each unit of order 100 times. Because the reprocessing does not need to include
    all processing steps (assuming the skipped steps can be adequately simulated), this should
    not automatically be ruled out – if the pipelinesDeepDetect) that must be repeated (e.g.
    are significantly faster thanMultiFitskipped), the overallsteps (suchimpactas on processing
    could still be negligible. Regardless, the role of DM in this sort of characterization also needs
    to be clarified to the community.
    TODO Cite Balrog paper (Suchyta and Huff 2016)
    5.6.4 Classification
    In its simplest realization, this pipeline computes variability summary statistics and probabilis-
    tic and/or discrete classification of each Object as a star or galaxy; this may be extended to
    include other categories (e.g. QSO, supernova).
    Variability summary statisticsVariabilityare delegatedCharacterizationalgorithmicto the
    component.
    Type classification isObjectdelegatedClassificationalgorithmicto the
    component. This may
    utilize any combination of morphological, color, and variability/motion information, and may
    use spatial information such as galactic latitude as a Bayesian prior. Classifications based on
    only morphology will also be available.
    Both variability and type classification may require “training” a representative subset of the Ob-
    ject and ForcedSource tables and/or similar tables derived from special program data. Rather
    than imposing a full-survey sequence point here, we’ll probably use previous data releases or
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    95

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    results from a small-area validation release.
    5.6.5 GatherContributed
    This pipeline is just a placeholder for any DM work associated with gathering, building, and/or
    validating major community-contributed data products.
    In addition to data products produced by DM, a data release production also includes official
    products (essentially additional Object table columns) produced by the community. These in-
    clude photometric redshifts and dust reddening maps. While DM’s mandate does not extend
    to developing algorithms or code for these quantities, its responsibilities may include valida-
    tion and running user code at scale. The parties responsible for producing these datasets
    and their relationship to DM needs to be better defined in terms of policy before a system for
    including community-contributed data products in a data release can be designed.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    96

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    6 Algorithmic Components
    This section describes mid-level Algorithmic Components that are used (possibly in multiple
    contexts) by the pipelines3, 4described, and5. These inin Sectionsturn depend on the even
    lower-level Software Primitives7. AlgorithmicdescribedComponentsin Sectionwill typically
    be implemented as Python cTasklassesclasses(suchin asthethecodebase as of the time
    this was written) that frequently delegate to C++. Unlike the pipelines discussed in previous
    sections, which occupy a specific place in a production, Algorithmic Components should be
    designed to be reusable in slightly different contexts, even if the baseline design only has
    them being used in one place. Many components may require different variants for use in
    different contexts, however, and these different variants may or may not require different
    classes. These context-specific variants are identified below.
    We expect that these components will form the bulk of the LSST Science Pipelines codebase.
    6.1 Reference Catalog Construction
    6.1.1 Alert Production Reference Catalogs
    Alert Production will useObjecta tablesubsetasofathereferenceDRP
    catalog. As the DRP
    Objecttable
    is regenerated on the same schedule as the template images used in Alert Pro-
    duction, we should always be able to guarantee that the reference catalog and the template
    images are consistent and cover the same area.
    Obtaining this catalog from the Level 2 database should simply be a matter of executing a SQL
    query, though some experimentation and iteration may be necessary to define this query.
    6.1.2 Data Release Production Reference Catalogs
    The reference catalog used in Data Release Production is expected to be built primarily from
    the Gaia catalog, but it may be augmented by data taken by LSST during commissioning (e.g. a
    short-exposure, full-survey layer). DRP processing will also iteratively update this catalog (uti-
    lizing LSST survey data) in the course of a single production, but it is not yet decided whether
    these changes will be propagated to later data releases.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    97

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Constructing the DRP reference catalog is thus more of a one-off activity rather than a reusable
    software component, and much of the work will be testing and research to determine how
    much we need to augment the Gaia catalog.
    6.2 Instrument Signature Removal
    Two variants of the instrument signature removal (ISR) pipeline will exist for the main camera,
    with the difference arising from the real-time processing constraints placed by AP. This section
    outlines the baseline design for DRP, with6.2.1. the differences for AP given in §
    • Overscan subtraction: per-amplifier subtraction of the overscan levels, as either a scalar,
    vector or array offset. After cropping out the first one or two overscan rows to avoid any
    potential contamination from CTI or transients, a clipped mean or median will be sub-
    tracted if using a scalar subtraction (to avoid contamination by cosmic rays or bleed
    trails), and a row-wise median subtracted if using a vector subtraction. If array subtrac-
    tion turns out to be necessary (unlikely, especially given the subtraction of a master bias
    frame later in the process), some thought should be given as to how to avoid introducing
    extra noise to the image.
    • Assembly: per-amplifier treatment of each CCD flavor (e2v and ITL sensors assembly
    differently) followed by per-CCD / per-raft assembly of the CCDs onto the focal plane.
    ApplicationEDGEmask-bitof
    to appropriate
    i.e.
    regionsaroundofthetheedgesCCDs,
    of both sensor flavors, and around the midline region of e2v sensors due to distortions
    from the anti-blooming implant.
    • Linearity: apply linearitymastercorrectionlinearity, markingusingtableregionsthe
    where the linearity is consideredSUSPECT.
    unreliable as
    • Gain correction: applied for CBP measurements where flat-fielding is not performed;
    multiplyabsoluteby the togainsconvert from ADUs to electrons, and estimate the per-
    pixel variance.
    • Crosstalk: Apply crosstalk correction to the raw data-stream from the DAQ using the
    appropriate versionmasterofcrosstalkthe.
    matrix
    • Mask defects and saturation:masterapplicationdefectandmasterlist ofsaturation
    levelsto setBAD/SATthebit(s) in the maskplane.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    98

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    • Full frame corrections:
    – Bias: Subtractmasterthebias. frame
    – Dark: Subtract an exposure masterlength darkmultiple, perhapsframeof the
    including the slew time, depending on when the array is cleared.
    – Flats: Divide by the appropriatemonochromaticlinear combination.master flatsof
    – Fringe frames: this will involve the subtraction of a fringe frame composed of some
    combinationmonochromaticof
    to matchflatsnightthe sky’s andspectrumthe filter in
    use at the time of observation, though the plan for how this combination will be derived
    remains to be determined.
    • Pixel level corrections:
    – The “brighter-fatter effect”: Apply brighter fatter correction using the coefficients
    from4.3.15§ . Need to add proper section about this and reference it as this is a non-
    trivial ISR algorithm. Just not sure where to put it.
    – Static pixel size effects: Correction of static effects such as tree rings, spider legs
    etc.
    using data4.3.14from. As§above - needs details and referencing.
    • CTE correction: The method used to correct for CTE will depend on what was needed to
    fully characterize the4.3.16charge).
    transfer (see §
    • Interpolation over defects and saturation: interpolate over defects previously identified
    using the PSF, andINTERPsetbit thein the mask plane.
    • Cosmic rays: identification6.3.1of), interpolationcosmic rays (seeover§ cosmic rays
    using PSF, andCR/INTERPsettingbit(s)ofin the mask plane.
    • Generate snap difference: simple pixel-wise differencing of snaps to identify cosmic rays
    and fast moving transients for removal is baselined, though a more complex process
    could be involved.5.1.1.See also §
    • Snap combination: the baseline design is for a simple pixel-wise addition of snaps to
    create the full-depth exposure for the visit. However, provision should be made for a
    less simplistic treatment in the event that there is a non-negligible mis-registration of
    the snaps arising from either the telescope
    e.g.
    inpointingthe
    or atmospheric effects
    dome/ground layer.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    99

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    6.2.1 ISR for Alert Production
    The ISR for the AP pipeline differs slightly to the one used in DRP due to the realtime processing
    constraint.
    Crosstalk correction will be performed inside the DAQ, where the most recent crosstalk matrix
    will have been loaded. However, whilst there is therefore no default action for AP, in the event
    of network outages where the local data buffer overflows and the crosstalk-corrected data is
    lost, crosstalk correction
    14
    would need to be applied.
    Flat fielding will be performed asnightfor DRP,sky’sbutwillspectrumbecausenot bethe
    available to AP, the fringe frame subtracted will either be some nominal fringe frame, or one
    taken from an array of pre-computed composite fringe frames with the sky-matching per-
    formed using PCA on-the-fly.
    Does AP plan on performing “brighter-fatter effect”and tree-ring corrections? No reason why
    it shouldn’t I don’t think (and it would likely need to if they were of DECam’s magnitude), I am
    just not sure and should include here if it won’t.
    6.3 Artifact Detection
    6.3.1 Cosmic Ray Identification
    The need for a morphological cosmic ray rejection algorithm is motivated on multiple fronts.
    Firstly, the science pipelines are explicitly required to run on visits that are taken in the tra-
    ditional way, i.e. one single continuous integration, and in a series of snaps where multiple
    exposures are taken in series and then aggregated OSS-REQ-0288
    in ISR . Even when the pipelines have the
    luxury of having multiple exposures taken in quick succession at the same pointing, we can-
    not simply reject anything in the difference of the snaps as a cosmic ray since we wish to be
    able to utilize measurements on the snap differences to identify very rapidly varying objects.
    Given the need for the morphological cosmic ray detection algorithm, we will, as the baseline,
    adopt an algorithm similar tophotothatpipeline.used inThethebaselineSDSS
    algorithm
    requires some modest knowledge of the PSF and looks for features that are sharp relative to
    14
    This assumes that alerts are still being generated in this eventuality.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    100

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    the PSF. Qualitatively, this works well on a variety of data though does require some tuning
    depending on the inputs.
    [33] present an alternative algorithm that does not depend on knowledge of the PSF, but
    instead assumes that cosmic ray features will be sharp from pixel to pixel. If an algorithm
    different from the baseline is necessary, an33algorithm] would be like the one described in [
    an option for exploration.
    6.3.2 Optical ghosts
    We will have a set of optical ghost models. Some of these will be models of stationary ghosts
    (e.g. pupil ghost). Others will be a set of ghosts produced by point sources as a function
    of source position and brightness. The structure of the stationary ghosts can be measured
    using stacked, dithered star fields. The latter will likely be modeled using raytracing tools or
    measured using projectors.
    The stationary ghosts will need to be fit for since they will depend on the total light through
    the pupil rather than on the brightness of a given source and we do not expect to have the
    data necessary to compute the total flux over the focalplane in a single thread in the alert
    production processing. Using the?º andfit theto stationarypredictionsmodelsof the single
    source ghosts,?., we will construct a ghost image
    ??
    ?> ?—
    ?, ?h
    ?.
    where?Éruns over the stationary?Êrunsghostovermodelsthe sourcesand
    contributing to
    single source ghosts. We can then correct the image by:
    ??
    ?> ?? ??
    ??
    It may not be possible to do point source ghost correction in alert production. We will know
    the model of the point source ghosts, but we will not know the location of the bright sources
    in other chips. Since point source ghosts can appear at significant separations, this may be a
    source of spurious detections.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    101

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    6.3.3 Linear feature detection and removal
    Satellite trails, out of focus airplanes, and meteors all cause long linear features in astronomi-
    cal images. The Hough13] Transformis a common[tool used in computer vision applications
    to detect linear features. Linear?Òfeatures, the perpendicularare parameterizeddistanceby
    to the line from the?], theoriginthe angle?Òandwithofthe x-axis.?)?Ò?-space?]?* The is binned
    and each pixel in the image adds its flux to all the bins consistent with that pixel location. For
    bright linear features, the bin at the true location of the feature will fill up because more than
    one bright pixel is contributing to that location in parameter space. After all pixels have been
    polled, the highest bins correspond to the linear features in the image.
    This works very well in high signal-to-noise images, but is very computationally expensive. It
    is also susceptible to bright point sources overwhelming faint linear features.
    The Hough transform is the correct tool for finding linear features whent the feature is high
    signal to noise. Since this is not always true in astronomical images, it’s necessary to use
    some form of a modified Hough transform that preferentially boosts the signal to noise of
    linear features in high dynamic range data.
    One could imagine a variety of ways6]tofirstdoapplythis. anForedgeexample,detec-[
    tion algorithm to pull out linear features and then use the result to line up the edges to form
    longer linear features. Another approach suggested by Steve Bickerton (HSC; private commu-
    nication) is to compute the PCA of pixel values in a localized region around each pixel. In linear
    features, there will be a high amount of correlation in a certain direction for the surrounding
    pixels. This effectively boosts the linear feature’s signal to noise in the PCA image and can
    produce a linear feature mask by simply applying a threshold.
    The baseline for the science pipelines will be to use a modified Hough transform to identify
    and mask linear features in visit images.
    6.3.4 Snap Subtraction
    Cosmic Rays
    When subtracting snaps to form a visit, we will need to still run some sort of
    morphological identifier like the one outlined above to identify cosmic rays. This is because
    there will be real transients and we still only want to pick out the sharp features as CRs. It
    will also help to have less crowding, so we should do CR rejection on the snap difference if we
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    102

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    have it.
    Ghosts
    Snap differences will not help with ghosting as the ghosts should difference almost
    perfectly.
    Linear features
    Snap differences will provide significant leverage for masking linear fea-
    tures. Since each segment will appear in at most one snap we can mask based on the pixels
    marked as detected in the difference images that are part of the trail. This will help in crowded
    regions. This technique will require running some sort of trail detection algorithm, but the re-
    quirements will be less stringent since the image will be so much less crowded.
    6.3.5 Warped Image Comparison
    Additional artifacts will be detected in DRP by comparing multiple visits that have already been
    resampled to the same coordinate system.SnapThisSubtractionis,similar conceptually to
    but will operate quite differently in practice, in that we do not expect to combine this stage
    with the morphological detection stages; instead we assume that everything we can detect
    morphologically will have already been detected.
    Instead, this stage will examine the full 3-d data cube (two spatial dimensions as well as the
    epoch dimension) for outliers in the epoch dimension that are contiguous in the spatial dimen-
    sions. This is an extension of traditional coadd outlier-rejection, which can cause spurious
    rejections of single pixels (or small groups of pixels) due to noise and differing PSFs. This can
    obviously detect astrophysical transients as well as image artifacts, and this is usually desir-
    able; this stage is responsible for determining which pixels should contribute to our definition
    of the static sky, and we want to reject astrophysical transients from that as well.
    The largest challenge for this algorithm is probably handling highly variable astrophysical
    sources that are the nevertheless present in most epochs. For these, defining the static sky
    is more subjective, and we may need to modify our criteria for rejecting a region on a visit as
    an outlier.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    103

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    6.4 Artifact Interpolation
    This component is responsible for interpolating over small (PSF scale or smaller) artifacts such
    as cosmic rays. By utilizing the PSF model, this interpolation should be good enough that many
    downstream algorithms do not need to worry about masked pixels (especially those that do
    not have a built-in formalism foraperturemissingorfluxessecond-momentdata, such as
    shapes). Interpolatedpixelswillalsobemasked(bothasinterpolatedandwithabitindicating
    the reason why).
    This will likely use Gaussian processes, but an existing implementation in the stack should be
    considered to be a placeholder, as it only interpolates in one direction (to deal with satellite
    trails).
    Artifact interpolation will not handle regions significantly larger than the PSF size; these must
    be either be subtracted or masked.
    6.5 Source Detection
    Detection is responsible for identifying new sources in a single image, yielding approximation
    positions significance estimates. We expect the same algorithm (or only slightly different algo-
    rithms) to be run on single-visit direct images, difference images, and coadds. For difference
    images, this must include detection of negative and dipole sources.
    The output of detectionFootprints(containingis a set of peaks).
    In the limit of faint, isolated objects and white noise, detection should be equivalent to a max-
    imum likelihood threshold for (at least) point sources, which can be achieved by correlating
    an image with its PSF and thresholding. Other approaches may be necessary for different
    classes of objects, such as:
    • In crowded stellar fields, we expect to need to detect iteratively while subtracting the
    brightest objects at each iteration5.1.1).
    (see e.g. Section
    • Optimal detection of diffuse galaxies may require correlating with kernels broader than
    the PSF.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    104

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    • When blending or any significant sub-threshold objects are present, the noise proper-
    ties may be sufficiently different from the usual assumptions in maximum-likelihood
    detection to allow those methods, and an alternate approach may be necessary.
    • When processing preconvolved difference images or likelihood coadds, detection will
    need to operate on images that have already been correlated with the desired filter.
    • When operating on non-likelihood coadds and standard difference images, detection
    may need to operate on images with significant correlated noise.
    In deep DRP processing5.3), detection(Sectionis closelydeep associationtiedandtodeepthe
    deblendingalgorithms, and may change significantly from the baseline plan based on devel-
    opments in those algorithms. For example, we may need to adopt a multi-scale approach
    to these operationsbackground(and estimation) that essentially merges these into a single
    algorithmic component with no well-defined boundaries.
    6.6 Deblending
    Deblending is both one of the most important and one of the most difficult algorithmic chal-
    lenges for LSST, and our plans for deblending algorithms are best described as a research
    project at this stage.
    The baseline interface takesFootprintsas(includinginput a setpeaks),of possibly merged
    from detections on severalObjectGenerationimages, and(seea set of images (related to, but
    not necessarly identical to the set of deytectionHeavyFootprintsimages). It returns a tree of
    that contain the deblended images of objects. The tree may have multiple levels, indicating
    a sequence of blend hypotheses that subdivide an image into more and more objects. There
    may be differentHeavyFootprintsfor each deblended image (at least one for every band),
    making the sizeHeavyFootprintsof allcomparable to this size of the image data, at least for
    coadds. Depending on the deblending algorithm chosen, a more compact representation of
    the deblend results may be possible (inHeavyFootprintsthat thattoitbewould allow the full
    regenerated quickly from the image data).
    As deblending maysimultaneousinvolveof galaxyfitting and point source models, it may
    also output the parameters of these models directly as measurements, in addition to gener-
    ating a pixel-level separation of neighboring objects that can be used by other measurement
    algorithmsNeighborReplacementvia
    .
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    105

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Deblending large objects isbackgroundalso closelyestimation. Somerelatedsciencetocases
    (focusing on small, rare objects) may prefer aggressive background subtraction that removes
    astrophysical backgrounds such as intra-cluster light or galactic cirrus, while other science
    cases obviously care about preserving these structures (as well as the wings of bright galaxies,
    which are frequently difficult to model parametrically). Rather than produce independent
    catalogs with different realizations of the background, it makes more sense to include these
    smaller-scale astrophysical background features in the deblend tree, which already provides
    a way to express multiple interpretations of the sky.
    The baseline approach to deblending involves the following steps:
    1. Define a “template” for each potential object in the blend (a model that at least approxi-
    mately reproduces the image of the object).
    2.Simultaneously fit the amplitudes of all templates in the blend.
    3.Remove redundant templates/objects according to some criteria (and loop back to the
    first step).
    4.Apportion each pixel’s flux to objects according to the value of the objects amplitude-
    scaled template at the position of that pixel divided by the sum of all amplitude-scaled
    templates.
    Regardless of the templates used, this approach strictly preserves flux, and it can preserve
    the morphology of even complex objects in the limit that they are widely separated. The
    complexity in this approach is of course in the definition of templates and the procedure for
    dealing with redundancy.
    The deblender in the SDSS Photo pipeline uses a rotational symmetry ansatz to derive tem-
    plates directly from the images. This approach is probably too underconstrained to work in
    the deeper, more blended regime of LSST, and hence we plan to try at least using various
    parametric models (both PSF-convolved and not). An ansatz that requires each object to have
    a approximately uniform color over its image may also be worth exploring, and we may also
    investigate other less-parametric models such as Gaussian mixtures, wavelet decompositions,
    or splines. Hybrid approaches, such as using a symmetry ansatz for the brightest object(s) in
    blends and more constrained models for the rest, will also be explored.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    106

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    This approach yields exactly only one level of parent/child relationships in the output blend
    tree; each peak in a blend generates at most one child, and all peaks have the same parent.
    To extend this to the multi-level tree we expect to need to support all science cases, we expect
    to repeat this approach at multiple scales – though it is currently unclear exactly how we will
    treat each scale differently; some possibilities include multiple detections on with different
    spatial filters and building a tree of peaks based on their detection significance and location.
    A key facet of any approach to deblending is to utilize the PSF model as a template for any
    children that can be safely identified as unresolved. This provides a way to build a deblender
    that can operate in crowded stellar fields as effectively as a traditional crowded field codes:
    as the density of the field increases (either as detected by the algorithm or as a function of
    position on the sky), we can increase the probability with which we identify objects as unre-
    solved. The simultaneous template fit then becomes a simultaneous fit of PSF models, and
    if we iterate this procedure with detection (after subtracting previously-fit stars), we recover
    the traditional crowded-field algorithm.
    A final major challenge in developing an adequate deblender is characterizing its performance.
    Not only do we lack quantitative requirements on the deblender’s performance, we also lack
    metrics that would quantify improvement in the deblender across science cases. Poor de-
    blender performance will clearly impact existing science requirements, but this sort of indi-
    rect testing makes iterative improvement more difficult, and it is certain that some deblender
    failure modes will adversely affect important science cases without affecting any existing re-
    quirements. Deblender development will thus have to also include significant work on char-
    acterizing deblender performance.
    6.6.1 Single Frame Deblending
    In single-frame processingDRP’s BootstrapImChar(e.g.and possiblySingleAP’sFrame Pro-
    cessingpipelines), deblending will be run on individual CCD images, which requires that it
    work without any access to color information and in some cases only a preliminary model of
    the PSF (since it may be run before a quality PSF model has been fit).
    Because single-epoch images are shallower than coadds, we expect blending to be less severe
    than in coadds. Combining this with the fact that only a single image is being operated on, it
    is unlikely the single-epoch deblender will be constrained by memory even if run in a single
    thread.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    107

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    6.6.2 Multi-Coadd Deblending
    Deep deblending on coadds will require a deblender that can simultaneously process a suite
    of coadds. This will include at least the deep coadds for each band, but it may also include
    short-period coadds (again, for each band) and possibly cross-band coadds. Merely keeping
    all of these in memory together would probably necessitate multithreading to avoid requiring
    more memory/core than most other pipeline algorithms, but we also expect the number of
    objects in blends to be large on average and extreme in the worst case, and memory use by
    the deblender scales with this as well. This will almost certainly require some sort of divide-
    and-conquer approach in addition to some combination of the already-complex algorithmic
    concepts described above.
    The outputs of the deep deblender will need to be “projected” to images other than the coadds
    actually used by the deblender. This includes at least PSF-matched coadds (which will have
    the same pixel grid but different PSFs) and possibly individual epoch images (which will have
    different pixel grids andforceddifferentphotometryPSFs)andmulti-epochfor
    ; seefitting
    Section6.7.3for more information.
    6.7 Measurement
    Source and object measurement involves a suite of algorithmic components at different levels;
    it is best thought of as12a) ofmatrixdrivers(seeandFigurealgorithms. Drivers correspond
    to a certain context in which measurement is performed,6.7.1.
    and are described in Section
    Drivers iterate (possibly in parallel) over all sources or objects in their target image(s), and
    execute measurement algorithms on each; each6.7.2measurement)
    algorithm (see Section
    processes either a single object or a group of blended objects. One of the main tasks of the
    drivers is to help the algorithms measure blended objects; while some algorithms may handle
    blending internally by simultaneous6.7.3), most willfittingbe given(Sectiondeblended pixels
    by the driver, which will utilize deblender outputs and the neighbor-replacement procedure
    described in6.7.3Sectionto provide the algorithms with deblended images.
    6.7.1 Drivers
    Measurement is run in several contexts, but always consists of running an ordered list of
    algorithm plugins on either individual objects or families thereof. Each context corresponds to
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    108

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Variants
    Single Visit
    Multi-Coadd
    Difference Image
    Multi-Epoch
    Forced
    A
    l
    g
    o
    r
    i
    t
    h
    m
    s
    Centroiders
    Second-Moment Shapes
    Aperture Photometry
    Static Point Source Models
    Petrosian Photometry
    Kron Photometry
    Galaxy Models
    Moving Point Source Models
    Trailed Point Source Models
    Dipole Fitting
    Spuriousness
    D
    e
    b
    l
    e
    n
    d
    i
    n
    g
    Replace Neighbors
    Simultaneous Fitting
    Variant-Algorithm or Variant-Deblending combination is implemented and will be used
    These photometry algorithms are also run in single-visit mode only to calculate their aperture corrections.
    Both deblending approaches are implemented and compared; either or both may be used, depending on test results.
    Deblending for these measurement variants will be implemented only if needed after testing with no deblending
    FigureMatrix12:
    showing combinations of measurement variants, algorithms, and deblend-
    ing approaches that will be implemented.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    109

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    different variant of the measurement driver code, and has a different set of plugin algorithms
    and approaches to measuring blended objects.
    Single Frame Measurement:
    Measure a direct single-visit CCD image, assuming deblend
    information already exists and can be used6.7.3to).replace neighbors with noise (see
    Single Frame MeasurementAP’sisSinglerun inFrameboth Processing) and DRP’spipeline
    BootstrapImChar, RefineImChar, andFinalImChar.
    The driver for Single Frame MeasurementSourceCatalogis passedand anEx-an input/output
    posureto measure. Plugins takeSourceRecordan input/outputand anExposurecontaining
    only the object to be measured.
    Multi-Coadd Measurement:
    Simultaneously measure a suite of coadds representing differ-
    ent bandpasses, epoch ranges, and flavors.MeasureCoaddsThispipeline.is run only in DRP’s
    The driver for Multi-Coadd MeasurementObjectCatalogis passedand a dictan input/output
    ofExposuresto be measured. Plugins takeObjectRecordan input/outputand a dictEx- of
    posures, each containing only the object to be measured. Some plugins will also support
    simultanous measurement of multiple objects, which requires they be provided the subset of
    theObjectCatalogto be measured andExposuresa dictcontainingof
    just those objects.
    Difference Image Measurement:
    Measure a difference image, potentially using the asso-
    ciated direct image as well. Difference imageAlertmeasurementDetection is run in AP’s
    pipeline andDiffImDRP’spipeline.
    The signatures of difference image measurement’s drivers and algorithms are at least some-
    what TBD; they will take at leastExposureanda differenceSourceCatalog/SourceRe-a
    image
    cord, but some plugins such as dipole measurement may require access to a direct image
    as well. Because difference imaging dramatically reduces blending, difference image mea-
    surement may not require any approach to blended measurement (though any use of the
    associated direct image would require deblending).
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    110

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    If preconvolution is used to construct difference images, but they are not subsequently decor-
    related, the algorithms run in difference image measurement cannot be implemented in the
    same way as those run in other measurement variants, and algorithms that cannot be ex-
    pressed as a PSF-convolved model fit (such as second-moment shapes and all aperture fluxes)
    either cannot be implemented or require local decorrelation.
    Multi-Epoch Measurement:
    Measure multiple direct images simultaneously by fitting the
    sameWCS-transformed,PSF-convolved model to them. Blended objects in Multi-Epoch Mea-
    surement will be
    at
    handled
    least
    fittingbythem simutaneously6.7.3), which may( in turn
    require hybrid galaxy/star6.7.3). Thesemodelsmodels( may then be used as templates for
    deblending and replace-with-noise6.7.3) measurement(if this improves the results.
    Because the memory and I/O requirements for multi-epoch measurement of a single object
    or blend family are substantial, we will notObjectCatalogprovide a driver that accepts an
    and measures all objectspwithinipelineit;willinstead,submittheindividual family-level jobs
    directly to the orchestration layer. The multi-epoch measurement driver will thus just operate
    on one blend family at a time, and manage blending while executing its plugin algorithms.
    Multi-epoch measurement for DRP only includes two plugin algorithms, so it is tempting to
    simply hard-code these into the driver itself, but this driver will also need to support new
    plugins in Level 3.
    Multi-epoch measurement will also be responsible for actually performing forced photometry
    on direct images, which it can do by holding non-amplitude parameters for moving point-
    source models fixed and adding a new amplitude parameter for each observation.
    Forced Measurement:
    Measure photometry on an image using positions and shapes from
    an existing catalog.
    In the baseline plan, we assume that forced measurement will only be run on difference im-
    ages; while forced photometry on direct images will also be performed in DRP, this will be
    done in the course of multi-epoch measurement.
    Because difference imaging reduces blending substantially, forced measurement may not re-
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    111

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    quire any special handling of blends. If it does, simultaneous fitting (with point-source models)
    should be sufficient.
    The driver for Forced MeasurementSourceCatalogis passed, an anadditionalinput/output
    inputReferenceCatalog, and Exposurean
    to measure. Plugins takeSourceRe-an input/output
    cord, an inputReferenceRecordand anExposure. If simultaneous fitting is needed to mea-
    sure blends, plugins will instead receive subsets of the catalogs passed to the driver instead
    of individual records.
    Forced measurement isForcedPhotometryused by thepipelineDRP
    and numerous pipelines
    in AP.
    Add references to specific AP pipelines that will use forced measurement.
    6.7.2 Algorithms
    Centroids
    Centroid measurements are run on single images to measure the position of
    objects. Despite the name, these don’t measure just the raw centroid of the photons that
    correspond to an object; we generally also expect our centroid algorithms to correct for off-
    sets introduced by convolution with the PSF. While they may not be implemented this way,
    centroid algorithms should thus return results that are equivalent to the best-fit position pa-
    rameters of a PSF-convolved symmetric model. This model should be a delta function for
    unresolved objects and something approximately matched to the inferred size of extended
    objects.
    When run in the very first stages of processing, a full PSF model will not be available, making
    PSF correction impossible, and here centroid measurements will be expected to yield the raw
    centroid of the light. Note that this must still be corrected for any weighting function used by
    the algorithm.
    Centroids will probably be run independentlyMulti-Coaddon Measure-each coadd during
    ment, to allow for centroid shifts due to proper motion in short-period coadds. Centroid mea-
    surements are supercededMoving Pointby SourceandGalaxyModelsModelsinMultiFit,
    which impose different models for centroid differences between epochs that are consistent
    with morphology.Forced Measurementin production will never include centroid measure-
    ment, as the goal is explicitly to measure photometry at predetermined positions, but it may
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    112

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    be useful to have the capability to centroid in forced measurement for diagnostic purposes.
    Pixel Flag Aggregation
    The pixel flag “measurement algorithm” simply computes summary
    statistics of masked pixels in the neighborhood of the source/object. This provides a generic
    way to identify objects impacted by e.g. saturation or cosmic rays while allowing other mea-
    surement algorithms to ignore these problems (especially when they have been interpolated
    over.
    Second-Moment Shapes
    Shape measurements here are defined as an estimate of a char-
    acteristic ellipse for a source
    not
    attemptor objectto correctthat doesfor the effect of the
    PSF, corresponding to the second moments of its image. To make this measurement practical
    in the presence of noise, a weight function must be used, and our baseline plan is to use an
    elliptical Gaussian matched (adaptively) to the shape of the image. This may be unstable for
    sources with extremely low SNR, and for these the PSF, a fixed circular Gaussian, or a top-hat
    may be used as the weight function. We may also include regularization that ensures the size
    of the object is no smaller than the size of the PSF.
    To enable downstream code to correct shapes for the PSF, the shape algorithm must also
    measure the moments of the PSF model at the position of every object or source (though
    we expect the best PSF-corrected shape measuresGalaxy forModelgalaxies to come from
    Fitting).
    Aperture Photometry
    Aperture photometry here refers to fluxes measured within a se-
    quence of fixed-size (i.e. same for all objects) circular or elliptical annuli. The radii of the
    annuli will be logarithmically spaced radii, though fluxes at the largest largest radii will not be
    measured for objects significantly smaller than those radii. Together these aperture fluxes
    will provide a measurement of the radial profile of the object.
    The baseline plan for LSST is to use circular apertures, but we also plan to investigate using
    ellipses, which would provide more meaningful and higher SNR measurements if problems in
    robustly defining per-object (and perhaps per-radius) ellipses for faint objects can be solved.
    While aperture fluxes with radii much larger than the pixel size can be measured naively by
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    113

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    simply summing pixel values, smaller aperturessincinterpolationwill be measured using the
    algorithm5], whichof [ integrates exactly over sub-pixel regions. To avoid contamination from
    bleed trails when measuring heavily saturated objects, we plan to measure fluxes within az-
    imuthal segments of annuli instead of circular regions; any the flux within any contaminated
    segments can be replaced by the mean of the remaining segments (thus assuming approxi-
    mate circular symmetry).
    For aperture fluxes with radii close to the PSF size to be scientifically useful, they must be
    performed on PSF-matched images. We thus expect to plan run aperture photometry only
    on PSF-matched coadds and visit-level images, with the latter accompanied by a caution that
    smaller apertures may not be meaningful without user-level correction for the PSF.
    Static Point Source
    In
    Photometry
    single-visit, difference image, and forced measurement,
    PSF fluxes will be measured with position heldCentroidfixed at the valued determined by the
    algorithm, with only the amplitude allowed to vary. We will not use per-pixel weights (as these
    can lead to bias as a function of magnitude when the PSF model is slightly incorrect) to fit for
    the amplitude, but we will use per-pixel variances to compute the uncertainty on the flux. PSF
    fluxes will be aperture corrected6.12).
    (see Section
    In multi-coadd measurement we may use either static point source models or moving point
    source models to estimate PSF fluxes; this depends on the number and depth of short-period
    coadds, and hence it is likely we will use static point source models for early data releases
    and moving point source models near the end of the survey. In either case we expect these
    measurements to be entirely superceded for science purposes by multi-epoch fitting results
    using a moving point-source model; these measurements on coadds are largely for QA and
    to warm-start multi-epoch fitting.
    Kron Photometry
    Kron fluxes are aperture fluxes measured with an aperture radius set to
    some multiple?3or(usually?3?/?6) of the Kron radius, which is defined as:
    kron
    ?>
    ?~
    ?Ò ???)?Ò?*
    ?~
    ???)?Ò?*
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    114

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    In our implementation, we use an elliptical aperture (and compute the above radius using
    elliptical moments),Second-Momentusing thetoShapeset the ellipticity.
    Measuring the Kron radius itself is difficult in the presence of noise; as with any moment
    measurement, pixels far from the center with low SNR are given higher weight than central
    pixels with high SNR. In practice, the sums over pixels in the Kron radius definition must be
    truncated at some point, and the resulting Kron radius can be sensitive to this choice. Our
    current approach uses a fixedSecond-Momentmultiple ellipse.ofShapethe
    This may be
    less robust than an adaptive approach, but it more closely matches the procedure used by
    SExtractor’sMAG_AUTO, which is by far the most popular implementation of Kron photometry
    in astronomy.
    Petrosian Photometry
    Need to get RHL to write this section.
    • Compute Petrosian radius.
    • Requires taut splines and more robust measurement of standard elliptical aperture
    suite.
    • Compute flux in elliptical aperture at multiple Petrosian radius.
    Galaxy Models
    Galaxy models will be fit toMulti-Epochall objectsMeasurementin both
    andMulti-Coadd Measurement. Coadd fitting may be performed only on deep coadds and
    used to warm-start multi-epoch fitting that would supersede it, but it may also be run on
    PSF-matched coadds to generate consistent colors (the consistent colors referred to by the
    DPDD
    may be derived from galaxy models fit to PSF-matched coadds or aperture fluxes on
    PSF-matched coadds, but they may also be derived from multi-epoch fitting results).
    The baseline plane for the galaxy models themselves is a restricted bulge + disk model, in
    which the two components are restricted to the same ellipticity and the ratio of their radii is
    fixed; practically this is more analogous to a single Sersic model with a linearized Sersic index.
    This may be extended to models with more flexible profiles and/or different ellipticity and
    radii for the two components if these additional degrees of freedom can still be fit robustly.
    Bayesian priors and possibly other regularization methods will likely be necessary even with
    the baseline degrees of freedom.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    115

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Designing and constraining priors that provide the right amount of information in the right
    way is a major challenge. One possibility is an empirical prior derived from external datasets
    such as deep HST fields and precursor ground-based surveys, which would almost certainly
    require custom processing of those datasets using the models intended for production use.
    Hierarchical modeling – in which the prior is derived from the LSST wide survey itself as individ-
    ual objects are fit – is unlikely to be feasible (a naive implementation would either introduce
    several
    full-survey, all-object sequence points in the processing or treat galaxies processed
    late differently from those processed early). An empirical prior derived from LSST special pro-
    gram data (e.g. deep drilling fields) or previous data releases would be feasible, however,
    and should be considered. Even an ideal prior that reflects the true distribution of galaxy pa-
    rameters may not be appropriate for galaxy photometry, however; fluxes must be rigorously
    defined to be unbiased to changes in observing conditions, and are most useful when they
    can be defined in a way that is redshift-independent
    ex-
    as well. The “correct” Bayesian prior
    plicitly
    treats galaxies with different radii differently, making both of these properties harder
    to guarantee. As a result, the prior we use for fitting may be some compromise between the
    statistically appropriate distribution and a regularization that attempts to reconcise Bayesian
    modeling with the requirements of traditional maximum-likelihood photometry.
    In addition to maximum-posterior fitting, we will draw Monte Carlo samples (nominally 200
    samples per object, at least on average) from the posterior in multi-epoch fitting mode. Fitting
    this within LSST’s computational budget will be a serious challenge, requiring new algorithm
    development in several areas:
    • Evaluating PSF-convolved galaxy models on every epoch at every sample point or opti-
    mizer iteration is extremely expensive. Because galaxy models are generally massively
    undersampled before convolution with the pixel grid, naive pixel convolution is impos-
    sible without considerable subsampling, which genreally makes it computationally im-
    practal. Fourier-space methods require galaxy models with anayltic Fourier transforms
    as well as a great deal of care in accounting for the differences between discrete and con-
    tinuous Fourier transforms. Multi-Gaussian and Multi-Shapelet approximation methods
    are only computationally feasible if the PSF can consistently be approximated well by
    those functions, which may not be known until relatively late in commissioning. It may
    be possible to combine the multi-Gaussian and Fourier-space convolution approaches
    by using multi-Gaussian approximations to galaxy models to evaluate them efficiently in
    Fourier space. We may also be able to address large residuals from multi-Gaussian/multi-
    Shapelet PSF approximations by convolving the residuals themselves with a simple proxy
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    116

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    for the galaxy model (which could be a delta function for small galaxies) and adding this
    as a correction to the multi-Gaussian/multi-Shapelet convolution.
    • Most Monte Carlo methods require many more than 200 draws to converge to a fair
    sample (for galaxy-fitting?©
    ?2?1
    ?5
    is common).problems,We plan to use importance sam-
    pling in multi-epoch fitting, starting with samples drawn from the posterior distribution
    in coadd-fitting (where we can evaluate likelihoods faster by a factor of the number of
    exposures in each band on average). These samples must be self-normalized, which
    introduces a bias that may be significant if the number of samples is small, and it is cur-
    rently unclear whether this will be a problem in our case. It is also unlikely we will be able
    to draw as many?©
    ?2?1
    ?5
    samplesas
    even in coadd measurement in order to achieve con-
    vergence there. Results from fitting with a greedy optimizer first should provide enough
    information allow for fair and efficient sampling with a smaller number of draws, but
    devising a sampling algorithm to make use of that information may be challenging.
    Challenges in galaxy modeling are not limited to sampling; the effective number of galaxy
    model evaluations involved in a typical greedy optimizer fit is also at least 200, if evaluations
    needed to estimate derivatives via finite differences are included (and analytic derivates are
    usually not significantly faster than numerical derivatives). Galaxy model parameterizations
    are intrinsically difficult for most optimizers near the zero radius limit, as this forces the deriva-
    tive of the model with respect to other parameters (such as ellipticity) to approach zero as well.
    Issues with Bayesian priors causing flux biases and the general lack of sufficient information
    to constrain more complex models are also present for optimizer-based fitting. Priors are per-
    haps a larger concern for fitting than sampling, in fact, because users can reweight samples to
    replace a DM-selected prior with a prior of their own choosing, but this is only approximately
    possible for optimizer results.
    Galaxy models may be fit simultaneouslySimultaneousto multiple) as Fittingwellobjects (see
    as fit to individualreplacingobjects afterneighbors. In simultaneouswith noise fitting, it will
    sometimes be inappropriate to fit all objectsHybridin a blendMod- with galaxy models. Fitting
    elsthat can transition smoothly betweenMovingaPointgalaxySourceismodelModeland a
    one approach to avoid fitting all permutations of model types to a blend.
    Cite Lensfit paper for restricted bulge-disk model. Cite Hogg and Lang, Bosch 2010 for Gaus-
    sian/Shapelet approximation.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    117

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Moving Point Source
    In the
    Models
    Multi-Epoch Measurement, all objects will be fit with
    a moving point source model that includes proper motion and parallax as free parameters
    in addition to positional zeropoint and per-band amplitudes. This model may be extended
    to include parameterized variability or per-epoch amplitudes if this can be done without de-
    grading the astrometric information that can be extracted from the fit. Moving point source
    models may beMulti-Coaddfit in
    Measurementas well if the suite of coadds contains enough
    short-period coadds to constrain the fit, but theseMulti-results will be fully superseded by the
    Epoch Measurementresults.
    Bayesian priors may be used in the fit (making this “maximum posterior” instead of “max-
    imum likelihood”), if necessary to ensure robustness when fitting for faint objects or if they
    significantly improve the quality of the results. These will generally be global and relatively un-
    informative (reflecting e.g. the expected distribution of proper motion of stars as a function of
    apparent magnitude), but may be highly informative for stars that can be unambiguously as-
    sociated with the Gaia catalog, if including Gaia and LSST astrometric solutions at the catalog
    level proves inconsistent with this (more rigorous) Bayesian approach to including Gaia data
    at the pixel level. All priors will be reported, but unlike Monte Carlo samples, results from a
    fit with a greedy optimizer cannot be reweighted to change to a user-provided prior except in
    a perturbative, first-order sense. Monte Carlo sampling with moving point source models is
    not included in the baseline plan, but will be considered if it proves important for joint fitting
    of blended stars andHybridgalaxiesModels, below)(seeStar/Galaxyor
    Classification, and it
    can be done without significantly affecting the compute budget.
    Cite Lang+Hogg paper that did this in Stripe 82.
    Trailed Point Source
    Need
    Models
    to find someone (probably on AP team) to write this
    section.
    • Fit PSF convolved with line segment to individual images
    Dipole Models
    • Fit PSF dipole for separation and flux to a combination of difference image and direct
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    118

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    image.
    • Deblending on direct image very problematic.
    Arising primarily due to slight astrometric alignment or PSF matching errors between the two
    images, or effects such as differential chromatic aberration, flux “dipoles” are a common ar-
    tifact often observed in image differences. These dipoles will lead to false detections of tran-
    sients unless correctly identified and eliminated. Importantly, dipoles will also be observed in
    image differences in which a source has moved less than the width of the PSF. Such objects
    must be correctly identified and measured as dipoles in order to obtain accurate fluxes and
    positions of these objects.
    Putative dipoles in image differences are identified as a positive and negative source whose
    footprints overlap by at least one pixel. These overlapping footprints are merged, and only
    the sources containing one and only one positive and negative merged footprint are passed
    to the dipole modeling task. There isDMTN-007a documented] between dipoledegeneracy [
    separation and flux, such that dipoles with closely-separated lobes of high flux are statistically
    indistinguishable from ones with low flux and wider separations. We remove this degeneracy
    by using
    pre-subtraction
    the
    (i.e., the
    images
    warped, PSF-matched template image and the pre-
    convolved science image) to constrain the lobe positions (specifically, to constrain the centroid
    of the positive lobe in the science image and of the negative lobe in the template image). This
    is done by first fitting and subtracting a second-order 2-D polynomial to the background within
    a subimage surrounding each lobe footprint in the pre-subtraction images to remove any flux
    from background galaxies (we assume that this gradient, if it exists, is identical in both pre-
    subtraction images). Then, a dipole model is fit simultaneously to the background-subtracted
    pre-subtraction images and the image difference.
    The dipole model consists of positive and negative instances of the PSF in the difference image
    at the dipole’s location. The six dipole model parameters (positive and negative lobe centroids
    and fluxes) are estimated using non-linear weighted least-squares minimization (we currently
    use the Levenberg-Marquardt minimization?l
    ?3
    algorithm).and signal- The resulting reduced
    to-noise estimates provide a measure by which the source(s) may be classified as a dipole.
    We have tested the described dipole measurement algorithm on simulated dipoles with a
    variety of fluxes, separations, background gradients, and signal-to-noise. Including the pre-
    subtraction image data clearly improves the accuracy of the measured fluxes and centroids.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    119

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    We have yet to thoroughly assess the dipole measurement algorithm performance on crowded
    stellar fields. Such crowded fields may confuse the parameter estimates (both centroids
    and/or fluxes) when using the pre-subtraction images to constrain the fitting procedure, and
    in such situations, we may have to adjust the prior constraint which they impose.
    Note that deblending dipole sources is a complicated process and we do not intend on imple-
    menting such an algorithm. As with all fitting algorithms, speed may be a concern. We will
    optimize the dipole measurement for speed.
    Spuriousness
    Need to find someone (probably on AP team) to write this section.
    • Some per-source measure of likelhood the detection is junk (in a difference image).
    • May use machine learning on other measurements or pixels.
    • May be augmented by spuriouness measures that aren’t purely per-source.
    6.7.3 Blended Measurement
    Most LSST objects will overlap one or more of its neighbors enough to affect naive measure-
    ments of their properties. One of the major challenges in the deep processing pipelines will
    be measuring these objects in a way that corrects for and/or characterizes the effect of these
    blends.
    The measurement algorithms6.7.2can beof splitSectionup broadly into two categories:
    • weighted momentsSecond-Moment(includes , ApertureShapes Photometry, Kron Pho-
    tometry, andPetrosianPhotometry;
    • forward modelingGalaxy(includesModels, Moving Point Source, andTrailedModels
    Point Source). Models
    Most measurements that involve the PSF or a PSF-convolved function as a weight function
    can be interpreted in eitherCentroidway algorithms(this includesandmostall PSF flux algo-
    rithms), though only the weighted-moment interpretation provides a motivation for ignoring
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    120

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    per-pixel variances, as is necessary to ensure unbiased fluxes in the presence of incorrect
    models.
    The statistical framework in which weighted moments make sense assumes that each object is
    isolated from its neighbors. As a result, our only option for these measurements is removing
    neighbors from the pixel values prior to measurement,6.7.3.
    which we will discuss further in
    In forward modeling, we convolve a model for the object with our model for the PSF, compare
    this model to the data, and either optimize to find best-fit parameters or explore the full likeli-
    hood surface in another way (e.g. Monte Carlo sampling). We can use the removing-neighbors
    approach for forward fitting, simply by fittting each object separately to the deblended pixels.
    However, we can also use simultaneous6.7.3), in whichfittingwe optimize(Section or sample
    the models for multiple objects jointly.
    Both neighbor-replacement and simultaneous fitting have some advantages and disadvan-
    tages:
    • Neighbor-replacement provides no direct way to characterize the uncertainties in an
    object’s measurements due to neighbors, while these are naturally captured in the full
    likelihood distribution of a simultaneous fit. This likelihood distribution may be very
    high-dimensional in a fit that involves many objects, however, and may be difficult to
    characterize or store.
    • Neighbor-replacement generally allows for more flexible morphologies than the ana-
    lytic models typically used in forward fitting, which is particularly important for nearby
    galaxies and objects blended with them; simultaneous fitting is only statistically well-
    motivated to the extent the models used can reproduce the data.
    • Once neighbor-free pixels are available, fitting objects simultaneously will almost always
    be more computationally expensive than fitting them separately to the deblended pixels.
    At best, simultaneous fitting will have similar performance but still require more complex
    code. And because we will need to deblend pixels to support some measurement algo-
    rithms, we’ll always have to deblend whether we want to subsequently do simultaneous
    fitting or not.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    121

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Neighbor Noise Replacement
    We do not perform measurements directly on the deblended-
    pixelHeavyFootprintsoutput byDeblenderthe for two reasons:
    • The deblended pixels typically have many zero entries, especially for large blend families
    (i.e. many pixels for which a particular object has no contribution). These zero pixels
    make the noise properties of a deblended object qualitatively different from those of an
    isolated object, which may be problematic for some measurement algorithms.
    • Many measurements utilize pixelsFootprintbeyond, andthein factblendmayfamily’s
    extend to pixels that are in another family.
    To address these issues, we measure deblended objects using the following procedure:
    1. Replace every above thresholdFootprintspixel) within therandomlyimagegener-(all
    ated noise that matches the background noise in the image.
    2.For each blend family:
    (a)For each child object in the current blend family:
    i.Insert theHeavyFootprintchild’sinto the image, replacing (not adding to) any
    pixels it covers.
    ii.Run all measurement algorithms
    child
    measurements.to produce
    iii.Replace the pixelsFootprintin theregionchild’swith (the same) random noise
    again.
    (b)Revert the pixelsFootprintin theto theirparentoriginal values.
    (c)Run all measurement algorithms
    parent
    measurements.to produce
    (d)Replace theFootprintparentpixels with (the same) random noise again.
    This procedure double-countsFootprintflux that, butisthisnotispartconsideredof a
    better
    than ignoring this flux, because most measurement algorithms utilize some other procedure
    for downweighting the contribution of more distant pixels.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    122

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Deblend Template
    When
    Projection
    deblendingis performed on one image and measure-
    ment occurs in another, the deblender outputs must be “projected” to the measurement im-
    age. In general, this requires accounting for three potential categories of differences between
    the images:
    • differing coordinate systems
    • differing PSFs
    • different epoch.
    We do not currently have a use case for projecting deblender results between images with
    different filters; we expect that we will have deblender results from at least each per-band
    coadd, and projection is required forForceddifferentMeasurementper-epochand
    images in
    Multi-Epoch Measurement. It may be entirelySimultaneousunnecessarycanifFittingbe used
    to address all blended measurement issues in these contexts.
    When variability can be ignored, deblendedresampledusingpixelthevaluessame can be
    algorithms that operatePSFonMatchingimages,kernelsandcan be used to account for
    PSF differences (though some regularization will be required if this involves a deconvolution).
    When variability cannot be ignored, these operations should instead be applied to the deblend
    templates, which can then be re-fit to produce new per-epoch deblend results.
    These operations are significantly easier if the deblend templates themselves are defined via
    analytic models that must be convolved with the PSF to generate the template; the models
    can simply be transformed, convolved with the per-epoch PSF and re-fit.
    Simultaneous
    For
    Fitting
    measurement algorithms that can be fully expressed as a likelihood-
    based fit using a model that closely approximatesNeighbortheNoisedata, an alternative to
    Replacementis to fit all objects in a blend simultaneously (either with a greedy optimizer or
    with Monte Carlo sampling). This is statistically straightforward: the parameter vectors for in-
    dividual per-object models can simply be concatenated, and the pixels to fit to are the union
    of all of the pixels that would be used in fitting individual objects.
    Simultaneously fitting a group of objects is almost certainly slower than fitting those objects
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    123

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    individually, in essentially every case, but the decrease in performance may be mild. The per-
    iteration calculations in optimizer-based methods?¶?)?μ
    ?3
    ?*
    have a worst-case complexity that is
    in the number of parameters,?¶?)?μ
    ?*evaluationbut
    the(and convolution) of models typically
    completely dominates?¶?)?μ
    ?3
    ?*linearthealgebra,
    and this should hold for all but the very largest
    blend families. The more important scaling factor is the number of iterations required to
    converge to the solution; as this is expected to scale roughly with the volume of the space
    that much be searched, it could?¶?)?Ë
    ?*.
    Thisscalecanasbepoorlyamelioratedas
    by starting
    the optimizer close to the correct solution; fitting objects independently to deblended pixel
    values before simultaneous fitting should provide a reasonably good guess. Simultaneous
    fittingMulti-Epochin
    Measurementcan also be initializedMulti-Coaddfrom the results of
    Measurement, where model evaluations are significantly faster.
    With Monte Carlo methods, a high-dimensional parameter space is less of a problem; Monte
    Carlo methods are valued precisely because they scale (on average) better with dimensionality.
    We still expect to require warm-start methods for simultaneous Monte Carlo sampling of mul-
    tiple objects, such as usingMulti-Epochimportance Measurementsamplingto re-weightin
    samples drawn during independentMulti-Coaddper-objectMeasurementsampling.
    or
    For both optimizer-based fitting and Monte Carlo sampling, it should be possible to explore
    the parameter space in a way that does not require evaluating the model for every object
    at every iteration or draw; because objects on opposite sides of a blend should affect each
    other only weakly, optimizer steps and samples that explicitly ignore these weak correlations
    in choosing the next parameter point to evaluate may be more efficient. In Monte Carlo sam-
    pling, this would probably utilize techniques from Gibbs Sampling; with optimizers, one can
    probably ignore the step for any objects whose optimal step size falls below some threshold
    (with some extra logic to guarantee these objects are are still sometimes updated). These
    improvements would likely reuqire significant development effort, and they would almost
    certainly make using off-the-sheld optimizers and samplers impossible.
    The fact that simultaneous fitting itself may be used to produce templates for deblending –
    and that simultaneous fitting may require non-simultaneous fitting, using deblended results,
    for a warm start – suggests that theDeblendingboundaryand measurementbetween the
    algorithmic components may be somewhat vague. This could be represented in software as
    an iteration between these algorithmic components, or perhaps a hierarchy of components in
    which the same low-level fitting code is used (and extended) by both algorithmic components,
    which can then be run straightforwardly in sequence.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    124

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Hybrid Models
    In simultaneous fitting, the model used to fit one object can affect the qual-
    ity of the fit to its neighbor, making it important the the best model be used for each object.
    We explicitlyMovingfit bothPoint SourceandGalaxyModelsModelsto every object, how-
    ever, precisely because we do not expect to always be able to securely classify objects well
    enough to know which model is better (and we certainly do not expect to be able to classify
    them before fitting these models). In simultaneous fitting, trying all possibilies leads to a com-
    binatorial explosion of model-fitting jobs?μ objects(fittingleads?3
    eachtoof two models to
    fits).
    Given that both of these models have a static point source as a limit, and classification is hard-
    est at this limit, making the right classification for neighbors may not be critical; misclassified
    objects would still end up being fit with a model that is broadly appropriate for them. Even
    in this case, we would?3?μ fittingstillobjectshave when?μ -objectfittingblendan
    with two
    models: every object would be fit twice as the “primary” object (with both models), and then
    twice for each of its neighbors. Given that each?μ ob-of these fitting jobs still involves fitting
    jects, this still results in?3?μa
    ?3
    (cleverscalingoptimizersof approximatelyand samplers could
    probably reduce this, with a cost in code complexity and development time).
    Another option would be to fit both models simultaneously, by introducing a higher-level
    boolean parameter that sets the type. Sampling from this hierarchical model is not signif-
    icantly difficult than sampling from either of the original models if naive samplers are used,
    but optimizers and samplers that rely on derivative information will likely have trouble dealing
    with the discrete type parameter. It may be possible to define a smooth transition between
    the two models through the static point source limit they share, though this would likely re-
    quire customization of the optimizer and sampler as well. Sampling with this sort of hybrid
    model would naturally produce samples from both models in the proportion weighted by the
    marginal probability of those models, which is essentially ideal (assuming sampling is consid-
    ered usefulMovingforPoint Source). UsingModelsan optimizer with hybrid models would
    result in a result for just the best-fit model, which is somewhat less desirable.
    6.8 Spatial Models
    In many areas we will need to represent spatial models (generally over CCDs, visits, or coadd
    patches). PSF models, sky backgrounds, aperture corrections, and WCSs all involve spatial
    variation on these scales, and at least some of these should share lower-level algorithm code
    to fit the spatila models. This will include models fit to sparse and non-uniformly sampled data.
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    125

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    We will support fitting Chebyshev polynomials and splines. We will also support regression
    techniques like Gaussian Processes.
    6.9 Background Estimation
    6.9.1 Single-Visit Background Estimation
    Background estimation will be done on the largest scale feasible first. In the case of Alert
    Production, this may be on the size of a chip. In DRP, we expect this to be on a full focalplane.
    An initial low order estimate will be made on a large scale. Each chip will be divided into
    subregions. For each subregion, a robust average of the non-masked pixels will be computed.
    All values for all chips will be fit by6.8).anThisappropriatewill providefunctiona low (see §
    order background estimation in focal plance coordinates. Note that this can only be done if
    the Instrument Signature Removal is very high fidelity. Any sharp discontinuity could cause
    problems with fitting a smooth function.
    A higher order background model can be computed per chip. First, the low order background
    is subtracted from the image. The non-masked pixels will again be binned on a finer grid
    avoiding bright objects. The median in each bin is fit by an appropriate function. In practice,
    this process will likely be iterative.
    In the case of Alert Production, there will be no full focalplane model since we expect to pro-
    cess only a single chip in each thread. In this case, we constrain the background with the
    available un-masked pixels without removing a global background first. Note that image dif-
    ferencing is still possible even in the scenario where there are no unmasked pixels in the
    science image. The background can be modeled as a part of the PSF matching process. We
    will want to do background modeling and subtraction in Alert Production when possible be-
    cause we will want to do calibrated photometry. Even though these measurements are not
    persisted for science use, they will be very useful for debugging and QA.
    If there are so few un-masked pixels in the entire focalplane that even a low order global
    background is impossible to model, background modeling may need to be iterated with a
    procedure that models and subtractsBootstrapImCharstars (for example,pipelinesee thein
    DRP).
    The contents of this document are subject to configuration control by the LSST DM Technical Control Team.
    126

    LARGE SYNOPTIC SURVEY TELESCOPE
    Data Management Science PipelinesLDM-151Design
    Latest Revision 2017-05-19
    Requirements include working in crowded fields. I think estimating a full focalplane model
    is the best we can do. If there are no unmasked pixels in the entire FoV, I don’t think there