1. Project Status
  2. Current Photos
  3. Risk Management
  4. Detailed Project Progress and Status
    1. LSST Program Office
    2. University of Washington
    3. Princeton University and University of California, Davis
    4. Major aim is to contribute to the goal of running as much as possible of the LSST stack as an “end-to-end” pipeline at the LSST2015 meeting.
    5. IPAC / California Institute of Technology
    6. SLAC / Stanford University
    7. NCSA / University of Illinois
    8. Current accomplishments:
    9. NOAO
    10. Current accomplishments:

Data Management Monthly Report

July 2015
 
Project Status

·   The Summer 2015 release continued, with all teams working on the next round of features and tests against the LDM-240 Milestones and Key Performance Metrics. The rate of progress is similar to that reported in the June report, and continues to be consistent with the reduced level of staffing relative to the initial plan.   Significant results include:

o   Aperture corrections from HSC were added to measurement and calibration pipelines and were enabled in the integration tests. 


o   Numerous additions to co-addition processing were ported from HSC
o   Work continued on estimation of galaxy shear fitting performance parameters.  PhoSim was set up and used to generate a library of simulated PSFs which will be used for these measurements. The first measurements were made comparing input to measured shear using the test setup.
o  Qserv distributed database reliability in multi-node environment now exceeding production scale requirements, completed query metadata implementation.
o   Built a Science User Interface demo system using IPython notebook and Firefly to make it easier for other developers to use.  
o   We received an offer of $12/metre from Somyl to lay buried fiber cable from Tololo to Pachon on the principal road that connects the two sites. However it has been decided at this time that we will remain with the original proposal from Telefonica for the fiber on existing power posts following the San Carlos valley and up the side of Cerro Pachon. At some future date there may exist the possibility of LSST or AURA/LSST making this redundant loop.
·   The contract amendment between NCSA and AURA for purchasing capabilities was finalized at the very end of July.  With the contract in place, in August we will begin making a near-term hardware refresh plan. We will revisit the infrastructure needs specified by the LSST developers (e.g., SUI and Qserv teams), as well as a identify development infrastructure needing upgrading/replacement.
·  The Data Management Development Roadmap (LDM-240), has undergone an extensive update covering FY16 - FY18, and we analyzed inter-milestone dependencies and identified additional Key Performance Metrics to add to the plan. A briefing of this activity and its linkage to Earned Value Management was provided to the AMCL on July 17.
·  The DM Project Manager continued development of the Brazil MOA (Networks/PIs). Three of the four Brazilian parties were reported to have signed, and the fourth was expected prior to the end of July, but this was incorrect as only two had signed. The remaining two are now expected to sign around the time of the Bremerton meeting.
·  The DM Project Scientist continued the verification and understanding of NEO detection efficiency and effectiveness (with Lynne Jones). They have looked at the sensitivity of NEO yields to variations in assumptions about the number of paired observations required for detection, as well as other cadence parameters.
·  The DM Project Engineer continued supporting internal DM activities as well as the Technical Operations Working Group (TOWG), and multiple interface activities, including the OCS Middleware Workshop and Wavefront Sensing.
·  The DM SQuaRE team continued supporting the recently deployed Continuous Integration System. Developers are now using the new CI system almost exclusively. Feedback has been overwhelmingly positive.
·  Recruiting and hiring activities continued across all DM institutions. Twenty-seven positions have been filled to date since the MREFC award, while 9 positions are currently open.   One new hire started in Tucson and two more are scheduled to start prior to the LSST 2015 meeting:
o   David Nidever, DM SQuaRE Scientist
o   Jonathan Sick, DM SQuaRE Software Developer/Documentation Specialist
o  Angelo Fausti, DM SQuaRE Software Developer (1 year assignment)
·  NSCA hired a system development and management lead, Jason Alt, whose focus will be specifying the production infrastructure; and hired a post-doc who will work with running the LSST stack on DES data.

 

Back to top



Current Photos
 
None this month (next month use image from Victor and K-T presentations to LSST 2015 showing HSC data processed with LSST stack).
 
 

Back to top



Risk Management
 
The DM Risk Register was reviewed in the monthly process. No new risks were added and no significant changes to existing risk exposure were made.

 

Back to top



Detailed Project Progress and Status


LSST Program Office

DM Project Management and Control

Current accomplishments:

The DM Project Manager:


·  Continued work on the Data Management Development Roadmap (LDM-240), analyzing inter-milestone dependencies and identifying additional Key Performance Metrics to add to the plan. A briefing of this activity and its linkage to Earned Value Management was provided to the AMCL on July 17.
·  Continued the process of developing the FY16 budget so that we can prepare FY16 sub award contract amendments. Analyzed sources of travel expense variance and performed time analysis of Technical/Control Account Manager activities to determine if budget is consistent with actuals. Analyzed sources of current schedule and cost variances for implications to FY16 plan and budget.
·  Completed and executed the amendment to the NCSA agreement for the procurement of hardware.
·  Continued development of the Brazil MOA (Networks/PIs).  Three of the four Brazilian parties were reported to have signed, and the fourth was expected prior to the end of July, but this was incorrect as only two had signed. The remaining two are now expected to sign around the time of the Bremerton meeting.
·  Continued recruiting and hiring activities. Across all DM institutions: 27 positions have been filled to date since the MREFC award, while 9 positions are currently open. One new hire started in Tucson and two more are scheduled to start prior to the LSST 2015 meeting, including :
o   David Nidever, DM SQuaRE Scientist
o   Jonathan Sick, DM SQuaRE Software Developer/Documentation Specialist
o  Angelo Fausti, DM SQuaRE Software Developer (1 year assignment)
 

Planned activities:

The DM Project Manager will:


·   Brief a subset of the AMCL on the relationship of the JIRA planning and Earned Value.
·   Continue recruiting and hiring, prepare for arrival of new hires.
·   Start work with NCSA on FY16 Acquisition Strategy covering LSST equipment procurements.
·   Complete the execution Memorandum of Agreement with Brazil to signature.
·   Attend the LSST 2015 Meeting at Bremerton August 17 - 21. 

DM Science

Current accomplishments:
The DM Project Scientist July activities continued to be focused on the study of the efficiency of LSST as a detector of NEOs. Specific work includes:
 


·   Continued the verification and understanding of NEO detection efficiency and
effectiveness (with Lynne Jones). We have looked at the sensitivity of
NEO yields to variations in assumptions about the number of paired
observations required for detection, as well as other cadence parameters.

 

·   To support the analyses above, developed binary packaging capability for
the LSST codes using the Conda package manager. The resulting scripts
are available from https://github.com/mjuric/conda-lsst/

 

·   John Lurie (graduate student at UW) has begun the analysis of Galactic
stellar number distribution profiles using precursor survey data. This
work will lead to a better estimate for the number of stars for the sizing
model.

 

·   Led the June science pipelines sprint review.

 

·   Gave an opening talk at Mocking the Universe workshop at STScI, on LSST
simulations and data processing.

 
Planned activities:
 
The DM Project Scientist expects to remain focused on the NEO studies and activities related to LSST 2015.
 
DM System Engineering

Current accomplishments:

Activities completed by the DM System Architect include:


·   Technical Operations Working Group
o   Rewrote science-based use cases
o   Presented TOWG work to DM for feedback
·   OCS Middleware workshop
o   Helped drive conclusions on many issues
o   Wrote up telemetry and events sent by DM
o   Wrote up configuration information to be published by DM
·   Discussed wavefront sensor processing with Sandrine
·   Informed Systems Engineering about plans for DM tools
·   Conducted detailed planning discussion with NCSA
·   Worked with NCSA on IT and production systems architecture
·   Published a document on DM community interaction/team culture
·   Did initial review of DM long-range planning
·   Completed agenda for LSST 2015 All Hands Meeting
·   Continued discussing cooperation with Euclid and WFIRST

 

The DM System Interfaces Scientist accomplished:

·   Several communications with Telescope Scientist Sandrine Thomas about revisions to LSE-75 ICD, including increased integration of wavefront and guider pipelines with DM and with calibration data to be produced by DM. Basic architecture outlined in meeting with Sandrine Thomas and K-T Lim at the end of the month

·   Initiated effort with new NCSA team member Matias Carrasco-Kind to refactor and modify the CmdLineTask interface to accommodate Camera data quality monitoring, DM data quality metric analysis, and Level 3 users, and facilitate interface with Firefly. Includes interactions with Applications team, notably Robert Lupton and Jim Bosch, to align this work with their needs as well

·   Supported Firefly team at IPAC in designing a new Python-process-starting capability

·   Attended Camera CD-3 Director's Review, with special attention to camera control system and DAQ issues


Planned activities:

The DM System Architect will:


·   Further work on DM long-range planning including resource loading
·   Define DM requirements for simulations
·   Complete Data Butler notes and turn over to Nate Pease
·   Define end-to-end DM system with diagrams and progression over time
·   Present at and run sessions at DM All Hands meeting
 

The DM System Interfaces Scientist will:

·  Work on revision of LSE-75 ICD
·  Continue to work with SUI team to interface with Camera team, support the camera team in using Firefly.
·  Continue to work with other DM teams (NCSA and Princeton in particular) in revamp the CmdLineTask interface in support of Firefly server side extension.
·  Attend SciPy conference

 
DM Science Quality and Reliability Engineering (SQuaRE)
 

Current Accomplishments:

02C.01.02 Science Quality and Reliability Engineering

·   We participated in and assisted with activities relating to the DM Re-planning.

·   Jonathan Sick joined SQuaRE on July 31st. He will lead in documentation and user experience issues.

02C.01.02.03 Science Pipeline Toolkit

·   Support changes to stack build and test infrastructure (RFC-69)

·   This is an ongoing item for S15 ICW with the Architecture Team. Progress included removing the EUPS dependency on SconsUtil (DM-2769)

02C.01.02.04 Continuous Integration System

·   A number of improvements were made by prioritising user requests and bringing up the system to production quality, including backups and ganglia monitoring. Developers are now using the new CI system almost exclusively except when needing to fall back on the old buildbot system for EUPS package publishing. Feedback has been overwhelmingly positive and we hope to add even more capabilities once the RFC-69 work is complete and we have the hardware capacity to do so.

·   We evaluated on the basis of our requirements and provided feedback to NCSA on the in-house OpenStack deployment. While uptime is not a major concern at this time, we identified a number of issues preventing us from being able to develop and deploy SQuaRE services on it at this time. Ongoing.

Planned Activities:

02C.01.02 Science Quality and Reliability Engineering

·   LDM-240/DLT transition work

·   Prepare and attend All-Hands Meeting

02C.01.02.04 Continuous Integration System

·   Further CI improvements - more advanced monitoring & user requests

·   OpenStack @ NCSA - continue evaluation when blockers are resolved

DM Applications, Middleware, and Infrastructure

Current accomplishments:

·   Refer to by institution reports below

Planned activities:

·   Refer to by institution reports below


 


University of Washington

Current accomplishments:

02C.03.00 -- Alert Production Management Engineering and Integration

Simon Krughoff (SK) continued with DLP planning.  All milestones and KPMs from FY25 on have been imported.  Russell Owen (RO) and Yusra AlSayyad (YA) participated in weekly meetings.  SK participated in group meetings and DMLT meetings.  YA attended the Scipy conference in Austin.

02C.03.05 -- Application Framework for Exposures

RO worked extensively on incorporating the aperture corrections from the HSC side in the LSST stack.  Aperture corrections were added to measurement and calibration pipelines (DM-435, DM-436).  Aperture corrections were enabled in the integration tests (DM-3114).  Additionally, several fixups and improvements were done (DM-3160, DM-3173, DM-3174, DM-3182).

In addition, RO worked on getting gcc 4.8 installable in the stack to enable switching to gcc 4.8 as the default supported compiler (DM-3126, DM-3140).

RO also worked on various bugs: providing input data for an example command line task, fixing bitrot in obs_cfht, and updating classes to use the new measurement schema (DM-1761, DM-2910, DM-3214).

SK worked to correct an issue resulting in some tests not being run (DM-2929).  This was reported by Tim Jenness as a result of running test coverage suites on the afw package.

Planned activities:

02C.03.00 -- Alert production Management, Engineering and Integration

YA, SK, and RO will attend the all-hands meeting in Bremerton.  The UW team will welcome new member Ian Sullivan.

02C.03.05 -- Application Framework for Exposures

YA, SK and RO will work to bring the HSC code over to LSST.  This work will largely be defined at the meeting in Bremerton.  YA will work on the approximation and interpolation framework.

 


Princeton University and University of California, Davis
This report covers work carried out in FM10 of FY15 in the Data Release Production group (staff at Princeton plus Price and Gee working remotely).
Work carried out in July:
 
02C.04.00 Data Release Production Management Engineering and Integration
 
Velocity was reduced this month by MacArthur vacationing  29 June — 10 July and Lupton from 20 July to 7 August.
 
Much management effort was spent on continuing the long range planning. Bosch and Swinbank continued to iterate on the milestones and meta-epics defined in the DLP project in JIRA. The DRP plan as described in JIRA has now evolved substantially beyond that described in the final version (33) of the LDM-240 Excel sheet. In addition, the key performance metrics relevant to DRP were described in JIRA and a set measurement epics were defined to track their value over the construction period.
 
Moving beyond the direct DRP project, dependencies of the data release part of the stack on other WBS elements were identified and recorded and, where possible and appropriate, discussions with the relevant T/CAMs were instigated to ensure the plan forms a coherent whole.
 
Slides were prepared and supplied to Kantor demonstrating a “drill down” through the DLP project from a Science Pipelines/Applications perspective in support of an upcoming presentation to the AMCL.
 
In conjunction with Economou, Jenness and Krughoff, Bosch and Swinbank developed a number of utilities which build upon the basic JIRA functionality to enable more effective visualization, understanding and reporting of the plan as recorded in JIRA. These are available on GitHub at https://github.com/lsst-sqre/sqre-jirakit .
 
Swinbank iterated with Kantor on the budget for DRP through the remainder of construction and produced a draft staffing plan & budget which will be discussed at LSST2015 in Bremerton next month. Producing a final version of the staffing plan will depend upon resource loading of the final development plan produced in JIRA-DLP.
 
As will be described below, technical work in this month focused on merger of Hyper Suprime-Cam functionality to to LSST. Significant logistical effort went into make this possible, particularly from Bosch and Lust. The former acted as the primary point of contact for detailed understanding of both HSC and LSST stacks, selecting which components require migration and prioritising them appropriately. Lust produced a web-based tool for visualizing which HSC tickets and branches have been merged to LSST which has been used to support the ongoing work.
 
Hiring:
 
Our advertisement for a scientific software developer specialising in C++ ran on the AAS Job Register this month. We expect to begin reviewing applicants and drawing up a shortlist for interview in early July.
 
Agreed budget, statement of work, sole source justification for Vishal Kasliwal, who will start 2015-09-01 in a joint U. Penn/Princeton position which will be 50% funded by LSST DM.
 
Merlin Fisher-Levine will join the DRP group to work with Lupton on the Calibration Products Pipeline (02C.04.02). Through FY16 he will be based at Brookhaven National Lab and work for LSST at the 50% level. We expect him to move to Princeton and increase his availability to 100% from FY17 onwards.
 
Agreed with Princeton admin staff to advertise for one or more “postdoctoral positions in software” to start in 2016. These will join the group in Princeton and will be allocated work on LSST, Hyper Suprime-Cam, WFIRST, Prime Focus Spectrograph or other software projects as their interests and the availability of resources dictate. These positions will be advertised on the Princeton website starting in August, and on the AAS Job Register in September and October. We expect to begin reviewing candidates in November.
 
02C.04.01 - Application Framework for Catalogs
 
Previous work (DM-2981) added supplemented the Exposure and ExposureRecord classes with information about the areas of the Exposure are “valid”, that is, do not contain data which is unusable due to e.g. vignetting. This information is now taken account of when co-adding point spread functions. [DM-3243]
 
Updated the logic applied when co-adding images such that a masked pixel in a single input image is not sufficient to result in the corresponding output pixel being masked. Instead, we assume the output data is valid as long as some user-specified fraction of the input data is good. [DM-3137]
 
A new mask plane, NO_DATA, was added. This makes it possible to distinguish between areas of an exposure which are masked because they simply contain no data, and those which are masked because they are too near the edge to be reliable. [DM-3136]
 
A number of minor bugs in the CoaddPsf code were resolved. [DM-3257]
 
Work is ongoing to convert the HSC parallelization framework [RFC-68, DM-2983]. This latter is particularly vital to the short-term middleware requirements for large scale processing of HSC and other data, but we expect it ultimately to be replaced by work in the 02C.07.01 (Processing Control) WBS
 
02C.04.03 - PSF Estimation
 
No work was carried out under this WBS during this month.
 
02C.04.06 - Object Characterization Pipeline
 
A new slot was added to describe the flux used in photometric calibration (“CalibFlux”). The default was changed such that calibration is normally performed with a circular aperture flux. The output table field names used for aperture fluxes were changed to directly indicate the aperture size used in the name of the field. [DM-3106, DM-3108]
 
Vestigial table columns were removed from the multi-band processing outputs. [DM-3139]
 
A number of large, ongoing HSC porting efforts absorbed developer time but were not finished and reviewed by the time the month closed. Notably, these include ongoing work to port the psfex measurement extension [DM-2961], deblended HeavyFootprints for forced photometry [DM-1954], “ubercalibration” through the meas_mosaic package [DM-2674] as well as improvements to the CModel measurement code [DM-2977].
 
The major work which did not involve porting from HSC continues to be in the estimation of galaxy shear fitting performance parameters. [DM-1108] Here, PhoSim was set up and used to generate a library of simulated PSFs which will be used for these measurements. This work had been scheduled to be performed by Debbie Bard at SLAC, but her unavailability meant that Gee performed it instead; the end result is that this epic will overrun slightly. In addition, the first measurements were made comparing input to measured shear using the test setup. [DM-2955, DM-2966]
 
Plans for August:
 
Note the LSST2015 meeting the middle of August which serves as both a focus for ongoing work but also as a disruption to overall velocity.
 
02C.04.00 Data Release Production Management Engineering and Integration
 
The priority through August will be to continue iterating on the plan in JIRA-DLP and, in particular, on its links to other WBS. Following from that, the near-term plan for the W16 cycle will be defined and the FY16 budget agreed.
 
02C.04.01 - Application Framework for Catalogs
 
Continued focus on merging work from HSC with particular emphasis on the parallelization middleware and improvements to coaddition.
 
02C.04.03 - PSF Estimation
 
No work is expected to be undertaken under this WBS.
 
02C.04.06 - Object Characterization Pipeline
 
Continued focus on merging work from HSC with particular emphasis on Footprint improvements, meas_mosaic and finishing the psfex extension.


Major aim is to contribute to the goal of running as much as possible of the LSST stack as an “end-to-end” pipeline at the LSST2015 meeting.

 

 


IPAC / California Institute of Technology

Current accomplishments:

02C.05.00  Science User Interface and Analytical Tools Management Engineering and Integration  

·   Summer vacation continues.  Trey: one week;  Tatiana: two weeks; 

·   Worked with IPAC IRSA group on collaboration in Firefly development, plan and schedule coordination, common system for issue tracking.

·   Worked on  Jira project DLP, added KPM and meta-epics.

02C.05.01   Basic Archive Access Tools

·   Continue the discussion of data access APIs with SLAC group when needed

·   John Rector  attended  SciPy conference for one week

·   Did a simple GWT widget into pure JavaScript code, the look and feel did not change too much since we use CSS control the look and feel already.

·   Worked on reading in the FITS binary table with multiple extensions.

·   Developed a simple case using the Firefly external task launcher. It will serve as example for the UIUC camera team development. 

 02C.05.02   Data Analysis and Visualization Tools

·   Built a simple Firefly demo system using IPython notebook to make it easier for other developers to useFirefly.  

·   Worked on the server side issues of overlaying masks on the primary image.

·   Supported the student in UIUC in Camera group in their development.

·   started on work to read in the FITS cube data on the Firefly server side

02C.05.05 User workspace

·   Started discussion on the user workspace concept within SUI team

02C.02.02  System Engineering

·   Work on revision of LSE-75 ICD

·   Continue to work with SUI team to interface with Camera team, support the camera team in using Firefly. 

·   Continue to work with other DM teams (NCSA and Princeton in particular) in revamp the CmdLineTask interface in support of Firefly server side extension. 

02C.01.02 SDQA 

·   Gregory attended  SciPy conference for one week, there are some collaboration opportunities for us to work with the Jupyter development team.

·   Participated in the workspace concept discussion, bringing in the through from DQA and L3 products point of view.

Planned activities:

02C.05.00  Science User Interface and Analytical Tools Management Engineering and Integration

·   Summer vacation continues.  Trey: 8 days;  Tatiana: 2 days; Loi Ly: 5 days; Lijun 5 days; David Ciaridi: 5 daysk; Xiuqin 2 days; Gregory: 10days

·   Work with IPAC IRSA group on collaboration in Firefly development, plan and schedule coordination, common system for issue tracking.

·   Finish LDM-240 road map to Jira project DLP, add KPM and meta-epics.

·   Finish FY16 budget, and work planning for winter and summer 16

·   Everyone will attend the LSST 2015 all hands meeting

02C.05.01   Basic Archive Access Tools

·   Continue the discussion of data access APIs with SLAC group if needed

·   Setup the access to NCSA hosts to test out the access to APIs and database

·   Prepare and achieve the DM end-to-end exercise.

·   Develop a simple package and deployment of Firefly to make developers’s life much easier to start using Firefly. 

02C.05.02   Data Analysis and Visualization Tools

·   Continue the development on the JavaScript APIs and Python APIs for Firefly visualization components.

·   Work on the design and implementation of overlay masks on the primary image.

·   Continue to work with Camera group to support their development.

·   Continue to refactor the Firefly to simply the code and make it more readable.

 02C.05.05 User workspace

·   Continue discussion on the user workspace concept with other DM teams and finish the document. 


 


SLAC / Stanford University

Current accomplishments:
Highlights:

 
 
 
 
 
 
 
  
 
 
 
 
 
 
 
 
 
· Lots of time spent on debugging problems with Qserv at scale. All major issues solved
·
02C.06.00 Science Data Archive and Application Services Management Engineering and Integration
· Coordinated July Sprint for the Data Access Team
· Interviews:
o one on-site interview: Augusto Roman
· Organized weekly Qserv and Data Access meetings
· SLAC related
o Brought the new finance person (Christine) up to speed
02C.06.01.01 Catalogs, Alerts and Metadata
· Implemented ingest code (DM-210). In review at the end of the July sprint
02C.06.01.02 Image and File Archive
· Finished the webform, demonstrated at the DataAccess meeting. Will deploy once we have a dedicated server machine at NCSA.
02C.06.02.01 Data Access Client Framework
· Progress with Butler documentation, not finished in July.
02C.06.02.02 Web Services
· DM-2538: Implemented RESTful python client
· DM-1893: Researched and Documented API Versioning
02C.06.02.03 Query Services
· Qserv 2015_07 Release highlights:
o Reliability in multi-node environment greatly improved: critical bug fixes related to problems at scale.
o Added support for ORDER BY
o Completed query metadata
o Miscellaneous code cleanup
· Code improvements:
o DM-3104: Add "ORDER BY" clause to lua SQL query on result table
o DM-1709: Implement result sorting for integration tests
o DM-2885: Improve confusing error message
o DM-3223: Improve czar-worker communication debugging
o DM-3090: Implement test suite for new class SqlTransaction
o DM-3110: qserv code cleanup
o DM-3235: qserv missing direct dependencies
o DM-3091: Remove unused function populateState()
o DM-3238: Add qserv-restart.sh
· New features:
o DM-2805: Complete Query Metadata Implementation
o DM-2966: Design CSS that supports updates
· Bug fixes:
o DM-3261: Fix problems in xrootd discovered in multi-node qserv tests
o DM-3237: Fix problems with no-result queries on multi-node setup
o DM-3102: Resolve segmentation fault in Logging Event destructor
· Documentation improvements:
o DM-2420: Document API for worker management service
o DM-3224: Document setting up multi-node Qserv and running integration test
02C.06.02.04 Image Services
· DM-2467: Implement stitching multiple patches across tract boundaries in a coadd v2
02C.06.02.05 Catalog Services
· no change
Planned activities:
Key activities:
· Wrap up large scale tests
· Wrap up Summer 2015 cycle, including documenting it
· Set up Webserv service for SUI tests
02C.06.00 Science Data Archive and Application Services Management Engineering and Integration
· Organize weekly Qserv and Data Access meetings
· Search for candidates for the remaining open position
· FY16 budget planning
· Winter 2016 cycle planning
02C.06.01.01 Catalogs, Alerts and Metadata
· Complete ingest
02C.06.01.02 Image and File Archive
· Finish work on improvements to the form
02C.06.02.01 Data Access Client Framework
· Wrap up butler v2
02C.06.02.02 Web Services
· Contextual error handling
· Add unit tests for webserv
02C.06.02.03 Query Services
· Resolve problems with large scale tests
· Analyze Qserv performance and measure the KPIs
· Finish designing Data Distribution, finish lightweight prototyping of data distribution
· Finish work on Query Management system
· Finish work on Qserv Refactoring (DM-1707)
· Finishe work on CSS v2 (key/value mysql-based interface)
02C.06.02.04 Image Services
· All done
02C.06.02.05 Catalog Services
· All done


NCSA / University of Illinois


Current accomplishments:
02C.07.00 Processing Control and Site Infrastructure Management
 
NSCA management made significant accomplishments in July. We hired a system development and management lead, Jason Alt, whose focus will be specifying the production infrastructure. We hired a post-doc who will work with running the LSST stack on DES data. After several more iterations throughout the month, the contract amendment between NCSA and AURA for purchasing capabilities was finalized at the very end of July. We began a review of existing design documents to gain acumen about the production systems NCSA needs to construct. We also performed the initial conversion of LDM-240 roadmap milestones to tickets in the new JIRA long-term planning project.
 
02C.07.01 Processing Control
 
Technical Operations Working Group
In July we began writing IT use cases for the TOWG. The structure of IT operations that emerged is based on the ITIL four-layer cake model of service design, service transition, service delivery, and ITC (hardware). The structure was well-received by the TOWG and generated a lot of feedback and guidance about how to proceed with this model.
 
Data Management Control System: OCS
The CCS-DAQ-OCS-DM Workshop IV was held at NCSA early in July. Work during the three-day meeting included integration testing of the SAL software. There were issues with getting the Python interface to the C++ SAL software to work using Boost-Python (eventually resolved), but a test using Swig binding was able to send log event and commands from Python to the appropriate C++ receivers (DM-3117, DM-3118). More detailed notes are available on Confluence here: https://confluence.lsstcorp.org/display/SYSENG/2015+July+08-10+CCS-DAQ-OCS-DM+Workshop+IV and here: https://confluence.lsstcorp.org/display/SYSENG/Integration+Testing+at+Workshop+IV . In preparation for the meeting we created SAL XML descriptions of messages sent to the OCS (DM-3101).
 
Data Management Control System: Alert Production
(DM-2263) - Investigation of reducing Condor job startup time continued in July. Most effort was spent communicating with the HTCondor team, including meeting with Miron Livny during his visit to NCSA and the end of the month. His suspicion is that the problem we are trying to solve (slow starting jobs not being ready in time) shouldn't be happening at all, and there is likely another and better solution, by changing the configuration we have. Miron requested a plot showing time from job submission to job start to better characterize the problem (e.g., can we reconfigure Condor, or is there a fundamental problem with Condor itself that the team will need to fix?).
 
Data Management Control System: Network Emulation for Nightly Processing Testbed
(DM-3193) – In preparation for setting up a network emulation testbed, the network engineer inventoried all the servers and the network labels currently in use and correlated with switchports and VLANs to get a tabular view of the network configuration. The plan when this activity was proposed was to deploy policy-based routing to force traffic between the test base site machines and test-archive site machines through the Linux network emulator, but depending on what we learn from this, we may want to replace the Linux network emulator with a more sophisticated product.
 
Data Management Control System: Specification of Level 1 System
(DM-3230) - In July NCSA began a detailed refinement of the design specifications for the image acquisition (a.k.a. “Level 1”) system. Both functional and physical breakdowns of the system were considered at a high level. Prose and figures were drafted with citations to requirements/details iterated in the design documents. A working draft is available on Confluence here: https://confluence.lsstcorp.org/display/~petravick/Breakdown+of+the+to-be+facility .
 
Data Management Control System: Processing DECam data with the LSST Stack
NCSA’s new post-doc began looking into the obs_decam package and communicating with the package developers (DM-3172). This has not yet been incorporated into the LSST stack (see RFC, https://jira.lsstcorp.org/browse/RFC-74 ), and the current package does not work with stack or have working unit tests. She began learning the LSST stack by reproducing errors in obs_decam (DM-3351), as well as the basic afw packages for handling images, tables, etc. (DM-3352).
 
Pipeline Execution Services: Process Execution Framework
Worked continued in July on construction a PEF prototype, including defining a workflow and making small modifications of a local branch, pipe_base-x (DM-3120). Additionally, a graphical representation of the prototype was created to visualize the execution framework (DM-3121). Notes on this are forthcoming.
 
02C.07.02 Infrastructure Services
 
ISO
In July the ISO reviewed LSE-78 (Observatory Network Design) and discussed with Ron Lambert. He also met with Iain Goodenow to discuss the observatory site security plan (SCADA enclave) and AUP. He began drafting a document that outlines the workflow for responding to a security incident and a template report for incident response. Finally, he prepared a CFP whitepaper on the LSST security program for the NSF Cyber Security Summit next month.
 
System Administration and Operation Services: Puppet
(DM-2237) – Work continued with testing and configuring Puppet for managing the development servers. Base configurations were updated to include installation of development tools, such as the requested gcc debug tools. New modules (otp, crashplan, nfs_client, selinux, timezone) were set up to finish base configurations and a new firewall module was written and tested.
 
02C.07.03 Environment and Tools
 
Development environment: OpenStack
NCSA’s Nebula OpenStack system came online in July in very friendly user mode. Accounts were set up for a small group of LSST developers that agreed to participate in the commissioning of the nebula. Greg Daues was appointed service manager as liason to the group and the remainder of the month was spent testing and debugging issues identified by the LSST team (e.g., DM-3225 , DM-3219 , DM-3227 ).
 
(DM-3185) – We also used the ISL OpenStack to investigate starting up VMs and configuring them for use in different scenarios (e.g., processing, build and test, etc.) via Python scripts that work against the OpenStack APIs. We were able to programmatically launch instances on an internal OpenStack network. We discovered potential permission issues with accessing servers through external networks and experimented with configurations to change the firewall behavior.
 
02C.07.04 Site Infrastructure
 
Purchasing equipment was still blocked until the finalization of the procurement contract amendment at the very end of July, and focus was placed on design specification of the “to-be” production system. Administration of the development cluster at NCSA included adding two new user accounts, installing gcc 4.8 and gcc debugging tools, setting up temporary VMs to support the OCS integration work, and scheduled monthly maintenance.
 
Planned activities:
 
02C.07.00 Processing Control and Site Infrastructure Management
 
Management activities in August will focus on assessing S15 status and specifying a coherent plan for upcoming work in FY16 and beyond. We have several interviews scheduled for the three open positions at NCSA. We will also prepare for and attend the LSST2015 conference.
 
02C.07.01 Processing Control
 
TOWG 
Work in August will follow guidance received from the TOWG. We will consider each site in an attempt to define common IT operations use cases, and identify roles to begin getting a reasonable understanding of needed effort/staffing during operations. We will migrate the notes into Enterprise Architect to organize the use cases.
 
Data Management Control System: Alert Production
We will work on replicating the slow job startup problem for Miron Livny and the HTCondor team and generate the requested plot showing time from job submission to job start to better characterize the problem. The DES group at NCSA has a utility that can retrieve this information from HTCondor logs; we will look into this.
 
An action item from the CCS-DAQ-OCS-DM Workshop was to revise LSE-70 (OCS Communication Architecture) and LSE-209 (software component to OCS interface control) to clarify the documents. In August, NCSA will review these changes and contribute comments.
 
Data Management Control System: Specification of Level 1 System
We will continue to refine the functional design specification documents and drawings of the Level 1 system, looking at components of the system in more detail to create a layered design specification. Our new systems lead will be coming up-to-speed on existing documentation to tackle the physical breakdown perspective of the system.
 
Data Management Control System: Processing DECam data with the LSST Stack
Work with the obs_decam package will proceed as it is incorporated into the LSST stack. The plan is to identify DES data sets and start building unit tests of raw and calibrated DES data to run through the pipeline.
 
Pipeline Execution Services: Process Execution Framework
Plans for the prototype execution framework include generating test tasks, creating a process demo using SDSS data, and incorporating the HSC parallelism framework.
 
02C.07.02 Infrastructure Services
 
ISO
In August Alex will attend the NSF Cyber Security Summit as the LSST ISO and present about the LSST security program and the SCADA enclave.
 
Update Sizing Model
As per the agreement outlined in the finalized contract amendment, the Sizing Model (LDM-143, LDM-144) will be refreshed with new costing. This has not been done since 2013 so we anticipate this initial refresh will take a considerable time. However, the technology refresh that will incorporate new/updated technologies into the model will not occur until the next cycle.
 
System Administration and Operation Services: Puppet
We expect the final testing of Puppet with base configuration manifests to be completed in August. The final changes could be released during the August scheduled maintenance.
 
02C.07.03 Environment and Tools
 
Development environment: OpenStack
NCSA will be focused on testing and configuring the Nebula, and we will continue to work with the OpenStack IT group, sys admins, and the LSST developers.
 
02C.07.04 Site Infrastructure
 
With the contract in place, in August we will begin making a near-term hardware refresh plan. We will revisit the infrastructure needs specified by the LSST developers (e.g., SUI and Qserv teams), as well as a identify development infrastructure needing upgrading/replacement.

 


NOAO


Current accomplishments:
02C.08.00 International Communications and Base Site Management Engineering and Integration
 
This month approximately 2-3 days was spent on T-Cam preparations and Jiras.
 
02C.08.01 Base Site
 
I participated on the panel for the design selection of the company that will draw up plans for the LSST Base buildings including the Data Center.
 
02C.08.03 Long-Haul Networks
 
Reuna and AURA/LSST held RFI meetings with several DWDM vendors namely; Padtec, Ciena, Infinera, ECI Telecom, Alcatel, Huawei, Coriant, Adva. It was decided at this meeting that the first official RFP date would be 2 September where the project will be presented to the vendors. A procurement process was designed and passed to the AURA compliance office for review.
 
We received an offer of $12/metre from Somyl to lay buried fiber cable from Tololo to Pachon on the principal road that connects the two sites. However it has been decided at this time that we will remain with the original proposal from Telefonica for the fiber on existing power posts following the San Carlos valley and up the side of Cerro Pachon. At some future date there may exist the possibility of LSST or AURA/LSST making this redundant loop.
 
Time was spent in preparing a document that describes a network design architecture for the Base and Summit computer facilities. This has been forwarded for review by the relevant teams.
 
Planned Activities.
 
02C.08.03 Long-Haul Networks
 
In August we will prepare the document for the RFP in September in collaboration with the Compliance offices of AURA.
 
Attend meetings at Bremerton with a short presentation on the status of the Chilean and International networks along with a discussion on the proposed network design for Base and Summit.

Back to top