Check out my interview with Jeff Kantor, data management project manager of the Large Synoptic Survey Telescope (LSST) currently under development. (Edit: Story also picked up by Slashdot). The telescope itself will be located in Chile where initial data processing will be done, with the full day's data sent nightly to the U.S.
The LSST is, in Jeff's words, "a proposed ground-based 6.7 meter effective diameter (8.4 meter primary mirror), 10 square-degree-field telescope that will provide digital imaging of faint astronomical objects across the entire sky, night after night."
When it's fully operational in 2016, the LSST will: "Open a movie-like window on objects that change or move on rapid timescales: exploding supernovae, potentially hazardous near-Earth asteroids, and distant Kuiper Belt Objects.
"The superb images from the LSST will also be used to trace billions of remote galaxies and measure the distortions in their shapes produced by lumps of Dark Matter, providing multiple tests of the mysterious Dark Energy."
Read on for some details about the software development process aspects of developing some of the world's most advanced image processing software...
When it’s finished, the LSST will be trained on both near-Earth objects and distant galaxies. Its 8.4 metre primary mirror will fire off a 3.2 Gigapixel exposure every 15 seconds, covering the full sky every three nights, generating 30 Terabytes of data per night, yielding a total database of 150 Petabytes.
I asked Jeff if the data will be kept "forever" - if there's any data which is classed as more/less important than others, so might be discarded after a certain period: "The LSST will issue a new data release (images, alerts, and catalogs) every year. Between releases, the latest images, catalog updates, and alerts will also be available for those who don’t want to wait. All data releases will be kept for the entire 10-year life of the survey, and presumably, for as long as the data is useful. That could be a very long time, as evidenced by the fact that there are projects currently working on digitizing astronomical images from glass plates taken in a 19th century survey."
In addition to using the ICONIX Process: "We also have a formal Software Development Plan, including Configuration Management, Quality Assurance, Verification and Validation plans, as well as Coding and Review standards. Finally, we have a complete develop, build, deploy environment, based on several off-the-shelf tools, customized to LSST needs."
What will the LSST be doing that no other telescope has done before? "No other telescope has the combination of width and depth of field and fast exposure that LSST has. This will permit it to image the entire visible sky every 3 nights, essentially capturing the entire universe to disk and making it accessible via the Internet."
In the Register interview, Jeff explains that they're using the ICONIX Process (as mapped out in my collaboration with Doug Rosenberg, Use Case Driven Object Modeling with UML: Theory and Practice) for their software analysis and design. The ICONIX Process is intrinsically a use case driven process. But as so much of the software will be doing core image processing, what will their use cases look like?
"Most of our use cases on the data reduction side have minimal human user interaction. Instead, there are a great many steps that are driven by configuration (policy). We write our use cases focusing on what the system does in the process of creating, updating, and deleting Domain Objects, the tangible astronomy data products in their intermediate and final states."
How many people/teams are involved? "In the Design and Development phase, there are currently 10 partner institutions and approximately 30 people participating in development of the Data Management System. Most are part-time, so the effective effort is more like 15 full-time equivalents. This number will peak at about 50 developers during the Construction Phase.
"The Enterprise Architect models and the application itself are hosted under Microsoft® Windows Server 2003®, allowing users to log in remotely using Remote Desktop Clients. This approach conveniently accommodates remote Macintosh, Linux and Windows users."
Unit testing must play a large role in a system like this. What procedures do you have in place to ensure images are processed correctly? "Our coding standards and process require unit test development and execution. We are just now evaluating static analyzers and coverage tools to ensure compliance with the standards and process."
There’s a requirement to process each 6.4GB image within 60 seconds, in order to provide astronomical transient alerts to the community. How will this requirement be guaranteed?
"The Nightly Processing Pipeline, which is responsible for processing the image and issuing the alerts within 60 seconds, will be parallelized across roughly 3000 cores (1/CCD amplifier in the LSST imager). The pipeline middleware will ensure fault-tolerance so that should any process lag or fail, it will be rescheduled and the remaining processes will be unaffected. We prototype the pipelines at a 5% to 20% scale in our annual Data Challenges in our Design and Development phase, to validate that we can achieve the required performance."
Have there been any special challenges? Surprises? "There are many challenges, as documented in our web sites. Examples include handling the data at petabyte volumes, achieving unprecedented levels of photometric and astrometric accuracy across a huge image and many time epochs, and dealing gracefully with hardware and software failures."
More at The Register.