The on-site quality-assurance system for Hyper Suprime-Cam: OSQAH

The on-site quality-assurance system for Hyper Suprime-Cam: OSQAH Abstract We have developed an automated quick data analysis system for data quality assurance (QA) for Hyper Suprime-Cam (HSC). The system was commissioned in 2012–2014, and has been offered for general observations, including the HSC Subaru Strategic Program, since 2014 March. The system provides observers with data quality information, such as seeing, sky background level, and sky transparency, based on quick analysis as data are acquired. Quick-look images and validation of image focus are also provided through an interactive web application. The system is responsible for the automatic extraction of QA information from acquired raw data into a database, to assist with observation planning, assess progress of all observing programs, and monitor long-term efficiency variations of the instrument and telescope. Enhancements of the system are being planned to facilitate final data analysis, to improve the HSC archive, and to provide legacy products for astronomical communities. 1 Introduction The Hyper Suprime-Cam (HSC) is a wide-field optical imaging camera on the Subaru Telescope, which has a $$1.^{\!\!\!^\circ }5$$ field of view (Miyazaki et al. 2012, 2018; Komiyama et al. 2018). The HSC was commissioned in 2012–2014 and has been offered for general observations since 2014 March. While individual, general, PI-type observing programs are executed with the HSC, since 2014 March the Subaru Telescope observatory has been carrying out the Subaru Strategic Program (SSP), a large-scale multi-waveband survey using 300 nights over five years. The HSC-SSP is an imaging survey driven by weak-lensing sciences, and additional analyses of the data set. A detailed survey description can be found in Aihara et al. (2018a). The data products from large surveys are considered to be a legacy shared by astronomical communities, and they should have huge impacts on various scientific fields, as demonstrated by the success of the Sloan Digital Sky Survey (SDSS). In this regard, many world-leading optical imaging surveys have been planned and undertaken, and most of them aim to make the final data products available to the public—the Canada–France–Hawaii Telescope Legacy Survey (CFHTLS) (Gwyn 2012), Pan-STARRS (Chambers et al. 2016), the Dark Energy Survey (DES: DES Collaboration 2016), and a planned survey by the Large Synoptic Survey Telescope (LSST: LSST Collaboration 2009). Quality assurance (QA) is a procedure for evaluating and ensuring the data quality, and also an operation to perform such a data evaluation. The QA of acquired data is one of the key components in making well-calibrated homogeneous data products that involve data sets from a whole survey period in large surveys, e.g., Kosugi et al. (2000), Shaw et al. (2010), and Furusawa et al. (2011). In CFHTLS, the observatory performs pre-processing by using the Elixir system (Magnier & Cuillandre 2004), combined with a monitoring system for sky transparency, Skyprobe (Cuillandre et al. 2002). The European Southern Observatory (ESO) has been leading queue-mode observations, equipped with a function to conduct the quality control of data taken with their facility’s instruments (Hanuschik et al. 2002; Hanuschik 2007). All calibration data are processed to assess the completion of programs as well as instrument health, and the results need to be certified by the observatory before being archived. The DES performs quality checks of data within about a day of the observation, updating an exposure list based on the completion of exposures, which is then input to observation planning (Mohr et al. 2012; Diehl et al. 2016). We conclude that real-time evaluation of data with a consistent configuration across an observing period is essential to achieve reliable data products for the astronomical community. In particular, fast feedback from automated QA processing of observations is desired to obtain uniform data sets from limited allocated times. Fast QA is beneficial to the observatory operation in both long-term surveys such as HSC-SSP and relatively small, general programs, to make efficient use of telescope time. In the HSC-SSP, we aim to perform automated QA of raw data during observations immediately after the acquisition at the observing site. With results from the QA processing, an observation plan can be modified to catch up with temporal changes in the sky conditions. We register all derived QA parameters in a database, to assess the survey progress and completion. This database is also used to make a list of data images to be processed in a science data production run, by applying a single set of conditions for data quality to all the acquired data. A preliminary version of this was presented in Furusawa et al. (2011) as a development of an automated quick analysis system for data taken by Suprime-Cam (Miyazaki et al. 2002), which can be considered as a prototype of the larger HSC data. This preliminary system (SC-RCM) proved the effectiveness of the idea of on-site QA in observation, although the system had limited functions to derive only basic characteristics of data, and was not used for survey management or data production. Based on the experience with SC-RCM, we designed and implemented the on-site QA system for the HSC observations, now applicable to survey management and data production for long-term observing programs. In this paper, we describe the on-site QA system developed for HSC (OSQAH). In section 2, the aims of this system and its requirements are described. We explain the terminology used in this system in section 3. Hardware components are presented in section 4, and data flow in the system is described in section 5. Software components (data analysis software, orchestration software and the visualization tool) are presented in sections 6 through 8. Section 9 describes the database utilized in the OSQAH system. Finally, we discuss selected results from a three-year operation of the system in section 10, and the summary is presented in section 11. 2 Aims and requirements The primary objective of the on-site QA system (OSQAH) is to evaluate the data quality in a timely manner and provide an immediate feed back to the running observation. This is critically important in order to make the final data products reliable. The information tagged on the exposures is also used in the later off-line data analysis phase for input discrimination. We also have a future plan to extend the system to allow observers to perform mosaic-stack analysis on the processed data generated by the QA processing, and provide a way of downloading these processed data to their sites for further scientific analysis. To meet these goals, we designed a system to implement the following functions: automated and fast on-site QA, with a consistent set of configurations over the survey period, user interfaces to visualize and select results from QA, and to perform data flagging, with proper access control for multiple observing programs, and database registration of results from QA, with traceability of the QA processing, and available to science data production. In the next sections, the current status of implementation of the OSQAH system is described. 3 Terminology In data handling by the OSQAH, we use specific terms to refer to HSC data. These terms are derived from the archival system of the Subaru Telescope and HSC data analysis software. A data set of an exposure with HSC comprises 112 CCDs, in which 104 CCDs are for science images and the other eight CCDs are for assessment of a telescope focus. FRAMEID is an identifier of a single raw CCD image registered in the Subaru data archive, e.g., HSCA00123456. In the HSC data analysis software (section 6), the pair of terms visit and ccd are used to identify a particular CCD image originating from an exposure (or shot). visit is an even number that represents an exposure, and is incremented by 2 every exposure. ccd is a sequential integer number [0..111] referring to a single CCD image in a visit. There is one-to-one correspondence between (visit, ccd) and FRAMEID. For further information, see subsection 3.1 in Aihara et al. (2018b) and the instrument web page.1 Throughout this paper, we use the term “QA parameters” for individual measurements or estimated values of data characteristics, and any values based on their combinations, which are used for QA. QA results are used to represent results derived by QA processing, including the QA parameters, processed files, logs, and flags based on these pieces of information. 4 Hardware components The hardware components employed in the system are summarized in figure 1. The present OSQAH system is composed of 15 computer nodes, which are located in the base facility at Hilo, Hawaii. All nodes are based on the 64-bit PC architecture with Xeon processors, running CentOS 6.9 x86_64. One node (16 cores, 16 GB RAM) is assigned for master control, which runs a main program to orchestrate an analysis cycle for data inputs of each visit. This node also serves as a master of a batch job system. As slave computing nodes of the batch system, we allocate eight nodes. Five of them (CCD nodes; 20 cores, 24 GB RAM per node) process QA on a CCD-by-CCD basis, and another three computing nodes (two mosaic and one user; 20 cores per node, 64 GB RAM for mosaic, and 256 GB RAM for user) work to conduct analyses for a visit’s data set across CCDs. Fig. 1. View largeDownload slide Hardware components and data flow in the system. The right-hand panel shown with a rounded box represents the OSQAH system, where rectangular boxes show constituting computing resources. In the OSQAH local area network connections of Gigabit Ethernet and InfiniBand (QDR 40 Gbps) for inter-node communications, and Fibre Channel (8 Gbps) for storages are shown. Transfer of the raw data is shown with thick arrows in the left-hand box. Observers connect to the Web server to monitor results. (Color online) Fig. 1. View largeDownload slide Hardware components and data flow in the system. The right-hand panel shown with a rounded box represents the OSQAH system, where rectangular boxes show constituting computing resources. In the OSQAH local area network connections of Gigabit Ethernet and InfiniBand (QDR 40 Gbps) for inter-node communications, and Fibre Channel (8 Gbps) for storages are shown. Transfer of the raw data is shown with thick arrows in the left-hand box. Observers connect to the Web server to monitor results. (Color online) We have four file servers equipped with Redundant Arrays of Independent Disks (RAID) storage units, each of which has a working area with a 28 to 77 TB capacity. These four nodes are Network File System (NFS) mounted from each of the master, slave computing, and web server nodes. We have two more server nodes. One of them hosts a database management system, which records all raw data and analysis results, and responds to inquiries from other nodes. The last node is dedicated for a web server. This web server node runs a user interface of a web application to provide resultant QA information to users through a graphical user interface. This node also hosts a service for system resource monitoring (MUNIN).2 All the nodes except the database node share a local area network built upon the InfiniBand interconnect (QDR-IB 40 Gbps) with IP-over-InfiniBand (IPoIB) enabled, which is used for the above NFS mounts. The network provided by the InfiniBand enables fast file read/write by analysis processes between the nodes. The Gigabit Ethernet connection serves for general communication for exchanging commands, and up-link to the external network including the observing control system. 5 Data flow The HSC camera acquires raw data comprising 112 CCD images of an exposure or a visit, and generates separate FITS files for individual CCDs (Utsumi et al. 2012). The size of raw data is about 18 MB per CCD or 2 GB per exposure. In SSP observations, each pointing in the Wide field is covered by four to six visits depending on filters, and 150 to 200 visits or 300–400 GB are obtained in a typical night. As soon as the data acquisition is done, the FITS files of raw data are transferred to the Subaru Observing Control System (Gen2) (Jeschke et al. 2008) at the summit. Figure 1 also shows the flow of raw data. The Gen2 delegates the observing commands that were given by observers, communicating with the telescope and the instrument. The Gen2 is also responsible for conveying the data to the Subaru Telescope Archive System (STARS) (Takata et al. 2002; Winegar 2008). Typically, the whole process of data transfer through registration in STARS, i.e., the time until users can access the archived data, takes up to more than 10 minutes. The data are further transferred to a mirror archive hosted in the headquarters of the National Astronomical Observatory of Japan (NAOJ) in Tokyo, Japan (MASTARS), which takes from several tens of minutes to up to a few hours, after which the data become retrievable. Thus, access to either STARS or MASTARS is not suitable for quick data evaluation. The Gen2 also transfers the same data set to a small-scale data analysis server (DA in figure 1) to provide a quick look for observers. However, this server is not optimized for HSC, with limited CPU and disk resources, and, hence, is not capable of fast QA processing. Therefore, we add another direct transfer of the raw data from Gen2 to OSQAH by Gigabit Ethernet. This transfer enables the OSQAH system to access the raw data as soon as the data acquisition is done, typically within a minute. To fetch the raw data, the OSQAH communicates with Gen2 with server–client-based software (datasink) based on the XML-RPC inter-process protocol. This software, written with pure Python, is developed as part of the Gen2 components. The server software runs in the Gen2 side and enables connection from the OSQAH during an HSC observing run. The client software is running on a file server of the OSQAH at all times. The client is notified by the server when new HSC raw data become available, and then starts a session of receiving files. The new files are first located in a dedicated directory on a file server, and relocated by the orchestration software for QA analysis (section 7). The relocation is executed only when the number of QA analysis jobs queued in the system is small enough (currently set to 50) for a load balancing purpose. Results of the QA analysis are written into either local disks of computing nodes or NFS areas on a file server, depending on the analysis stages. At present, observers are not allowed to access raw and processed data files directly by logging into any of the OSQAH nodes. We plan to prepare a method of data access through a user node as a future development item. Observers usually check the results through a web application, using an access control based on observing programs (section 8). 6 Data analysis software Data analysis software is developed based on a modified version of the HSC data analysis pipeline hscPipe (Bosch et al. 2018). This pipeline is built on an analysis framework developed by the LSST collaboration (Ivezić et al. 2008; Axelrod et al. 2010; Jurić et al. 2015). Top level algorithms are written in Python, combined with C++ codes used in data processing parts that must run fast. The base hscPipe version used in the OSQAH applications is 2.12.4d_hsc. This version was used in the first internal data release of HSC-SSP within a collaboration, and is slightly older than that used for the recent HSC-SSP public data release (Aihara et al. 2018b). Nevertheless, this version is capable of the basic image processing and measurements required for QA. The data analysis software comprises five major analysis stages of QA processing: (1) frame analysis, (2) exposure analysis, (3) focus-offset analysis, (4) tile image analysis, and (5) image quality map analysis. Figure 2 shows the relation of the analysis stages and their outputs. Fig. 2. View largeDownload slide Analysis stage components processed by OSQAH, and output files from those analysis processes. The arrows represent inputs/outputs of data to/from analysis stages. Frame analysis, tile-image analysis, and focus-offset analysis run independently. The remaining analysis stages are dependent upon other stages, in which outputs from a previous stage need to be input to the next stage. Fig. 2. View largeDownload slide Analysis stage components processed by OSQAH, and output files from those analysis processes. The arrows represent inputs/outputs of data to/from analysis stages. Frame analysis, tile-image analysis, and focus-offset analysis run independently. The remaining analysis stages are dependent upon other stages, in which outputs from a previous stage need to be input to the next stage. Configurations to application processes are given by the orchestration software, which is described in section 7. All processes write standard/standard-error outputs to log files in an operation directory that is prepared separately for each night. 6.1 Frame analysis In the first step of the QA analysis cycle for a visit’s data set, the OSQAH performs a frame analysis, which is a CCD-by CCD-process. To extract QA parameters of each CCD, this stage conducts overscan subtraction, flat-fielding, sky subtraction, and basic measurements of sources for 104 scientific CCDs, excluding eight off-focus CCDs. The QA parameters include the seeing, overscan levels, sky levels, and sky transparency of each CCD. A single set of DOMEFLAT data are used in flat-fielding for all observing runs for consistent QA analysis. Their count levels are normalized so that the field center should be unity to ease estimation of the sky transparency across the field of view (FOV). If a data type of the input data, determined by DATA_TYP keyword of FITS header, is DARK, DOMEFLAT, or SKYFLAT, this stage only performs overscan subtraction and count level measurement, and the other QA parameters are not measured. For BIAS data, only the count level measurement is done without overscan subtraction. Flat-fielding in the SSP data production is performed with the DOMEFLAT data only, to take an advantage of data acquisition with stable brightness and color temperature of the illumination sources across the observing runs. Some other observing programs obtain the SKYFLAT (twilight flat) data, for more uniform illumination. To perform calibration for astrometry and magnitude zero-point, this stage carries out cross-matching of detected sources on a CCD image with external reference catalog sources (SDSS-DR8: Aihara et al. 2011). The external catalog is read by the analysis process through FITS files divided by sky tessellation in a form of astrometry.net (Lang et al. 2010). The cross-matching is done using a pattern-matching algorithm presented by Tabur et al. (2007). Astrometry runs to determine the World Coordinate System (WCS) of each CCD, with the SIP convention (TAN-SIP)3 (Shupe et al. 2005), as carried out. The distortion of HSC due to the optics is about 3% at the field edge (Miyazaki et al. 2018). The TAN-SIP convention is used to model the distortion in a CCD with third-order coefficients for non-linear terms. Typical fitting residuals are ∼0.15″, which is small enough to perform cross-matching with the reference catalog. Magnitude zero-points are derived using the reference sources in each CCD. Here, the SDSS magnitudes are transformed to the HSC native-band magnitudes by pre-defined color terms. These color terms are determined by convolving spectral energy distributions in a stellar library by Gunn and Stryker (1983) with response functions of SDSS and HSC, and fitting a quadratic polynomial of a difference in the two system magnitudes as a function of SDSS color of sources. Thus, the magnitude zero-points and source magnitudes based on these zero-points are determined in the HSC native-band system. Sky transparency is estimated based on the zero-points as described in sub-subsection 6.1.2. The 104 CCDs are processed by 104 independent parallel Python processes simultaneously spreading over the five slave nodes (see also section 7). As described in section 9, at the end of the process for each CCD, extracted QA parameters are registered to a frame table in the database hsc_onsite. At the start and the end of the processes, time stamps of the process execution are also recorded in file_mng_onsite table in the hscpipe database. As a result, we have processed CCD FITS images with calibrated WCS and zero-point set in the header, and catalog files in the FITS BINTABLE format. This stage is developed based on a hscProcessCcd task for single-visit processing of hscPipe. Major differences from the formal production in HSC-SSP are as follows: It uses a fixed PSF model (0$${^{\prime\prime}_{.}}$$9 in FWHM) throughout image processing and model-fitting measurements, without on-the-fly PSF model determination for the sake of fast and coarse QA processing, except employing an updated seeing measurement algorithm. It introduces sky transparency measurement. It provides estimation of focus offset asynchronously with the other QA processes, since confirmation of on-focus of images is essential to keep observing and needs quick feedback from the system than other QA. It terminates a procedure that requires an unusually long time, to prevent any blockage of other QA analysis processes. It measures only raw average counts for BIAS, DARK, DOMEFLAT, or SKYFLAT, without flat-fielding. A couple of major components in the frame analysis—estimates of seeing and sky transparency—are described in the following sections. 6.1.1 Seeing measurement The basic idea of seeing measurement is derived from the on-site QA system for Suprime-Cam (Furusawa et al. 2011). The codes are ported to hscPipe-based functions (sizeMagnitudeMitakaStarSelector) and the parameters are tuned up with a range of HSC data in various bands. The seeing is measured for sources detected above 5σ of the sky fluctuation. A combination of their sizes and instrumental magnitudes are used to pick a group of bright point sources. The size is defined as a simple average of the triangular components of the adaptive moments (Ixx + Iyy)/2. Based on our experiments, we set a limiting instrumental magnitude to avoid the influence of faint cosmic rays. To determine the limiting magnitude, the code first makes a number count of all sources as a function of magnitude. Then the number counts are summed up from the bright end toward the fainter magnitudes until the sum reaches 15% of the total number of sources (Nsource). Only the sources brighter than this magnitude are considered in the next step. We adjust the magnitude limit in blue bands so that only sources with high signal-to-noise ratio (S/N) are selected, multiplying the above 15% condition by a factor of 2/3 for g and NB0718, 1/2 for NB0468 and NB0515, and 2/5 for NB0387. Here, for data taken with short exposure time and having a small total number of sources after the magnitude cut, the 50 smallest sources are picked first, and then the 30 brightest sources are chosen from the resultant sample. The other data are screened in a reversed order, i.e., picking the brightest sources first and then the smallest sources. We find that this approach is empirically robust to include a group of point sources for various qualities of data. This way, we obtain a group of point sources. The mean of the source sizes is determined by applying a 3σ clipping of outliers. This gives a decent initial estimate of the seeing. To determine the final seeing measurement, we make a histogram of the source sizes of the sample with a bin step of 0.2 pixels, for a size range of ±0.75 pixels around the above seeing guess. The mode of the histogram is taken as the final seeing. An example of the final seeing measurement is shown in figure 3. Fig. 3. View largeDownload slide Diagnostic plot of seeing estimation. This is an example for data of visit = 120690 and ccd = 49. In the left-hand panel, all sources detected with a 5σ and those used to estimate a rough seeing range in the first step (after filtering for saturation and the magnitude limit) are plotted with plus and cross symbols, respectively. The filled circles are the final sample to determine the seeing of a CCD image, which are located within a size range of compact sources (dot–dashed lines). The right-hand panel shows a histogram of the final sample of compact sources, and the mode is determined as the final seeing (dashed line in both panels), which is 3.58 pixels or 0$${^{\prime\prime}_{.}}$$6 in this example. (Color online) Fig. 3. View largeDownload slide Diagnostic plot of seeing estimation. This is an example for data of visit = 120690 and ccd = 49. In the left-hand panel, all sources detected with a 5σ and those used to estimate a rough seeing range in the first step (after filtering for saturation and the magnitude limit) are plotted with plus and cross symbols, respectively. The filled circles are the final sample to determine the seeing of a CCD image, which are located within a size range of compact sources (dot–dashed lines). The right-hand panel shows a histogram of the final sample of compact sources, and the mode is determined as the final seeing (dashed line in both panels), which is 3.58 pixels or 0$${^{\prime\prime}_{.}}$$6 in this example. (Color online) 6.1.2 Sky transparency estimation The transparency of the sky is estimated as the ratio of the number of detected photons and that which is anticipated from known reference stars under a clear-sky condition. In the current system, we use the SDSS-DR8 catalog for reference stars. The anticipated numbers are converted from magnitude zero-points (table 1), which are derived from the sensitivities (mag) in each band provided in the instrument page.4 Some of the magnitudes have been slightly modified to obtain a 100% transparency under a clear sky (see section 10). The measurement of stars is done using a fixed aperture photometry with 24 pixels (or 4″ at the field center) in diameter in each CCD. The derived transparencies in each CCD are collected and averaged over the CCDs with a 3σ clipping. The result is returned to observers as a sky transparency that is representative of the visit. Table 1. Sensitivity assumed in each band.* Band  Zero-point  Offset    Band  Zero-point  Offset  g  29.0  −0.061    N387  24.6  —  r  29.0  −0.060    N468  26.0  —  r2  29.0  —    N515  25.8  —  i  28.6  +0.091    N718  25.9  —  i2  28.6  +0.091    N816  25.5  —  z  27.7  +0.16    N921  25.7  +0.26  y  27.4  +0.077    N973  25.1  —  Band  Zero-point  Offset    Band  Zero-point  Offset  g  29.0  −0.061    N387  24.6  —  r  29.0  −0.060    N468  26.0  —  r2  29.0  —    N515  25.8  —  i  28.6  +0.091    N718  25.9  —  i2  28.6  +0.091    N816  25.5  —  z  27.7  +0.16    N921  25.7  +0.26  y  27.4  +0.077    N973  25.1  —  *Magnitude per electron per unit exposure time (mag e−1 s−1) in each band is listed. It is assumed that these zero-points are anticipated under a clear sky condition. The offsets in the third column have been added since 2016-03-03 UT, and the zero-point values in the second column are before these offsets are applied. The zero-points for the r2, i2, N515, and N816 bands are tentative and slightly different from the latest values on the instrument web page. The offsets in these bands need to be updated based on the accumulated data. The other bands with no offset value have been introduced lately, and their offsets will be determined when sufficient data are collected, too. View Large At present, the OSQAH system cannot provide a transparency for fields that are not covered by SDSS in a regular operation mode. However, we have introduced an engineering mode that uses the Pan-STARRS1 (PS1) catalog (Magnier et al. 2013, 2016) for the reference, which has recently been made publicly available. This new mode under evaluation will enable us to perform an estimation of the transparency for most of the sky areas accessible from the Subaru Telescope in the near future. 6.2 Exposure analysis Exposure analysis is invoked after all the processes of the frame analysis of a visit are completed. This stage gathers results from the frame analysis of the 104 CCDs, and derives QA parameters representative of the visit by taking statistics of the QA parameters from individual CCDs. At the end of the process, the QA parameters for the visit are registered in a exposure table in the hsc_onsite database. Again, time stamps of the execution are recorded in file_mng_onsite in the hscpipe database. This analysis is executed as a single process on one of the two mosaic nodes. Since this analysis requires access to output files from the frame analysis in all CCDs, the mosaic nodes connect to the NFS directories on all slave CCD nodes and file servers. 6.2.1 Global astrometric solution The exposure analysis is capable of re-determining the astrometric solution of each CCD as an optional function. This function solvetansip gathers cross-matched reference sources in all CCDs, and determines a set of WCS coefficients for the entire FOV. This procedure could improve a resultant astrometry in each CCD by using all the cross-matched sources that are in the FOV and have ninth-order TAN-SIP coefficients. It sets a consistent WCS across the FOV, in which all CCDs share a single reference projection point, too. In the regular operation mode, this procedure is disabled, since both additional processing time and disk access are required, while astrometry in a single CCD done by the frame analysis is acceptable to QA in most observing programs. 6.3 Focus-offset analysis The OSQAH system provides a rough estimate of the telescope focus offset based on images as part of the QA analysis cycle. Eight CCDs are dedicated to this purpose, and they are placed at the very edges of the FOV with ±200 μm vertical offsets from the focal plane. We implement a code to estimate the extent of the off-focus of an image in the automated QA cycle. The estimation is done by measuring sizes of point sources on the eight de-focused CCDs, converting them to the focus offset (H. Miyatake 2014 private communication; see subsection 3.6 in Miyazaki et al. 2018 for details). This conversion is performed assuming the following relation:   \begin{equation} \sigma ^2_\mathrm{PSF} = \sigma ^2_\mathrm{atm} + \sigma ^2_\mathrm{opt} + \sigma ^2_\mathrm{focus}, \end{equation} (1)where σPSF, σatm, σopt, and σfocus are the overall PSF size of images measured on the de-focused CCDs, contributions to the PSF size by atmospheric scintillation, the optics, and off-focus, respectively. Combining measurements on both CCDs with positive and negative de-focus values cancels out the σatm and σopt terms. Thus, we can estimate the off-focus term and convert it to the telescope focus offset based on a ray-tracing model. The resultant focus offset is shown through a web application immediately after the estimation is finished. Observers adjust the telescope focus typically two to three times a night based on this function, which is done between exposures. This allows observers to continue scientific exposures by compensating for the telescope focus without performing a full sequence of focus determination with the camera. The full sequence usually takes 3–5 minutes. This analysis is executed on a dedicated node (“User node” in figure 1), in order to provide as fast a feed-back of a result to observers as possible. 6.4 Tile image analysis Tile image analysis produces a single mosaic-tiled image of a visit for a quick-look purpose. The process of this analysis collects raw images of all CCDs in a visit and tiles them into a single image covering the entire FOV. In tiling images, counts in overscan regions in each amplifier are subtracted from the images and the regions are trimmed, and an 8 × 8 binning is applied, too. For BIAS data, the overscan regions are simply trimmed without subtraction. The OSQAH system generates another set of quick-look images for individual CCDs—overscan-subtracted images and flat-fielded images in the process of the frame analysis. The flat-fielded images are also displayed by a web user interface (section 8), in which observers can pan and zoom the tiled flat-fielded images across the entire FOV. The former tiled images are useful for a quick check of the global pattern in the raw images, and the latter helps with a close inspection of images with a higher spatial resolution. This analysis stage is executed as a single process on one of the two mosaic nodes. 6.5 Image quality map analysis The frame analysis outputs seeing measurements in each CCD, which includes seeing (FWHM) and ellipticity of point sources selected by the seeing measurement algorithm. The image quality map analysis assembles these results from each CCD, and generates a map of the two kinds of image qualities (seeing size and ellipticity) and a number density of the selected sources across the FOV (see sub-subsection 8.2.1). The analysis provides distortion-uncorrected raw maps and distortion-corrected maps per visit both for seeing and ellipticity. This correction is done by assuming an optical model of PSF size variation against a radial position in the FOV for seeing size, and a pre-defined distortion model of ninth-order polynomials for ellipticity. This analysis stage runs as a single process per visit on one of the two mosaic nodes, too. 7 Orchestrating software The orchestration software manages data flow and every QA analysis cycle performed in the OSQAH system. The software monitors the completion of raw data transfer from the observing control system Gen2, and invokes a set of analysis cycles for the arrived data of a visit. This part is developed separately from the individual data analysis software. The software is written in pure Python version 2.7 and runs on the master node. Figure 4 shows a schematic view of the analysis cycle managed by the orchestration software, which is described in the next sections. Fig. 4. View largeDownload slide Schematic view of analysis cycle managed by the orchestration software. The thin arrows show a work flow performed by the runOnsite process. (Color online) Fig. 4. View largeDownload slide Schematic view of analysis cycle managed by the orchestration software. The thin arrows show a work flow performed by the runOnsite process. (Color online) 7.1 Raw data capturing and registration The orchestration software runOnsite monitors the arrival of raw data by polling a dedicated directory on the NFS file system on the file server every three seconds. Since STARS manages each CCD image as a unit data frame, the observing control system Gen2 transfers each CCD image to STARS and also to the OSQAH system asynchronously, without making any link between CCDs in a visit. Hence, the OSQAH system first has to tie the arrived CCD data to the originating visit. Completion of file transfer of the CCD data is evaluated by their file sizes to avoid a race condition between the data transfer process and the file capturing process. When a CCD FITS data file belonging to a certain visit is received, the runOnsite process awaits the arrival of all CCD data in the visit. runOnsite is designed to track multiple visits at the same time, by recording which CCDs already arrived in each visit. This is because raw data from multiple visits often come in asynchronously, in an arbitrary order. The procedure of the data registration is done as follows: first, runOnsite relocates the raw CCD data under a directory tree specialized for the data analysis pipeline, which is called the “data repository”. In this step, base file names are renamed from the original FRAMEID to HSC-${visit}-${ccd}, in the same manner as for hscPipe (Bosch et al. 2018; Aihara et al. 2018b). Then, basic information of the data, such as visit, ccd, object name, filter, and exposure time, is extracted from the data and registered in a database (hscpipe). This database plays the same role as the raw data registry used in hscPipe, except an extension is introduced in OSQAH for recording raw file locations and time stamps of when each analysis stage starts and ends. 7.2 Analysis cycle integration When runOnsite receives and registers all 112 CCDs from a visit, or when a waiting timer expires, it triggers a cycle of analysis procedures for the visit. The execution of the analysis procedure is performed through the batch job management system TORQUE (2.5.13).5 TORQUE is an open-source product, which is a derivative of the PBS project. The TORQUE server and scheduler are hosted by the master node, and client services (MOMs) run in each of the slave computing nodes. A cycle of analysis procedures for a visit is composed of several jobs, including the analysis stages presented in section 6, as follows: (1a) frame analysis (frame-ana), to extract QA parameters from 104 CCDs, (1b) tile-CCD analysis (tile), to tile 104 CCDs on to a single mosaicked image for a quick look, (1c) focus-offset analysis (foff), to estimate a guess of focus offset, (2a) image quality map analysis (map), to generate a set of plots showing seeing and ellipticity across the FOV, for images both uncorrected and corrected for the optics distortion, (2b) making symbolic links (mksym), to create symbolic links of processed data by frame-ana that are located in local disks of the slave nodes on a working NFS file system shared by all the computing nodes, and (3) exposure analysis (exp-ana), to gather QA parameters from all CCDs in a visit and determine representative QA parameters of that visit. These analysis stages are submitted and queued to TORQUE as a group of batch jobs. Once the jobs are submitted, the sequence of job execution with available computing resources and monitoring of job completion are managed by TORQUE. Here, dependencies between the jobs are set so that one analysis that requires outputs from another analysis should be executed after all the necessary jobs are completed. The first set of jobs, (1a) through (1c), run asynchronously, where the frame analysis is executed as a TORQUE’s array job that groups 104 jobs for all the CCDs in a visit. After the frame-ana (1a) is completed, the second set of jobs, (2a) and (2b), are executed. Finally, only when the (2b) mksym is done, (3) exp-ana is ready to run and executed by TORQUE. Available CPU resources where each analysis process should run are determined and assigned by TORQUE, too. To minimize the processing time for a cycle of analyses, the frame analysis is configured to write a major part of outputs into local disks of slave nodes rather than writing to NFS disks via a hundred parallel processes. The mksym job (2b) makes symbolic links to these outputs files from an NFS directory on a file server, so that the exposure analysis process on a mosaic node can access these files. Despite the overheads and complexity of manipulating a considerable number of symbolic links, this staging of the procedure is faster than writing all outputs to the NFS directory. This additional stage should be disabled when we employ a faster and more load-tolerant file system, such as a parallel-distributed file system, in the future. 8 Visualizing and interacting tool—OBSLOG The web application OBSLOG is a tool for visualizing and providing QA results to observers, and serves as the front-end user interface of OSQAH. This application is designed to provide observers with an interface to perform interactive checks of the results and make their records. The OBSLOG lists up basic information of data which represents observing parameters associated with the data (e.g., visit id, observing time, filter, object name, and exposure time), along with quick-look images and the derived QA parameters for every visit. The listing is done in a similar manner to a traditional observation log in paper form. Key functions provided by the OBSLOG are described in the following sections. 8.1 Building QA logs Building a list of basic information and QA parameters per visit is the first thing to do. This list is maintained in a SQLite3 database dedicated for OBSLOG.6 Every 30 seconds OBSLOG checks the OSQAH’s registry database and the output directory where the derived QA parameters and processed files are stored. If the OBSLOG detects a new visit registered in the registry, it creates a new entry for the visit in its SQLite3 database. At this stage, values in FITS header keywords are extracted and registered, which include PROP-ID, used to identify an observing program. When the OBSLOG detects completion of a QA analysis cycle for a visit by monitoring the OSQAH’s QA results database (section 9), it loads all QA parameters into the SQLite3 database. Values of FITS header keywords and QA are structured following the JSON format in the SQLite3 database.7 8.2 Viewing and searching interface 8.2.1 QA parameters and observers’ notes OBSLOG displays a summary of the QA analysis on an interactive web page. The basic information shown by default for each visit is (1) visit id, (2) sequential data id in a night assigned by the instrument, (3) observing date and time, (4) filter name, (5) object (field) name, (6) exposure time, (7) azimuth and elevation of the telescope pointing, (8) instrument rotator angle, (9) position angle of FOV, (10) seeing, (11) sky level, (12) magnitude zero-point (mag ADU−1 s−1), (13) sky transparency, (14) focus position of the telescope, and (15) users’ comment. In addition to the above values, links to the following information are provided on the page: (16) FITS header, (17) ellipticity map of point sources across the FOV, (18) seeing (FWHM) map, (19) number density of detected point sources, and (20) quick-look images. The measured QA parameters listed here are the representative values of each exposure derived by the exposure analysis. Figure 5 is a screenshot of the OBSLOG web page. When a mouse pointer is put on a dedicated button, the FITS header, quick-look image, and image quality maps can be displayed. Figure 6 shows an example of the image quality maps, which are generated by the image quality map analysis stage. Observers can also add or modify columns to show other sets of QA parameters by themselves, and can save the modified configuration per observer. Fig. 5. View largeDownload slide Screen shot of the whole view of the OBSLOG interface. Each row shows extracted QA parameters of a visit, with basic information on the data. A quick-look image and a chart for monitoring seeing, sky transparency, and focus offset are shown in the bottom left- and the bottom right-hand panels, respectively. Forms to input observers’ notes are located in the far right-hand side of each visit in this example. Data search commands are accepted in a form at the top of the page. (Color online) Fig. 5. View largeDownload slide Screen shot of the whole view of the OBSLOG interface. Each row shows extracted QA parameters of a visit, with basic information on the data. A quick-look image and a chart for monitoring seeing, sky transparency, and focus offset are shown in the bottom left- and the bottom right-hand panels, respectively. Forms to input observers’ notes are located in the far right-hand side of each visit in this example. Data search commands are accepted in a form at the top of the page. (Color online) Fig. 6. View largeDownload slide Example of image quality maps. From left to right, (a) seeing size (FWHM), (b) typical ellipticity and orientation of elongation of PSF-like sources in each position, and (c) number of PSF-like sources used to derive these two values. (Color online) Fig. 6. View largeDownload slide Example of image quality maps. From left to right, (a) seeing size (FWHM), (b) typical ellipticity and orientation of elongation of PSF-like sources in each position, and (c) number of PSF-like sources used to derive these two values. (Color online) This user interface allows observers to search for and list a specific range of visits by giving a condition for selection, such as a range of observing dates, filter, exposure time, seeing, and sky transparency. The conditions can be given in a combination of FITS header keyword values and QA parameters with a JavaScript notation, which is helpful in performing even complicated data searches. This querying function is used by observers to ensure completion of exposures and list usable data sets in science production during and after observation. The web page also has a form to attach a note from the observers for each visit. These notes can be used for data flagging, with the QA parameters, to assess completion of exposures and data selection in the science data production. The notes are recorded in the SQLite3 database with the account name of the user who wrote the note. Observers can only edit or delete his/her own notes. The list of visits is automatically updated on the webpage, on which only information on updated visits is transferred and appended to the QA list. The resultant QA list can be downloaded as an observation log in CSV, Excel, JSON, or PDF formats. 8.2.2 Quick-look images As described in subsection 6.4, a couple of quick-look images showing the entire FOV are available through the OBSLOG. The tiled image by 8 × 8 binning with overscan subtraction is displayed in a pop-up window overlaid on the list of QA parameters. Another tiled image, after flat-fielding, can be also viewed in a separated window, which has a typical dimension of ∼8000 pixels on a side (figure 7). We use an open-source JavaScript library Leaflet to construct a responsive interactive map, which allows observers to pan and zoom very quickly into pixel areas of interest.8 Fig. 7. View largeDownload slide Example of quick-look images displayed by OBSLOG. The left-hand panel shows an entire field of view of a visit, and a close-up image is shown in the right-hand panel. (Color online) Fig. 7. View largeDownload slide Example of quick-look images displayed by OBSLOG. The left-hand panel shows an entire field of view of a visit, and a close-up image is shown in the right-hand panel. (Color online) 8.2.3 Variation monitoring OBSLOG is capable of plotting a temporal variation in selected QA parameters. Observers can specify QA parameters or an arithmetic combination thereof to be plotted in a chart, by using JavaScript notation. This function is implemented with a JavaScript library for jQuery plotting (Flot).9 In standard observations, seeing, sky transparency, and estimated focus offsets are monitored by this function (figure 8). Fig. 8. View largeDownload slide Time-sequence plot for monitoring QA parameters. In this example, temporal variation of seeing, sky transparency, and focus offset from visit to visit are being plotted. The open circles with error bars are the estimated on-focus position of the instrument, and those without error bars show the current instrument position. The data point for focus located at around (21:57, 3.85) is for the full focus sequence. The scales attached to the left-hand side are for sky transparency, seeing (arcsec), and estimated on-focus position of the instrument (mm), from left to right. (Color online) Fig. 8. View largeDownload slide Time-sequence plot for monitoring QA parameters. In this example, temporal variation of seeing, sky transparency, and focus offset from visit to visit are being plotted. The open circles with error bars are the estimated on-focus position of the instrument, and those without error bars show the current instrument position. The data point for focus located at around (21:57, 3.85) is for the full focus sequence. The scales attached to the left-hand side are for sky transparency, seeing (arcsec), and estimated on-focus position of the instrument (mm), from left to right. (Color online) 8.3 Server and client components OBSLOG is composed of server and client software. The server component is developed with a web application framework, Ruby on Rails, and runs on the web server node. This component is responsible for building the QA logs in the dedicated SQLite3 database. Upon receiving a users request, this component also executes a query for a data search of the SQLite3 database and returns a result to the client component. In this query, observers can specify the format for the result; HTML, JSON, CSV, Excel, or PDF. The server component generates a file formatted in the given format and transfers the formatted result to the client component. The server component handles a user authentication and its access control, too (subsection 8.4). The client component provides the user interface of the OBSLOG, which is written with JavaScript and runs on a web browser on the user side. This component decodes outputs from the server components, displaying the resultant QA list on a web page, and also delegates interactive commands given by the observers to the server component. The user interface is built as a single-page application with the JavaScript framework Knockout10 and jQuery UI so that all functions should be accessible without any page transition. This is to prevent interference with observers’ key inputs. 8.4 User authentication and access control In order to view QA results, the observers are requested to log in to the OBSLOG interface. The user authentication is done through Lightweight Directory Access Protocol (LDAP) for user accounts of the Subaru archive system (STARS). When a user is authenticated, the OBSLOG server component determines which observing programs the user belongs to, and records the information in another SQLite3 database that is dedicated to OBSLOG user management. The OBSLOG only allows a user to view the data which have a PROP-ID value in the FITS header matching one of the observing program IDs to which the user has a right of access. Since the OBSLOG user interface is the single place where the observers monitor the QA results, this access control guarantees proprietary data access rights between observing programs in the OSQAH system. 9 Database This section describes the structure of databases used in OSQAH. The OSQAH databases are managed by an open-source relational database management system, PostgreSQL (version 9.3). We choose the PostgreSQL to allow simultaneous read and write access by more than 100 processes. There are a couple of database spaces employed to operate the QA analysis. The database hscpipe is dedicated to manage raw data and DOMEFLAT data (figure 9). This is the same as the data registries used in the HSC data analysis pipeline (Bosch et al. 2018), which provide data analysis processes with pointers to necessary raw data and DOMEFLAT data. The other database, hsc_onsite, records all results of the QA analysis. This database shares a table structure for meta-data registration with the catalog database for the HSC-SSP data release (Yamada et al. 2014; T. Takata et al. in preparation). Fig. 9. View largeDownload slide Database for management of raw data and DOMEFLAT data (hscpipe). Table names (with an underline) and columns important for QA analysis in tables are shown in each box. The tables raw and raw_visit store basic information on raw data in order to identify data to be processed. The file_mng_onsite holds additional information including the file location and time stamps of processing data. The flat table manages DOMEFLAT data. Each set of DOMEFLAT data are associated with a range of dates (validstart and validend) to identify which DOMEFLAT data should be used for certain raw data. The DOMEFLAT data are applied to raw data that are taken within a period of this range. In all boxes, the column marked with an open star is the primary key of a table, and those with an open circle or a rounded rectangle are given a unique key constraint in the table definitions. The keys connected by a line are used to join a couple of tables in the database query. Fig. 9. View largeDownload slide Database for management of raw data and DOMEFLAT data (hscpipe). Table names (with an underline) and columns important for QA analysis in tables are shown in each box. The tables raw and raw_visit store basic information on raw data in order to identify data to be processed. The file_mng_onsite holds additional information including the file location and time stamps of processing data. The flat table manages DOMEFLAT data. Each set of DOMEFLAT data are associated with a range of dates (validstart and validend) to identify which DOMEFLAT data should be used for certain raw data. The DOMEFLAT data are applied to raw data that are taken within a period of this range. In all boxes, the column marked with an open star is the primary key of a table, and those with an open circle or a rounded rectangle are given a unique key constraint in the table definitions. The keys connected by a line are used to join a couple of tables in the database query. 9.1 Data registry database Figure 9 shows tables and important columns managed in the hscpipe database. To identify raw data by the data analysis processes, raw and raw_visit tables are used, which hold basic information of raw data for each visit and ccd. The application in the frame analysis also refers to the flat table to determine which DOMEFLAT data should be used for the particular raw data being processed. This determination is done based on a period specified by the two columns validstart and validend, in the manner where a set of DOMEFLAT data are chosen if the observing date of the raw data file being processed fits within this period. The file_mng_onsite table is a special table prepared for the OSQAH operation. This table manages time stamps of QA analysis execution and the root directory used for the raw data repository. 9.2 QA result database In the QA result database (hsc_onsite), there are a couple of table categories (figure 10). The operation tables are designed to manage analysis history, and provide means to apply data flagging. The QA results tables are for storing all the derived QA parameters. HEALPix indices (Górski et al. 2005) mapped to all CCD data are available. This is helpful in science data production to identify a group of CCD data overlapping in a given sky area. Fig. 10. View largeDownload slide Tables in the database for QA results (hsc_onsite). There are a couple of categories of tables. The operation tables are prepared for recording an analysis history, locations of processed files, and time stamps of analysis operation. The QA result tables store analysis results. The frame and the exposure tables record the derived QA parameters, and the frame_anaresult and the exposure_anaresult have flags for data usability. The frame_hpx11 stores HEALPix indices with Nside = 2048 covering each CCD pixel area based on WCS derived in the frame analysis. This information is useful in data production, to identify CCD data overlapping with each pre-defined sky tessellation. Fig. 10. View largeDownload slide Tables in the database for QA results (hsc_onsite). There are a couple of categories of tables. The operation tables are prepared for recording an analysis history, locations of processed files, and time stamps of analysis operation. The QA result tables store analysis results. The frame and the exposure tables record the derived QA parameters, and the frame_anaresult and the exposure_anaresult have flags for data usability. The frame_hpx11 stores HEALPix indices with Nside = 2048 covering each CCD pixel area based on WCS derived in the frame analysis. This information is useful in data production, to identify CCD data overlapping with each pre-defined sky tessellation. Figure 11 summarizes important columns in each table in the hsc_onsite database. The frame_mng and exp_mng tables maintain the locations of processed files and time stamps of database loading of processed data. The frame and exposure tables are the places to record the derived QA parameters. The frame_anaresult and exposure_anaresult tables are designed to maintain information on usability of data. The flags flag_auto, flag_usr, and flag_tag are intended to store 1 (good) or 0 (bad) based on automated evaluation, inputs by an observer, and the final judgment in combination of the former two flags, respectively. Another set of columns, purpos and datset, are prepared for the management of the data set in the science data production. These keywords are intended to identify which visits are usable in the data production, what range of visits are to be grouped, and what calibration data should be applied to respective groups. These mechanisms of flagging and data set management are under development, and these flags are not yet used in the current operation. The analysis table records analysis configuration and history, as described in the next section. Fig. 11. View largeDownload slide Tables and columns in the hsc_onsite database. The analysis table records a history of analysis sessions, in which each night’s operation is assigned an analysis session that ties to a set of configurations for analysis applications. The frame table registers QA parameters of each CCD separately, and the exposure table maintains representative values that are determined by combining QA parameters from all CCDs in a visit. The exposure table also holds the rms of major QA parameters between CCDs. The entries marked with an open star are the primary keys in each table. Fig. 11. View largeDownload slide Tables and columns in the hsc_onsite database. The analysis table records a history of analysis sessions, in which each night’s operation is assigned an analysis session that ties to a set of configurations for analysis applications. The frame table registers QA parameters of each CCD separately, and the exposure table maintains representative values that are determined by combining QA parameters from all CCDs in a visit. The exposure table also holds the rms of major QA parameters between CCDs. The entries marked with an open star are the primary keys in each table. 9.3 Analysis tracking The analysis tracking is a crucial function for QA. To monitor and examine the QA parameters, we have to identify what version and configurations of the software are used for each QA analysis. The software versions, algorithms, or configurations of applications may be sometimes modified as the QA system is updated in the observatory’s engineering. All the QA results also need to be verified by reprocessing when necessary. In the OSQAH, we maintain analysis configurations used in the QA analysis and analysis histories from each observing night by using the analysis table (figure 11). The concept of the traceability management of the QA analysis in OSQAH is explained in figure 12. Fig. 12. View largeDownload slide Tables for maintaining analysis configurations. The analysis table stores “rerun” and “config” used in each runOnsite session with a unique ana_id. A rerun is allowed to have a single configuration only. Log files from applications are stored separately by observing dates. Fig. 12. View largeDownload slide Tables for maintaining analysis configurations. The analysis table stores “rerun” and “config” used in each runOnsite session with a unique ana_id. A rerun is allowed to have a single configuration only. Log files from applications are stored separately by observing dates. A session of the orchestration software runOnsite is started at the beginning of each observing program in a night. A unique analysis id (ana_id) is assigned to each of the runOnsite sessions. Here, a pair of parameters are assigned to a runOnsite session. One parameter is “rerun”, which is the identifier of an observing program in each night, and usually assigned to every observing program in each night as a combination of observing date and program name (e.g., ut20170328a_s17a_ssp). The other parameter is “config”, which indicates the configuration directory used in the QA analysis. The runOnsite process submits analysis jobs to use the specified rerun and config throughout a runOnsite session, i.e., an observing program in a night. The analysis table holds a combination of rerun and config used in each runOnsite session with a unique ana_id, where one ana_id is tied to a runOnsite session. Thus, we can trace an analysis configuration applied in the QA analysis in each observing program through the ana_id. The config directory points to an operation directory that contains configuration files of analysis parameters and environmental variables to be set in the application processes. The config directories and configuration files are prepared in advance of night operations. The rerun is derived from a term used in the data analysis software hscPipe, which provides a destination of output files through hscPipe applications. This is also used for a conflict check of analysis configurations. An hscPipe process ensures that a single consistent configuration is used in a rerun, to avoid any untraceable mixture of analysis results with different configurations. At the start of a runOnsite session, the OSQAH creates the hsc_onsite tables in a new name space using “schema” of PostgreSQL, which is dedicated for the rerun. This aims to distinguish QA results in different observing programs easily. The schema name is derived from the rerun name. Figure 13 illustrates how the hsc_onsite database maintains tables for different observing programs (i.e., reruns). In this example, the two left-hand groups show the schemas assigned on the night of 2017-03-25 UT, in which two observing programs (s17a_ssp and s17a_qn000) are carried out, and the two different schemas are assigned to respective programs. In each schema, a set of operation and QA result tables are created. Fig. 13. View largeDownload slide Database schema used for night operation in the hsc_onsite database. A schema, which is a name space utilized in a PostgreSQL database, is assigned for each night operation. In each schema, a set of tables for managing QA operation and results are created. Fig. 13. View largeDownload slide Database schema used for night operation in the hsc_onsite database. A schema, which is a name space utilized in a PostgreSQL database, is assigned for each night operation. In each schema, a set of tables for managing QA operation and results are created. 10 Application results 10.1 Operation 10.1.1 General and SSP observations Since 2014 March, the OSQAH system has been operating basically for all general observations including HSC-SSP and some engineering observations. The access control provided by OBSLOG successfully manages the multiple general observations in the system. In the HSC-SSP observations, nightly exposure planning is performed based on the OSQAH’s QA list with observers’ notes, and completion of exposures are determined by the QA results. Observing logs with QA results generated by OBSLOG are distributed to collaborators to share the survey’s progress, and are also provided in the public data release. The data selection in science data production for data releases to-date has been done using the QA database. Thus, our initial goals for the on-site QA are fulfilled by the OSQAH, and the OSQAH is thought to be an essential facility for operating HSC-SSP observations. In addition, several PI-type observing programs aimed at searching for transient sources utilize the OSQAH for pre-processing of further sophisticated algorithms such as image subtraction, e.g., Tanaka et al. (2016). This suggests that the OSQAH’s automated QA processing has the capability of increasing productivity of various science programs. 10.1.2 Queue observation Queue-mode observation with HSC has been commissioned since 2016 March. Allocation of queue-mode programs have gradually increased. One of the keys to operating the queue observations is immediate QA of data and decision-making regarding the completion of a program based on the QA result. The operation of HSC queue-mode observation is designed to use the OSQAH for this purpose with coarse QA (called initial QA) during the night. The OSQAH is enhanced to feed the QA parameters (seeing, transparency, sky level) for every visit, with exposure id to the Gen2 queue-mode observing system using the XML-RPC protocol. This feeding has been crucial for the HSC queue-mode observation, to check if these QA parameters satisfy a given condition, and if a set of exposures (observing block) are completed soon after the data acquisition. 10.2 Performance The system performance in the case of typical HSC-SSP observations is described. The data transfer from the HSC to the OSQAH system via Gen2 after the camera shutter closing takes about a minute. The additional overhead for capturing and registering raw data before executing analysis processes is ∼30 s. The frame analysis is finished within two minutes. After that the mksym stage and exposure analysis are completed in ∼1.5 min. Other analysis processes, including focus-offset analysis and tile-image analysis, are usually done prior to completion of the exposure analysis. As the end of exposure analysis triggers the OBSLOG to append a new visit to the QA list, observers get access to new QA parameters in slightly under five minutes after the end of exposure. This timing is acceptable for tracking in most observations. However, when relatively short exposures (≲30 s) are repeated, the input exceeds the system capacity and the analysis cycle of OSQAH tends to become slow or even stall, failing to process part of the visits due to timed-out waiting or failure in the TORQUE scheduling service. This is thought to be related to an instability of NFS in high-load conditions, and an issue that is to be addressed for more stable operations in the near future. 10.3 Assessments of data Throughout the HSC operation of three years, all the QA results by OSQAH have been registered in the database with basic information of the CCD data, such as telescope pointing and attitude of the instrument, etc. This database has the potential capability of monitoring the stability of data characteristics, and, to some extent, the condition of the telescope and the instrument, through the data quality parameters. Such assessments with the accumulated data are quite important in maintaining data quality over a long-term survey program, and achieving successful observations exploiting the expected performance of HSC. 10.3.1 Atmospheric extinction As an example application of the data assessment with the database, we discuss the effect of atmospheric extinction at the site of the Subaru telescope. From the QA result database, we derive a relation between the elevation of the telescope pointing and the magnitude zero-point per unit time of data in the g band over the three-year general observations including HSC-SSP. Figure 14 shows the result. In the figure, each data point shows the magnitude zero-point and secz of each visit. Here, we divide the whole data set for 2014-09-17 to 2017-03-07 into eight periods. Any data set from before 2014-09-17 is excluded since the OSQAH operation was not regulated then and the resultant QA parameters are less reliable than in the other periods. Since absolute values of magnitude zero-points seem to vary with periods, offsets are added so that data points at secz ≲ 1.3 in all periods should coincide with those of the first period (2014-09-17 to 2014-12-27). The offsets applied here range from 0.06 to 0.39 mag. No conversion from the zero-point to the transparency is done, and therefore the assumption of the zero-point under the photometric condition listed in table 1 does not affect the result. We discuss the temporal change in the absolute values in the next section. Fig. 14. View largeDownload slide Magnitude zero-points versus elevation of the telescope pointing in the g band. Each point represents a zero-point of a visit. Representative data points with error bars are also shown in each secz bin. The best-fitting line, with a slope of dZP/secz = 0.12, is superimposed on the data points. (Color online) Fig. 14. View largeDownload slide Magnitude zero-points versus elevation of the telescope pointing in the g band. Each point represents a zero-point of a visit. Representative data points with error bars are also shown in each secz bin. The best-fitting line, with a slope of dZP/secz = 0.12, is superimposed on the data points. (Color online) We see a clear slope in the magnitude zero-point with the elevation (secz). The zero-point decreases as the telescope pointing goes to lower elevation, although a scatter of the data points toward the lower zero-points is considerably large due to a variation in intrinsic sky transparency from night to night. To estimate the slope quantitatively, the data points are divided into separate secz bins with an interval of 0.1. In each bin, the median and the standard deviation are derived by performing 1.5σ clipping of outliers three times (filled circles with error bars). We fit a linear function to these representative data points (secz < 2.2) by the least-squares method. The slope (magnitude/airmass) is determined to be dZP/secz = 0.12 based on the best-fitting function (solid line in the figure). We find that this slope is close to the atmospheric extinction coefficients at the Mauna Kea summit, 0.14 at 4750 Å, which were reported in the CFHT Bulletin (Bèland et al. 1988) and referred to by various Mauna Kea observatories, e.g., Gemini Telescope.11 The difference between the two results corresponds to about 2% at secz = 2.0, which is smaller than a typical magnitude error (≳0.05 mag) by the OSQAH. This result supports that the OSQAH has been deriving reasonable estimates of the sky transparency. The result also implies that the atmospheric condition on Mauna Kea may remain almost stable for these three decades (1988–2017). In the Appendix, the same analysis carried out for the other broad-bands is presented. 10.3.2 Temporal change in system efficiency Another interesting aspect that can be discussed by monitoring the magnitude zero-points is the system efficiency. Figure 15 shows the relation of the magnitude zero-points and observing dates, where each of the eight periods is plotted without any added offsets. It is seen that the absolute values of the zero-points decrease with time. The latest data points for the period 2016-12-23 to 2017-03-07 seem to be lower by 0.4–0.45 mag than those in the 2014-09-17 to 2014-12-27 period. This change corresponds to a 30%–35% decrease in the system efficiency during the interval, if we assume that the change is purely due to degradation in the system efficiency. Fig. 15. View largeDownload slide Variation in magnitude zero-points in the g band since 2014 September. Recent measurements of the zero-points in 2017 March seem to be lower than that of 2014 September by 0.4 mag or ∼30%. This can be explained partly by degradation in reflectivity of the primary mirror after three years’ operation since the last re-coating. (Color online) Fig. 15. View largeDownload slide Variation in magnitude zero-points in the g band since 2014 September. Recent measurements of the zero-points in 2017 March seem to be lower than that of 2014 September by 0.4 mag or ∼30%. This can be explained partly by degradation in reflectivity of the primary mirror after three years’ operation since the last re-coating. (Color online) Fig. 16. View largeDownload slide Same as figure 14, but plotted for the r, i, z, and y bands in each panel. (Color online) Fig. 16. View largeDownload slide Same as figure 14, but plotted for the r, i, z, and y bands in each panel. (Color online) Fig. 17. View largeDownload slide Same as figure 15, but plotted for the r, i, z, and y bands in each panel. The observing periods in all panels are labeled in the same manner as in figure 15: (1) 2014-09-17–2014-12-27, (2) 2015-03-15–2015-08-20, (3) 2015-10-06–2016-02-12, (4) 2016-03-04–2016-04-15, (5) 2016-06-01–2016-07-12, (6) 2016-07-29–2016-09-07, (7) 2016-09-27–2016-12-01, and (8) 2016-12-23–2017-03-07 UT. (Color online) Fig. 17. View largeDownload slide Same as figure 15, but plotted for the r, i, z, and y bands in each panel. The observing periods in all panels are labeled in the same manner as in figure 15: (1) 2014-09-17–2014-12-27, (2) 2015-03-15–2015-08-20, (3) 2015-10-06–2016-02-12, (4) 2016-03-04–2016-04-15, (5) 2016-06-01–2016-07-12, (6) 2016-07-29–2016-09-07, (7) 2016-09-27–2016-12-01, and (8) 2016-12-23–2017-03-07 UT. (Color online) In response to this result, the observatory made an investigation, and found a decrease in the system efficiency of the high-dispersion spectrograph (HDS: Noguchi et al. 2002) of ∼35%–15% at 400–550 nm over the three years since the last mirror re-coating, in summer 2013 (N. Takato 2017 private communication). Direct measurement of the reflectivity of the primary mirror test pieces by the observatory has also supported this degradation in the mirror reflectivity at blue wavelengths. Therefore, our result could partly be explained by the degradation in the mirror reflectivity, which is most severely affected in the g band in the HSC broad-bands. The next mirror re-coating has been scheduled for late 2017, and is recognized as a high-priority action by the observatory based on the clues in the degradation that the observatory has provided. This result proves that continued quality monitoring by OSQAH with an invariant configuration could provide a hint of the telescope and instrument health, although we have a large uncertainty in the interpretation of the QA parameter measurements. Continued QA is quite important for stable operation of HSC observations. 10.4 Issues and future prospects Developments of new functions and improvements of the system performance are undertaken and being planned, to enhance the system usability in observations. To provide more reliable sky transparency, we have added slight offsets in the magnitude zero-points since the night of 2016-03-03 UT, so that the sky transparency by the QA analysis should become unity in the best sky condition (table 1). These offsets are determined based on the SSP data taken during 2014-03-25 to 2015-11-15 UT, to re-scale a 95%-tile maximum of the estimated sky transparency to unity. We need to update the offsets to the zero-points in relatively new filters such as r2, i2, N515, and N816, based on the accumulated data. Based on suggestions from observers, we plan to make a new QA mode with the PS1 catalog in a regular operation in the near future. This enhancement is to extend the sky coverage for the transparency estimation, and most of the sky accessible from the Mauna Kea will be supported. Another suggestion is the system tolerance against repeated short exposures (≲30 s). The current system tends to get unacceptably slow for such fast data inputs. As the OSQAH is considered to be a facility which is crucial to the HSC operation, especially in HSC-SSP and queue-mode observations, it is highly desirable to maintain the system as fully functioning at all times. We would like to improve the system performance by introducing a more responsive file system for concurrent access, such as a parallel distributed file system. It would also be useful to support analysis modes for processing every few visits, or a partial area of the FOV for the fast feedback. These improvements will enable the OSQAH to accommodate a wider variety of observations, including searches for transient and variable sources that require both wide sky coverage and high responsiveness of the system. The establishment of documentation, FAQs, and issue tracking for trouble-shooting is necessary for stable operation. A data flagging interface for observers is to be implemented. This will enable more efficient survey progress management and data production. The introduction of data set management with a database, which connects a range of science data with a set of calibrated data to be applied, is required in both HSC-SSP and queue-mode observations. The OSQAH database will be able to provide these new functions by the addition of a few new tables and slight modifications to the existing tables. Once implemented, these functions will assist observations and enhance the legacy of the HSC data archive. The archive associated with the QA information will facilitate data processing and reliable calibration by observers. 11 Summary We develop an on-site QA system for HSC (OSQAH) that performs automated quick data processing for evaluating data qualities. The OSQAH system is commissioned and has been operating for general observations since 2014 March. The system provides parameters used for QA, including seeing, sky level, and sky transparency, to observers through a web-based user interface typically within five minutes of the data acquisition. This fast feeding enables exposure planning during a night and efficient survey progress management. Queue-mode observations with HSC also rely on this system for the initial coarse quality check. We show how the QA database is useful for assessing the performance of the telescope and instrument. Developments of new features and improvements of the system performance are planned, which include improvements of the sky coverage by the PS1 catalog and the system responsiveness for short exposures. The successful upgrades will make the system more robust for various observations, to facilitate data analysis, and enhance the valuable HSC data archive. Acknowledgements We are grateful to the anonymous referee for their helpful comments that have improved the manuscript. All the observatory staff are appreciated for their efforts in making the OSQAH system operational. We thank Drs. Masafumi Yagi, Ichi Tanaka, Shin Oya, and Naruhisa Takato for valuable discussions and comments. This paper is based on data collected at the Subaru Telescope and retrieved from the HSC data archive system, which is operated by the Subaru Telescope and the Astronomy Data Center at National Astronomical Observatory of Japan. The Hyper Suprime-Cam (HSC) collaboration includes the astronomical communities of Japan and Taiwan, and Princeton University. The HSC instrumentation and software were developed by the National Astronomical Observatory of Japan (NAOJ), the Kavli Institute for the Physics and Mathematics of the Universe (Kavli IPMU), the University of Tokyo, the High Energy Accelerator Research Organization (KEK), the Academia Sinica Institute for Astronomy and Astrophysics in Taiwan (ASIAA), and Princeton University. Funding was contributed by the FIRST program from Japanese Cabinet Office, the Ministry of Education, Culture, Sports, Science and Technology (MEXT), the Japan Society for the Promotion of Science (JSPS), Japan Science and Technology Agency (JST), the Toray Science Foundation, NAOJ, Kavli IPMU, KEK, ASIAA, and Princeton University. HM is supported by the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. This work is in part supported by MEXT Grant-in-Aids for Scientific Research on Priority Areas (No. JP18072003) and for Scientific Research on Innovative Areas (No. JP15H05892). This paper makes use of software developed for the Large Synoptic Survey Telescope. We thank the LSST Project for making their code available as free software at http://dm.lsst.org. The Pan-STARRS1 Surveys (PS1) have been made possible through contributions of the Institute for Astronomy, the University of Hawaii, the Pan-STARRS Project Office, the Max-Planck Society and its participating institutes, the Max Planck Institute for Astronomy, Heidelberg and the Max Planck Institute for Extraterrestrial Physics, Garching, Johns Hopkins University, Durham University, the University of Edinburgh, Queen’s University Belfast, the Harvard-Smithsonian Center for Astrophysics, the Las Cumbres Observatory Global Telescope Network Incorporated, the National Central University of Taiwan, the Space Telescope Science Institute, the National Aeronautics and Space Administration under Grant No. NNX08AR22G issued through the Planetary Science Division of the NASA Science Mission Directorate, the National Science Foundation under Grant No. AST-1238877, the University of Maryland, and Eotvos Lorand University (ELTE), the Los Alamos National Laboratory, and the Gordon and Betty Moore Foundation. Appendix. Assessments with magnitude zero point in broadband The analysis of extinction coefficients based on the magnitude zero-points in the QA database, which is described in sub-subsection 10.3.1, is presented in the remaining broad-bands. Figure 16 shows results in the r, i, z, and y bands in the four panels. The fitting is done for data points at secz < 2.2 for the r and i bands, and secz < 2.0 for the z and y bands. The derived extinction coefficients are summarized in table 2. The analysis of the temporal change in the system efficiency (sub-subsection 10.3.2) is also performed in these four bands. In figure 17, we see decreases in the efficiency of ∼10% to 20% in all bands, which is a smaller decrease than in the g band. Table 2. Optical extinction values.* Band  dZP/secz(OSQAH)  dZP/secz(CFHT)  Wavelength  g  0.12  0.14  475  r  0.078  0.11  650  i  0.057  0.07  800  z  0.047  0.05  900  y  0.042  0.04  1000  Band  dZP/secz(OSQAH)  dZP/secz(CFHT)  Wavelength  g  0.12  0.14  475  r  0.078  0.11  650  i  0.057  0.07  800  z  0.047  0.05  900  y  0.042  0.04  1000  *The atmospheric extinction coefficients (dZP/secz) estimated by OSQAH are listed, with those reported in the CFHT bulletin in 1988 for reference. The CFHT coefficients are values at the given wavelength (nm) in the far right column. Collection of further data processed with a fixed set of configuration would be needed to obtain more robust coefficients. View Large Footnotes † Based on data collected at Subaru Telescope, which is operated by the National Astronomical Observatory of Japan. 1 ⟨http://www.naoj.org/Observing/Instruments/HSC/⟩. 2 ⟨http://munin-monitoring.org/⟩. 3 ⟨https://fits.gsfc.nasa.gov/registry/sip.html⟩. 4 ⟨http://www.naoj.org/Observing/Instruments/HSC/sensitivity.html⟩. 5 ⟨http://www.adaptivecomputing.com/products/open-source/torque/⟩. 6 ⟨https://sqlite.org/index.html⟩. 7 ⟨http://www.json.org/⟩. 8 ⟨http://leafletjs.com/⟩. 9 ⟨http://www.flotcharts.org/⟩. 10 ⟨http://knockoutjs.com/⟩. 11 ⟨https://www.gemini.edu/sciops/telescopes-and-sites/observing-condition-constraints/extinction⟩. References Aihara H. et al.   2011, ApJS , 193, 29 CrossRef Search ADS   Aihara H. et al.   2018a, PASJ , 70, S4 Aihara H. et al.   2018b, PASJ , 70, S8 Axelrod T., Kantor J., Lupton R. H., Pierfederici F. 2010, Proc. SPIE , 7740, 774015 Bèland S., Boulade O., Davidge T. 1988, Canada-France-Hawaii Telescope Information Bulletin, No. 19  ( Kamuela, HI: CFHT Corporation) Bosch J. et al.   2018, PASJ , 70, S5 Chambers K. C. et al.   2016, arXiv:1612.05560 Cuillandre J.-C., Magnier E. A., Isani S., Sabin D., Knight W., Kras S., Lai K. 2002, Proc. SPIE , 4844, 501 DES Collaboration, 2016, MNRAS , 460, 1270 CrossRef Search ADS   Diehl H. T. et al.   2016, Proc. SPIE , 9910, 99101D Furusawa H. et al.   2011, PASJ , 63, S585 CrossRef Search ADS   Górski K. M., Hivon E., Banday A. J., Wandelt B. D., Hansen F. K., Reinecke M., Bartelmann M. 2005, ApJ , 622, 759 CrossRef Search ADS   Gunn J. E., Stryker L. L. 1983, ApJS , 52, 121 CrossRef Search ADS   Gwyn S. D. J. 2012, AJ , 143, 38 CrossRef Search ADS   Hanuschik R. W. 2007, ASP Conf. Ser. , 376, 373 Hanuschik R. W., Hummel W., Sartoretti P., Silve D. 2002, Proc. SPIE , 4844, 139 Ivezic Z. et al.   2008, arXiv:0805.2366 Jeschke E., Bon B., Inagaki T., Streeper S. 2008, Proc. SPIE , 7019, 70190U Jurić M. et al.   2015, arXiv:1512.07914 Kosugi G. et al.   2000, Proc. SPIE , 4010, 174 Komiyama Y. et al.   2018, PASJ , 70, S2 Lang D., Hogg D. W., Mierle K., Blanton M., Roweis S. 2010, AJ , 139, 1782 CrossRef Search ADS   LSST Science Collaborations 2009, arXiv:0912.0201 Magnier E. A., Cuillandre J.-C. 2004, PASP , 116, 449 CrossRef Search ADS   Magnier E. A. et al.   2013, ApJS , 205, 20 CrossRef Search ADS   Magnier E. A. et al.   2016, arXiv:1612.0542 Miyazaki S. et al.   2002, PASJ , 54, 833 CrossRef Search ADS   Miyazaki S. et al.   2012, Proc. SPIE , 8446, 84460Z Miyazaki S. et al.   2018, PASJ , 70, S1 Mohr J. J. et al.   2012, Proc. SPIE , 8451, 84510D Noguchi K. et al.   2002, PASJ , 54, 855 CrossRef Search ADS   Shaw R. A., Levine D., Axelrod T., Lahr R. R., Mannings V. G. 2010, Proc. SPIE , 7740, 7740H Shupe D. L., Mehrdad M., Jing L., Makovoz D., Narron R. 2005, ASP Conf. Ser. , 347, 491 Tabur V. 2007, PASA , 24, 189 CrossRef Search ADS   Takata T., Yagi M., Yasuda N., Ogasawara R. 2002, Proc. SPIE , 4844, 242 Tanaka M. et al.   2016, ApJ , 819, 5 CrossRef Search ADS   Utsumi Y. et al.   2012, Proc. SPIE , 8446, 844662 Winegar T. 2008, Proc. SPIE , 7016, 70160M Yamada Y. et al.   2014, Proc. SPIE , 9149, 91492I © The Author 2017. Published by Oxford University Press on behalf of the Astronomical Society of Japan. All rights reserved. For Permissions, please email: journals.permissions@oup.com http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Publications of the Astronomical Society of Japan Oxford University Press

Loading next page...
 
/lp/ou_press/the-on-site-quality-assurance-system-for-hyper-suprime-cam-osqah-zr0VjnaaEW
Publisher
Oxford University Press
Copyright
© The Author 2017. Published by Oxford University Press on behalf of the Astronomical Society of Japan. All rights reserved. For Permissions, please email: journals.permissions@oup.com
ISSN
0004-6264
eISSN
2053-051X
D.O.I.
10.1093/pasj/psx079
Publisher site
See Article on Publisher Site

Abstract

Abstract We have developed an automated quick data analysis system for data quality assurance (QA) for Hyper Suprime-Cam (HSC). The system was commissioned in 2012–2014, and has been offered for general observations, including the HSC Subaru Strategic Program, since 2014 March. The system provides observers with data quality information, such as seeing, sky background level, and sky transparency, based on quick analysis as data are acquired. Quick-look images and validation of image focus are also provided through an interactive web application. The system is responsible for the automatic extraction of QA information from acquired raw data into a database, to assist with observation planning, assess progress of all observing programs, and monitor long-term efficiency variations of the instrument and telescope. Enhancements of the system are being planned to facilitate final data analysis, to improve the HSC archive, and to provide legacy products for astronomical communities. 1 Introduction The Hyper Suprime-Cam (HSC) is a wide-field optical imaging camera on the Subaru Telescope, which has a $$1.^{\!\!\!^\circ }5$$ field of view (Miyazaki et al. 2012, 2018; Komiyama et al. 2018). The HSC was commissioned in 2012–2014 and has been offered for general observations since 2014 March. While individual, general, PI-type observing programs are executed with the HSC, since 2014 March the Subaru Telescope observatory has been carrying out the Subaru Strategic Program (SSP), a large-scale multi-waveband survey using 300 nights over five years. The HSC-SSP is an imaging survey driven by weak-lensing sciences, and additional analyses of the data set. A detailed survey description can be found in Aihara et al. (2018a). The data products from large surveys are considered to be a legacy shared by astronomical communities, and they should have huge impacts on various scientific fields, as demonstrated by the success of the Sloan Digital Sky Survey (SDSS). In this regard, many world-leading optical imaging surveys have been planned and undertaken, and most of them aim to make the final data products available to the public—the Canada–France–Hawaii Telescope Legacy Survey (CFHTLS) (Gwyn 2012), Pan-STARRS (Chambers et al. 2016), the Dark Energy Survey (DES: DES Collaboration 2016), and a planned survey by the Large Synoptic Survey Telescope (LSST: LSST Collaboration 2009). Quality assurance (QA) is a procedure for evaluating and ensuring the data quality, and also an operation to perform such a data evaluation. The QA of acquired data is one of the key components in making well-calibrated homogeneous data products that involve data sets from a whole survey period in large surveys, e.g., Kosugi et al. (2000), Shaw et al. (2010), and Furusawa et al. (2011). In CFHTLS, the observatory performs pre-processing by using the Elixir system (Magnier & Cuillandre 2004), combined with a monitoring system for sky transparency, Skyprobe (Cuillandre et al. 2002). The European Southern Observatory (ESO) has been leading queue-mode observations, equipped with a function to conduct the quality control of data taken with their facility’s instruments (Hanuschik et al. 2002; Hanuschik 2007). All calibration data are processed to assess the completion of programs as well as instrument health, and the results need to be certified by the observatory before being archived. The DES performs quality checks of data within about a day of the observation, updating an exposure list based on the completion of exposures, which is then input to observation planning (Mohr et al. 2012; Diehl et al. 2016). We conclude that real-time evaluation of data with a consistent configuration across an observing period is essential to achieve reliable data products for the astronomical community. In particular, fast feedback from automated QA processing of observations is desired to obtain uniform data sets from limited allocated times. Fast QA is beneficial to the observatory operation in both long-term surveys such as HSC-SSP and relatively small, general programs, to make efficient use of telescope time. In the HSC-SSP, we aim to perform automated QA of raw data during observations immediately after the acquisition at the observing site. With results from the QA processing, an observation plan can be modified to catch up with temporal changes in the sky conditions. We register all derived QA parameters in a database, to assess the survey progress and completion. This database is also used to make a list of data images to be processed in a science data production run, by applying a single set of conditions for data quality to all the acquired data. A preliminary version of this was presented in Furusawa et al. (2011) as a development of an automated quick analysis system for data taken by Suprime-Cam (Miyazaki et al. 2002), which can be considered as a prototype of the larger HSC data. This preliminary system (SC-RCM) proved the effectiveness of the idea of on-site QA in observation, although the system had limited functions to derive only basic characteristics of data, and was not used for survey management or data production. Based on the experience with SC-RCM, we designed and implemented the on-site QA system for the HSC observations, now applicable to survey management and data production for long-term observing programs. In this paper, we describe the on-site QA system developed for HSC (OSQAH). In section 2, the aims of this system and its requirements are described. We explain the terminology used in this system in section 3. Hardware components are presented in section 4, and data flow in the system is described in section 5. Software components (data analysis software, orchestration software and the visualization tool) are presented in sections 6 through 8. Section 9 describes the database utilized in the OSQAH system. Finally, we discuss selected results from a three-year operation of the system in section 10, and the summary is presented in section 11. 2 Aims and requirements The primary objective of the on-site QA system (OSQAH) is to evaluate the data quality in a timely manner and provide an immediate feed back to the running observation. This is critically important in order to make the final data products reliable. The information tagged on the exposures is also used in the later off-line data analysis phase for input discrimination. We also have a future plan to extend the system to allow observers to perform mosaic-stack analysis on the processed data generated by the QA processing, and provide a way of downloading these processed data to their sites for further scientific analysis. To meet these goals, we designed a system to implement the following functions: automated and fast on-site QA, with a consistent set of configurations over the survey period, user interfaces to visualize and select results from QA, and to perform data flagging, with proper access control for multiple observing programs, and database registration of results from QA, with traceability of the QA processing, and available to science data production. In the next sections, the current status of implementation of the OSQAH system is described. 3 Terminology In data handling by the OSQAH, we use specific terms to refer to HSC data. These terms are derived from the archival system of the Subaru Telescope and HSC data analysis software. A data set of an exposure with HSC comprises 112 CCDs, in which 104 CCDs are for science images and the other eight CCDs are for assessment of a telescope focus. FRAMEID is an identifier of a single raw CCD image registered in the Subaru data archive, e.g., HSCA00123456. In the HSC data analysis software (section 6), the pair of terms visit and ccd are used to identify a particular CCD image originating from an exposure (or shot). visit is an even number that represents an exposure, and is incremented by 2 every exposure. ccd is a sequential integer number [0..111] referring to a single CCD image in a visit. There is one-to-one correspondence between (visit, ccd) and FRAMEID. For further information, see subsection 3.1 in Aihara et al. (2018b) and the instrument web page.1 Throughout this paper, we use the term “QA parameters” for individual measurements or estimated values of data characteristics, and any values based on their combinations, which are used for QA. QA results are used to represent results derived by QA processing, including the QA parameters, processed files, logs, and flags based on these pieces of information. 4 Hardware components The hardware components employed in the system are summarized in figure 1. The present OSQAH system is composed of 15 computer nodes, which are located in the base facility at Hilo, Hawaii. All nodes are based on the 64-bit PC architecture with Xeon processors, running CentOS 6.9 x86_64. One node (16 cores, 16 GB RAM) is assigned for master control, which runs a main program to orchestrate an analysis cycle for data inputs of each visit. This node also serves as a master of a batch job system. As slave computing nodes of the batch system, we allocate eight nodes. Five of them (CCD nodes; 20 cores, 24 GB RAM per node) process QA on a CCD-by-CCD basis, and another three computing nodes (two mosaic and one user; 20 cores per node, 64 GB RAM for mosaic, and 256 GB RAM for user) work to conduct analyses for a visit’s data set across CCDs. Fig. 1. View largeDownload slide Hardware components and data flow in the system. The right-hand panel shown with a rounded box represents the OSQAH system, where rectangular boxes show constituting computing resources. In the OSQAH local area network connections of Gigabit Ethernet and InfiniBand (QDR 40 Gbps) for inter-node communications, and Fibre Channel (8 Gbps) for storages are shown. Transfer of the raw data is shown with thick arrows in the left-hand box. Observers connect to the Web server to monitor results. (Color online) Fig. 1. View largeDownload slide Hardware components and data flow in the system. The right-hand panel shown with a rounded box represents the OSQAH system, where rectangular boxes show constituting computing resources. In the OSQAH local area network connections of Gigabit Ethernet and InfiniBand (QDR 40 Gbps) for inter-node communications, and Fibre Channel (8 Gbps) for storages are shown. Transfer of the raw data is shown with thick arrows in the left-hand box. Observers connect to the Web server to monitor results. (Color online) We have four file servers equipped with Redundant Arrays of Independent Disks (RAID) storage units, each of which has a working area with a 28 to 77 TB capacity. These four nodes are Network File System (NFS) mounted from each of the master, slave computing, and web server nodes. We have two more server nodes. One of them hosts a database management system, which records all raw data and analysis results, and responds to inquiries from other nodes. The last node is dedicated for a web server. This web server node runs a user interface of a web application to provide resultant QA information to users through a graphical user interface. This node also hosts a service for system resource monitoring (MUNIN).2 All the nodes except the database node share a local area network built upon the InfiniBand interconnect (QDR-IB 40 Gbps) with IP-over-InfiniBand (IPoIB) enabled, which is used for the above NFS mounts. The network provided by the InfiniBand enables fast file read/write by analysis processes between the nodes. The Gigabit Ethernet connection serves for general communication for exchanging commands, and up-link to the external network including the observing control system. 5 Data flow The HSC camera acquires raw data comprising 112 CCD images of an exposure or a visit, and generates separate FITS files for individual CCDs (Utsumi et al. 2012). The size of raw data is about 18 MB per CCD or 2 GB per exposure. In SSP observations, each pointing in the Wide field is covered by four to six visits depending on filters, and 150 to 200 visits or 300–400 GB are obtained in a typical night. As soon as the data acquisition is done, the FITS files of raw data are transferred to the Subaru Observing Control System (Gen2) (Jeschke et al. 2008) at the summit. Figure 1 also shows the flow of raw data. The Gen2 delegates the observing commands that were given by observers, communicating with the telescope and the instrument. The Gen2 is also responsible for conveying the data to the Subaru Telescope Archive System (STARS) (Takata et al. 2002; Winegar 2008). Typically, the whole process of data transfer through registration in STARS, i.e., the time until users can access the archived data, takes up to more than 10 minutes. The data are further transferred to a mirror archive hosted in the headquarters of the National Astronomical Observatory of Japan (NAOJ) in Tokyo, Japan (MASTARS), which takes from several tens of minutes to up to a few hours, after which the data become retrievable. Thus, access to either STARS or MASTARS is not suitable for quick data evaluation. The Gen2 also transfers the same data set to a small-scale data analysis server (DA in figure 1) to provide a quick look for observers. However, this server is not optimized for HSC, with limited CPU and disk resources, and, hence, is not capable of fast QA processing. Therefore, we add another direct transfer of the raw data from Gen2 to OSQAH by Gigabit Ethernet. This transfer enables the OSQAH system to access the raw data as soon as the data acquisition is done, typically within a minute. To fetch the raw data, the OSQAH communicates with Gen2 with server–client-based software (datasink) based on the XML-RPC inter-process protocol. This software, written with pure Python, is developed as part of the Gen2 components. The server software runs in the Gen2 side and enables connection from the OSQAH during an HSC observing run. The client software is running on a file server of the OSQAH at all times. The client is notified by the server when new HSC raw data become available, and then starts a session of receiving files. The new files are first located in a dedicated directory on a file server, and relocated by the orchestration software for QA analysis (section 7). The relocation is executed only when the number of QA analysis jobs queued in the system is small enough (currently set to 50) for a load balancing purpose. Results of the QA analysis are written into either local disks of computing nodes or NFS areas on a file server, depending on the analysis stages. At present, observers are not allowed to access raw and processed data files directly by logging into any of the OSQAH nodes. We plan to prepare a method of data access through a user node as a future development item. Observers usually check the results through a web application, using an access control based on observing programs (section 8). 6 Data analysis software Data analysis software is developed based on a modified version of the HSC data analysis pipeline hscPipe (Bosch et al. 2018). This pipeline is built on an analysis framework developed by the LSST collaboration (Ivezić et al. 2008; Axelrod et al. 2010; Jurić et al. 2015). Top level algorithms are written in Python, combined with C++ codes used in data processing parts that must run fast. The base hscPipe version used in the OSQAH applications is 2.12.4d_hsc. This version was used in the first internal data release of HSC-SSP within a collaboration, and is slightly older than that used for the recent HSC-SSP public data release (Aihara et al. 2018b). Nevertheless, this version is capable of the basic image processing and measurements required for QA. The data analysis software comprises five major analysis stages of QA processing: (1) frame analysis, (2) exposure analysis, (3) focus-offset analysis, (4) tile image analysis, and (5) image quality map analysis. Figure 2 shows the relation of the analysis stages and their outputs. Fig. 2. View largeDownload slide Analysis stage components processed by OSQAH, and output files from those analysis processes. The arrows represent inputs/outputs of data to/from analysis stages. Frame analysis, tile-image analysis, and focus-offset analysis run independently. The remaining analysis stages are dependent upon other stages, in which outputs from a previous stage need to be input to the next stage. Fig. 2. View largeDownload slide Analysis stage components processed by OSQAH, and output files from those analysis processes. The arrows represent inputs/outputs of data to/from analysis stages. Frame analysis, tile-image analysis, and focus-offset analysis run independently. The remaining analysis stages are dependent upon other stages, in which outputs from a previous stage need to be input to the next stage. Configurations to application processes are given by the orchestration software, which is described in section 7. All processes write standard/standard-error outputs to log files in an operation directory that is prepared separately for each night. 6.1 Frame analysis In the first step of the QA analysis cycle for a visit’s data set, the OSQAH performs a frame analysis, which is a CCD-by CCD-process. To extract QA parameters of each CCD, this stage conducts overscan subtraction, flat-fielding, sky subtraction, and basic measurements of sources for 104 scientific CCDs, excluding eight off-focus CCDs. The QA parameters include the seeing, overscan levels, sky levels, and sky transparency of each CCD. A single set of DOMEFLAT data are used in flat-fielding for all observing runs for consistent QA analysis. Their count levels are normalized so that the field center should be unity to ease estimation of the sky transparency across the field of view (FOV). If a data type of the input data, determined by DATA_TYP keyword of FITS header, is DARK, DOMEFLAT, or SKYFLAT, this stage only performs overscan subtraction and count level measurement, and the other QA parameters are not measured. For BIAS data, only the count level measurement is done without overscan subtraction. Flat-fielding in the SSP data production is performed with the DOMEFLAT data only, to take an advantage of data acquisition with stable brightness and color temperature of the illumination sources across the observing runs. Some other observing programs obtain the SKYFLAT (twilight flat) data, for more uniform illumination. To perform calibration for astrometry and magnitude zero-point, this stage carries out cross-matching of detected sources on a CCD image with external reference catalog sources (SDSS-DR8: Aihara et al. 2011). The external catalog is read by the analysis process through FITS files divided by sky tessellation in a form of astrometry.net (Lang et al. 2010). The cross-matching is done using a pattern-matching algorithm presented by Tabur et al. (2007). Astrometry runs to determine the World Coordinate System (WCS) of each CCD, with the SIP convention (TAN-SIP)3 (Shupe et al. 2005), as carried out. The distortion of HSC due to the optics is about 3% at the field edge (Miyazaki et al. 2018). The TAN-SIP convention is used to model the distortion in a CCD with third-order coefficients for non-linear terms. Typical fitting residuals are ∼0.15″, which is small enough to perform cross-matching with the reference catalog. Magnitude zero-points are derived using the reference sources in each CCD. Here, the SDSS magnitudes are transformed to the HSC native-band magnitudes by pre-defined color terms. These color terms are determined by convolving spectral energy distributions in a stellar library by Gunn and Stryker (1983) with response functions of SDSS and HSC, and fitting a quadratic polynomial of a difference in the two system magnitudes as a function of SDSS color of sources. Thus, the magnitude zero-points and source magnitudes based on these zero-points are determined in the HSC native-band system. Sky transparency is estimated based on the zero-points as described in sub-subsection 6.1.2. The 104 CCDs are processed by 104 independent parallel Python processes simultaneously spreading over the five slave nodes (see also section 7). As described in section 9, at the end of the process for each CCD, extracted QA parameters are registered to a frame table in the database hsc_onsite. At the start and the end of the processes, time stamps of the process execution are also recorded in file_mng_onsite table in the hscpipe database. As a result, we have processed CCD FITS images with calibrated WCS and zero-point set in the header, and catalog files in the FITS BINTABLE format. This stage is developed based on a hscProcessCcd task for single-visit processing of hscPipe. Major differences from the formal production in HSC-SSP are as follows: It uses a fixed PSF model (0$${^{\prime\prime}_{.}}$$9 in FWHM) throughout image processing and model-fitting measurements, without on-the-fly PSF model determination for the sake of fast and coarse QA processing, except employing an updated seeing measurement algorithm. It introduces sky transparency measurement. It provides estimation of focus offset asynchronously with the other QA processes, since confirmation of on-focus of images is essential to keep observing and needs quick feedback from the system than other QA. It terminates a procedure that requires an unusually long time, to prevent any blockage of other QA analysis processes. It measures only raw average counts for BIAS, DARK, DOMEFLAT, or SKYFLAT, without flat-fielding. A couple of major components in the frame analysis—estimates of seeing and sky transparency—are described in the following sections. 6.1.1 Seeing measurement The basic idea of seeing measurement is derived from the on-site QA system for Suprime-Cam (Furusawa et al. 2011). The codes are ported to hscPipe-based functions (sizeMagnitudeMitakaStarSelector) and the parameters are tuned up with a range of HSC data in various bands. The seeing is measured for sources detected above 5σ of the sky fluctuation. A combination of their sizes and instrumental magnitudes are used to pick a group of bright point sources. The size is defined as a simple average of the triangular components of the adaptive moments (Ixx + Iyy)/2. Based on our experiments, we set a limiting instrumental magnitude to avoid the influence of faint cosmic rays. To determine the limiting magnitude, the code first makes a number count of all sources as a function of magnitude. Then the number counts are summed up from the bright end toward the fainter magnitudes until the sum reaches 15% of the total number of sources (Nsource). Only the sources brighter than this magnitude are considered in the next step. We adjust the magnitude limit in blue bands so that only sources with high signal-to-noise ratio (S/N) are selected, multiplying the above 15% condition by a factor of 2/3 for g and NB0718, 1/2 for NB0468 and NB0515, and 2/5 for NB0387. Here, for data taken with short exposure time and having a small total number of sources after the magnitude cut, the 50 smallest sources are picked first, and then the 30 brightest sources are chosen from the resultant sample. The other data are screened in a reversed order, i.e., picking the brightest sources first and then the smallest sources. We find that this approach is empirically robust to include a group of point sources for various qualities of data. This way, we obtain a group of point sources. The mean of the source sizes is determined by applying a 3σ clipping of outliers. This gives a decent initial estimate of the seeing. To determine the final seeing measurement, we make a histogram of the source sizes of the sample with a bin step of 0.2 pixels, for a size range of ±0.75 pixels around the above seeing guess. The mode of the histogram is taken as the final seeing. An example of the final seeing measurement is shown in figure 3. Fig. 3. View largeDownload slide Diagnostic plot of seeing estimation. This is an example for data of visit = 120690 and ccd = 49. In the left-hand panel, all sources detected with a 5σ and those used to estimate a rough seeing range in the first step (after filtering for saturation and the magnitude limit) are plotted with plus and cross symbols, respectively. The filled circles are the final sample to determine the seeing of a CCD image, which are located within a size range of compact sources (dot–dashed lines). The right-hand panel shows a histogram of the final sample of compact sources, and the mode is determined as the final seeing (dashed line in both panels), which is 3.58 pixels or 0$${^{\prime\prime}_{.}}$$6 in this example. (Color online) Fig. 3. View largeDownload slide Diagnostic plot of seeing estimation. This is an example for data of visit = 120690 and ccd = 49. In the left-hand panel, all sources detected with a 5σ and those used to estimate a rough seeing range in the first step (after filtering for saturation and the magnitude limit) are plotted with plus and cross symbols, respectively. The filled circles are the final sample to determine the seeing of a CCD image, which are located within a size range of compact sources (dot–dashed lines). The right-hand panel shows a histogram of the final sample of compact sources, and the mode is determined as the final seeing (dashed line in both panels), which is 3.58 pixels or 0$${^{\prime\prime}_{.}}$$6 in this example. (Color online) 6.1.2 Sky transparency estimation The transparency of the sky is estimated as the ratio of the number of detected photons and that which is anticipated from known reference stars under a clear-sky condition. In the current system, we use the SDSS-DR8 catalog for reference stars. The anticipated numbers are converted from magnitude zero-points (table 1), which are derived from the sensitivities (mag) in each band provided in the instrument page.4 Some of the magnitudes have been slightly modified to obtain a 100% transparency under a clear sky (see section 10). The measurement of stars is done using a fixed aperture photometry with 24 pixels (or 4″ at the field center) in diameter in each CCD. The derived transparencies in each CCD are collected and averaged over the CCDs with a 3σ clipping. The result is returned to observers as a sky transparency that is representative of the visit. Table 1. Sensitivity assumed in each band.* Band  Zero-point  Offset    Band  Zero-point  Offset  g  29.0  −0.061    N387  24.6  —  r  29.0  −0.060    N468  26.0  —  r2  29.0  —    N515  25.8  —  i  28.6  +0.091    N718  25.9  —  i2  28.6  +0.091    N816  25.5  —  z  27.7  +0.16    N921  25.7  +0.26  y  27.4  +0.077    N973  25.1  —  Band  Zero-point  Offset    Band  Zero-point  Offset  g  29.0  −0.061    N387  24.6  —  r  29.0  −0.060    N468  26.0  —  r2  29.0  —    N515  25.8  —  i  28.6  +0.091    N718  25.9  —  i2  28.6  +0.091    N816  25.5  —  z  27.7  +0.16    N921  25.7  +0.26  y  27.4  +0.077    N973  25.1  —  *Magnitude per electron per unit exposure time (mag e−1 s−1) in each band is listed. It is assumed that these zero-points are anticipated under a clear sky condition. The offsets in the third column have been added since 2016-03-03 UT, and the zero-point values in the second column are before these offsets are applied. The zero-points for the r2, i2, N515, and N816 bands are tentative and slightly different from the latest values on the instrument web page. The offsets in these bands need to be updated based on the accumulated data. The other bands with no offset value have been introduced lately, and their offsets will be determined when sufficient data are collected, too. View Large At present, the OSQAH system cannot provide a transparency for fields that are not covered by SDSS in a regular operation mode. However, we have introduced an engineering mode that uses the Pan-STARRS1 (PS1) catalog (Magnier et al. 2013, 2016) for the reference, which has recently been made publicly available. This new mode under evaluation will enable us to perform an estimation of the transparency for most of the sky areas accessible from the Subaru Telescope in the near future. 6.2 Exposure analysis Exposure analysis is invoked after all the processes of the frame analysis of a visit are completed. This stage gathers results from the frame analysis of the 104 CCDs, and derives QA parameters representative of the visit by taking statistics of the QA parameters from individual CCDs. At the end of the process, the QA parameters for the visit are registered in a exposure table in the hsc_onsite database. Again, time stamps of the execution are recorded in file_mng_onsite in the hscpipe database. This analysis is executed as a single process on one of the two mosaic nodes. Since this analysis requires access to output files from the frame analysis in all CCDs, the mosaic nodes connect to the NFS directories on all slave CCD nodes and file servers. 6.2.1 Global astrometric solution The exposure analysis is capable of re-determining the astrometric solution of each CCD as an optional function. This function solvetansip gathers cross-matched reference sources in all CCDs, and determines a set of WCS coefficients for the entire FOV. This procedure could improve a resultant astrometry in each CCD by using all the cross-matched sources that are in the FOV and have ninth-order TAN-SIP coefficients. It sets a consistent WCS across the FOV, in which all CCDs share a single reference projection point, too. In the regular operation mode, this procedure is disabled, since both additional processing time and disk access are required, while astrometry in a single CCD done by the frame analysis is acceptable to QA in most observing programs. 6.3 Focus-offset analysis The OSQAH system provides a rough estimate of the telescope focus offset based on images as part of the QA analysis cycle. Eight CCDs are dedicated to this purpose, and they are placed at the very edges of the FOV with ±200 μm vertical offsets from the focal plane. We implement a code to estimate the extent of the off-focus of an image in the automated QA cycle. The estimation is done by measuring sizes of point sources on the eight de-focused CCDs, converting them to the focus offset (H. Miyatake 2014 private communication; see subsection 3.6 in Miyazaki et al. 2018 for details). This conversion is performed assuming the following relation:   \begin{equation} \sigma ^2_\mathrm{PSF} = \sigma ^2_\mathrm{atm} + \sigma ^2_\mathrm{opt} + \sigma ^2_\mathrm{focus}, \end{equation} (1)where σPSF, σatm, σopt, and σfocus are the overall PSF size of images measured on the de-focused CCDs, contributions to the PSF size by atmospheric scintillation, the optics, and off-focus, respectively. Combining measurements on both CCDs with positive and negative de-focus values cancels out the σatm and σopt terms. Thus, we can estimate the off-focus term and convert it to the telescope focus offset based on a ray-tracing model. The resultant focus offset is shown through a web application immediately after the estimation is finished. Observers adjust the telescope focus typically two to three times a night based on this function, which is done between exposures. This allows observers to continue scientific exposures by compensating for the telescope focus without performing a full sequence of focus determination with the camera. The full sequence usually takes 3–5 minutes. This analysis is executed on a dedicated node (“User node” in figure 1), in order to provide as fast a feed-back of a result to observers as possible. 6.4 Tile image analysis Tile image analysis produces a single mosaic-tiled image of a visit for a quick-look purpose. The process of this analysis collects raw images of all CCDs in a visit and tiles them into a single image covering the entire FOV. In tiling images, counts in overscan regions in each amplifier are subtracted from the images and the regions are trimmed, and an 8 × 8 binning is applied, too. For BIAS data, the overscan regions are simply trimmed without subtraction. The OSQAH system generates another set of quick-look images for individual CCDs—overscan-subtracted images and flat-fielded images in the process of the frame analysis. The flat-fielded images are also displayed by a web user interface (section 8), in which observers can pan and zoom the tiled flat-fielded images across the entire FOV. The former tiled images are useful for a quick check of the global pattern in the raw images, and the latter helps with a close inspection of images with a higher spatial resolution. This analysis stage is executed as a single process on one of the two mosaic nodes. 6.5 Image quality map analysis The frame analysis outputs seeing measurements in each CCD, which includes seeing (FWHM) and ellipticity of point sources selected by the seeing measurement algorithm. The image quality map analysis assembles these results from each CCD, and generates a map of the two kinds of image qualities (seeing size and ellipticity) and a number density of the selected sources across the FOV (see sub-subsection 8.2.1). The analysis provides distortion-uncorrected raw maps and distortion-corrected maps per visit both for seeing and ellipticity. This correction is done by assuming an optical model of PSF size variation against a radial position in the FOV for seeing size, and a pre-defined distortion model of ninth-order polynomials for ellipticity. This analysis stage runs as a single process per visit on one of the two mosaic nodes, too. 7 Orchestrating software The orchestration software manages data flow and every QA analysis cycle performed in the OSQAH system. The software monitors the completion of raw data transfer from the observing control system Gen2, and invokes a set of analysis cycles for the arrived data of a visit. This part is developed separately from the individual data analysis software. The software is written in pure Python version 2.7 and runs on the master node. Figure 4 shows a schematic view of the analysis cycle managed by the orchestration software, which is described in the next sections. Fig. 4. View largeDownload slide Schematic view of analysis cycle managed by the orchestration software. The thin arrows show a work flow performed by the runOnsite process. (Color online) Fig. 4. View largeDownload slide Schematic view of analysis cycle managed by the orchestration software. The thin arrows show a work flow performed by the runOnsite process. (Color online) 7.1 Raw data capturing and registration The orchestration software runOnsite monitors the arrival of raw data by polling a dedicated directory on the NFS file system on the file server every three seconds. Since STARS manages each CCD image as a unit data frame, the observing control system Gen2 transfers each CCD image to STARS and also to the OSQAH system asynchronously, without making any link between CCDs in a visit. Hence, the OSQAH system first has to tie the arrived CCD data to the originating visit. Completion of file transfer of the CCD data is evaluated by their file sizes to avoid a race condition between the data transfer process and the file capturing process. When a CCD FITS data file belonging to a certain visit is received, the runOnsite process awaits the arrival of all CCD data in the visit. runOnsite is designed to track multiple visits at the same time, by recording which CCDs already arrived in each visit. This is because raw data from multiple visits often come in asynchronously, in an arbitrary order. The procedure of the data registration is done as follows: first, runOnsite relocates the raw CCD data under a directory tree specialized for the data analysis pipeline, which is called the “data repository”. In this step, base file names are renamed from the original FRAMEID to HSC-${visit}-${ccd}, in the same manner as for hscPipe (Bosch et al. 2018; Aihara et al. 2018b). Then, basic information of the data, such as visit, ccd, object name, filter, and exposure time, is extracted from the data and registered in a database (hscpipe). This database plays the same role as the raw data registry used in hscPipe, except an extension is introduced in OSQAH for recording raw file locations and time stamps of when each analysis stage starts and ends. 7.2 Analysis cycle integration When runOnsite receives and registers all 112 CCDs from a visit, or when a waiting timer expires, it triggers a cycle of analysis procedures for the visit. The execution of the analysis procedure is performed through the batch job management system TORQUE (2.5.13).5 TORQUE is an open-source product, which is a derivative of the PBS project. The TORQUE server and scheduler are hosted by the master node, and client services (MOMs) run in each of the slave computing nodes. A cycle of analysis procedures for a visit is composed of several jobs, including the analysis stages presented in section 6, as follows: (1a) frame analysis (frame-ana), to extract QA parameters from 104 CCDs, (1b) tile-CCD analysis (tile), to tile 104 CCDs on to a single mosaicked image for a quick look, (1c) focus-offset analysis (foff), to estimate a guess of focus offset, (2a) image quality map analysis (map), to generate a set of plots showing seeing and ellipticity across the FOV, for images both uncorrected and corrected for the optics distortion, (2b) making symbolic links (mksym), to create symbolic links of processed data by frame-ana that are located in local disks of the slave nodes on a working NFS file system shared by all the computing nodes, and (3) exposure analysis (exp-ana), to gather QA parameters from all CCDs in a visit and determine representative QA parameters of that visit. These analysis stages are submitted and queued to TORQUE as a group of batch jobs. Once the jobs are submitted, the sequence of job execution with available computing resources and monitoring of job completion are managed by TORQUE. Here, dependencies between the jobs are set so that one analysis that requires outputs from another analysis should be executed after all the necessary jobs are completed. The first set of jobs, (1a) through (1c), run asynchronously, where the frame analysis is executed as a TORQUE’s array job that groups 104 jobs for all the CCDs in a visit. After the frame-ana (1a) is completed, the second set of jobs, (2a) and (2b), are executed. Finally, only when the (2b) mksym is done, (3) exp-ana is ready to run and executed by TORQUE. Available CPU resources where each analysis process should run are determined and assigned by TORQUE, too. To minimize the processing time for a cycle of analyses, the frame analysis is configured to write a major part of outputs into local disks of slave nodes rather than writing to NFS disks via a hundred parallel processes. The mksym job (2b) makes symbolic links to these outputs files from an NFS directory on a file server, so that the exposure analysis process on a mosaic node can access these files. Despite the overheads and complexity of manipulating a considerable number of symbolic links, this staging of the procedure is faster than writing all outputs to the NFS directory. This additional stage should be disabled when we employ a faster and more load-tolerant file system, such as a parallel-distributed file system, in the future. 8 Visualizing and interacting tool—OBSLOG The web application OBSLOG is a tool for visualizing and providing QA results to observers, and serves as the front-end user interface of OSQAH. This application is designed to provide observers with an interface to perform interactive checks of the results and make their records. The OBSLOG lists up basic information of data which represents observing parameters associated with the data (e.g., visit id, observing time, filter, object name, and exposure time), along with quick-look images and the derived QA parameters for every visit. The listing is done in a similar manner to a traditional observation log in paper form. Key functions provided by the OBSLOG are described in the following sections. 8.1 Building QA logs Building a list of basic information and QA parameters per visit is the first thing to do. This list is maintained in a SQLite3 database dedicated for OBSLOG.6 Every 30 seconds OBSLOG checks the OSQAH’s registry database and the output directory where the derived QA parameters and processed files are stored. If the OBSLOG detects a new visit registered in the registry, it creates a new entry for the visit in its SQLite3 database. At this stage, values in FITS header keywords are extracted and registered, which include PROP-ID, used to identify an observing program. When the OBSLOG detects completion of a QA analysis cycle for a visit by monitoring the OSQAH’s QA results database (section 9), it loads all QA parameters into the SQLite3 database. Values of FITS header keywords and QA are structured following the JSON format in the SQLite3 database.7 8.2 Viewing and searching interface 8.2.1 QA parameters and observers’ notes OBSLOG displays a summary of the QA analysis on an interactive web page. The basic information shown by default for each visit is (1) visit id, (2) sequential data id in a night assigned by the instrument, (3) observing date and time, (4) filter name, (5) object (field) name, (6) exposure time, (7) azimuth and elevation of the telescope pointing, (8) instrument rotator angle, (9) position angle of FOV, (10) seeing, (11) sky level, (12) magnitude zero-point (mag ADU−1 s−1), (13) sky transparency, (14) focus position of the telescope, and (15) users’ comment. In addition to the above values, links to the following information are provided on the page: (16) FITS header, (17) ellipticity map of point sources across the FOV, (18) seeing (FWHM) map, (19) number density of detected point sources, and (20) quick-look images. The measured QA parameters listed here are the representative values of each exposure derived by the exposure analysis. Figure 5 is a screenshot of the OBSLOG web page. When a mouse pointer is put on a dedicated button, the FITS header, quick-look image, and image quality maps can be displayed. Figure 6 shows an example of the image quality maps, which are generated by the image quality map analysis stage. Observers can also add or modify columns to show other sets of QA parameters by themselves, and can save the modified configuration per observer. Fig. 5. View largeDownload slide Screen shot of the whole view of the OBSLOG interface. Each row shows extracted QA parameters of a visit, with basic information on the data. A quick-look image and a chart for monitoring seeing, sky transparency, and focus offset are shown in the bottom left- and the bottom right-hand panels, respectively. Forms to input observers’ notes are located in the far right-hand side of each visit in this example. Data search commands are accepted in a form at the top of the page. (Color online) Fig. 5. View largeDownload slide Screen shot of the whole view of the OBSLOG interface. Each row shows extracted QA parameters of a visit, with basic information on the data. A quick-look image and a chart for monitoring seeing, sky transparency, and focus offset are shown in the bottom left- and the bottom right-hand panels, respectively. Forms to input observers’ notes are located in the far right-hand side of each visit in this example. Data search commands are accepted in a form at the top of the page. (Color online) Fig. 6. View largeDownload slide Example of image quality maps. From left to right, (a) seeing size (FWHM), (b) typical ellipticity and orientation of elongation of PSF-like sources in each position, and (c) number of PSF-like sources used to derive these two values. (Color online) Fig. 6. View largeDownload slide Example of image quality maps. From left to right, (a) seeing size (FWHM), (b) typical ellipticity and orientation of elongation of PSF-like sources in each position, and (c) number of PSF-like sources used to derive these two values. (Color online) This user interface allows observers to search for and list a specific range of visits by giving a condition for selection, such as a range of observing dates, filter, exposure time, seeing, and sky transparency. The conditions can be given in a combination of FITS header keyword values and QA parameters with a JavaScript notation, which is helpful in performing even complicated data searches. This querying function is used by observers to ensure completion of exposures and list usable data sets in science production during and after observation. The web page also has a form to attach a note from the observers for each visit. These notes can be used for data flagging, with the QA parameters, to assess completion of exposures and data selection in the science data production. The notes are recorded in the SQLite3 database with the account name of the user who wrote the note. Observers can only edit or delete his/her own notes. The list of visits is automatically updated on the webpage, on which only information on updated visits is transferred and appended to the QA list. The resultant QA list can be downloaded as an observation log in CSV, Excel, JSON, or PDF formats. 8.2.2 Quick-look images As described in subsection 6.4, a couple of quick-look images showing the entire FOV are available through the OBSLOG. The tiled image by 8 × 8 binning with overscan subtraction is displayed in a pop-up window overlaid on the list of QA parameters. Another tiled image, after flat-fielding, can be also viewed in a separated window, which has a typical dimension of ∼8000 pixels on a side (figure 7). We use an open-source JavaScript library Leaflet to construct a responsive interactive map, which allows observers to pan and zoom very quickly into pixel areas of interest.8 Fig. 7. View largeDownload slide Example of quick-look images displayed by OBSLOG. The left-hand panel shows an entire field of view of a visit, and a close-up image is shown in the right-hand panel. (Color online) Fig. 7. View largeDownload slide Example of quick-look images displayed by OBSLOG. The left-hand panel shows an entire field of view of a visit, and a close-up image is shown in the right-hand panel. (Color online) 8.2.3 Variation monitoring OBSLOG is capable of plotting a temporal variation in selected QA parameters. Observers can specify QA parameters or an arithmetic combination thereof to be plotted in a chart, by using JavaScript notation. This function is implemented with a JavaScript library for jQuery plotting (Flot).9 In standard observations, seeing, sky transparency, and estimated focus offsets are monitored by this function (figure 8). Fig. 8. View largeDownload slide Time-sequence plot for monitoring QA parameters. In this example, temporal variation of seeing, sky transparency, and focus offset from visit to visit are being plotted. The open circles with error bars are the estimated on-focus position of the instrument, and those without error bars show the current instrument position. The data point for focus located at around (21:57, 3.85) is for the full focus sequence. The scales attached to the left-hand side are for sky transparency, seeing (arcsec), and estimated on-focus position of the instrument (mm), from left to right. (Color online) Fig. 8. View largeDownload slide Time-sequence plot for monitoring QA parameters. In this example, temporal variation of seeing, sky transparency, and focus offset from visit to visit are being plotted. The open circles with error bars are the estimated on-focus position of the instrument, and those without error bars show the current instrument position. The data point for focus located at around (21:57, 3.85) is for the full focus sequence. The scales attached to the left-hand side are for sky transparency, seeing (arcsec), and estimated on-focus position of the instrument (mm), from left to right. (Color online) 8.3 Server and client components OBSLOG is composed of server and client software. The server component is developed with a web application framework, Ruby on Rails, and runs on the web server node. This component is responsible for building the QA logs in the dedicated SQLite3 database. Upon receiving a users request, this component also executes a query for a data search of the SQLite3 database and returns a result to the client component. In this query, observers can specify the format for the result; HTML, JSON, CSV, Excel, or PDF. The server component generates a file formatted in the given format and transfers the formatted result to the client component. The server component handles a user authentication and its access control, too (subsection 8.4). The client component provides the user interface of the OBSLOG, which is written with JavaScript and runs on a web browser on the user side. This component decodes outputs from the server components, displaying the resultant QA list on a web page, and also delegates interactive commands given by the observers to the server component. The user interface is built as a single-page application with the JavaScript framework Knockout10 and jQuery UI so that all functions should be accessible without any page transition. This is to prevent interference with observers’ key inputs. 8.4 User authentication and access control In order to view QA results, the observers are requested to log in to the OBSLOG interface. The user authentication is done through Lightweight Directory Access Protocol (LDAP) for user accounts of the Subaru archive system (STARS). When a user is authenticated, the OBSLOG server component determines which observing programs the user belongs to, and records the information in another SQLite3 database that is dedicated to OBSLOG user management. The OBSLOG only allows a user to view the data which have a PROP-ID value in the FITS header matching one of the observing program IDs to which the user has a right of access. Since the OBSLOG user interface is the single place where the observers monitor the QA results, this access control guarantees proprietary data access rights between observing programs in the OSQAH system. 9 Database This section describes the structure of databases used in OSQAH. The OSQAH databases are managed by an open-source relational database management system, PostgreSQL (version 9.3). We choose the PostgreSQL to allow simultaneous read and write access by more than 100 processes. There are a couple of database spaces employed to operate the QA analysis. The database hscpipe is dedicated to manage raw data and DOMEFLAT data (figure 9). This is the same as the data registries used in the HSC data analysis pipeline (Bosch et al. 2018), which provide data analysis processes with pointers to necessary raw data and DOMEFLAT data. The other database, hsc_onsite, records all results of the QA analysis. This database shares a table structure for meta-data registration with the catalog database for the HSC-SSP data release (Yamada et al. 2014; T. Takata et al. in preparation). Fig. 9. View largeDownload slide Database for management of raw data and DOMEFLAT data (hscpipe). Table names (with an underline) and columns important for QA analysis in tables are shown in each box. The tables raw and raw_visit store basic information on raw data in order to identify data to be processed. The file_mng_onsite holds additional information including the file location and time stamps of processing data. The flat table manages DOMEFLAT data. Each set of DOMEFLAT data are associated with a range of dates (validstart and validend) to identify which DOMEFLAT data should be used for certain raw data. The DOMEFLAT data are applied to raw data that are taken within a period of this range. In all boxes, the column marked with an open star is the primary key of a table, and those with an open circle or a rounded rectangle are given a unique key constraint in the table definitions. The keys connected by a line are used to join a couple of tables in the database query. Fig. 9. View largeDownload slide Database for management of raw data and DOMEFLAT data (hscpipe). Table names (with an underline) and columns important for QA analysis in tables are shown in each box. The tables raw and raw_visit store basic information on raw data in order to identify data to be processed. The file_mng_onsite holds additional information including the file location and time stamps of processing data. The flat table manages DOMEFLAT data. Each set of DOMEFLAT data are associated with a range of dates (validstart and validend) to identify which DOMEFLAT data should be used for certain raw data. The DOMEFLAT data are applied to raw data that are taken within a period of this range. In all boxes, the column marked with an open star is the primary key of a table, and those with an open circle or a rounded rectangle are given a unique key constraint in the table definitions. The keys connected by a line are used to join a couple of tables in the database query. 9.1 Data registry database Figure 9 shows tables and important columns managed in the hscpipe database. To identify raw data by the data analysis processes, raw and raw_visit tables are used, which hold basic information of raw data for each visit and ccd. The application in the frame analysis also refers to the flat table to determine which DOMEFLAT data should be used for the particular raw data being processed. This determination is done based on a period specified by the two columns validstart and validend, in the manner where a set of DOMEFLAT data are chosen if the observing date of the raw data file being processed fits within this period. The file_mng_onsite table is a special table prepared for the OSQAH operation. This table manages time stamps of QA analysis execution and the root directory used for the raw data repository. 9.2 QA result database In the QA result database (hsc_onsite), there are a couple of table categories (figure 10). The operation tables are designed to manage analysis history, and provide means to apply data flagging. The QA results tables are for storing all the derived QA parameters. HEALPix indices (Górski et al. 2005) mapped to all CCD data are available. This is helpful in science data production to identify a group of CCD data overlapping in a given sky area. Fig. 10. View largeDownload slide Tables in the database for QA results (hsc_onsite). There are a couple of categories of tables. The operation tables are prepared for recording an analysis history, locations of processed files, and time stamps of analysis operation. The QA result tables store analysis results. The frame and the exposure tables record the derived QA parameters, and the frame_anaresult and the exposure_anaresult have flags for data usability. The frame_hpx11 stores HEALPix indices with Nside = 2048 covering each CCD pixel area based on WCS derived in the frame analysis. This information is useful in data production, to identify CCD data overlapping with each pre-defined sky tessellation. Fig. 10. View largeDownload slide Tables in the database for QA results (hsc_onsite). There are a couple of categories of tables. The operation tables are prepared for recording an analysis history, locations of processed files, and time stamps of analysis operation. The QA result tables store analysis results. The frame and the exposure tables record the derived QA parameters, and the frame_anaresult and the exposure_anaresult have flags for data usability. The frame_hpx11 stores HEALPix indices with Nside = 2048 covering each CCD pixel area based on WCS derived in the frame analysis. This information is useful in data production, to identify CCD data overlapping with each pre-defined sky tessellation. Figure 11 summarizes important columns in each table in the hsc_onsite database. The frame_mng and exp_mng tables maintain the locations of processed files and time stamps of database loading of processed data. The frame and exposure tables are the places to record the derived QA parameters. The frame_anaresult and exposure_anaresult tables are designed to maintain information on usability of data. The flags flag_auto, flag_usr, and flag_tag are intended to store 1 (good) or 0 (bad) based on automated evaluation, inputs by an observer, and the final judgment in combination of the former two flags, respectively. Another set of columns, purpos and datset, are prepared for the management of the data set in the science data production. These keywords are intended to identify which visits are usable in the data production, what range of visits are to be grouped, and what calibration data should be applied to respective groups. These mechanisms of flagging and data set management are under development, and these flags are not yet used in the current operation. The analysis table records analysis configuration and history, as described in the next section. Fig. 11. View largeDownload slide Tables and columns in the hsc_onsite database. The analysis table records a history of analysis sessions, in which each night’s operation is assigned an analysis session that ties to a set of configurations for analysis applications. The frame table registers QA parameters of each CCD separately, and the exposure table maintains representative values that are determined by combining QA parameters from all CCDs in a visit. The exposure table also holds the rms of major QA parameters between CCDs. The entries marked with an open star are the primary keys in each table. Fig. 11. View largeDownload slide Tables and columns in the hsc_onsite database. The analysis table records a history of analysis sessions, in which each night’s operation is assigned an analysis session that ties to a set of configurations for analysis applications. The frame table registers QA parameters of each CCD separately, and the exposure table maintains representative values that are determined by combining QA parameters from all CCDs in a visit. The exposure table also holds the rms of major QA parameters between CCDs. The entries marked with an open star are the primary keys in each table. 9.3 Analysis tracking The analysis tracking is a crucial function for QA. To monitor and examine the QA parameters, we have to identify what version and configurations of the software are used for each QA analysis. The software versions, algorithms, or configurations of applications may be sometimes modified as the QA system is updated in the observatory’s engineering. All the QA results also need to be verified by reprocessing when necessary. In the OSQAH, we maintain analysis configurations used in the QA analysis and analysis histories from each observing night by using the analysis table (figure 11). The concept of the traceability management of the QA analysis in OSQAH is explained in figure 12. Fig. 12. View largeDownload slide Tables for maintaining analysis configurations. The analysis table stores “rerun” and “config” used in each runOnsite session with a unique ana_id. A rerun is allowed to have a single configuration only. Log files from applications are stored separately by observing dates. Fig. 12. View largeDownload slide Tables for maintaining analysis configurations. The analysis table stores “rerun” and “config” used in each runOnsite session with a unique ana_id. A rerun is allowed to have a single configuration only. Log files from applications are stored separately by observing dates. A session of the orchestration software runOnsite is started at the beginning of each observing program in a night. A unique analysis id (ana_id) is assigned to each of the runOnsite sessions. Here, a pair of parameters are assigned to a runOnsite session. One parameter is “rerun”, which is the identifier of an observing program in each night, and usually assigned to every observing program in each night as a combination of observing date and program name (e.g., ut20170328a_s17a_ssp). The other parameter is “config”, which indicates the configuration directory used in the QA analysis. The runOnsite process submits analysis jobs to use the specified rerun and config throughout a runOnsite session, i.e., an observing program in a night. The analysis table holds a combination of rerun and config used in each runOnsite session with a unique ana_id, where one ana_id is tied to a runOnsite session. Thus, we can trace an analysis configuration applied in the QA analysis in each observing program through the ana_id. The config directory points to an operation directory that contains configuration files of analysis parameters and environmental variables to be set in the application processes. The config directories and configuration files are prepared in advance of night operations. The rerun is derived from a term used in the data analysis software hscPipe, which provides a destination of output files through hscPipe applications. This is also used for a conflict check of analysis configurations. An hscPipe process ensures that a single consistent configuration is used in a rerun, to avoid any untraceable mixture of analysis results with different configurations. At the start of a runOnsite session, the OSQAH creates the hsc_onsite tables in a new name space using “schema” of PostgreSQL, which is dedicated for the rerun. This aims to distinguish QA results in different observing programs easily. The schema name is derived from the rerun name. Figure 13 illustrates how the hsc_onsite database maintains tables for different observing programs (i.e., reruns). In this example, the two left-hand groups show the schemas assigned on the night of 2017-03-25 UT, in which two observing programs (s17a_ssp and s17a_qn000) are carried out, and the two different schemas are assigned to respective programs. In each schema, a set of operation and QA result tables are created. Fig. 13. View largeDownload slide Database schema used for night operation in the hsc_onsite database. A schema, which is a name space utilized in a PostgreSQL database, is assigned for each night operation. In each schema, a set of tables for managing QA operation and results are created. Fig. 13. View largeDownload slide Database schema used for night operation in the hsc_onsite database. A schema, which is a name space utilized in a PostgreSQL database, is assigned for each night operation. In each schema, a set of tables for managing QA operation and results are created. 10 Application results 10.1 Operation 10.1.1 General and SSP observations Since 2014 March, the OSQAH system has been operating basically for all general observations including HSC-SSP and some engineering observations. The access control provided by OBSLOG successfully manages the multiple general observations in the system. In the HSC-SSP observations, nightly exposure planning is performed based on the OSQAH’s QA list with observers’ notes, and completion of exposures are determined by the QA results. Observing logs with QA results generated by OBSLOG are distributed to collaborators to share the survey’s progress, and are also provided in the public data release. The data selection in science data production for data releases to-date has been done using the QA database. Thus, our initial goals for the on-site QA are fulfilled by the OSQAH, and the OSQAH is thought to be an essential facility for operating HSC-SSP observations. In addition, several PI-type observing programs aimed at searching for transient sources utilize the OSQAH for pre-processing of further sophisticated algorithms such as image subtraction, e.g., Tanaka et al. (2016). This suggests that the OSQAH’s automated QA processing has the capability of increasing productivity of various science programs. 10.1.2 Queue observation Queue-mode observation with HSC has been commissioned since 2016 March. Allocation of queue-mode programs have gradually increased. One of the keys to operating the queue observations is immediate QA of data and decision-making regarding the completion of a program based on the QA result. The operation of HSC queue-mode observation is designed to use the OSQAH for this purpose with coarse QA (called initial QA) during the night. The OSQAH is enhanced to feed the QA parameters (seeing, transparency, sky level) for every visit, with exposure id to the Gen2 queue-mode observing system using the XML-RPC protocol. This feeding has been crucial for the HSC queue-mode observation, to check if these QA parameters satisfy a given condition, and if a set of exposures (observing block) are completed soon after the data acquisition. 10.2 Performance The system performance in the case of typical HSC-SSP observations is described. The data transfer from the HSC to the OSQAH system via Gen2 after the camera shutter closing takes about a minute. The additional overhead for capturing and registering raw data before executing analysis processes is ∼30 s. The frame analysis is finished within two minutes. After that the mksym stage and exposure analysis are completed in ∼1.5 min. Other analysis processes, including focus-offset analysis and tile-image analysis, are usually done prior to completion of the exposure analysis. As the end of exposure analysis triggers the OBSLOG to append a new visit to the QA list, observers get access to new QA parameters in slightly under five minutes after the end of exposure. This timing is acceptable for tracking in most observations. However, when relatively short exposures (≲30 s) are repeated, the input exceeds the system capacity and the analysis cycle of OSQAH tends to become slow or even stall, failing to process part of the visits due to timed-out waiting or failure in the TORQUE scheduling service. This is thought to be related to an instability of NFS in high-load conditions, and an issue that is to be addressed for more stable operations in the near future. 10.3 Assessments of data Throughout the HSC operation of three years, all the QA results by OSQAH have been registered in the database with basic information of the CCD data, such as telescope pointing and attitude of the instrument, etc. This database has the potential capability of monitoring the stability of data characteristics, and, to some extent, the condition of the telescope and the instrument, through the data quality parameters. Such assessments with the accumulated data are quite important in maintaining data quality over a long-term survey program, and achieving successful observations exploiting the expected performance of HSC. 10.3.1 Atmospheric extinction As an example application of the data assessment with the database, we discuss the effect of atmospheric extinction at the site of the Subaru telescope. From the QA result database, we derive a relation between the elevation of the telescope pointing and the magnitude zero-point per unit time of data in the g band over the three-year general observations including HSC-SSP. Figure 14 shows the result. In the figure, each data point shows the magnitude zero-point and secz of each visit. Here, we divide the whole data set for 2014-09-17 to 2017-03-07 into eight periods. Any data set from before 2014-09-17 is excluded since the OSQAH operation was not regulated then and the resultant QA parameters are less reliable than in the other periods. Since absolute values of magnitude zero-points seem to vary with periods, offsets are added so that data points at secz ≲ 1.3 in all periods should coincide with those of the first period (2014-09-17 to 2014-12-27). The offsets applied here range from 0.06 to 0.39 mag. No conversion from the zero-point to the transparency is done, and therefore the assumption of the zero-point under the photometric condition listed in table 1 does not affect the result. We discuss the temporal change in the absolute values in the next section. Fig. 14. View largeDownload slide Magnitude zero-points versus elevation of the telescope pointing in the g band. Each point represents a zero-point of a visit. Representative data points with error bars are also shown in each secz bin. The best-fitting line, with a slope of dZP/secz = 0.12, is superimposed on the data points. (Color online) Fig. 14. View largeDownload slide Magnitude zero-points versus elevation of the telescope pointing in the g band. Each point represents a zero-point of a visit. Representative data points with error bars are also shown in each secz bin. The best-fitting line, with a slope of dZP/secz = 0.12, is superimposed on the data points. (Color online) We see a clear slope in the magnitude zero-point with the elevation (secz). The zero-point decreases as the telescope pointing goes to lower elevation, although a scatter of the data points toward the lower zero-points is considerably large due to a variation in intrinsic sky transparency from night to night. To estimate the slope quantitatively, the data points are divided into separate secz bins with an interval of 0.1. In each bin, the median and the standard deviation are derived by performing 1.5σ clipping of outliers three times (filled circles with error bars). We fit a linear function to these representative data points (secz < 2.2) by the least-squares method. The slope (magnitude/airmass) is determined to be dZP/secz = 0.12 based on the best-fitting function (solid line in the figure). We find that this slope is close to the atmospheric extinction coefficients at the Mauna Kea summit, 0.14 at 4750 Å, which were reported in the CFHT Bulletin (Bèland et al. 1988) and referred to by various Mauna Kea observatories, e.g., Gemini Telescope.11 The difference between the two results corresponds to about 2% at secz = 2.0, which is smaller than a typical magnitude error (≳0.05 mag) by the OSQAH. This result supports that the OSQAH has been deriving reasonable estimates of the sky transparency. The result also implies that the atmospheric condition on Mauna Kea may remain almost stable for these three decades (1988–2017). In the Appendix, the same analysis carried out for the other broad-bands is presented. 10.3.2 Temporal change in system efficiency Another interesting aspect that can be discussed by monitoring the magnitude zero-points is the system efficiency. Figure 15 shows the relation of the magnitude zero-points and observing dates, where each of the eight periods is plotted without any added offsets. It is seen that the absolute values of the zero-points decrease with time. The latest data points for the period 2016-12-23 to 2017-03-07 seem to be lower by 0.4–0.45 mag than those in the 2014-09-17 to 2014-12-27 period. This change corresponds to a 30%–35% decrease in the system efficiency during the interval, if we assume that the change is purely due to degradation in the system efficiency. Fig. 15. View largeDownload slide Variation in magnitude zero-points in the g band since 2014 September. Recent measurements of the zero-points in 2017 March seem to be lower than that of 2014 September by 0.4 mag or ∼30%. This can be explained partly by degradation in reflectivity of the primary mirror after three years’ operation since the last re-coating. (Color online) Fig. 15. View largeDownload slide Variation in magnitude zero-points in the g band since 2014 September. Recent measurements of the zero-points in 2017 March seem to be lower than that of 2014 September by 0.4 mag or ∼30%. This can be explained partly by degradation in reflectivity of the primary mirror after three years’ operation since the last re-coating. (Color online) Fig. 16. View largeDownload slide Same as figure 14, but plotted for the r, i, z, and y bands in each panel. (Color online) Fig. 16. View largeDownload slide Same as figure 14, but plotted for the r, i, z, and y bands in each panel. (Color online) Fig. 17. View largeDownload slide Same as figure 15, but plotted for the r, i, z, and y bands in each panel. The observing periods in all panels are labeled in the same manner as in figure 15: (1) 2014-09-17–2014-12-27, (2) 2015-03-15–2015-08-20, (3) 2015-10-06–2016-02-12, (4) 2016-03-04–2016-04-15, (5) 2016-06-01–2016-07-12, (6) 2016-07-29–2016-09-07, (7) 2016-09-27–2016-12-01, and (8) 2016-12-23–2017-03-07 UT. (Color online) Fig. 17. View largeDownload slide Same as figure 15, but plotted for the r, i, z, and y bands in each panel. The observing periods in all panels are labeled in the same manner as in figure 15: (1) 2014-09-17–2014-12-27, (2) 2015-03-15–2015-08-20, (3) 2015-10-06–2016-02-12, (4) 2016-03-04–2016-04-15, (5) 2016-06-01–2016-07-12, (6) 2016-07-29–2016-09-07, (7) 2016-09-27–2016-12-01, and (8) 2016-12-23–2017-03-07 UT. (Color online) In response to this result, the observatory made an investigation, and found a decrease in the system efficiency of the high-dispersion spectrograph (HDS: Noguchi et al. 2002) of ∼35%–15% at 400–550 nm over the three years since the last mirror re-coating, in summer 2013 (N. Takato 2017 private communication). Direct measurement of the reflectivity of the primary mirror test pieces by the observatory has also supported this degradation in the mirror reflectivity at blue wavelengths. Therefore, our result could partly be explained by the degradation in the mirror reflectivity, which is most severely affected in the g band in the HSC broad-bands. The next mirror re-coating has been scheduled for late 2017, and is recognized as a high-priority action by the observatory based on the clues in the degradation that the observatory has provided. This result proves that continued quality monitoring by OSQAH with an invariant configuration could provide a hint of the telescope and instrument health, although we have a large uncertainty in the interpretation of the QA parameter measurements. Continued QA is quite important for stable operation of HSC observations. 10.4 Issues and future prospects Developments of new functions and improvements of the system performance are undertaken and being planned, to enhance the system usability in observations. To provide more reliable sky transparency, we have added slight offsets in the magnitude zero-points since the night of 2016-03-03 UT, so that the sky transparency by the QA analysis should become unity in the best sky condition (table 1). These offsets are determined based on the SSP data taken during 2014-03-25 to 2015-11-15 UT, to re-scale a 95%-tile maximum of the estimated sky transparency to unity. We need to update the offsets to the zero-points in relatively new filters such as r2, i2, N515, and N816, based on the accumulated data. Based on suggestions from observers, we plan to make a new QA mode with the PS1 catalog in a regular operation in the near future. This enhancement is to extend the sky coverage for the transparency estimation, and most of the sky accessible from the Mauna Kea will be supported. Another suggestion is the system tolerance against repeated short exposures (≲30 s). The current system tends to get unacceptably slow for such fast data inputs. As the OSQAH is considered to be a facility which is crucial to the HSC operation, especially in HSC-SSP and queue-mode observations, it is highly desirable to maintain the system as fully functioning at all times. We would like to improve the system performance by introducing a more responsive file system for concurrent access, such as a parallel distributed file system. It would also be useful to support analysis modes for processing every few visits, or a partial area of the FOV for the fast feedback. These improvements will enable the OSQAH to accommodate a wider variety of observations, including searches for transient and variable sources that require both wide sky coverage and high responsiveness of the system. The establishment of documentation, FAQs, and issue tracking for trouble-shooting is necessary for stable operation. A data flagging interface for observers is to be implemented. This will enable more efficient survey progress management and data production. The introduction of data set management with a database, which connects a range of science data with a set of calibrated data to be applied, is required in both HSC-SSP and queue-mode observations. The OSQAH database will be able to provide these new functions by the addition of a few new tables and slight modifications to the existing tables. Once implemented, these functions will assist observations and enhance the legacy of the HSC data archive. The archive associated with the QA information will facilitate data processing and reliable calibration by observers. 11 Summary We develop an on-site QA system for HSC (OSQAH) that performs automated quick data processing for evaluating data qualities. The OSQAH system is commissioned and has been operating for general observations since 2014 March. The system provides parameters used for QA, including seeing, sky level, and sky transparency, to observers through a web-based user interface typically within five minutes of the data acquisition. This fast feeding enables exposure planning during a night and efficient survey progress management. Queue-mode observations with HSC also rely on this system for the initial coarse quality check. We show how the QA database is useful for assessing the performance of the telescope and instrument. Developments of new features and improvements of the system performance are planned, which include improvements of the sky coverage by the PS1 catalog and the system responsiveness for short exposures. The successful upgrades will make the system more robust for various observations, to facilitate data analysis, and enhance the valuable HSC data archive. Acknowledgements We are grateful to the anonymous referee for their helpful comments that have improved the manuscript. All the observatory staff are appreciated for their efforts in making the OSQAH system operational. We thank Drs. Masafumi Yagi, Ichi Tanaka, Shin Oya, and Naruhisa Takato for valuable discussions and comments. This paper is based on data collected at the Subaru Telescope and retrieved from the HSC data archive system, which is operated by the Subaru Telescope and the Astronomy Data Center at National Astronomical Observatory of Japan. The Hyper Suprime-Cam (HSC) collaboration includes the astronomical communities of Japan and Taiwan, and Princeton University. The HSC instrumentation and software were developed by the National Astronomical Observatory of Japan (NAOJ), the Kavli Institute for the Physics and Mathematics of the Universe (Kavli IPMU), the University of Tokyo, the High Energy Accelerator Research Organization (KEK), the Academia Sinica Institute for Astronomy and Astrophysics in Taiwan (ASIAA), and Princeton University. Funding was contributed by the FIRST program from Japanese Cabinet Office, the Ministry of Education, Culture, Sports, Science and Technology (MEXT), the Japan Society for the Promotion of Science (JSPS), Japan Science and Technology Agency (JST), the Toray Science Foundation, NAOJ, Kavli IPMU, KEK, ASIAA, and Princeton University. HM is supported by the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. This work is in part supported by MEXT Grant-in-Aids for Scientific Research on Priority Areas (No. JP18072003) and for Scientific Research on Innovative Areas (No. JP15H05892). This paper makes use of software developed for the Large Synoptic Survey Telescope. We thank the LSST Project for making their code available as free software at http://dm.lsst.org. The Pan-STARRS1 Surveys (PS1) have been made possible through contributions of the Institute for Astronomy, the University of Hawaii, the Pan-STARRS Project Office, the Max-Planck Society and its participating institutes, the Max Planck Institute for Astronomy, Heidelberg and the Max Planck Institute for Extraterrestrial Physics, Garching, Johns Hopkins University, Durham University, the University of Edinburgh, Queen’s University Belfast, the Harvard-Smithsonian Center for Astrophysics, the Las Cumbres Observatory Global Telescope Network Incorporated, the National Central University of Taiwan, the Space Telescope Science Institute, the National Aeronautics and Space Administration under Grant No. NNX08AR22G issued through the Planetary Science Division of the NASA Science Mission Directorate, the National Science Foundation under Grant No. AST-1238877, the University of Maryland, and Eotvos Lorand University (ELTE), the Los Alamos National Laboratory, and the Gordon and Betty Moore Foundation. Appendix. Assessments with magnitude zero point in broadband The analysis of extinction coefficients based on the magnitude zero-points in the QA database, which is described in sub-subsection 10.3.1, is presented in the remaining broad-bands. Figure 16 shows results in the r, i, z, and y bands in the four panels. The fitting is done for data points at secz < 2.2 for the r and i bands, and secz < 2.0 for the z and y bands. The derived extinction coefficients are summarized in table 2. The analysis of the temporal change in the system efficiency (sub-subsection 10.3.2) is also performed in these four bands. In figure 17, we see decreases in the efficiency of ∼10% to 20% in all bands, which is a smaller decrease than in the g band. Table 2. Optical extinction values.* Band  dZP/secz(OSQAH)  dZP/secz(CFHT)  Wavelength  g  0.12  0.14  475  r  0.078  0.11  650  i  0.057  0.07  800  z  0.047  0.05  900  y  0.042  0.04  1000  Band  dZP/secz(OSQAH)  dZP/secz(CFHT)  Wavelength  g  0.12  0.14  475  r  0.078  0.11  650  i  0.057  0.07  800  z  0.047  0.05  900  y  0.042  0.04  1000  *The atmospheric extinction coefficients (dZP/secz) estimated by OSQAH are listed, with those reported in the CFHT bulletin in 1988 for reference. The CFHT coefficients are values at the given wavelength (nm) in the far right column. Collection of further data processed with a fixed set of configuration would be needed to obtain more robust coefficients. View Large Footnotes † Based on data collected at Subaru Telescope, which is operated by the National Astronomical Observatory of Japan. 1 ⟨http://www.naoj.org/Observing/Instruments/HSC/⟩. 2 ⟨http://munin-monitoring.org/⟩. 3 ⟨https://fits.gsfc.nasa.gov/registry/sip.html⟩. 4 ⟨http://www.naoj.org/Observing/Instruments/HSC/sensitivity.html⟩. 5 ⟨http://www.adaptivecomputing.com/products/open-source/torque/⟩. 6 ⟨https://sqlite.org/index.html⟩. 7 ⟨http://www.json.org/⟩. 8 ⟨http://leafletjs.com/⟩. 9 ⟨http://www.flotcharts.org/⟩. 10 ⟨http://knockoutjs.com/⟩. 11 ⟨https://www.gemini.edu/sciops/telescopes-and-sites/observing-condition-constraints/extinction⟩. References Aihara H. et al.   2011, ApJS , 193, 29 CrossRef Search ADS   Aihara H. et al.   2018a, PASJ , 70, S4 Aihara H. et al.   2018b, PASJ , 70, S8 Axelrod T., Kantor J., Lupton R. H., Pierfederici F. 2010, Proc. SPIE , 7740, 774015 Bèland S., Boulade O., Davidge T. 1988, Canada-France-Hawaii Telescope Information Bulletin, No. 19  ( Kamuela, HI: CFHT Corporation) Bosch J. et al.   2018, PASJ , 70, S5 Chambers K. C. et al.   2016, arXiv:1612.05560 Cuillandre J.-C., Magnier E. A., Isani S., Sabin D., Knight W., Kras S., Lai K. 2002, Proc. SPIE , 4844, 501 DES Collaboration, 2016, MNRAS , 460, 1270 CrossRef Search ADS   Diehl H. T. et al.   2016, Proc. SPIE , 9910, 99101D Furusawa H. et al.   2011, PASJ , 63, S585 CrossRef Search ADS   Górski K. M., Hivon E., Banday A. J., Wandelt B. D., Hansen F. K., Reinecke M., Bartelmann M. 2005, ApJ , 622, 759 CrossRef Search ADS   Gunn J. E., Stryker L. L. 1983, ApJS , 52, 121 CrossRef Search ADS   Gwyn S. D. J. 2012, AJ , 143, 38 CrossRef Search ADS   Hanuschik R. W. 2007, ASP Conf. Ser. , 376, 373 Hanuschik R. W., Hummel W., Sartoretti P., Silve D. 2002, Proc. SPIE , 4844, 139 Ivezic Z. et al.   2008, arXiv:0805.2366 Jeschke E., Bon B., Inagaki T., Streeper S. 2008, Proc. SPIE , 7019, 70190U Jurić M. et al.   2015, arXiv:1512.07914 Kosugi G. et al.   2000, Proc. SPIE , 4010, 174 Komiyama Y. et al.   2018, PASJ , 70, S2 Lang D., Hogg D. W., Mierle K., Blanton M., Roweis S. 2010, AJ , 139, 1782 CrossRef Search ADS   LSST Science Collaborations 2009, arXiv:0912.0201 Magnier E. A., Cuillandre J.-C. 2004, PASP , 116, 449 CrossRef Search ADS   Magnier E. A. et al.   2013, ApJS , 205, 20 CrossRef Search ADS   Magnier E. A. et al.   2016, arXiv:1612.0542 Miyazaki S. et al.   2002, PASJ , 54, 833 CrossRef Search ADS   Miyazaki S. et al.   2012, Proc. SPIE , 8446, 84460Z Miyazaki S. et al.   2018, PASJ , 70, S1 Mohr J. J. et al.   2012, Proc. SPIE , 8451, 84510D Noguchi K. et al.   2002, PASJ , 54, 855 CrossRef Search ADS   Shaw R. A., Levine D., Axelrod T., Lahr R. R., Mannings V. G. 2010, Proc. SPIE , 7740, 7740H Shupe D. L., Mehrdad M., Jing L., Makovoz D., Narron R. 2005, ASP Conf. Ser. , 347, 491 Tabur V. 2007, PASA , 24, 189 CrossRef Search ADS   Takata T., Yagi M., Yasuda N., Ogasawara R. 2002, Proc. SPIE , 4844, 242 Tanaka M. et al.   2016, ApJ , 819, 5 CrossRef Search ADS   Utsumi Y. et al.   2012, Proc. SPIE , 8446, 844662 Winegar T. 2008, Proc. SPIE , 7016, 70160M Yamada Y. et al.   2014, Proc. SPIE , 9149, 91492I © The Author 2017. Published by Oxford University Press on behalf of the Astronomical Society of Japan. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Journal

Publications of the Astronomical Society of JapanOxford University Press

Published: Jan 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 12 million articles from more than
10,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Unlimited reading

Read as many articles as you need. Full articles with original layout, charts and figures. Read online, from anywhere.

Stay up to date

Keep up with your field with Personalized Recommendations and Follow Journals to get automatic updates.

Organize your research

It’s easy to organize your research with our built-in tools.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve Freelancer

DeepDyve Pro

Price
FREE
$49/month

$360/year
Save searches from
Google Scholar,
PubMed
Create lists to
organize your research
Export lists, citations
Read DeepDyve articles
Abstract access only
Unlimited access to over
18 million full-text articles
Print
20 pages/month
PDF Discount
20% off