SAMI Data Reduction
SAMI data reduction divides neatly into two overarching stages; from raw data straight off the telescope to row stacked spectra (RSS), and from RSS frames to individual galaxy data cubes. RSS frames are two-dimensional arrays containing one-dimensional, wavelength-calibrated spectra from all SAMI fibres for a single exposure. RSS frames represent an intermediate step in the data reduction process and are not included as part of the DR 2 release. Data cubes are then formed by extracting all spectra for a given galaxy from each of the contributing RSS frames, drizzling, combining, then resampling onto a regular grid. For further details see the following papers: Allen et al. (2015), Sharp et al. (2015), Green et al. (2018), Scott et al. (submitted).
1. Raw data to RSS frames
The following steps are applied by the Two-Degree Field Data Reduction (2dfDR) package to individual raw frames to produce RSS frames:
Bias, dark, and overscan subtraction: Bias and dark frames are subtracted to correct errant CCD pixels for blue arm data taken prior to mid-2014. Both CCDs were upgraded in 2014, making this step largely redundant. An overscan correction is applied, subtracting the bias level in each frame.
Flat-fielding: Each frame is divided by a detector flat generated by averaging (typically >30) fibre flats, for which the spectrograph has been defocussed so that the illumination is relatively uniform. These frames are then filtered to remove large-scale variations, leaving only smaller-scale pixel-to-pixel flat-field variations. Charge spots due to cosmic rays are removed from each individual science frame using a tuned implementation of the LaCosmic routine (see van Dokkum 2001).
Tramline map construction: Fibre locations are traced across the detector, generating a so-called tramline map giving the pixel-by-pixel [x,y] location of each fibre. This is performed using a twilight sky observation. The fibre peaks are identified and fitted approximately using a quadratic fit to the 3 pixels around each peak. Then as a second stage we implement an algorithm that assumes a Gaussian fibre profile (a good approximation to SAMI fibres in AAOmega) and fits five Gaussians (the central one and two either side) to precisely determine both the centre and width of the fibre profile.
Spectral extraction and flat-fielding: Flux from the 2D image is extracted to generate a 1D spectrum for each fibre. An optimal extraction (see Sharp & Birchall 2010) is performed to fit the flux amplitudes perpendicular to the dispersion axis. Gaussian profiles are fit, holding the centre and width constant (i.e., at the values obtained from the tramline and fibre width maps measured above) and fitting all 819 fibres simultaneously. Following extraction, the 1D spectra are divided by an extracted and normalised 1D dome lamp flat-field spectrum, which removes residual fibre-to-fibre variations in spectral response.
Wavelength calibration: Emission lines in a CuAr arc lamp exposure are identified in extracted 1D spectra and matched to line-lists with a 3rd order polynomial for each fibre solution. The extracted object spectra are interpolated onto a single, fixed wavelength grid after applying a heliocentric velocity correction.
Fibre throughput correction: Fibre throughputs were calibrated primarily from the relative strength of twilight flat-field frames. If no twilight frame was observed for a given field, a dome flat-field frame was used. These throughput values were then used for subtraction of the night sky spectrum. If the sky residuals after sky subtraction (see below) were large, the fibre throughputs were remeasured using the integrated flux in the night sky lines. If all sky lines were affected by bad pixels (typically only an issue for the blue CCD, which covers only a single sky line), then the mean fibre throughputs, derived from all other frames for this field, were adopted. The sky subtraction was then repeated with the revised throughput values.
Sky subtraction: The sky spectrum is measured from 26 dedicated sky fibres, taking a median spectrum after throughput correction, and subtracted from all spectra.
The above steps result in RSS frames consisting of 819 wavelength-calibrated, flat-fielded, one-dimensional spectra.
2. RSS frames to cubes
The 819 fibres in each object RSS frame constitute: 12 hexabundles (61-fibre integral-field units; IFUs) targeted on SAMI Galaxy Survey galaxy targets, 1 hexabundle targeted on a secondary standard star, and 26 sky fibres (which are not used beyond the steps outlined above). Each galaxy field (i.e. set of 12 galaxy targets and one secondary standard star) is observed at least 6 (and typically 7) times. In addition to galaxy object frames, several exposures containing only a single spectrophotometric standard star centred in one hexabundle are also observed throughout the night.
The process of combining and reconstructing these RSS frames into three-dimensional data cubes of individual galaxies is accomplished using the sami data reduction package. In addition, the sami package applies a telluric correction and absolute flux calibration step to each RSS frame and a final flux calibration step to each output data cube. The individual reduction steps are implemented as follows:
Initial flux calibration: Each spectrum is corrected for the large-scale (in wavelength) extinction by the atmosphere at Siding Spring Observatory at the observed airmass.
Primary flux calibration: The spectrum is extracted for each spectrophotometric standard star observed (accounting for light lost between fibres) and compare to the known stellar spectrum to determine the transfer function. Multiply galaxy RSS frames by the transfer function derived from the spectrophotometric observation closest in time to the galaxy frame.
Telluric correction: The spectrum is extracted for the secondary standard star (again accounting for light lost between fibres), selected to be a relatively featureless F-dwarf based on the colours. Using a linear fit across the telluric regions (6850-6960 Å and 7130-7360 Å), the code determines a telluric correction, and divides all galaxy spectra by this correction.
Secondary flux calibration: The observed g-band magnitude of the secondary standard star is measured from its extracted spectrum, and compared to the catalogued PSF g-band magnitude for the corresponding star from the Sloan Digital Sky Survey. All galaxy spectra in the frame are scaled by this ratio.
Centering: The centroids of each of the 12 galaxies in the frame are fit using a two-dimensional Gaussian and a simple empirical model describing the telescope offset and atmospheric refraction. Centering is repeated for all frames for a given galaxy field to measure the variation in centroid for each galaxy from frame to frame. Each individual galaxy is aligned across frames using the measured centroids.
Cube creation: Each frame is drizzled onto a regular 0.5 arc second-square spaxel grid. The flux in each output spaxel is taken to be the mean of the flux in each input fibre, weighted by the fractional spatial overlap of that fibre with the spaxel. To regain some of the spatial resolution that would otherwise be lost in convolving the 1.6 arc second fibres with 0.5 arc second spaxels, the overlaps are calculated using a fibre footprint with only an 0.8 arc second diameter. The centroid of a galaxy varies as a function of wavelength due to atmospheric refraction - this effect was corrected for by re-calculating the drizzle locations when the expected shift due to atmospheric diffraction exceeded 1/50$^\mathrm{th}$ of a spaxel. The derived flux cube is then multiplied by a weight cube (see below), such that the output flux cube is in units of $10^{−16} \mathrm{erg}\ \mathrm{s}^{−1} \mathrm{cm}^{−2}$.
Final flux calibration: The spectrum of the secondary standard star is re-extracted from the final data cube using a Moffat profile fit. The g-band magnitude is again calculated and a final scaling relative to the catalogue value is applied to all galaxy cubes from the same field. The total effective seeing of all galaxies in the field is also measured from the Moffat fit.
The above data reduction process is applied simultaneously to both blue and red arm data, with separate blue and red data cubes produced as output. The secondary and final flux calibration scaling values are derived from the blue arm data but are applied to both arms.
3. Changes between DR1 and DR2
Several improvements were made to the data reduction pipeline between the 1st and 2nd data releases. All changes are documented and their effects quantified in Scott et al. (2018), but here we briefly summarise these changes. For DR2, the version of the SAMI Python pipeline identified by Mercurial changeset ID 17EBC0FF0A1C was used in conjunction with 2dfDR v6.65.
Spectral extraction: Twilight frames are now used to derive all tram line maps, and the preliminary scattered light model has been improved, resulting in improved extraction and reduced noise below ~4000 Å.
Fibre flat-fielding: Twilight frames are used to derive fibre throughputs, resulting in improved absolute flux calibration below ~4000 Å.
Wavelength resampling: The wavelength solution is corrected for heliocentric motion, and all galaxies are sampled onto a single, fixed, common wavelength scale, resulting in a modified wavelength range and sampling.