Data analysis is a two-step process, combining a classical first stage with a data-driven second stage. It is described in detail in our second data release paper (Buder et al 2018).
In the first stage, the SME program (Valenti & Piskunov 1996, Piskunov & Valenti 2017) is used with 1D LTE MARCS model atmospheres (Gustafsson et al. 2008) to analyse a subset of GALAH stars with high-quality spectra and stellar parameters that span the space of the full data set. For each star, stellar parameters (Teff, log(g), [M/H], microturbulent velocity and v sin(i)) are determined simultaneously by synthesis of Fe, Ti and Sc lines, along with H-alpha and H-beta. With those values set, elemental abundances are derived using wavelength windows chosen to be free from blending.
In the second stage, these training set spectra and stellar parameters are fed into The Cannon (Ness et al 2015), a data-driven generative modelling approach to label determination. The Cannon builds a quadratic model at each pixel (ie., wavelength step) of the normalised spectrum as a function of the labels. This model is then used to determine the labels for the bulk of the spectra in a computationally short amount of time.
There is comprehensive flagging of the labels. This flag situations where the label result could be too far from the training set, the χ2 between the observed spectrum and the spectrum calculated by The Cannon could be too large, or the spectra could have been classified by t-SNE as having problems (for details on the application of t-SNE to GALAH spectra see Traven et al 2017).
For work involving stellar parameters and abundances, we recommend only using those values where flag_cannon=0, and flag_x_fe=0 where x is the elements of interest.