Atomic data and spectral modeling constraints from high-resolution X-ray observations of the Perseus cluster with Hitomi

The Hitomi SXS spectrum of the Perseus cluster, with $\sim$5 eV resolution in the 2-9 keV band, offers an unprecedented benchmark of the atomic modeling and database for hot collisional plasmas. It reveals both successes and challenges of the current atomic codes. The latest versions of AtomDB/APEC (3.0.8), SPEX (3.03.00), and CHIANTI (8.0) all provide reasonable fits to the broad-band spectrum, and are in close agreement on best-fit temperature, emission measure, and abundances of a few elements such as Ni. For the Fe abundance, the APEC and SPEX measurements differ by 16%, which is 17 times higher than the statistical uncertainty. This is mostly attributed to the differences in adopted collisional excitation and dielectronic recombination rates of the strongest emission lines. We further investigate and compare the sensitivity of the derived physical parameters to the astrophysical source modeling and instrumental effects. The Hitomi results show that an accurate atomic code is as important as the astrophysical modeling and instrumental calibration aspects. Substantial updates of atomic databases and targeted laboratory measurements are needed to get the current codes ready for the data from the next Hitomi-level mission.


Introduction
Many major achievements in X-ray studies of clusters of galaxies were made possible by the advent of new X-ray spectroscopic instruments. The proportional counters on the Ariel V mission (spectral resolving power R ≡ E/∆E ∼ 6) revealed the highly ionized Fe line emission near 7 keV in the Perseus cluster (Mitchell et al. 1976), establishing the thermal origin of cluster X-rays. The CCDs (R = 10-60) onboard the ASCA satellite further identified line emission from O, Ne, Mg, Si, S, Ar, Ca, and Ni in the hot intracluster medium (ICM: Fukazawa et al. 1994;Mushotzky et al. 1996). The Reflection Grating Spectrometer (RGS: R = 50-100 for spatially extended sources) onboard XMM-Newton Tamura et al. 2001;Kaastra et al. 2001) discovered the lack of strong cooling flows in cool-core clusters. Most recently, the Soft Xray Spectrometer (SXS: Kelley et al. 2016) onboard the Hitomi satellite (Takahashi et al. 2016) disclosed the low energy density of turbulent motions in the central region of the Perseus cluster with the resolving power of R ∼ 1250 (Hitomi Collaboration et al. 2016). Each iteration of higher resolution spectroscopy enhances our understanding of clusters and other cosmic objects.
As more high-resolution X-ray spectra become available, the X-ray community -including observers, theoreticians, and laboratory scientists -urgently needs accurate and complete atomic data and plasma models. As a first step in achieving this, we will compare the current data and models (collectively called "codes" hereafter). The most used plasma codes in Xray astronomy are AtomDB/APEC (Smith et al. 2001;Foster et al. 2012), SPEX (Kaastra et al. 1996), and CHIANTI (Dere et al. 1997;Del Zanna et al. 2015). The AtomDB code descends from the original work of Raymond & Smith (1977), SPEX started with Mewe (1972), and CHIANTI started with Landini & Monsignori Fossi (1970). All these codes have evolved significantly since their initial beginnings, often stimulated by the challenges imposed by new generations of instruments. It is clear that the code comparison is strongly needed to verify the scientific output and to understand systematic uncertainties in the results originating from the codes and atomic databases. However, few code comparisons have been done (e.g., Audard et al. 2003), and in particular, so far there is no comparison based on high-resolution X-ray spectra of galaxy clusters.
The Hitomi X-ray observatory was launched on February 17, 2016. Among the main scientific instruments, the SXS has an unprecedented resolving power of R ∼1250 at 6 keV over a 6×6 pixel array (3 ′ ×3 ′ ). It has a near-Gaussian energy response with FWHM=4-6 eV over the 0.3-12 keV band (Leutenegger et al. in prep.). The X-ray mirror has an angular resolution with a half-power diameter of 1. ′ 2 (Maeda et al. accepted.). A gate valve was in place for early observations to minimize the risk of contamination from out-gassing of the spacecraft (Tsujimoto et al. 2016), which includes a Be window that absorbs most X-rays below ∼2 keV. As the SXS is a non-dispersive instrument (unlike gratings) it can be used to observe extended objects without a loss of spectral resolution. This makes the SXS the best instrument for high-resolution spectroscopic studies of galaxy clusters. The Perseus cluster was observed as the firstlight target of the SXS, and the first paper showing its spectroscopic capabilities focused on the turbulence in the Perseus cluster (Hitomi Collaboration et al. 2016).
With these data, we can also measure abundances (Hitomi Collaboration et al. 2017b: hereafter Z paper), temperature structure (Hitomi Collaboration et al. in prep.: T paper), and resonance scattering (Hitomi Collaboration et al. accepted.a: RS paper). These quantities are essential to understand the origin and evolution of galaxy clusters (see review by Böhringer & Werner 2010). Metal abundances trace products of billions of supernovae explosions integrated over cosmic time and the measurements are crucial for understanding chemical evolution of ICM as well as the evolutions and explosions of progenitor stars (Werner et al. 2008). Temperature structure or anisothermality gives an insight into thermodynamics in ICM and thus important for understanding of the heating mechanism against effective radiative cooling in a dense core region (Peterson & Fabian 2006). Resonance scattering is another, indirect tool to assess turbulence, one of the candidate mechanisms of the ICM heating. Required precisions to these quantities depend on astrophysical objectives -for the cosmic star-formation history the Ni-to-Fe abundance ratio needs to be measured to ≈10% and for detection of resonance scattering with the Fe Heα complex the forbidden-to-resonance (z-to-w) line ratio to a few percent, for instance (see individual topical papers for details).
In this paper we focus on the atomic physics and modeling aspects of the Perseus spectrum with the Hitomi SXS. We show that this high-resolution spectrum offers a sensitive probe of several important aspects of cluster physics including turbulence, elemental abundance measurements, and structures in temperature and velocity (section 3). We investigate the sensi-tivity of the related derived physical parameters to various aspects of the spectroscopic codes (section 4) and their underlying atomic data (section 5), spectral (section 6) and astrophysical (sections 7 and 8) modelings, as well as fitting techniques (section 9). By consolidating these systematic factors and by comparing them to statistical uncertainties as well as the systematic factors due to instrumental calibration effects (appendix 3), we can evaluate with what precisions the important quantities can be determined. This allows us to be optimally prepared for future high-resolution X-ray missions. We highlight the relative changes to each parameter by using different atomic modelings and so on, rather than the changes in fitting statistics, since the former is more fundamental for understanding the systematic uncertainties in the scientific results. The astrophysical interpretation of our derived parameters is not discussed in this paper, but will be in a series of separate papers focusing in greater detail on the relevant astrophysics, e.g., abundances (Z paper), temperature structure (T paper), resonance scattering (RS paper), velocity structure (Hitomi Collaboration et al. accepted.b: V paper), and the central active galactic nucleus (AGN) of the Perseus cluster (Hitomi Collaboration et al. accepted.c: AGN paper). Also, we do not examine combined effects of different types of systematic factors (e.g., plasma-code dependence in the detailed astrophysical modeling like multitemperature models), which will be separately discussed in the individual topical papers.

Data reduction
In this paper, the cleaned event data in the pipeline products version 03.01.005.005 are analyzed with the Hitomi software version 005a and the calibration database (CALDB) version 005 (Angelini et al. 2016) 1 . There are four Hitomi observations of the Perseus cluster (name: sequence number = "Obs 1": 100040010, "Obs 2": 100040020, "Obs 3": 100040030-100040050, and "Obs 4": 100040060). The instrument had nearly reached thermal equilibrium by Obs 4 (Fujimoto et al. 2016), and the calibrations of Obs 2 and Obs 3 can be checked against Obs 4 because of their overlapping fields of view (FOVs), but the FOV of Obs 1 does not overlap the others and the instrument was the most out of equilibrium during that pointing. Hence only the Obs 2, 3, and 4 are used in this work.
Events registered during low-Earth elevation angles below two degrees and passages of the South Atlantic Anomaly were already excluded by the pipeline processing which created the cleaned events file. Events coincident with the particle veto had also already been rejected. Data were further screened by criteria described as "recommended screening" in the Hitomi data reduction guide 2 to remove those with distorted pulse shapes or coincident events in any two pixels, which further reduces the background, though the difference is negligible given the surface brightness of the Perseus cluster. For all the three observations (Obs 2-4), only high-resolution primary events (an event with no pulse in the interval 69.2 ms before or after it) were extracted and used. This choice is fine because relative ratios are the same between different event types (Seta et al. 2012;Ishisaki et al. 2016).
The line broadening due to the spatial velocity gradient in the ICM is removed, since it is not relevant to the atomic study. To do this, we apply an additional energy scale correction (also used in Hitomi Collaboration et al. 2016, 2017a, forcing the strong Fe-K lines to appear at the same energy in each pixel, aligned to the same redshift as the central AGN (z =0.01756 or cz =5264 km s −1 : Ferruit et al. 1997). This also removes residual gain errors in the Fe-K band. The effect of the spatial velocity correction on the baseline-model fitting (section 3) is discussed in appendix 3. A recent measurement of NGC 1275 indicates an alternative redshift of 0.017284±0.00005 (V paper). In this paper, we do not refer to the new value, since its impact on the other fitting parameters would be washed out by the non-linear energy-scale correction applied later (appendix 1) or by the redshift component included in the baseline model (section 3).
The large-size redistribution matrix files (RMFs) for highprimary events created by sxsmkrmf are used to take into account the main Gaussian component, the low-energy exponential tail, and escape peaks of the line spread function (Leutenegger et al. in prep.). We have also tested two different types of RMFs; one is the small-size RMFs which includes only the Gaussian core, and the other is the extra-large-size RMFs with all components in the large-size RMFs plus electron-loss continuum. The effect of changing the RMF type is discussed in appendix 3. The ancillary response files (ARFs) are generated separately for the diffuse emission and the point-source component. To enhance the precision of the diffuse ARFs, a background-subtracted Chandra image of the Perseus cluster in the 1.8-9.0 keV band whose AGN core is replaced with the average value of the surrounding regions is used to provide the spatial distribution of seed photons. Since the effective area is estimated based on the input image with a radius of 12 ′ , which is larger than the detector FOV (3 ′ ×3 ′ ), the measured spectral normalization reported in this paper is larger than the actual value. We do not correct this effect since this paper is focused on the relative uncertainties instead of the absolute values. We have further tested to use the point-source ARF for the both components, and show the effects in appendix 3.
The non X-ray background (NXB) of the SXS is much lower than those of the X-ray CCDs thanks to the anti-coincidence 2 See https://heasarc.gsfc.nasa.gov/docs/hitomi/analysis/ . screening, which reduces the NXB rate by a factor of ≈10 (Kilbourne et al. accepted.). We extract the NXB spectrum from Earth occultation data with sxsnxbgen, and screened with the standard NXB criteria and the same additional screening as the source events. The NXB spectrum is taken into account as a SPEX file model in the baseline analysis (section 3). Other background components, which include the cosmic X-ray background and Galactic foreground emission, are negligible for the Perseus data. The relative changes of the baseline parameters for a fitting in absence of the NXB is shown in appendix 3.
The main remaining issue in the data analysis is that the planned calibration procedures were not fully available for these early observations. In particular, the contemporaneous calibration of the energy scale (or gain) for the detector array was not yet carried out. The previous Hitomi papers (Hitomi Collaboration et al. 2016, 2017a focused on a relatively narrow energy range; in this work we study a wide energy band of 1.9-9.5 keV. This forces us to apply two additional corrections to the energy scale and effective area as described in appendix 1.

Baseline model
The result of a spectral model fit is a list of parameters representing the source. These parameters depend on several factors, like the statistical quality of the data, the instrument calibration, background subtraction method, fitting techniques, spectral model components, physical processes included in the spectral model, and atomic parameters. All of these factors contribute to the final set of source parameters that is derived. Apart from the statistical uncertainties, all other factors act like a kind of systematic uncertainty, and by carefully analyzing each individual contribution we can assess its contribution to the final uncertainty.
We proceed as follows. Below we define our baseline best-fit model and explain why we incorporate each component in the model. We then list the best-fit parameters with their statistical uncertainties. The effects of the different systematic factors are in general not excessively large, and therefore we list their impact by showing by how much the best-fit parameters are increased or decreased due to these factors. Usually the statistical uncertainties on the best-fit parameters are very similar for all investigated cases, so we only list the statistical uncertainties of the baseline model.
We use the SPEX package (Kaastra et al. 1996) to define the baseline model because it allows us to test the system in a straightforward way. The version of SPEX that is being used here is 3.03.00. It calculates all relevant rates, ion concentrations, level populations, and line emissivities on the fly (see section 4.1 for more details).
We use optimally binned spectra (using the SPEX obin command structure; see appendix 1.3) with C-statistics (Cash 1979). This choice will be elaborated later (section 9).
All abundances are relative to the Lodders & Palme (2009) proto-solar abundances with free values relative to those abundances for the relevant elements.
The dominant spectral component is a collisionally ionized plasma, with a temperature of about 4 keV (Hitomi Collaboration et al. 2016), modeled with the SPEX cie model. For the ionization balance we choose the Urdampilleta et al. (2017) ionization balance (for more detail see section 5.4). The electron temperature, abundances of Si, S, Ar, Ca, Cr, Mn, Fe, and Ni are free parameters; the abundances of all other metals (usually with no or very weak lines in the bandpass of the Hitomi SXS) are tied to the Fe abundance. In addition, we leave the turbulent velocity free; the value of this turbulent velocity has been discussed in detail in Hitomi Collaboration et al. (2016). Although in SPEX the magnitude of turbulence is parameterized by a two-dimensional root-mean-square velocity vmic assuming isotropic velocity distribution, we convert it into one-dimensional line-of-sight (LOS) velocity dispersion σv (= vmic/ √ 2) and use it throughout this paper to enable direct comparisons to the previous studies (Hitomi Collaboration et al. 2016, 2017a. The Hitomi SXS spectrum of the Perseus cluster shows clear signatures of resonance scattering (RS paper); in addition, we may expect absorption of He-like line emission by Li-like ions (Mehdipour et al. 2015). To account for both effects, we include the absorption from a CIE plasma as modeled by the SPEX hot model to our model. The hot model calculates the continuum and line absorption from a plasma with the temperature, chemical composition, turbulent velocity and outflow velocity as free parameters. This absorption is applied to all emission components from the cluster. Because the FOV of the Hitomi SXS is relatively small compared to the size of the Perseus cluster, the effects of resonance scattering to lowest order imply the removal of photons from the line of sight towards the cluster core; we do not observe the re-emitted photons further away from the nucleus. A more sophisticated resonance scattering model is discussed by RS paper. In order not to over-constrain the model, we leave only the column density of the hot absorbing gas N H,hot free, and tie the other parameters (electron temperature, abundances, turbulent and outflow velocities) to the values of the main 4-keV emission component (but see section 7).
Our spectrum also contains a contribution from the central AGN of NGC 1275. This is modeled by a powerlaw (SPEX component pow) plus two Gaussians (gaus) for the neutral Fe Kα lines. We use the powerlaw model which has a 2-10-keV luminosity of 2.4×10 36 W or a flux of 3.5×10 −14 W m −2 , almost one fifth of the total 2-10-keV luminosity of the observed field, and a photon index of 1.91. The Gaussian lines have rest-frame energies of 6.391 keV and 6.404 keV, an intrinsic FWHM of 25 eV and a total luminosity of 5.6×10 33 W or a total flux of 8.0×10 −17 W m −2 . We have kept the parameters of the central AGN frozen in our fits to the above values. The above model and parameter values are from the initial evaluation for AGN paper, which have been updated later. Updating the AGN spectrum modeling results in slightly different best-fit values of the baseline model (section 8.3), but the changes are so small that relative differences in the ICM parameters due to other systematic factors are unchanged. Thus we use the original AGN model and parameters throughout this paper except in section 8.3.
We apply further the cosmological redshift (SPEX reds component) to the model, but leave it as a free parameter for the baseline model to account for any residual systematic energy scale corrections (either of instrumental or astrophysical origin; this is not important for the present study).
The last spectral component applied to all spectra is another hot component to account for the interstellar absorption from our Galaxy; we have frozen the temperature to 0.5 eV (essentially a neutral plasma), with a column density of 1.38×10 21 cm −2 , following the argumentation in Hitomi Collaboration et al. (2017a). The abundances are frozen to the proto-solar abundances (Lodders & Palme 2009).
The model contains further a component of pure neutral Be and a correction factor for the effective area (see appendix 1.2); these serve purely as instrumental effective area corrections and are kept frozen for our modeling.
To summarize, the baseline model starts with a thermal ICM and AGN components, self-absorbed, redshifted, absorbed again by the foreground, and corrected for instrumental effects. The free parameters of our model are then the emission measure Y and temperature kT of the hot gas, the turbulent velocity σv of the hot gas, the abundances of Si, S, Ar, Ca, Cr, Mn, Fe, and Ni, the effective absorption column of the hot cluster gas N H,hot , and the overall redshift of the system z. This baseline model achieves a C-statistic value (Cstat) of 4926 for an expected value of 4876±99.
The best-fit parameters of our model are given in table 1. It is beyond the scope of this paper to discuss the astrophysical interpretation of the temperature, abundances, and resonance scattering; these are discussed in much greater detail by T, Z, and RS papers, respectively.
In the following sections, that form the core of our paper, we investigate in more detail the systematic effects that affect the best-fit parameters of this baseline model. We do so by showing in table 1 the difference in best-fit C-statistic and the best-fit model parameters, for different assumptions in our modeling.
In several cases we also show the relative difference in the predicted model spectra.
We consider the following systematic effects: the plasma code that is used (section 4), the atomic database in the background (section 5), different choices for details of the plasma modeling (section 6), astrophysical modeling effects (section 7), the role of other spectral components apart from the main hot plasma (section 8), and spectral fitting techniques (section 9). Those due to instrumental calibration aspects are separately examined in appendix 3.

Systematic factors affecting the derived source parameters: plasma code
We consider in this paper apart from SPEX version 3.03 (the baseline plasma model) also the old SPEX version 2/Mekal plasma model, the latest SPEX version before the launch of Hitomi (hereafter, the pre-launch version: SPEX version 3.00), as well as the pre-launch and the latest APEC/AtomDB versions 3.0.2 and 3.0.8 (Smith et al. 2001;Foster et al. 2012), respectively, CHIANTI version 8.0 (Dere et al. 1997;Del Zanna et al. 2015), and Cloudy version 13.04 (Ferland et al. 2013) plasma models. The best-fit models with these codes highlighting the Fe and Ni Heα bands are compared in figure 1. The full-band results as well as the relevant atomic data are compared between these codes in appendices 4 and 5 (see also section 4.2).

SPEX versions 3.00 and 3.03
Version 3.00 of SPEX was released on January 29, 2016 as the pre-launch version for Hitomi data analysis. In SPEX version 2, line powers were calculated using the method of Mewe et al. (1985), i.e., using a temperature-dependent parameterization of the line fluxes with empirical density corrections. This version 3.00 contains fully updated atomic data for the most highly ionized ions, solving directly the balance equations for the ion energy level populations incorporating effects like density and radiation field, and uses these level populations to calculate the line power.
Triggered by the early work on the Hitomi SXS data of the Perseus cluster (Hitomi Collaboration et al. 2016), and the follow-up work as presented in this paper, several updates to version 3.00 were made leading to SPEX version 3.03, released in November 2016, that is used for the present analysis. Below we list the most important updates for the present work relative to version 3.00.
1. For Li-like ions, inner-shell transitions were extended from maximum principal quantum number n = 6 to n = 15 using FAC calculations. 2. A numerical issue with Be-like ions related to metastable levels was resolved allowing the full use of the new line calculations for these ions. 3. Inner-shell energy levels, Auger rates, and radiative transitions for O-like Fe XIX to Be-like Fe XXIII were added using Palmeri et al. (2003a). 4. A bug in the calculation of trielectronic recombination for Li-like ions was also removed; in the dielectronic capture from the He-like 1s2s level to Li-like 2s2p 2 levels the relative population of the 1s2s level was ignored leading to a too high population of these Li-like levels and subsequently to too strong stabilizing radiative transitions from these levels, and not in agreement with the Hitomi SXS data. 5. The proper branching ratios for excitation and inner-shell ionization to excited levels that can auto-ionize are now taken into account, leading to improvements for some satellite lines.
To demonstrate the post-launch updates, we present the results of the Hitomi SXS spectral fitting with both versions 3.00 and 3.03 in table 1. The best-fit C-statistic value increases by 2372 from version 3.03 to 3.00, and the latter gives a 7% higher temperature, 8% higher turbulent velocity, and 30% lower Fe abundance than the former one. The other abundances also have 3% to 37% deviations. The effective column density of resonance scattering N H,hot becomes zero with version 3.00.

Using SPEX version 2 (the Mekal code)
The old Mekal code, or SPEX version 2 (Mewe et al. 1995), contained significantly fewer lines and chemical elements than the present version of SPEX. In addition, the atomic data (e.g., line energies) have been improved in the present SPEX version compared to the old Mekal model. This is evident from table 1, showing that the best-fit C-statistic value increases by 1125 if we replace the new code by the old code. A detailed comparison (figures 23-25 in appendix 4) shows that there are many differences. For instance, contrary to the old model, the new model includes Cr and Mn lines (in the 5-6 keV range). Also, updates in the line energies are visible as a sharp negative residual close to a sharp positive residual.
The old code yields almost the same temperature as the new code, but there are significant changes in the derived turbulent velocity and the abundances. Small wavelength errors can be compensated for by adjusting the line broadening. Abundances are off by 2-4σ or up to 5-15% of the values obtained from the baseline model. This is only one example of a comparison between different models. In appendix 4 (figures 23-25), we show the full Hitomi SXS spectrum in 1.9-9.5 keV with our best-fit baseline model in the upper panels, and the residuals in the lower panels. In these lower panels we also show the relative difference between the baseline model and the best-fit models obtained with various other plasma codes.
The differences between these models can be divided into two classes: wavelength differences (leading to a positive residual next to a negative residual e.g., the Ca XIX Heβ line near 4.51 keV has a different wavelength in the Mekal code compared to the baseline model), or flux differences (leading to a   strict positive or negative residual in the relative residuals e.g., the S XV forbidden line near 2.38 keV is stronger in the Mekal model compared to the baseline model).
In appendix 5 (tables 10 and 11), we list the line energies of the strongest lines in the spectrum. For comparison, the energies in SPEX are shown together with those in the APEC version 3.0.8 and CHIANTI version 8.0 codes. All the Lyman-and Helium-series transitions with model line emissivities ≥10 −26 photon m 3 s −1 are listed, and for satellite lines of He-like, Li-like, and Be-like ions, the threshold is set to 10 −25 photon m 3 s −1 . In addition we show the Einstein coefficients and emissivities used in the three atomic codes.

APEC
APEC runs were conducted for both the pre-launch version, AtomDB version 3.0.2, and the latest version, AtomDB version 3.0.8. Since the launch of Hitomi, several updates have been made to the database to reflect the needs of the Hitomi data. These updates were not made to "fit" the Hitomi SXS data, but instead to reflect the priorities that analysis revealed. These changes were: 1. The ionization and recombination rate calculation was switched from an interpolatable grid to a fit function, which has a few percent effect on several ion populations depending on the temperatures/ion involved. 2. Wavelengths for higher n transitions of the H-and He-like ions were changed to match Ritz values from the NIST Atomic Spectra Database. 3. Wavelengths for valence shell transitions of Li-like ions were changed to match Ritz values from NIST. 4. Fluorescence yields and wavelengths of inner shell lines were updated to the data of Palmeri et al. (2003aPalmeri et al. ( , 2003bPalmeri et al. ( , 2008Palmeri et al. ( , 2010Palmeri et al. ( , 2012; Mendoza et al. (2004). 5. Collisional excitation rates for He-like Fe were changed from an unpublished data set to that of Whiteford et al. (2001). 6. Collisional excitation rates for H-like ions from Al to Ni were changed from FAC calculations to those of Li et al. .
The spectral calculation is done with the BVVAPEC model in Xspec version 12.9.1 (Arnaud 1996), while the rebinning and fitting are carried out with SPEX version 3.03.00. The abundance standard (Lodders & Palme 2009) is applied to the APEC calculations. This allows a direct comparison between APEC and SPEX. The ionization balance calculation in APEC, on the other hand, is based on Bryans et al. (2009), while Urdampilleta et al. (2017) is used in SPEX. This difference is separately discussed in section 5.4.
The run with the pre-launch APEC version 3.0.2 gives a best-fit C-statistic which is larger than the baseline value by 670.
As shown in figure 1 and appendix 4 (figures 23-25), the relative difference between SPEX and APEC is usually within 10%, except for a few lines, including Cr XXIII Heα, Mn XXIV Heα, Fe XXIV satellite lines at 6.42 keV, 6.44 keV, 8.03 keV, and 8.04 keV, Ni XXVII Heα blended with Ni XXVI and Fe XXIV satellite lines, and Fe XXV Heβ to Heη lines. Many differences might be related to the rates used in level population calculation, e.g., collisional excitation and spontaneous emission rates (see section 5 for details). The line energy data in APEC version 3.0.8 are in general good agreement with SPEX version 3.03 (see table 10 in appendix 5 for details).
As listed in table 1, the APEC code gives a similar bestfit temperature as the SPEX baseline model. The metal abundances obtained with APEC are lower by 5-10% for Si, S, Ar, Ca, and Ni than the best-fit baseline values, while the Cr abundances obtained with the two codes agree within error bars. The largest difference is with the Fe abundance, which is 16% lower in the latest APEC/AtomDB (version 3.0.8) than SPEX. The best-fit turbulent velocity in σv (LOS dispersion) derived with the latest APEC code is 16 km s −1 lower than the SPEX result.

CHIANTI
Another atomic code/database widely used in the UV and X-ray spectroscopy for optically thin, collisionally dominated plasma is the CHIANTI code. Compared to the APEC and SPEX codes, CHIANTI is more focused on modeling the spectra from relatively cooler plasma in the solar and stellar atmosphere, while in this work, we are testing it in the conditions of hot ICM emission. The latest version 8.0 (Dere et al. 1997;Del Zanna et al. 2015) is used. The current CHIANTI database includes all the relevant H-like and He-like ions except for Cr and Mn, which means that these abundances cannot be estimated. We calculate the collisional ionization equilibrium spectrum using an IDL-version isothermal model, setting the ionization balance to Bryans et al. (2009), and change the solar abundance table to Lodders & Palme (2009) proto-solar values. To perform the fit to the data, the IDL calculation is implemented as an input to the user model in SPEX, and the fitting engine of SPEX repeatedly triggers the IDL run until a best-fit is reached. Since the CHIANTI code does not provide line broadening information, we apply a multiplicative SPEX Gaussian broadening model vgau to the CHIANTI model. This is only a firstorder approximation, since the thermal broadening should vary with the atomic number. A detailed comparison on the best-fit spectra shown in appendix 4 (figures 23-25) reveals several differences in emission features from the baseline model, at levels ranging from a few % up to about 20%. Most of these differences are traced back to the different input atomic data, which can be found in appendix 5 (tables 10-11).
The C-statistic value increases by 327 when fitting with the CHIANTI code. The best-fit temperature, emission measure, turbulent velocity, and the Fe abundance are roughly consistent with the baseline results, while the remaining abundances differ by 3-19%. The required column density for resonance scattering is reduced by 10% with the CHIANTI model.

Cloudy
The Cloudy code has been developed as a tool to calculate photoionized plasmas and it is principally used for this application. It does, however, have a module for calculating CIE plasma spectra, so we have therefore fitted the Perseus spectrum with the coronal equilibrium model of Cloudy version 13.04. The abundance standard is set to Lodders & Palme (2009). Since the Cloudy code does not provide the thermal and turbulent broad-ening, we again apply a multiplicative SPEX Gaussian broadening model vgau to the Cloudy calculations. As shown in table 1, the fit with Cloudy yields a large C-statistic. The most significant residuals appear at the Fe XXV He-series and Fe XXVI Lyα lines. The best-fit temperature agrees with the results of the other codes, but the abundance values differ strongly from those derived from the other codes. We again note that modeling of collisional plasmas is not Cloudy's main purpose.

Systematic factors affecting the derived source parameters: atomic data
As shown in table 1, the atomic code uncertainty contributes the main uncertainty of many parameters, such as the Si, S, Ar, Ca, Mn, Fe, and Ni abundances, the hot absorption, and the turbulent velocity. The code uncertainty mainly comes from the input atomic data, for instance, the ionization balance, collision excitation/de-excitation rates, recombination rates, and transition probabilities. In this section, we explore and describe the discrepancies between the current atomic data used in each code, and estimate the propagated errors on the fitted parameters.

H-like ions
In this section, we address the systematic uncertainties on the collisional excitation rates for H-like ions from the ground to the 2p levels. The radiative relaxation from the 2p levels back to the ground produces the Lyα lines. As shown in table 2, the effective collision strengths of Si XIV and Fe XXVI for a 4-keV plasma often differ by 10-30% among atomic codes, which contributes an important uncertainty in the abundance measurement (table 1). The collision rates used in AtomDB version 3.0.2 and CHIANTI version 8.0 are systematically larger than those in SPEX version 3.03 and AtomDB version 3.0.8, while the latter two are roughly consistent with the calculations by the Flexible Atomic Code (FAC, Gu 2008), version 1.1.1. FAC can calculate both atomic structure and scattering data, and the relativistic effects are fully taken into account by the Dirac-Coulomb Hamiltonian. By solving the configurationinteraction wave functions in the Dirac-Fock-Slater centralfield potential, it evaluates the radiative transitions and autoionization rates for the input atomic levels, and computes the effective collisional strengths using a distorted-wave approximation. The FAC values shown are based on calculations with a default grid that contains 6 grid points. As check a calculation with a grid of 11 points has also been carried out. The values of the 11 point grid are about 5% lower than the values of the default grid calculation. The consistency between FAC and AtomDB version 3.0.8 is expected, since the AtomDB values are essentially taken from a FAC calculation by Li et al. (2015).
The differences in the effective collision strengths depend on the electron temperature. In figure 2, we compare five sets of calculations for Si XIV and Fe XXVI Lyα transitions. For Si XIV, SPEX uses a R-matrix calculation by Aggarwal & Kingston (1992), which is roughly consistent with the AtomDB and CHIANTI values within 8% at 10 6 K, but becomes lower by 30% at 10 7.7 K than the CHIANTI data. This means that even for the simplest H-like ions, the atomic data for the collision process are not sufficiently converged to match the accuracy of the current observations. Since the Si abundance is mostly determined by the Si XIV Lyα for the Hitomi SXS data, the 30% uncertainty in the collision strength calculation indicates a roughly similar error in the abundance measurement.
For Fe XXVI, we compare two representative calculations using a R-matrix method, Ballance et al. (2002) (implemented in CHIANTI version 8.0 and AtomDB version 3.0.2) and Kisielius et al. (1996) (used in SPEX version 3.03), and the FAC calculation in AtomDB version 3.0.8. The three results roughly agree with each other at 10 6 K, while the calculations of Kisielius et al. (1996) is higher than the other two up to 10 7 K, and decreases rapidly beyond this temperature, relative to the others. At the high temperature end (3×10 8 K), the difference between the Ballance et al. (2002) and Kisielius et al. (1996) values is about 30% for the 1s ( 2 S 1/2 ) -2p ( 2 P 3/2 ) Lyα1 transition. According to Ballance et al. (2002), the differences at low and high energies are mainly caused by the treatment of radiation damping and the high-energy approximation, respectively. This would contribute a minor part of the uncertainty on the Fe abundance measured with the Hitomi SXS data; the main uncertainty comes from the Heα transitions (section 5.1.2).

He-like ions
We now turn to the He-like Fe-K multiplet as a test case to assess the flux errors on model lines by the input atomic data. First we define the range of related atomic levels and data in figure 3 and table 3. The most dominant populating process for the upper levels of the resonance and intercombination transitions is electron-impact excitation from the ground state, and the main loss process is radiative transition back to the ground state. The upper level of the x line (2p 3 P2) has a 18% chance to form a two-step decay via an intermediate level.
Meanwhile, for a 4-keV plasma, the upper level of the Fe XXV forbidden transition (z) is populated almost equally by: excitation from the ground state; cascades from the 2p ( 3 P0), 2p ( 3 P2), and 3p ( 3 P2) levels; and radiative recombination from the continuum state. In addition, inner-shell ionization of Fe XXIV drives 8%, and radiative transitions from the 3p ( 3 P1) and 4p ( 3 P2) levels both provide 4% of the population. The metastable level can decay to the ground only via radiative transitions.
We compare the atomic data extracted from the SPEX, AtomDB, and CHIANTI databases, as well as the collision data from the Open-ADAS database 3 , and the radiative transition data from the FAC calculation. The effective collision strengths used in SPEX, AtomDB version 3.0.2 and CHIANTI version 8.0, and AtomDB version 3.0.8 are taken from the published data in Zhang & Sampson (1987, distorted wave), Whiteford (2005, R-matrix), and Whiteford et al. (2001, R-matrix), respectively. The data from Open-ADAS is calculated with the distorted-wave approximation. We do not show the FAC results on the collisional excitation, since it does not provide explicitly the contributions from resonance excitation channels, which are incorporated in the other calculations.
As shown in table 3, the collision data converge relatively well (<18%) on the ground to 1 P and 3 P level transitions, but differ by up to 42% on the ground to 3 S transition. As shown in figure 4, the effective collision strengths used in CHIANTI version 8.0 / AtomdB version 3.0.2 are systematically larger than that in the SPEX version 3.03, by a factor of two at 1 keV, and about 40% at 10 keV. The values in AtomDB version 3.0.8 lie in the middle, about 10% higher than the SPEX values at 4 keV. It appears that the R-matrix calculations (AtomDB and CHIANTI) are systematically higher, by 10-40%, than the distorted-wave calculations (SPEX and Open-ADAS). Since the forbidden transition from 3 S to the ground gives a line intensity only second to the resonance line for a 4-keV plasma, while the latter is subject to resonance scattering (section 8.1), the uncertainty of the 3 S excitation should contribute a significant portion of the total error of the Fe abundance. The radiative transition data used in different codes agree within a 15% level for the Helike triplet lines. The transition rates from higher levels (e.g., n =3 and 4) to the ground can have a larger uncertainties up to 30%, which will be discussed in more detail in section 5.2.
Assuming that the deviations between different data gives a rough measure of the atomic process uncertainties, we carry out a Monte-Carlo simulation to quantify the atomic uncertainties on the He-like triplet line ratios. We generate 1000 sets of collisional excitation rates by randomizing based on the five sets of collision data in table 3. The same is done for the transition probability. Then we run the SPEX calculation repeatedly, each time with one set of randomized collision and radiative data, to determine the flux error on each individual line. There are two potential caveats: first, the Monte-Carlo method assumes that all the rate errors are independent, which is not always true for the atomic calculations; second, the differences between SPEX and other codes on the atomic data of the recombination processes and the fluorescent yields, as well as on the atomic structure such as the maximum principal quantum number, are not taken into account in the simulation. Therefore the error obtained in the simulation should be regarded as a lower limit.
The results of the 1000 simulations are shown in figure 5. The simulation predicts that the resonance (w), intercombination (x and y), and the forbidden (z) lines have uncertainties of ∼4%, 2% and 8%, and 6%, respectively. The y and z lines have the larger atomic uncertainties than the other two, probably caused by the relatively large errors of the collision strengths and the complex formation of the 3 P and 3 S levels. The actual AtomDB version 3.0.8 and SPEX version 3.03 line intensities indicate similar uncertainties. The CHIANTI version 8.0 triplet line fluxes are systematically lower than the simulation results and the other two codes. This could be caused by the fact that CHIANTI has the lowest maximum principal quantum number, and hence possibly a lowest radiative decay contribution to the n =2 levels, among the three atomic codes. When multiplying the CHIANTI fluxes by a factor of 1.05, they become well in line with the simulation results.

Best fit with adjusted line ratios for the x and y lines
We have tested the sensitivity of our results on the He-like Fe lines further as follows. We made the intensity of the x and y lines relative to the forbidden line a free parameter. Technically, this was achieved by applying two line components to the x and y line. This model produces the transmission T (E) for in our case an absorption or emission line as T (E) = exp [−τ0φ(E)] with φ(E) the Gaussian optical depth profile. We have frozen the line energy of this absorption line to the energies of the x and y lines, respectively, and the width to the width of the 6.550×10 9 6.578×10 9 6.519×10 9 6.480×10 9 6.568×10 9 1% 3.769×10 12 10% * Energy-level IDs correspond to the energy levels as denoted in figure 3. † Relative contributions to the total gain or loss term of the level derived with SPEX v3.03. ‡ Relative differences between the codes defined as (maximum−minimum)/maximum. x y z z-x z-y  emission line (using the best-fit thermal and turbulent broadening from the baseline model). Thus, the only two additional free parameters are the nominal optical depths τ0 of both lines, positive values indicating lower flux, negative values higher flux. The best-fit parameters are τ0 =0.035±0.028 for x and τ0 = −0.068±0.025 for y. From this we derive that for the bestfit model the flux of the x-line should be lower by 3±3% and that of y should be higher by 8±3% compared to our SPEX plasma model in order to give the best agreement with the observed spectrum (table 1).
The atomic uncertainties on the x, y, and z lines are calculated using a Monte-Carlo simulation in section 5.1.2. Based on the simulated data, we further estimate that the errors on the x and y relative to the forbidden line ratios are 6.2% and 9.2%, respectively. Hence the best-fit modifications to the x and y lines are well in line with the expected atomic errors. Besides the radiative transition data for the He-like triplet shown above, here we make a more systematic comparison of the transition probabilities among the atomic codes. The radiative data for selected strong lines are shown in appendix 5 (table 10). In figures 6 and 7, we demonstrate that the Einstein A values for H-like ions are consistent within a few percent among the codes, while for He-like ions, especially for transitions from n =3 or more to the ground, the A values have larger uncertainties up to 30%. The SPEX A values are systematically higher than those in AtomDB and CHIANTI. Partly owing to the difference in the transition data, the Heβ, Heγ, and Heδ line intensities calculated by SPEX are higher than the AtomDB and CHIANTI lines (see details in table 10). These lines contribute a minor role in the abundance measurements.  The line energies, radiative transitions, and emissions of the satellite lines for a 4-keV CIE plasma are compared in table 11 in appendix 5. The atomic level-dependent Auger transition rates and radiative-to-total branching ratios are shown in table 4, and the resulting line spectra for Fe are plotted in figure 8. The most noticeable issue is that APEC version 3.0.8 gives higher Fe XXIV fluxes at ∼6.5 keV and 6.545 keV than the other two codes, which is driven by a recent update of APEC by incorporating the dielectronic recombination (DR) rates and branching ratios calculated in Palmeri et al. (2003a). This could partially explain the different Fe abundances with SPEX and APEC as shown in table 1. Figure 9 shows relative ionic fractions of a 4-keV CIE plasma based on the SPEX and APEC calculations. In SPEX the ionization balance mode was set to Urdampilleta et al. (2017) which allows to include inner-shell ionization contributions to the spectrum, while in APEC the balance from Bryans et al. (2009) was assumed. For He-like, H-like, and bare ions of Si-Cu, these two calculations agree with each other within 5%. For Li-like ions, they agree within 10%. For higher sequences, however, larger differences are seen as the APEC values are systematically larger, by up to 57%, than the SPEX values.

Ionization equilibrium concentrations
Here we assess the uncertainties on ionization concentration by replacing the baseline Urdampilleta et al. (2017, hereafter U17) balance with historical ones, namely Arnaud & Rothenflug (1985, AR85), Arnaud &Raymond (1992, AR92) 4 , andBryans et al. (2009, B09). It should be noted that the AR85 and AR92 balances do not include trace elements, such as Cr and Mn. As shown in table 1, the baseline model with the AR85 and AR92 ionization balances becomes much worse by δCstat of about 100, and the best-fit temperature and abundances changes by 1-3%. The B09 balance provides an equally good fit as the U17 one, yielding almost the same parameters except for the Fe abundance, which increases by 4%. The NH of the self-absorption component changes by 6-13% for different balances. By comparing the values from the mostly used B09 and U17 balances, the systematic uncertainty on abundances from ionization concentration is 1-4%. A related issue is the uncertainty on the He-like to H-like ion ratios. As shown in appendix 4 (figures 23-25), the He-and Ly-series are the dominant line feature of the Perseus spectrum, and their ratios largely determine the temperature measurement. Here we examine the He-like to H-like ion ratios as a function of nuclear charge Z, which is expected in theory to be a perfectly smooth function. The calculation is based on SPEX version 3.03. As shown in figure 10, the He-like to H-like ion ratio indeed appears as a nearly linear function in logarithmic space, and the scatter is within 0.5%.

Systematic factors affecting the derived source parameters: plasma modeling
Although it is in principle straightforward to calculate a spectrum from the atomic data, practically these calculations are based on a range of approximations, and usually include only limited physical processes -treatment of specific physical processes is limited or missing entirely. This section explores these technical issues in the plasma modeling and discusses their impacts on the fitted parameters.

Voigt profiles
In our baseline model we have approximated the line profiles using Doppler profiles (Gaussians). This gives a significant increase in speed in obtaining our spectral fits. However, the true profiles are Voigt profiles. We have tested the sensitivity of our results to these intrinsic line profile assumptions. The Lorentzian widths of the Voigt profiles are fixed to the natural widths in SPEX version 3.03. Figure 11 shows our results.
The changes are substantial (5-10%) near the Fe XXV resonance line observed at 6.60 keV. In all other parts of the spectrum the changes are smaller, due to the fact that the lines are weaker.

Continuum contributions from heavy elements
Not only abundant elements like H, He, O, and Fe contribute significantly to the continuum emission, but also the contributions of less abundant elements like Cr or Mn are detectable. We discovered this by accident when we tested our baseline model with the old version of SPEX (version 2). In that old version only the 15 most abundant elements with nuclear charge less than 30 were taken into account in the line emission, yet the model could produce some very crude constraints on the Cr and Mn abundance, while the line emission of both elements was not accounted for by the model. What is the explanation for this? In figure 12 we show the relative contribution of each element to the continuum emission (including here also the AGN continuum). About 90% of the emission is due to H and He, about 10% is due to Fe, and all other elements contribute less than a few percent at most. In particular for the elements between Si and Mn clearly the smooth two-photon emission bumps and the free-bound edges are visible.
The present spectrum has 487621 counts with a nominal uncertainty of 698 counts. Cr and Mn contribute 78 and 104 counts to the continuum, respectively. Therefore their contribution is small but if the abundances would have been off by a factor of 10, their continuum contribution with their specific structure as shown in figure 12 would have allowed to constrain their abundances.
In SPEX all contributions to the radiative recombination (free-bound) continuum smaller than a threshold are omitted for computational efficiency (the free-bound continuum calculation takes most of the computing time for high-resolution spectra because of the large number of energy bins and atomic shells that need to be calculated). The threshold is controlled by the parameter "gacc" that can be set by the user. Its default value is 10 −3 , but for figure 12 we have put it to 10 −7 .
This same value is used in the entry gacc listed in table 1. It can be seen that changing this parameter has only a very minor effect on the fit (improvement of C-statistic only 0.41), but the additional computational burden is heavy.

Maximum principal quantum number n in the calculations
Collisional excitation by thermal electrons mostly populates the inner shells of the atomic structure. Although the emission lines from outer shells are usually rather weak, some of them become visible in the Hitomi SXS spectrum (appendix 5; table 10). Here we test the impact on the obtained spectral parameters by limiting the maximum principal quantum number n in the calculation. As shown in table 1, when excluding the outer shells with n > 5, the fitting with the baseline model gets poorer by δCstat ≈61, and the best-fit metal abundances become slightly larger by a few percent. This is because the outer shell population will also contribute to the inner shell (e.g., Lyα, Lyβ) transitions by radiative cascading. As shown in table 10, the Hitomi SXS data require the plasma code to calculate to at least n = 10 for the Fe XXV lines.

Hyperfine mediated transitions
The isotopic composition of Fe contains approximately 2% of 57 Fe, which has non-zero nuclear spin and thus might be expected to exhibit a hyperfine-mediated transition from 1s2p 3 P0 to ground, resulting in a weak third intercombination line. The transition rate has been calculated by Johnson et al. (1997), who find that it is about 6% of the transition rate to the 1s2s 3 S1 state, so that the strength of the 1s2p 3 P0 transition to ground is negligible for Fe. The low branching ratio to ground can be attributed to the relatively weak magnetic moment of 57 Fe. We caution that all odd Z elements have non-zero nuclear magnetic moments, and for most of those ions in the Fe group, the hyperfine mediated decay channel to ground is actually dominant.

Systematic factors affecting the derived source parameters: astrophysical model
The atomic data and plasma code are eventually integrated into the spectral models. To verify the spectral modeling with the Hitomi SXS data, it is important to test it in a proper astrophysical context. In this session, we will incorporate several astrophysical effects, such as non-equilibrium and multitemperature, examine their spectral features with the data, and calculate the related uncertainties on the fitted parameters. The physical implication of these effects will be discussed in other Hitomi Collaboration papers.

Ion temperature versus turbulence
The basic assumption made in our earlier paper on turbulence (Hitomi Collaboration et al. 2016) is that the ion temperature of the cluster gas equals the electron temperature. Given the relatively high density in the core of the Perseus cluster In order to test this, we have decoupled the ion temperature from the electron temperature in our model and refitted the spectrum. We get an insignificant improvement of our fit (δCstat = −0.02) with the best-fit values of the ion temper-ature of kTion = 4.1 (−2.3, +3.2) keV and turbulent velocity σv = 156 (−21, +13) km s −1 . However, there is a strong anti-correlation between both parameters. Without constraints on the ion temperature, σv can be anywhere between 134 and 168 km s −1 . The best-fit values of these parameters depend on details of the spectral analysis method, although the differences are smaller than the statistical errors. Such systematic effects are separately discussed in V paper. Note that for a fixed ion temperature, the uncertainty on the turbulent velocity is much smaller, i.e., only 3 km s −1 . We show the (minor) effects of a free ion temperature on the other parameters in table 1.

Deviations from collisional ionization equilibrium
The core of the Perseus cluster is a very dynamical environment, with a relatively high density and an active galactic nucleus at its center. Therefore, in principle one might expect nonequilibrium ionization effects to play a role. We have tested this as follows.
The most simple test is to decouple the temperature used for the ionization balance calculations, T bal , from the (electron) temperature Tspec used for the evaluation of the emitted spectrum for the set of ionic abundances obtained using T bal . This can be achieved within the SPEX package by making the parameter RT ≡ T bal /Tspec a free parameter. We obtain a best fit for RT =0.980±0.011, i.e., close to unity, with only a modest improvement in C-statistic of 3.26.
Alternatively, we can replace the basic CIE model by a genuine non-equilibrium ionization (NEI) model in SPEX. This model can mimic a plasma that suddenly changes its electron temperature from a value T1 to a value T2. The spectrum is then evaluated after a time t, related to the measured relaxation timescale U by U = nedt, the electron density integrated over time from the instant that the temperature suddenly changes.
Further, we tested a recombining model by inverting the role of T1 and T2 (model labeled "Recombining"). Leaving T1 free it appears that it gets to a very high value. Therefore we choose to fix T1 to a high value (100 keV) so we start essentially with a fully ionized plasma. We obtain T2 =3.933±0.020 keV, U = (2.5±0.2)×10 18 m −3 s, and an improvement in C-statistic of 9.19.
The above may suggest that there are some significant although minor non-equilibrium effects. However, we cannot claim such effects here. First, nominally our fits are very close to equilibrium (RT ≈ 1 or U ≈10 19 m −3 s). The best-fit value for RT may differ from unity at the 1.9-σ confidence level, but the absolute difference is only 2.0%. It is likely true that the systematic uncertainties on the ionization and recombination rates are large enough to account for such a small deviation from equilibrium. For example, when we increase all ionization rates for iron ions arbitrarily by 5%, the peak concentration of Fe XXV for the baseline model would increase from 0.747 to 0.750; a lowering of the temperature by 1% would have the same effect on the Fe XXV concentration.
Another issue is that introducing multi-temperature structure (section 7.4) gives much larger improvements to the fit. Clearly, the Perseus core region contains multiple-temperature components, and at such a level that weak non-equilibrium effects cannot be separated from it.

Effects of the spatial structure of the Perseus cluster
Up to now, we have treated the Perseus spectrum with relatively simple spectral models. In reality, Perseus shows temperature and abundance gradients. How do they affect our analysis? We investigate this through simulation. Our goal here is to estimate the systematic uncertainties on the derived parameters resulting from neglecting the spatial structure of Perseus.
We proceed as follows. We have taken the radial temperature and density profile derived from deprojected Chandra spectra as given by Zhuravleva et al. (2014, extended data figure 1). For the radial abundance profile we have adopted the average profile for a large sample of clusters based on XMM-Newton data (Mernier et al. 2016). We have not chosen their profile derived from the Perseus data alone, because that is noisier than the average profile for the full set of clusters. Mernier et al. (2016) show that in general the radial abundance profiles of individual clusters agree well with this average profile. . The best-fit isothermal (1T) and two-temperature (2T) models to this DEM are shown with the red and orange histograms, respectively, and the best-fit GDEM model with the blue curve.
We have then integrated these 3D profiles over the line of sight through the projected FOV of the Hitomi SXS for our present observations. We accounted for the different pointing position for Obs 2+3 compared to Obs 4 by weighing with the relative exposure times. This way we have obtained the differential emission measure distribution (DEM) within the FOV of the Hitomi SXS. We have binned it in 0.1-keV wide temperature bins. The total emission measure is 1.003×10 73 m −3 . Figure 13 shows this distribution (normalized to integral unity) as well as the average abundance for each temperature bin.
We see that the DEM is strongly peaked towards 3 keV, and decreases rapidly towards higher temperatures. This peak corresponds to the coldest gas in the center of the cluster using the Zhuravleva et al. (2014) parameterization (we have assumed that the temperature remains constant for radii smaller than 10 kpc). The DEM then flattens near 5 keV and turns up again above 6.4 keV. This corresponds to the peak in the radial temperature distribution around 250 kpc. The abundance drops almost continuously from 0.82 in the center (at 3 keV) to 0.47 at 6.5 keV.
Thus, we are faced with an extremely skewed DEM distribution over a range of only a factor of two difference in temperature, combined with a monotonous declining abundance pattern that also differs by a factor of two from low to high temperatures. How does this affect our modeling?
We have taken our baseline model, and replaced the main ∼4-keV emission component with the 36 temperature components shown in figure 13. The abundances for the different temperature components are the ones shown in figure 13. For simplicity we assume that all elements have the same abundance. All other spectral components (absorption, AGN contribution, etc.) are taken to be exactly the same as in our best-fit baseline model. We then simulated this spectrum with the same exposure time as the observed Hitomi SXS spectrum, and fitted this simulated spectrum the same way as our baseline model.
In order to avoid the overhead of having to simulate many different cases, we have turned the random noise in our simulations off. In this way we get with a single simulation the best-fit parameters (and their uncertainties where needed). A perfect fit would then yield a formal C-statistic of 0.
We first fit this simulated spectrum with our baseline model, where the thermal emission is modeled as a single-temperature component (labeled as 1T). The best-fit reaches a C-statistic value of 36.37, i.e., the isothermal approximation is poorer by 36.37 compared with the true underlying spectrum. This fit (table 5) shows some clear biases. First, the abundance of Si and S, with lines at the low-energy end of the spectrum, are too high by about 10% compared to the input model (the input model does not have a single abundance, but we list the emission-measure weighted abundance for the input model in table 5). On the other hand, the Fe and Ni abundances are too low by 4%. As a result, the Si/Fe ratio is even off by 15%. This bias can be understood from the different temperature dependence of the Si/S lines compared to the Fe/Ni lines. Our model forces these lines to be formed at the same temperature, and the only way to get the line fluxes more or less right is to adjust the abundances.
Interestingly, the Cr and Mn abundances are even lower, by 8-10%. This is due to the fact that the 1T model in the simulation under-predicts the true continuum near the dominant Cr and Mn lines by about 0.3%. As a result, the total simulated flux near these lines can be recovered only by reducing the abundances.
The temperature for this simulated 1T model (3.62 keV) is slightly lower than the temperature for the baseline model (4.05 keV). There may be various reasons for this. First, our spherically symmetric model for the Perseus cluster that we used may be too simplistic. For example, the Chandra intensity map of the Perseus cluster (Zhuravleva et al. 2014, figure 1) shows non-azimuthal fluctuations up to about 50% due to various structures within the Perseus core. Also, there are calibration uncertainties; for instance, for 4-keV plasmas, Schellenberger et al. (2015) shows differences between temperatures derived from Chandra and XMM-Newton that can easily reach 10%. It is not unfeasible that similar differences would exist between the Hitomi SXS temperature scale and that of Chandra. Finally, even with fully deprojected spectra, at the same distance from the cluster center multiple temperature components may co-exist due to different cooling or heating histories of different plasma elements (e.g. Kaastra et al. 2004).
We then fit the simulated spectrum with the Gaussian DEM (GDEM) model, where the DEM is log-normally distributed (the blue curve in figure 13). This model gives a much better description of the simulated spectrum (table 5), with a C-statistic of only 3.27. The corresponding DEM is quite different from the DEM of our input model (the black histogram in figure 13) but because it has the same total emission measure, average temperature and variance as the input DEM distribution, the corresponding spectra are very similar. Note that while the model parameter for the temperature of the GDEM model is 3.53 keV, its emission-measure weighted temperature is 3.59 keV, which is very close to the emission-measure weighted temperature of the input model (3.62 keV) or the 1T fit (3.62 keV). There is still a small bias in the derived abundances, but it is less than 4% for all elements.
The last model we fit to this simulated spectrum is a twotemperature component model (2T) with the abundances of both components tied together. This provides the best-fit (table 5) with a C-statistic value of only 2.64 and and abundance bias smaller than 3%. Finally, we have investigated the properties of the strongest lines in the spectrum. Defining line fluxes can be done in two ways: either taking the "pure" line flux, or also including other weak lines that are blended with the line of interest at the spectral resolution of the instrument. We have chosen the latter approach, and included the flux from all lines within ±2 eV from the line of interest. Figure 14 shows the combined line flux of the four Heα transitions w, x, y, and z of Fe XXV and the sum of both Lyα lines of Fe XXVI. No resonance scattering has been taken into account in these calculations. It is seen that the Fe XXV emission is more concentrated towards lower temperatures (the emission-weighted temperature for this ion is 3.69 keV), while Fe XXVI has a much flatter distribution (average temperature for this ion is 4.39 keV). Also, the ratio of the sum of the x, y, and z line fluxes to the w line flux changes significantly over this temperature range: from 0.79 at 3 keV to 0.63 at 6.5 keV.
7.4 Multi-temperature fitting of the Hitomi SXS data As shown in section 7.3, the central region of the Perseus cluster contains multiple temperature components. To evaluate the impact of the multi-temperature structure on the ICM parameters (e.g., turbulent velocity and abundances) for the real data, we carry out a multi-temperature fit to the Hitomi SXS spectrum. It is known that there is often more than one solution to fit a multi-temperature structure, since models with different combinations of temperatures and abundances might essentially yield a similar spectrum. Exploring these solutions is the focus of T paper. In this paper, we present three basic approximations for the temperature structure, and test them using the Hitomi SXS data.
First we assume that the temperature distribution follows a GDEM form. As shown in section 7.3, the GDEM model provides a proper approximation to the radial temperature profile of the Perseus cluster as derived from Chandra data. In the fit, we adopt the peak temperature, Gaussian width of the DEM, abundances, and turbulent velocity as free parameters, and the remaining components (AGN and resonance scattering) are modeled in the same way as in the baseline model (section 3). The effective-area correction factor (appendix 1.2) is also left free, as the continuum of the GDEM model is slightly different from the single-temperature baseline model. The results of the GDEM fits are shown in table 6. The C-statistic improves by 61 compared to the baseline fit. The best-fit central temperature T is 3.83±0.05 keV, and the Gaussian width σT = 0.13±0.01, which indicates a significant deviation from isothermality. Note that σT is defined in units of log 10 (T ), hence the value of σT corresponds roughly to 35% of T or 1.3 keV. The GDEM fitting gives lower Si, S, and Ar abundances, a similar Ca abundance, and slightly higher Cr, Mn, Fe, and Ni abundances than the single-temperature run. The abundance changes agree well with the prediction in table 5, indicating that the GDEM results are more close to the real values than the baseline results. The turbulent velocity remains intact in the new fit.
As a second approach, we apply a model with two discrete temperatures. First we assume that the two temperature components have the same set of abundances and turbulent velocity, as well as the same foreground absorption with a column density of 1.38×10 21 cm −2 . This setting is called 2CIEA. The other spectral components are inserted in the same way as the baseline model, and the effective area fudge factor is left free in the fitting. As shown in table 6, the C-statistic improves by 59 compared to the baseline fit. The best-fit two temperatures are 3.36±0.29 keV and 5.14±0.30 keV, and the abundances and turbulent velocity agree well with those with the GDEM model. The two-temperature fit can be further improved by allowing the Fe abundances and turbulence of the two components to vary freely. This setting is then called 2CIEB. This fitting improves the C statistic by 126 from the baseline fit. The third component has a temperature of 1.9 keV and best-fit Y = 0.33× 10 73 m −3 . The values of σv and Fe abundance are tied to those of the ∼3.5-keV component.

Helium abundance
Helium is an interesting element. It does not have line transitions in the X-ray band, yet its continuum contribution relative to hydrogen varies by ∼5% over the Hitomi SXS band, and therefore our results are affected by the adopted He abundance. It has been discussed that the He abundance in cluster cores may be enhanced by a factor of two or more due to sedimentation (Fabian & Pringle 1977;Gilfanov & Syunyaev 1984;Qin & Wu 2000;Chuzhoy & Nusser 2003;Ettori & Fabian 2006;Medvedev et al. 2014;Berlok & Pessah 2016). However, the magnitude of the effect is hard to predict due to the role of the magnetic topology, plasma instabilities, gas mixing by mergers and turbulence, and the formation of a cool core.
We have tested the effects of an enhanced He abundance on our baseline model by enhancing the He abundance to 1.1 times its original value. The effects are shown in table 1. The main effect is an enhancement of the abundances of all metals by 0.02-0.03.

Systematic factors affecting the derived source parameters: spectral components
Besides the (near-)thermal emission from the ICM, the Hitomi SXS spectrum might contain additional spectral components, such as the resonance scattering and the charge exchange between hot and cold matter. Are these components properly modeled in the current atomic codes? We investigate the additional spectral components and calculate the induced uncertainties on the derived properties of the main thermal component. As indicated in section 3, we have included a simple model to account for the absorption of photons through the cluster gas itself. In table 7 we show the transitions with strong line absorptions in the Hitomi SXS band, including the band that would have been observed if the gate valve would have been opened. The optical depth τ0 at line center is derived by assuming the best-fit baseline parameters of the column density of the hot gas N H,hot , abundances, and velocity dispersion σv (table 1). The transitions with optical depths larger than 0.005 are listed. We also list the oscillator strength f and the total transition probability A from the upper level of the line that is used in these calculations (Voigt absorption profiles are being used).

Self-absorption by hot gas
Clearly, the Fe XXV resonance line (Heα w) has the highest optical depth, but we see significant contributions from the other lines of the same Rydberg series, as well as for other ions of Fe and other elements. Also the optical depth of the Fe XXIV lines that block a part of the He-like intercombination line (Mehdipour et al. 2015) is up to 2%, a level that is detectable (for the intercombination line, the statistical uncertainty of the spectrum over one instrumental FWHM of 5 eV is about 3%). In our baseline model, we have coupled the turbulent velocity, the Doppler velocity and the temperature to the corresponding parameters of the dominant thermal emission component. We have also tested a model where we have decoupled these quantities. We obtain an insignificant improvement of our model (see table 1) with a temperature of 3.8±0.6 keV for the absorbing gas, a velocity relative to the hot gas of 10±30 km s −1 , a LOS turbulent velocity dispersion of 191±35 km s −1 and a column density of (20.1±2.2)×10 24 m −2 . All these parameters are fully consistent with the parameters of the emission component within the uncertainties of those emission parameters, but obviously we cannot exclude that the properties of the absorbing gas areon average -within the range indicated by the above uncertainties.
Our model substitutes a simple absorption model for resonance scattering effects. It assumes a common hydrogenequivalent column density for all the transitions listed in 7, ignoring the spatial structure of the ICM. The model also ignores the re-emission process after absorption, which possibly results in lower estimation of optical depths. A more accurate characterization of resonance scattering requires radiative simulations, which will be separately presented in RS paper.

Charge exchange contributions
Charge exchange (CX) happens when a neutral atom collides with a sufficiently charged ion, which recombines with the electron(s) captured from the atom. The product ion often has a highly excited state with large principal quantum number n, and thereafter, the decay of the excited electron(s) will fill the innershell vacancies through line emission. Therefore, the most characteristic features of the CX emission in X-rays are the transitions from high-n shells to the ground, which are much stronger than those in the CIE case. The CX spectrum also exhibits higher Lyβ-to-Lyα and forbidden-to-resonance (z-to-w) ratios, although these features can be achieved by other atomic processes (Gu et al. 2016b).
In this section we examine the CX contributions to the ICM emission of the Perseus cluster with the Hitomi SXS spectrum. The CX component adopted here is described in Gu et al. (2016a). It uses velocity-dependent, nlS-(He-like) or n-(Hlike) resolved reaction cross sections, based primarily on the multi-channel Landau-Zener calculations (Mullen et al. 2016). The low-energy weight function (equation 4 of Gu et al. 2016a) is applied to the H-like data in which the l-distribution cannot be obtained by the Landau-Zener calculations. For the Li-like and Be-like sequences, it includes velocity-and nl-resolved cross sections which are derived from an empirical scaling relation presented in Gu et al. (2016a). In this model, only the atomic component of the cold gas is considered, although in reality the molecular gas also contributes. The collision velocity is set to 200 km s −1 (Conselice et al. 2001), and the ionization temperature and abundances of the CX ions are fixed to the best-fit values of the ICM thermal component. As shown in table 1, a new baseline run including the CX component results in a minor C-statistic improvement (δCstat =13).
The fitting prefers that the CX lines are more broadened than the CIE lines, with a turbulent velocity vmic > ∼ 600 km s −1 (or σv > ∼ 400 km s −1 ). Since the actual line profile cannot be determined by the current data, we fix the turbulent velocity of the CX component at vmic = 800 km s −1 , corresponding to σv of 566 km s −1 which is the upper limit of the neutral atomic line width of the molecular cloud near NGC 1275 as reported in, e.g., Salomé et al. (2011). The large line width might be caused by a combined effect: it can be partially contributed by the kinematics of the neutral cloud and the ICM, and partially from the atomic uncertainty of the capture state (Gu et al. 2016a), as the CX lines from n ≥10 levels are often blended. Changing the turbulent velocity to a larger value (e.g, 1000 km s −1 ) has a negligible effect on the fitting.
As shown in figures 17 and 18, the CX model predicts that the most promising high-n transitions are seen in the S XVI band, which has been reported by Hitomi Collaboration et al. (2017a), as well as the Fe XXV band. The CX lines contribute to ∼1% of the continuum for S XVI at ∼3.4 keV, and ∼3% for Fe XXV at ∼8.6 keV. To measure the statistical uncertainties, we replace the CX model with two Gaussian lines at the energies of the S XVI and Fe XXV high-n transitions. The Gaussian FWHM is set free for each line. The S XVI and Fe XXV CX lines have 1.6-σ and 2.4-σ significances, respectively. However, it is premature to claim the detection of CX with the current data, since the uncertainty from the effective area/gain calibration is large and energy dependent, as discussed by Hitomi Collaboration et al. (2017a). For the remaining ions, the high-n transitions are negligible, either due to the low abundances, or blending with strong thermal lines.
As shown in table 1, inclusion of the CX component has minor effects on the ICM temperature, emission measure, and turbulent velocity. The Fe and Si abundances are reduced by ≈ 5% and 2%, respectively, and the S, Cr, Mn, and Ni are affected by 1-3%. Since the CX emission has a larger forbidden-toresonance (z-to-w) ratio than the thermal emission, the equivalent N H,hot for the possible resonance scattering is reduced by about 7%. The effect on the resonance scattering study will be further discussed in RS paper.

AGN contribution
To assess the uncertainty from the AGN flux, here we first consider an extreme condition: the central AGN is quite dim and its power-law emission is negligible. As shown in table 1, the non-AGN run gives a much worse fit (δCstat = 625) than the original baseline fit, and the best-fit temperature shifts by 0.5 keV. The abundances are systematically lower by 0.01-0.21 solar.
Next we examine a more realistic case for possible system-atic uncertainty related to the detailed AGN modeling. The AGN spectrum in the baseline model was established in the early study for AGN paper with the PSF photometry. The technique is essentially unchanged in the final analysis but the energy band is extended up to 20 keV with the sxsextend tool. The broader-band spectrum requires a slightly flatter photon index and a ≈20% lower flux in 2-10 keV (see AGN paper for details). Another notable update is the RMF type, which has been changed from the large size (also used in our baseline model) to the extra-large size to include electron-loss continuum. As we examine the effect of using different types of RMF separately (section 3.2), we use the new AGN model derived in the same method as in AGN paper but with the large-size RMF for a straightforward comparison to the baseline model. Therefore, slightly different parameter values from AGN paper are adopted in our test: the photon index of 1.85 and the 2-10-keV flux of 2.9×10 −14 W m −2 (private communication with H. Noda and Y. Fukazawa, 2017).
The new AGN model run gives a slightly poorer fit (δCstat = 11) than the original baseline model. The lower AGN flux requires a significant rise of the ICM continuum by 6%, which results in 3-4% lower abundances. The change in the ICM gas temperature becomes insignificant, unlike the no-AGN case.

Systematic factors affecting the derived source parameters: fitting techniques
In this section we discuss the effects of applying different fitting techniques on the derived parameters of the baseline model. χ 2 , optimal binning χ 2 , 1-eV bins 2 4 6 8 Fig. 19. Relative differences between a χ 2 fit and a C-statistic fit (blue curve) for the baseline model. The red curve shows the same but with 1 eV bins instead of optimal binning.
It is well known that the use of χ 2 -statistics in spectral fitting can give bias in the estimated parameters (e.g. Nousek & Shue 1989;Mighell 1999). The proper way to resolve this is to use C-statistics (Cash 1979), and we have used that for our baseline model. We use the modification of C-statistic as proposed by Castor (see the Xspec manual 5 , Arnaud 1996). This modification is the standard in the Xspec and SPEX packages. Our present Hitomi SXS spectrum offers an excellent opportunity to demonstrate the bias that one gets when using χ 2 -statistics.
We have taken the baseline model and replaced the C-statistic with the χ 2 statistic in the spectral fit. The best-fit model has χ 2 =6192 for 5790 degrees of freedom. The value of the Cstatistic that corresponds to this χ 2 -optimized fit is 88 higher than for the baseline model. We show the relative difference between both models in figure 19. It is seen that the continuum for the χ 2 fits is about 1% lower than for the baseline model, while some of the stronger emission lines have similar fluxes for both cases. This 1% bias is caused by the well-known effect that χ 2 fits tend to give lower fluxes by giving relatively more weight to the data points that by chance have a flux below the expected value than to the data points that have a flux above it. Our present spectrum has typically 100 counts in most continuum bins, and according to Mighell (1999) this would give a bias of about 1 count, in remarkable good agreement with our findings here. Note that for typically 100 counts per bin, the Poissonian error bars are about 10 counts, hence much larger than the differences between the models. This shows that biased fits are easily overlooked if plotted at full resolution. Only rebinning the best-fit drastically (with a factor of at least a hundred or so) would show the bias.
The bias becomes even stronger if in addition to using χ 2statistics we drop the optimal binning and use 1-eV bins (see the red curve in figure 19). In addition to a lower flux, there is now also a significant bias in the temperature, leading to a different overall slope of the spectrum. The bias is even 6% at the highest energies.

Optimal binning versus other binning
We have also tested how our results depend on the adopted bin size. When we use C-statistics, we find no difference at all for the parameters shown in table 1 when comparing our optimal binning with a uniform binning of 1 eV. This is easily understood by noting that our optimal binning already gives a bin size of 1-2 eV for all bins (see appendix 1.3), and that it is more the order of magnitude of the bins rather than the precise bin size that matters for the sensitivity of statistical tests (Kaastra & Bleeker 2016, see figure C3).
Note that when χ 2 is being used, binning is important, but as we demonstrate in section 9.1, the use of χ 2 -statistics should be avoided.

Local versus global fit
Astrophysical spectroscopic analysis in the radio through ultraviolet bands often relies upon precise measurements of selected, strong emission lines (e.g., H I 21 cm, Fe II 1.257 µm, and [O III] 5007Å) whose atomic and diagnostic properties are wellunderstood. This can also be done with X-ray spectroscopy (Hitomi Collaboration et al. 2016), but both the physics of X-ray-emitting plasmas and the availability of high-resolution spectrometers create significant challenges.
The Hitomi SXS spectrum of Perseus presents a clear combination of emission lines with a continuum, implying it can be completely understood via fits with the sum of a simple continuum plus a series of Gaussian emission lines, with astrophysical parameters derived from positions, widths, and flux ratios of the Gaussian parameters. An advantage of this approach is that it requires a relatively small amount of reliable atomic data, enabling the use of experimentally-verified and theoreticallyunderstood features. For example, the ratio of the line intensity of the Lyα line to its resolved DR satellites depends critically on the electron temperature; advanced line diagnostics using multiple DR satellite lines even test whether the underlying plasma is in thermal equilibrium (Gabriel & Phillips 1979;Kaastra et al. 2009).
Although elegant, and without doubt useful to obtain an approximate description, this approach will miss details resulting from a self-consistent fit of the full spectrum. Three key problems occur with X-ray spectral analysis via purely local line fits: 1. Unlike other spectral bands, the line and continuum emission arise from the same plasma. Therefore, simplifying the continuum to a spline fit or even bremsstrahlung emission independent of the line components ensures the resulting analysis will miss features. The X-ray continuum, even in strict collisional ionization equilibrium, contains significant contributions from radiative recombination continua and two photon emission (see e.g., figure 8 in Kaastra et al. 2008). While these components can be included in the fit, e.g., the APEC No-Line model used in Plucinsky et al. (2017), separating line from continuum emission makes finding a selfconsistent model all but impossible. 2. X-ray spectrometers, even the SXS, have only limited resolution, while the X-ray bandpass has a plethora of strong lines, making line blending an ongoing problem. Table 10 in appendix 5 shows several instances of lines from different elements separated by less than the instrumental resolution. Worse, the narrow bandpasses of oft-used diagnostic lines such as the Heα complex includes a multitude of DR satellite lines together with the strong "triplet" (actually a quartet) lines. Many of the lines have multiple excitation channels all of which must be known in order to fit the complex reliably. This is especially true of the forbidden line (z), as discussed earlier. For the SXS spectrum, Gaussian lines were used to determine the turbulent motion (Hitomi Collaboration et al. 2016), but a local form of the global fit is required to extract the maximum amount of information even from a relatively small bandwidth. At a minimum, when applying line ratio diagnostics it must be clear both in the model and the data whether these contaminants have been taken into account. 3. Few, if any, sources in the Universe will be in perfect equilibrium, either collisional or photo-ionized. The present spectrum of the Perseus cluster is a good example of such complexities. While dominated by a 4-keV temperature component, the possibility of multi-temperature cannot be eliminated based on the data (section 7.4), and is certainly expected theoretically. Depending upon their excitation mechanism, each emission line will be affected differently by these effects, rendering the use of just one or two diagnostic ratios precise but quite inaccurate. Using many lines, including upper limits to non-detections, will avoid this problem, but at some point the distinction between a many-line vs a global fit will become blurred.
Despite the above issues, line ratios may be preferred over global fits when the source spectrum is either too complex to be fully understood, or when calibration uncertainties dominate the broad-band spectra. Of course, the accuracy of the physical parameters derived from global fits also relies upon complete and accurate atomic databases. In the case of completeness, global models contain potentially millions of atomic transitions, most of which have not been experimentally verified. While "spot" checks do exist, in most cases the accuracy of the data is not well known, i.e., estimates of uncertainties are determined by comparing results from different theoretical calculations, or by using uncertainties from portions where experimental results do exist.
In our case, we have shown that the calibration and completeness of the spectral models is not perfect but good enough to yield a very good description of the Perseus spectrum. Ultimately, local and global fits must be used in a complementary way. The broad bandwidth coupled with the high spectral resolution of the SXS makes it possible to take advantage of the strengths of both methods, improving the reliability of the derived physical parameters of the source.

An improved model
To fit the Perseus spectrum, we introduced a baseline model (section 3) which mainly consists of a ∼4-keV CIE plasma and an AGN component. Obviously the baseline model is merely a simple approximation (section 7.4), even though it already achieves a satisfactory fit based on the current Hitomi SXS data. Throughout the paper, we have tested a variety of plasma codes, atomic data calculations, plasma and astrophysical modelings, additional spectral components, and instrumental effects, and compared them with the original baseline fit. By properly incorporating some of the atomic and astrophysical effects into the baseline model, we are able to achieve a more advanced physical model of the Perseus spectrum.
We construct an improved model as follows. Following the baseline model, SPEX version 3.03 is used, the abundance standard are the Lodders & Palme (2009) proto-solar values, and the ionization balance is set to Urdampilleta et al. (2017). The thermal emission is modeled now by the sum of three CIE components, with temperatures of about 2 keV, 3.5 keV and 5 keV (section 7.4). The three-temperature model is chosen since it gives the best fit of all multi-temperature modelings (table 6). The three components have the same Si, S, Ar, Ca, Cr, Mn, and Ni abundances, while the Fe abundance and turbulent velocity are left free for the 3.5-keV and 5-keV components. For the 2-keV component, the Fe abundance and turbulent velocity are tied to those of the 3.5-keV component (section 7.4). The AGN contribution, resonance scattering, and the Galactic absorption components are added in the same way as for the baseline model. The possible CX component (section 8.2) is included in the improved model. Following section 6.1, the Voigt function is used to describe the line profiles. We re-fit the effective area correction factor in the same way as described in appendix 1.2, and show the best-fit model in figure 22 (a).
The improved model achieves a so-far best C-statistics, 4779 for an expected value of 4876±99, which is significantly better than the baseline fit (C-statistics = 4926). The best-fit model is plotted in appendix 4 (figures 23-25), and the stacked residual diagram is shown in figure 20. The residual diagram is calculated by adding a line component, with the central energy moving from 1.9 keV to 9.5 keV with a step of 3 eV, on the best-fit baseline and improved models. Compared to the baseline run, the residuals at >2σ are greatly suppressed by the improved fit, and the diagram follows well the expected Gaussian distribution. As shown in table 1, the new model essentially reproduces the best-fit results of the three-temperature model (section 7.4). The best-fit temperatures are 1.92±0.21 keV, 3.61±0.33 keV, and 5.43±0.38 keV for the three components. Note that the values are sensitive to the details of the spectral modeling as well as the calibration of the instrumental response (see T paper for further details). All the Si, S, Ar, Cr, Mn, and Ni abundances become roughly 0.8 solar, which are much more uniform than the baseline results. The Ca abundance remains to be about 0.9 solar. The best-fit Fe abundances are 0.91±0.05 solar and 0.64±0.05 solar for the 3.5-keV and 5-keV components, respectively. The turbulent velocities become σv = 117±11 km s −1 for the 3.5-keV component and σv = 223±27 km s −1 for the 5-keV component. This may suggest that the cooler ICM tends to have a lower level of turbulence than the hotter one, but the results depend on the assumed temperature structure and are sensitive to the continuum modeling including the effective area calibration. The further details are discussed in V paper. Moreover, the improved model gives a self-absorption column density of (1.05±0.15)×10 25 m −2 . The column density of Fe XXV is thus (2.18±0.23)×10 20 m −2 , in good agreement with the value that we derive from the simulated spectra in section 7.3 (4.02×10 20 m −2 for the semi-column of a line through the core; 2.64×10 20 m −2 for the semi-column averaged over the Hitomi SXS FOV). Details on the derived resonance scattering are discussed in RS paper.

Important factors
We have shown in this paper the dependencies of several astrophysically interesting parameters, mainly focusing on the plasma modeling, i.e., plasma codes and atomic databases. We have also investigated the dependencies on astrophysical modeling as well as spectral fitting techniques. For a proper astrophysical modeling of the present Perseus cluster spectra, as presented here but in greater detail discussed in a set of other papers (Hitomi Collaboration et al. 2016, 2017a, Z, T, RS, V, andAGN papers), it is crucial to understand the possible systematic biases on the derived parameters. Table 1 provides a comprehensive list of the estimated biases, which enables us to intercompare various aspects of systematic uncertainty. The effects of some of the plasma modeling factors are comparable to or even larger than the statistical or instrumental uncertainty (appendix 3). Because this is the only high-quality high-resolution X-ray spectrum of a spatially extended thermal X-ray source up to now, it is also important for the preparation of future X-ray missions, in the sense that priorities in calibration, astrophysical modeling, or data analysis can be set.

Emission measure
The emission measure Y of the cluster ICM is a good representative of the absolute flux of the hot cluster gas. First, it is clear that we need to use the latest aharfgen to obtain an accurate emission measure. Although the emission measures are uniformly underestimated by 20% for the other cases with the older software, hereafter we ignore the difference and compare the relative values to find out which parameter affects Y .
The main contributor to the systematic uncertainty on Y is the adopted flux of the central AGN. Ignoring the AGN completely would give a 20% higher emission measure as well as 12% higher temperature for the hot gas. In fact, the AGN contribution affects almost all parameters of the hot cluster gas. Because we are not completely ignorant about the AGN flux, the true uncertainties are smaller than described above. One example of more realistic estimation is the difference due to updating the AGN model with the broader-band spectroscopy, which gives a 6% higher emission measure.
Other important factors for Y are the effective area correction (up to 4%), the ionization balance (3%), and the assumption of isothermality (σT free, 3%). The differences between the plasma codes, for which we consider here SPEX version 3.03 and AtomDB version 3.0.8 (SPEX and AtomDB briefly hereafter) to be the most sophisticated, are not very important for the emission measure: differences are less than 1%.

Temperature
Ignoring from now on the AGN contribution uncertainty, the most important factor affecting the temperature of the dominant 4-keV component is the fitting techniques. Using unbinned spectra and χ 2 statistics biases the temperature by 5%. Forcing isothermality (i.e., putting σT to be zero) gives 3.5% bias. Using the official effective area correction based on the on-ground calibration instead of the additional effective area correction using a thinner Be-filter and knak gives 3% bias. Both plasma codes (SPEX and AtomDB) agree relatively well in their derived temperature (better than 2%).

Turbulent velocity
While the temperature agreement between the plasma codes is good, they result in a 10% difference in the derived amount of turbulence in the plasma. This uncertainty is almost as large as the uncertainty introduced by ignoring completely the resonance scattering or ignoring the position-dependent bulk velocity field (both 9%).

Cluster velocity
This bulk velocity field obviously also affects the derived velocity centroid of the cluster (23 km s −1 ). Obviously the gain correction is also important (14 km s −1 ). Finally, the use of SPEX or AtomDB also results in a difference of 6 km s −1 .

Resonance scattering
The plasma codes result in an even bigger difference of 40% in the derived column density of the resonantly scattering plasma, six times larger than the statistical uncertainty on this quantity. This relatively large difference is likely associated to the systematic uncertainties in the line emissivities, because in the comparison we use the same resonance scattering model (the SPEX hot model). However, also precise modeling of the temperature structure (see our improved model) is important: this can also produce a difference of 35%.

Abundances
Finally, we discuss here the uncertainties on the abundances. Most striking is the difference in the Fe abundance associated to the plasma code: AtomDB gives a 16% lower abundance than SPEX. This is 17 times higher than the small statistical uncertainty on the Fe abundance. The differences can be attributed mostly to differences in the adopted collisional excitation and DR rates of the strongest spectral lines (sections 5.1 and 5.3). Other factors affecting the Fe abundance are the inclusion of resonance scattering (11%) and CX (5%). On the other hand, the Ni abundance is almost bias-free between the latest SPEX and APEC/AtomDB (at least within its 7% statistical uncertainty; however the bias between SPEX versions 2 and 3 is still significant. See also Z paper). This is not the case for other elements. The Si and S abundance can be biased by 6-14% depending on each of the following four factors: the plasma code, the isothermality assumption, the gain correction and the fitting method (χ 2 fitting on unbinned data). For Ar and Ca the main systematic uncertainties are associated to the plasma model (6-8%). Finally, for Cr and Mn both the isothermality assumption and the fitting method are the main sources of systematic uncertainty.

Implications to other observations
So far we have reviewed the state-of-the-art knowledge, mostly on the K-shell transitions, for modeling the hot (several keV) tenuous plasma in collisional ionization equilibrium. We caution that the atomic uncertainties derived from the Perseus data cannot simply be copied to observations of other sources, as the accuracy of atomic data depends strongly on the types of transitions (tables 2 and 3), as well as on the plasma conditions, such as electron temperature (figures 2 and 4) and ion charge states (figures 6 and 7). For instance, X-ray emission from a stellar corona (or an elliptical galaxy) is dominated by transitions in the Fe-L complex, which are known to be computationally more intricate than those in the Fe-K (e.g., Bernitt et al. 2012), and hence less accurate (e.g., de Plaa et al. 2012).
A more important issue is to discuss the atomic uncertainties by the science cases. The doppler measurement of line-ofsight velocities would be subject to the reference-wavelength accuracy of the dominant transitions except for the cases with large bulk velocities for example in young supernova remnants (SNRs). The precise characterization of turbulence velocity structures, i.e., search for non-Gaussianity, would primarily be limited by the accuracy of atomic and astrophysical modeling of the RS effect. This could be avoided by making use of local fittings of optically thin emission lines, and in this case relative line energies and emissivities of satellite lines as well as the calibration of the line spread function are the dominant source of uncertainty. The detection of a small departure from CIE (e.g., for merger clusters; Inoue et al. 2016) would mainly be limited by the uncertainties in the charge-state distribution calculation and thus ionization and recombination rates adopted therein (section 7.2). Revealing detailed time evolution of NEI plasma by measuring the charge-state distribution (e.g., for recombining plasma in SNRs; Sawada & Koyama 2012) would require an even higher level of accuracy for these transition rates including multiple ionization due to inner-shell processes followed by Auger ejections. The elemental abundance measurement is affected mostly by the errors of the line excitations and branching ratios for individual transitions including those for the satellite lines, although only the error of the total emissivity of a line complex (e.g., Heα) would matter for a system with a large intrinsic line broadening (∼100 eV) like young SNRs where the ion temperature is considerably high (∼MeV).
The atomic uncertainty for each science case can be evaluated by the Monte-Carlo approach introduced in section 5.1.2. Ultimately, the atomic error calculation should be implemented as a standard analysis procedure in the spectral modeling packages. This would require substantial work in the code development by assessing the accuracy of detailed atomic data.

Atomic data needs
As shown throughout this paper, the reliability of a spectral modeling package lies not only in the accuracy and completeness of its atomic data, but also in its ability to properly synthesize the atomic data as a function of physical parameters, i.e., the plasma conditions, such as temperature and density. Synthesizing the data is tedious and computationally taxing owing to the fact that the databases employed are large, containing millions of data points, including transition energies, excitation and ionization cross sections, resonant (multi-electronic) and non-resonant (radiative) recombination cross sections, and nonthermal processes, such as CX recombination. Different models use atomic databases of varying levels of completeness and accuracy as well as different synthesis methods in their calculations. Estimates of a model's accuracy is often given by comparison to other models. However, a true measure of a model's accuracy can only be determined by comparing to laboratory benchmark measurements.
Benchmark measurements, generally, come in two forms: as isolated experiments where a single ionic species or atomic process is studied, or as integrated experiments, where emission or absorption is measured from several simultaneous ions and atomic processes as a function of temperature or density. Isolated experiments include those conducted at electron beam ion traps, advanced light sources, or storage rings. Integrated experiments include experiments using, for example, tokamaks or laser-produced plasmas. Isolated experiments generally test portions of atomic databases, and integrated experiments test synthesis models. Examples of isolated experiments include measurements of absolute electron-impact excitation cross sections as a function of electron energy, transition energies, natural line widths, and oscillator strengths (Beiersdorfer et al. 1992;Brown et al. 2006;Rudolph et al. 2013). Examples of integrated experiments include the spectral signature of the Heα complex as a function of electron temperature (Bitter et al. 2008;Gu et al. 2012;Rosen et al. 2014;Rice et al. 2015), or full Fe-K and Fe-L shell spectral signatures as a function of temperature and density.
Providing laboratory benchmarks for the atomic database in all physical regimes for all astrophysically relevant ions is not tractable. Hence, models are tested by comparing to measurements where available. Typically, models agree with measurements at the 10-20% level in the cases of excitation and ionization processes. Transition energies, however, are of much higher accuracy. In the case of H-and He-like ions, measurement of the transition energies have tested theory at the level of a few to a few tens of parts per million Beiersdorfer 2009;Beiersdorfer & Brown 2015). In the case of ions with more bound electrons, i.e., L-shell ions, the accuracy of the models is not as well known, as experimental benchmarks are more sparse and agreement with theory varies.
The inability of the standard X-ray astrophysics models to accurately fit a significant fraction of the lines in the SXS Perseus spectrum (Hitomi Collaboration et al. 2016) not only uncovered some of the limits of SPEX and APEC, it also showed the limits of the high-accuracy laboratory measurements. For example, laboratory measurements of relative line intensities in the Fe Heα complex, in particular the strength of the forbidden line (z), still introduce a limit to our ability to take full advantage of the line complex's diagnostic power. The high-quality SXS Perseus spectrum provides the impetus for more complete and higher accuracy calculations and systematic measurements of all the processes involved in exciting, not only the forbidden line (z), but all of the lines found in the Heα complex, and not only for He-like Fe XXV, but also for other astrophysically relevant He-like ions. Measurements such as these will be paramount to interpreting high-resolution spectra to be returned by the future high-resolution X-ray spectroscopy missions (see section 11.3). Largely driven by their large band-widths, high energy resolution, and large collecting areas, high accuracy measurements of a plethora of atomic parameters will be required.
While providing a complete list of required measurements is beyond the scope of this paper, a few necessary measurements, in addition to the studies of Heα, should be mentioned. For example, a more complete study of the excitation cross sections and spectral signatures of CX recombination should be completed. Many CX studies have been completed, however, at present, theory has not matured to a point of consistently predicting experimental results, and hence, the diagnostic capability of CX emission is limited. Absolute cross-section measurements for electron-impact excitation followed by cascades, especially in the case of high-n transitions, with accuracies on the order of 5-10%, should also be a high priority as they determine the line strengths, and in turn, relative ion abundances (ionic fractions) and elemental abundances from a variety of celestial sources. High-accuracy measurements of DR-resonance strengths and of ionization cross sections should also be pursued. Similar laboratory measurements of photo-excitation and ionization processes should also be conducted, as these are the basis for determining column densities and scattering effects (RS paper).
One of the most sought after and challenging integrated laboratory experiments is an accurate measurement of the ion charge balance as a function of electron temperature and density. This is a universal goal throughout plasma physics, spanning nearly all temperature and density regimes. Integrated experiments such as these are challenging because it is hard to know to high accuracy what the systematics are of the source plasma, i.e., it is often hard to quantify or experimentally discount gradient and non-uniformity effects. Regardless of these challenges, integrated experiments where the plasma parameters have been independently well diagnosed have been successfully conducted (Rosen et al. 2014).

Prospects for XARM, Athena and other missions
The Hitomi SXS observation of Perseus, with its highresolution spectrum in the 1.9-9.5 keV band, showed both the strengths and weaknesses of existing plasma codes. Pre-launch versions of both the SPEX and AtomDB codes provided generally plausible fits to the observation, matching the continuum and many of the strong lines well. While neither fit was formally statistically acceptable (see table 1), the two codes agreed (to within ±0.3 keV) on the best-fit temperature, and to within ±0.2 on elemental abundances. At CCD resolution, these discrepancies could easily be understood as calibration issues or inadequacies of the collisional isothermal model; only at the resolution of the SXS were the clear problems with both codes apparent. As described above, many of these disagreements could be addressed by updating wavelengths and cross sections for a few weaker lines and by fixing minor code bugs. As a result, both SPEX and AtomDB are in close agreement about the emission from a 4-keV collisional plasma in 1.9-9.5 keV.
That an SXS observation was required to discover and address these problems may seem odd, as gratings on both Chandra and XMM-Newton have provided high-resolution Xray spectra of point sources since 1999. Unfortunately, most Xray point sources have intrinsically complex and variable spectra; stellar coronae include plasmas with a broad range of temperatures, while any model of the absorbed photo-ionized spectra of X-ray binaries and AGN must include a range of different geometries and source spectra. The only truly simple pointsource spectra are isolated neutron stars or white dwarfs, which have no features in the X-ray band and are therefore used as calibration sources. As a result, few grating observations could be used to test details of the plasma models beyond the strong lines, since any differences in weaker features could reasonably be due to issues in the source model and not the code.
Substantial work remains, therefore, to ensure that current plasma codes will be ready to face the challenges of data from the X-ray Astronomy Recovery Mission (XARM), ESA's Athena mission, and proposed missions such as the Arcus grating spectrometer or the Lynx observatory. These missions will have resolutions similar to or better than the Hitomi SXS, and will observe a large range of sources, including collisional plasmas with temperatures between 10 4 -10 9 K and photo-ionized plasmas with a similarly broad range of source flux, either in ionization equilibrium or non-equilibrium. These missions will cover a bandpass of ≈0.1-10 keV, a range that includes strong lines from Fe L-shell ions (Fe XVII-Fe XXIV) as well as M-shell lines from many abundant elements.
The Hitomi SXS data have shown that accurate atomic models are just as important as calibration. Preparing for these missions will require a multifaceted approach of plasma-code testing, theoretical calculations, and laboratory measurements. The process will begin with systematic testing of existing atomic models against (1) each other to determine where discrepancies exist, (2) laboratory measurements from electron beam ion traps and synchrotrons, and (3) deep targeted observations with existing observatories. When areas of unresolvable disagreement are identified, new theoretical calculations may be required or targeted laboratory measurements made. The plasmacode community has already begun this work, starting with a set of agreed-upon standard tests developed at a meeting at the Lorentz Center 6 . However, a consistent and continuous effort will be required to ensure that the community is ready for this next generation of high-resolution X-ray spectra. The science goals of Hitomi were discussed and developed over more than 10 years by the ASTRO-H Science Working Group, all members of which are authors of this manuscript. All the instruments were prepared by joint efforts of the team. The manuscript was subject to an internal collaboration-wide review process. All authors reviewed and approved the final version of the manuscript.

A.1.1 Energy-scale correction
To obtain the energy-dependent residual energy-scale errors, we fit the strongest emission lines in the 1.9-9.5 keV band. For each line, we define an adjacent band with a width of 0.1-0.2 keV, and perform a local fit of the Hitomi SXS spectrum. Table 8 lists the principal lines for the individual bands. A collisional ionization equilibrium (CIE) model affected by redshift is used to fit the astronomical lines, whereas a redshifted double-Gaussian model is used for the instrumental Si Kα lines. For the CIE model, the temperature is fixed to 4 keV, while the abundance, turbulent velocity, and redshift are left free. The redshift which is obtained from the fit is then compared with the known Perseus redshift (z =0.01756 or cz =5264 km s −1 : Ferruit et al.  Kaufman & Martin (1993); (4) Kelly (1987); (5) Sugar & Corliss (1985); (6) (2017). † A vacancy is denoted as a negative index of the electron configurations. ‡ Poor fit. Ignored in derivation of the correction curve. § The energy shift at Fe XXV Heα is assumed to be zero as it is already adjusted by the removal of the spatial velocity gradient.
The 1s-3p analogous to the 1s-2p dielectronic satellite line, j: 2p ( 2 P 3/2 ) -1s.2p 2 ( 2 D 5/2 ), labeled by Phillips (2008Phillips ( ). 1997) to obtain the best-fit energy shifts. The rest-frame reference energies implemented in the CIE model in SPEX version 3.03 are calculated values except for Ar XVII Heα, each retrieved from the references shown in table 8. Some are not the most commonly used calculations or measurements for calibration, but the differences are usually much smaller than the statistical uncertainties in the present analysis and thus do not affect the correction results. Detailed comparisons of the reference energies are given in appendix 2 (table 9). For the instrumental Si lines, the relative normalization of the double Gaussians is fixed at the known value (Scofield 1974), and the obtained redshift is directly converted to the energy shift. As shown in figure 21 and table 8, these shifts appear to be −(1-3) eV below 4 keV and above 7 keV, and +(0-2) eV in 4-7 keV. These differences cannot be justified by an astrophysical model -the ∼2 eV differ-ences between the Si Lyα and Fe Heα lines correspond to 300 km s −1 , while they are partially formed at similar temperatures. More importantly, at the high-energy side, there is a several eV difference in the Rydberg series of Fe XXV, which is even harder to explain with a realistic astrophysical model. Furthermore, the energy-scale shifts at the instrumental Si Kα lines are in good agreement with the parabolic trend of the astrophysical lines, providing further support for a non-astrophysical explanation. The behavior of these deviations is consistent with calibration issues (Eckart et al. in prep.). As shown in figure 21, we perform an empirical fit using a parabolic function to the observed deviations (δE). The correction to the original energy E (keV) is given as We caution that this empirical correction is not to be used outside of the range of the fit or trusted at the extremes of that range. Because there is no mechanism for an offset in the energy scale, the error must eventually tend to 0 at the lowest energies.

A.1.2 Effective-area correction factor
To identify and remove any possible residual calibration errors on the effective area, which affects mostly the continuum spectrum, we incorporate two correction functions in the broad-band spectral analysis. One represents uncertainty in the thickness of the Be window of the gate valve, another the uncertainty in the effective area of the X-ray mirrors of the SXS. To estimate the size of these factors, the Hitomi SXS spectrum is rebinned into 100-eV bins to enhance the continuum features. We then fit it with the baseline model described later in section 3 incorporating a knak component which determines the correction function using piece-wise power-laws in energy-correction factor space, together with a neutral-Be absorption model.
By making several iterations between a fit with 100-eV wide bins and a fit with the optimal binning (see appendix 1.3), the best-fit corrections and Be model are determined as shown with the solid curve in figure 22 (a). The fit prefers a negative absorption column of the Be model, which indicates that the ac- tual thickness of the Be window might be slightly lower than the value (262 µm) used in the current calibration. The correction, however, approaches to unity with a more realistic spectral modeling of the ICM (an improved model; see section 10), as shown with the dashed curve in figure 22 (a). Therefore the thinner Be window preferred with the baseline model is most likely due to incomplete modeling of the ICM emission.
The best-fit effective-area correction function is consistent with unity at ≤7 keV, and decreases to 0.9 at ∼9 keV. This means that the current calibration might be underestimated by ≤10% at the high-energy end of the standard SXS bandpass. This correction above 7 keV is more significant with an improved model. We discuss the effective-area corrections in more detail in section 3.4. Note that the sharp change at 7 keV is caused by the model grids; changing the grids has a negligible effect on the fitted parameters.

A.1.3 Binning of the data
For the binning of our X-ray data, we have followed the approach of Kaastra & Bleeker (2016) for optimal binning. The optimal bin size depends on the spectral resolution, number of resolution bins and local intensity of the spectrum and is different for each energy. It is achieved by issuing the obin command in SPEX. Since we started with spectrum with 0.5-eV resolu-tion bin, our optimal bin size is a multiple of 0.5 eV. In practice, for most energies below 8.2 keV we use a bin size of 1.5 eV, and for higher energies 2 eV. The exceptions are near the S XV Heα complex where we use 1 eV, and near the high-n transition lines of Fe XXV, including Heγ, Heδ, and Heε lines where we also use 1.5-eV data bins. Table 9 compares the reference energies of emission lines in SPEX used in the SXS energy-scale correction (appendix 1.1) to the NIST Atomic Spectra Database version 5.3 (Kramida et al. 2016) and other available measurements and calculations.

Appendix 2 Reference line energies for the energy-scale correction
In the case of hydrogenic ions, we use the calculations of Erickson (1977). This is in contrast to Hitomi Collaboration et al. (2016), where for H-like Fe XXVI the calculations of  were used. The up-to-date calculations of Yerokhin & Shabaev (2015) agree well with . Although  are the accepted standard for n =2 to 1 transitions in H-like ions (Lyα) and have been well tested (Beiersdorfer 2009), they do not include transitions from higher Rydberg states with n ≥3. The calculations are in good agreement between Erickson (1977) and  within 0.02 eV for Z ≤20. For Lyα1 of Fe XXVI, Erickson (1977) is 0.11 eV less. For consistency, we use the values from Erickson (1977).
For n =2 to 1 transitions in He-like ions (Heα) of Cr and Fe, Ca and Ni, and S, we respectively use the calculated values of Shirai et al. (2000), Sugar & Corliss (1985), and Kaufman & Martin (1993). For Ar Heα, we use the measurement of Kelly (1987). These values are in good agreements within 0.2 eV with the calculations of Artemyev et al. (2005) as well as Cheng et al. (1994). An exception is found in the Ni Heα line, whose deviation is +0.47 eV. Recent update by Natarajan & Kadrekar (2013) gives a better agreement for Ni Heα with the SPEX value. These calculations have also been compared to many measured values (Beiersdorfer & Brown 2015), and good agreement is found. We also note the work of Drake (1988) which has been used often as a calibration standard.
For the 1s-3p transitions of Fe XXV (Heβ1) and Fe XXIV (j3 satellite in Phillips 2008), we respectively use the calculation of Sugar & Corliss (1985) and a FAC calculation (private communication with A. J. J. Raassen, 2017). Smith et al. (1993) performed both calculations and measurements of these 1s-3p lines of Fe ions. The calculated values are in good agreements with those used in SPEX within 0.1 eV. On the other hand, the measured values have relatively large deviations (−0.45 and −0.9 eV, respectively) from the calculations, which may be due to the limited wavelength calibration of the crystal spectrometer, as noted in Smith et al. (1993).
For the other high-n Rydberg series (Ca XIX Heβ and Fe XXV Heβ-δ), we use the calculated values from Sugar & Corliss (1985). The measurements of the high-n lines of He-like Fe XXV have been conducted by Indelicato et al. (1986) and Beiersdorfer et al. (1989), respectively. These agree well with the SPEX values within the measurement errors (±0.15 eV for Heγ1 and ±0.22 eV for Heδ1).

Appendix 3 Systematic factors due to instrumental effects
In this section we discuss the effects of several aspects of the instrumental calibrations on the derived parameters.

A.3.1 Velocity-gradient correction
The line broadening due to spatial bulk velocity of the ICM is removed by applying an energy-scale correction to each pixel (section 2). Without this correction, the C-statistic obtained with the baseline model increases by δCstat =62, and the LOS turbulent velocity dispersion becomes larger by 13 km s −1 ("No vel. cor." in table 1). The best-fit line center shifts towards shorter wavelength by 23 km s −1 .

A.3.2 Response matrices
We also test how much the fit changes by using a small RMF with only the Gaussian core component, as well as by using an extra-large RMF with the electron-loss continuum 7 . As shown in table 1, a small RMF improves the baseline fit by δCstat =4, while an extra-large RMF (listed as "XL RMF") gives instead a poorer fit with δCstat =12. The changes on the best-fit temperature and abundances by the RMF-type selection are 1-2%.

A.3.3 Non-X-ray background
The NXB rate depends on the orbital history of the satellite. Although this effect is already taken into account in sxsnxbgen, the systematic uncertainty could be large if the orbital history is biased. For the case of the Perseus observations (Obs 2-4), this systematic is expected to be small as the on-source time (≈290 ks) is much longer than the satellite orbital period. Indeed, the estimated NXB rate is 3.0×10 −2 counts s −1 cm −2 in 1.0-10 keV, consistent with the orbit-averaged value (Kilbourne et al. accepted.). This converts to 0.4% of the total count rate of the source events in 1.9-9.5 keV. Here, we consider an extreme case where we completely ignore the NXB contribution. As shown in table 1, 7 The current version of SPEX (3.03.00) is not fully compatible with the extralarge-size RMF because of its complexity. Here we apply a local fix to the incompatibility, which will be publicly available in the next release (version 3.03.01 or later). the baseline run without the NXB component gives a larger Cstatistic value (δCstat =9) than the original run, and the impact on the fitted parameters is minor, ≤1% on temperature and abundances.

A.3.4.1 Point-source ARFs
The spatial extent of the target has an impact on the instrumental response. As shown in table 1 (labeled as "PS ARF"), the use of the point-source ARFs not only on the AGN component but also on the ICM component of the Perseus cluster gives a larger C-statistic value (δCstat =30) than the original baseline fit. The improper ARF would lead to a 1% bias on temperature and up to 5% biases on abundances.

A.3.4.2 No effective-area correction factor
The correction factor (appendix 1.2) is included to remove potential calibration uncertainties on the effective area. As shown in table 1 (labeled as "No cor."), ignoring the correction factor yields a poorer fit (δCstat =38). The temperature shifts by 0.08 keV from the original value, and several abundances are underestimated by 0.03 times solar. The emission measure changes by 1.3%, larger than the statistical error by a factor of 5. This indicates that the correction factor, despite being small, is still needed for the current calibration.

A.3.4.3 Correction factor based on on-ground calibration
The baseline effective area correction is done with the SPEX model knak. Alternatively, the correction can be achieved by setting auxtransfile=CALDB in the aharfgen run, which applies an additional empirical transmission on the original ARFs. With this correction, the discrepancy in the mirror effective area between on-ground calibration measurements and ray-tracing simulations are removed. We hence re-run the baseline fit by including the new correction factor while turning off the knak and Be-filter fine-tuning (appendix 1.2). The original and new correction factors are respectively shown with the blue and red curves in figure 22 (a). A poorer fit (δCstat =191; "Ground cor." in table 1) is obtained with the new correction, and the best-fit CIE emission measure changes by 4%. The temperature decreases by 0.12 keV and the abundances changes by ≤0.06 solar from the original values.

A.3.4.4 Correction using the Crab observation
The third method to evaluate and correct systematic uncertainty in effective area is to use standard candles. The Crab is one of the broadly used reference sources for effective area calibration. Using a 9.7-ks observation, Tsujimoto et al. (accepted.) found that the SXS spectrum of the Crab showed a systematic deviation from the canonical model. The deviation, defined as the SXS-to-canonical ratio, behaves differently depending on the energy bands, but in the 3-9 keV band it shows a monotonically decreasing trend on energy within ±5%. This is reminiscent of our effective area correction with knak as shown in figure 22 (a). Therefore, we re-run the baseline fit with the Crab ratio as an effective-area correction instead of the original factor in appendix 1.2.
As shown table 1 ("Crab cor."), the Crab correction does improve the fit from the no-correction case ("No cor.") by δCstat =25. This accounts for two-third of the improvement by using the original correction factor (δCstat =38). The slightly worse fit than the baseline model might be attributable to the different observation configurations between Perseus and Crab: different pixel contributions and event-grade selections due to different spatial extents of the sources and incoming photon rates.

A.3.4.5 ARF with the latest aharfgen
There has been released a newer version (006) of the Hitomi software including an updated ray-tracing ARF generator aharfgen, in which a bug in the coordinates calculation for an input image is corrected. A comparison of effective areas between the old and new tools, as well as the new-to-old area ratio are shown in figure 22 (b). The ratio curve has an almost constant, smooth structure over the fitting range without any lineor edge-like features. The notable difference is rather in the total effective area. The ∼20% lower area results in a comparable amount of increase in the ICM emission measure ("New arfgen" in table 1). Although the new ARF marginally improves the fit, any changes in the best-fit values of the other parameters are less than 0.1%, justifying the use of the old ARF in the baseline model for the current purpose.

A.3.5 Effects of the gain correction factor
The correction on energy scale is crucial for fitting the emission lines. Once it is removed, the fitting with the baseline model becomes worse by δCstat = 627, and the Si, S, and Fe abundances are affected up to 15%. The temperature and line broadening are not affected by the energy-scale correction. Figures 23-25 show the full-band (1.9-9.5 keV) Hitomi SXS spectrum with the best-fit baseline model using SPEX version 3.03 and the relative differences of the best-fit models obtained with various other plasma models. See section 4 for details.