The distribution of absorption in AGN detected in the XMM-Newton observations of the CDFS

We have used very deep XMM-Newton observations of the Chandra Deep Field-South to examine the spectral properties of the faint active galactic nucleus (AGN) population. Crucially, redshift measurements are available for 84% (259/309) of the XMM-Newton sample. We have calculated the absorption and intrinsic luminosities of the sample using an extensive Monte Carlo technique incorporating the specifics of the XMM-Newton observations. Twenty-three sources are found to have substantial absorption and intrinsic X-ray luminosities greater than 10^44 erg/s, putting them in the"type-2"QSO regime. We compare the redshift, luminosity and absorption distributions of our sample to the predictions of a range of AGN population models. In contrast to recent findings from ultra-deep Chandra surveys, we find that there is little evidence that the absorption distribution is dependent on either redshift or intrinsic X-ray luminosity. The pattern of absorption in our sample is best reproduced by models in which ~75% of the AGN population is heavily absorbed at all luminosities and redshifts.


INTRODUCTION
Recent X-ray spectral studies using XMM-Newton and Chandra (Piconcelli et al. 2003;Civano et al. 2005;Mateos et al. 2005a) have begun to unravel the nature of the faint X-ray sources that make up the bulk of the extragalactic Xray background (XRB) below 10 keV. Most of these sources are accretion powered AGN Page et al. 2003). However, the population is far from homogeneous; the AGN exhibit a range of spectral properties, both in Xrays, and at other wavelengths. Unified models of AGN (e.g. Antonucci 1993), have been reasonably successful in explaining the observational differences between the various classes of AGN/Seyfert galaxies. The X-ray spectra of the AGN can, in the majority of cases, be adequately described by a power-law model, in many cases with some degree of absorption. The AGN with significant X-ray absorption are predominantly identified with narrow emission line galaxies, and are seen as higher redshift, higher luminosity analogues of nearby Seyfert 2 galaxies. Population synthesis models such as those of Comastri et al. (1995), Gilli, Salvati & Hasinger (2001) and Ueda et al. (2003) use a superposi-⋆ E-mail:td@phys.soton.ac.uk tion of faint AGN to reproduce the observed XRB. These models incorporate a large fraction of absorbed AGN in order to reproduce the hard spectrum of the XRB. Gilli, Salvati & Hasinger (2001) found that the XRB could be reproduced using a model in which the absorbed and unabsorbed AGN shared a common intrinsic X-ray luminosity function. A similar AGN population model was found to represent adequately the multi-wavelength properties of AGN in the GOODS dataset (Treister et al. 2004). In our previous work (Dwelly et al. 2005), we compared the X-ray colour distribution of sources detected in the 13 H XMM-Newton deep field with the X-ray colour distributions predicted by a number of model NH distributions. We found that the best match to the data was made by a NH distribution model in which absorbed and unabsorbed AGN share the same intrinsic luminosity function, and where the number of AGN per unit log NH is proportional to (log NH ) 8 .
However, the results from other studies based upon deep X-ray surveys (e.g. Piconcelli et al. 2003;Ueda et al. 2003;Cowie et al. 2003; Barger et al. 2003Barger et al. , 2005, appear to contradict some of the foundations of these simple synthesis models. These studies all found that on average, the absorbed AGN which are optically identified lie at lower red-shifts, and have lower intrinsic luminosities, than their unabsorbed counterparts. Clearly the relationship between absorption and luminosity in the AGN population requires further examination. Therefore, in order to constrain the relationship between AGN luminosity, redshift and absorption we have undertaken a study of the X-ray properties of AGN in the Chandra Deep Field-South (CDFS). In this study we examine the 500 ks XMM-Newton European Photon Imaging Camera (EPIC) observations of the CDFS (hereafter XMM-CDFS) which provide a superb dataset for measuring the spectral properties of faint X-ray sources. The EPIC imaging reaches to fluxes well below the break in the 2-5 keV source counts, covers around 0.19 deg 2 and contains enough photons to permit broad band X-ray spectral analysis for even the faintest sources. The studies of Streblyanska et al. (2004) and Braito et al. (2005) have also used the XMM-CDFS dataset to investigate the X-ray spectra of a number of the brighter AGN in the field. However, until this work, there has been no investigation of the X-ray properties of the entire source population detected in the XMM-Newton imaging.
The EPIC data are complimented by Chandra observations (Giacconi et al. 2002;Lehmer et al. 2005), which provide sub-arcsecond positions for the majority of the EPIC sources, allowing us to identify uniquely the X-ray sources with optical counterparts. What is more, the entire EPIC field of view is covered by the COMBO-17 survey (Wolf et al. 2004), providing photometric redshift estimates for nearly all R < 24 optical counterparts. We use an extensive Monte Carlo simulation process to recover the NH and intrinsic luminosity of the AGN detected in our sample. We also use the simulations to compare directly the distribution of sources with the predictions of a number of AGN population models.
This paper is laid out as follows. In section 2.1 we describe the reduction of the XMM-Newton data, and the source detection process. In section 3 we detail how we have correlated this with the other datasets available in the CDFS. In section 4 we introduce our Monte Carlo method for calculating the absorption and luminosity of the sample, and we demonstrate its fidelity. In section 5 we present the distribution of absorption, luminosity and redshift in the XMM-CDFS sample and compare it to the predictions of a number of AGN population models. Finally, in section 6 we compare our results to those found by other studies and discuss the implications for AGN population models.
Throughout the paper we use a lambda-dominated flat cosmology with H0 = 70 km s −1 Mpc −1 , (ΩM , ΩΛ) = (0.3, 0.7). SE a−Eb denotes the flux of a source in the observed Ea-E b keV band, corrected for Galactic absorption. L2−10 refers to an object's intrinsic X-ray luminosity (that is, before absorption), in the rest-frame 2-10 keV energy band. NH is the equivalent hydrogen column density in units of cm −2 .  Table 1. Summary of the XMM-Newton observations in the CDFS, showing the observation ID numbers, dates, exposure times, and telescope position angles (PA). Exposure times are given in kiloseconds for the pn and the average of the two MOS detectors, and indicate the length of good time remaining after removing periods of high background. observations have position angle rotated ∼ 180 • with respect to the July 2001 observations. The XMM-Newton data cover ∼ 0.19 deg 2 (nearly twice the sky area of the 1Ms Chandra observations), and total around 500 ks. All three EPIC detectors (MOS1, MOS2 and pn) were operated with the 'Thin1' filters and were in full frame mode. The XMM-Newton data were reduced using standard Science Analysis Software (SAS, version 6.0) tasks, following the method described by Loaring et al. (2005). After temporally filtering periods of enhanced particle background from the event lists, we are left with approximately 340 ks of pn, and 395 ks of MOS exposure time. We notice an enhancement of the 0.2-0.5 keV background level for CCD #5 of MOS1, therefore we have discarded all the data from this chip in this energy range. We note that this effect has been reported recently by Pradas & Kerp (2005). We make an image, and an exposure map for each of the EPIC cameras, in each of the 0.2-0.5, 0.5-2, 2-5 and 5-10 keV energy bands, and for each of the eight observations. We also produce out-of-time event images for the pn camera in each of the four energy bands and for each observation which are used later by the background fitting algorithm. We have tied our coordinate system to the positions of (relatively) bright point-like X-ray sources detected in the 1Ms Chandra imaging of the field, taking into account the (-1.1 ′′ , 0.8 ′′ ) offset between the Chandra positions and optical counterparts (Giacconi et al. 2002). A summary of the XMM-Newton observations is given in table 1. Visual inspection of the 0.2-0.5 and 0.5-2 keV images reveals four regions of large scale (> 1 ′ diameter) diffuse emission, the locations of which are shown in figure 1. The most likely origin of this emission is from highly ionised gas in galaxy groups or clusters, the study of which is outside the scope of this AGN paper. The background fitting algorithm we have employed is not designed to remove diffuse emission on these scales. This unsubtracted diffuse emission will affect our measurement of the X-ray spectral properties of any AGN lying in these four regions, therefore we have excluded them from further analysis. The total sky area removed amounts to ∼ 28 arcmin 2 (4% of the total XMM-Newton sky coverage).
We have used the iterative background fitting and  The combined XMM-Newton EPIC MOS+pn 0.5-2 keV exposure map. The peak pn-equivalent exposure time is ∼ 540ks. We show the regions covered by the COMBO-17 survey (large black dashed rectangle), the 1Ms Chandra imaging (white polygon), and extended Chandra field (large black polygon), and the approximate area covered by the VVDS (smaller white dashed rectangle).
multi-band source searching method to detect sources in the EPIC images. This process is described in detail by Loaring et al. (2005), but for completeness we give a summary here. Our method uses the SAS source searching routines EBOXDETECT and EMLDETECT, together with an iterative background fitting algorithm to detect sources simultaneously in the multi-band EPIC images. In order to maximise the sensitivity, the final source searching and source characterisation is carried out on composite images (one per energy band), summed over all observations and summed over all three EPIC detectors (MOS1, MOS2 and pn). In order to take account of the different effective areas, exposure times, and chip layouts/gaps of the EPIC detectors, we generate a composite exposure map. The contribution to this map from the MOS detectors was scaled according to the MOS/pn response ratio. The relative sensitivity of the MOS and pn detectors in each energy band was calculated with the XSPEC spectral fitting package (Arnaud 1996) using standard on-axis EPIC pn and MOS response matrices. A similar process was used to determine the absolute pn count-rate to flux conversion factors in each energy band. For these calculations, we assume a power-law spectrum with photon-index 1.7, corrected for Galactic absorption of 8 × 10 19 cm −2 , (Rosati et al. 2002). For the purposes of the final source detection process (in which an off-axis dependent PSF model is used), the position of the optical axis in the combined images is set to be the pn exposure-weighted mean of the pointings of the eight separate observations. For each candidate detection, the source searching routine EMLDETECT reports a multi-band detection likelihood parameter, DET ML, where DET ML = −lnP . P is the probability (taking account of all four energy bands), that a random background fluctuation would occur within the detection element with an equal or greater number of source counts than the candidate detection. At the default minimum DET ML level of 5.0, the "raw" EMLDETECT sourcelist contains 435 detections. It is not necessary for these sources to be individually detected in all four energy bands. At this low detection threshold, we expect a number of spurious detections to contaminate the faint end of the XMM-Newton sample; our method for dealing with these is discussed in the next section. A small number of the detections have very poorly determined positions (σpos > 5 ′′ ), or have poorly determined extent (the 90% error on the measurement of the extent is greater than the extent itself), and so we remove these from our sourcelist. We define three hardness ratios, where RE a −E b is the combined MOS+pn source count rate, corrected for vignetting, in the Ea to E b (keV) energy band.

MATCHING TO Chandra AND OPTICAL CATALOGUES
3.1 Cross correlation with Chandra observations of the field The original 1Ms Chandra imaging of the CDFS covers the central part of the XMM-Newton field of view (FOV) to great depth (Giacconi et al. 2002;Alexander et al. 2003, see fig. 2). Recently, the Chandra sky coverage of the CDFS has been increased by a mosaic of four 250 ks Chandra pointings: the Extended Chandra Deep Field-South (E-CDFS) (Lehmer et al. 2005). We used the higher positional accuracy of the Chandra observations to aid unambiguous optical identification of the XMM-Newton sources. We matched the XMM-CDFS detections to sources in a combined Chandra catalogue constructed from the catalogues of Giacconi et al. (2002) and Lehmer et al. (2005). We have taken into account the (-1.1 ′′ , 0.8 ′′ ) offset between the Chandra positions and optical counterparts in the Giacconi et al. (2002) catalogue. For those Chandra sources which appear in both the Giacconi et al. (2002) and Lehmer et al. (2005) catalogues, we mainly used the positions from the former. The point spread function of the XMM-Newton EPIC detectors is strongly off-axis angle dependent (Gondoin 2000). At large off-axis angles the azimuthal component of the PSF becomes rather extended, whereas the radial component remains relatively constant. Therefore, we match the XMM-CDFS detections to Chandra counterparts using an ellipsoidal region. The semi-major axis of this ellipse is increased from 5 ′′ for sources at the centre of the field, up to a maximum of 10 ′′ for sources at off axis angles greater than 15 ′ . The semi-minor axis is kept constant at 5 ′′ , and is oriented parallel to the line joining the source position and the nominal optical axis of the EPIC-pn detector. In addition, for the few XMM-CDFS detections which EMLDETECT determines to be slightly extended, we increase both semi-axes of the search ellipse by the measured extent. This choice of X-ray position matching criteria is discussed further in Appendix A. Using these positional criteria, we find that 330 of the 431 XMM-CDFS detections are matched to Chandra sources; 185 of these matches are to sources in the 1Ms Chandra catalogue, and 145 are to sources in the E-CDFS catalogue. For the XMM-CDFS sources with no Chandra counterpart, we have manually examined the XMM-Newton and Chandra images. We find that in 15 cases, there is a nearby Chandra counterpart just outside the matching ellipse. We adopt the Chandra positions for these sources. Figure 3 shows the positional offsets between these matched XMM-CDFS and Chandra sources.
The determination of the XMM-Newton detection likelihood limit is a balance between the desire to include as many sources in our sample as possible, against the need to minimise the number of spurious detections. We could simply reject all XMM-Newton detections that are not matched to Chandra sources. However, whilst the headline flux limits achieved by the Chandra observations are fainter than those for the XMM-CDFS, the coverage is not uniform over the XMM-Newton FOV. What is more, the relative sensitivity of the XMM-Newton and Chandra detectors varies with energy. In particular, XMM-Newton EPIC is much more sensitive than Chandra at very high photon energies (> 5 keV), and at very low photon energies (< 0.5 keV), so sources having either very hard or very soft spectra will be preferentially detected with XMM-Newton. The XMM-Newton and Chandra observations of the CDFS span approximately four years, therefore intrinsic variability on timescales of several months to years could also account for sources appearing in some catalogues and not others. For these reasons, and because this study is based primarily upon XMM-Newton data, we curtail our XMM-CDFS sourcelist purely on the basis of the XMM-Newton detection likelihood. However, we set the level of this such that approximately 90% of the XMM-Newton detections have Chandra counterparts. At a detection likelihood threshold of 8.5, we find that there are 335 XMM-Newton detections and 302 (90.1%) of these have at least one Chandra counterpart.
There are 16 cases where a XMM-CDFS detection has more than one Chandra source inside (or very close to) the matching ellipse. In order to determine whether these are genuinely confused sources, we have manually inspected the XMM-Newton images, the 1Ms Chandra images 1 (Alexander et al. 2003), and the E-CDFS images 2 (Lehmer et al. 2005). In one case, the "confusion" appears to be the result of a single real astrophysical source appearing in both the 1Ms and E-CDFS Chandra catalogues. We chose the E-CDFS source in this case. We find that in the other cases, the XMM-Newton detection is matched to a clearly separated pair of Chandra sources. However, for four of these, there is a large brightness contrast between the two Chandra sources, and so we do not consider the XMM-CDFS source to be "confused". We have removed from our sample the remaining eleven truly "confused" XMM-Newton detections because their XMM-Newton determined properties are superpositions of more than one real astrophysical source. The small number of detections that we have had to remove demonstrates that source confusion plays only a small role (∼ 3% of sources) in (high Galactic latitude) XMM-Newton surveys of several hundred kiloseconds.

Optical counterparts and redshifts
The CDFS has been the target for a number of space and ground based deep optical/infrared imaging campaigns, which cover different parts of the field to different depths (e.g. Arnouts et al. 2001;Wolf et al. 2004;Giavalisco et al. 2004). A large amount of VLT-FORS time has been expended on optical spectroscopy of counterparts to X-ray sources detected in the 1Ms Chandra survey (Szokoly et al. 2004). Zheng et al. (2004) used these spectroscopic identifications together with optical and NIR measurements to estimate redshifts for virtually all of the X-ray sources in the 1Ms Chandra catalogue of Giacconi et al. (2002). Initially, we adopted the Zheng et al. (2004) estimates for all of the 167 XMM-CDFS detections matched to sources in the 1Ms Chandra catalogue. However, as noted by Barger et al. (2005), a number of the Zheng et al. (2004) optical counterparts have relatively large offsets from the X-ray source positions. We have manually examined the Chandra vs optical positions for the sources where the optical position stated by  The differences between the positions determined using XMM-Newton and Chandra as a function of XMM-Newton off-axis angle. The XMM-CDFS sources with Chandra counterparts are shown with circles for matches to Chandra 1Ms sources, and triangles for matches to E-CDFS sources. Those XMM-Newton sources manually matched to a Chandra source are highlighted with boxes. XMM-CDFS sources which do not have Chandra counterparts are shown with small square symbols at offset = -0.5.. The "confused" XMM-Newton detections are marked with crosses. Bottom panel: The differences between the X-ray position and the position of the optical counterpart, as a function of XMM-Newton 0.2-10 keV flux. The XMM-CDFS sources which have Chandra counterparts are shown with circles (1Ms matches) and triangles (E-CDFS matches), and those which do not have Chandra counterparts are shown with small square symbols. XMM-CDFS sources having no optical counterpart within the matching region are placed at offset =-0.5. Zheng et al. (2004) is more than 2 ′′ from the 1Ms Chandra position, or where there is more than one COMBO-17 source within 2 ′′ of the 1Ms Chandra position. We have drawn upon the Chandra 1Ms images (Alexander et al. 2003), the E-CDFS images (Lehmer et al. 2005), the COMBO-17 optical images 3 (Wolf et al. 2004), and the GEMS/GOODS ACS images 4 (Caldwell et al. 2006), to choose the most likely optical counterparts. For sixteen cases (i.e. 10% of those sources that were examined), we decided that an alternative opti-3 http://www.mpia-hd.mpg.de/COMBO/combo CDFSpublic.html 4 ftp://archive.stsci.edu/pub/hlsp/gems/ cal source was a more likely counterpart to the X-ray source. All of these alternative optical counterparts are closer to the X-ray position than the counterpart chosen by Zheng et al. (2004), and most are optically fainter. The ID numbers of these sources in Zheng et al. (2004) are 3, 17, 23, 25, 36, 61, 64, 70, 97, 99, 213, 517, 528, 548, 591 and 641. Where possible, (four cases) we have used the COMBO-17 redshift estimates for our preferred counterpart. Otherwise we consider the X-ray source to be optically unidentified. We note that for another six of the Zheng et al. (2004) sources, the correct spectroscopic redshift is quoted, but an incorrect optical position is stated. These were all cases where the X-ray sources had multiple optical counterparts listed in Szokoly et al. (2004).
We note that several of the XMM-CDFS detections matched to 1Ms Chandra sources have faint (R > 25) optical counterparts for which photometric redshifts have been calculated by Mainieri et al. (2005b). However, in each case, the Mainieri et al. (2005b) redshift estimate is in agreement with the Zheng et al. (2004) value within the errors. So for simplicity and consistency, we prefer to adopt the Zheng et al. (2004) redshift estimate.
For the XMM-CDFS detections having a counterpart in the E-CDFS Chandra catalogue, and where an optical counterpart is found, we adopt the optical counterpart position from Lehmer et al. (2005). We use the COMBO-17 photometric redshift if there is an object in the COMBO-17 catalogue within 1 ′′ of the optical position given by Lehmer et al. (2005). We find that in several cases, the optical counterpart has also been spectroscopically identified (i.e. it appears in the VVDS catalogue of Le Fevre et al. 2004, or the Szokoly et al. 2004 field galaxy list), and so for these XMM-CDFS sources we adopt the spectroscopic identifications.
For the 33 XMM-CDFS detections having no Chandra counterpart, we have attempted to assign an optical counterpart from the COMBO-17 catalogue. Our starting point was to choose the optically brightest source inside the variable matching ellipse discussed earlier. We then manually examined the X-ray and optical images to determine if this was the correct choice. For most of these sources, we adopted the initial choice of counterpart. However, for two sources, we chose an alternative optical counterpart because of a co-location with an enhancement in the E-CDFS image. For 13 of the XMM-CDFS sources without Chandra counterparts, the XMM-Newton detection is most likely due to diffuse emission from a group or cluster. That is, there are several galaxies having similar photometric redshifts located close to the XMM-Newton position. We have excluded these detections from our final XMM-CDFS sample, as they are unlikely to be AGN. There is one XMM-CDFS detection located away from the centre of a bright (R ∼ 17) face-on spiral galaxy. The soft X-ray colours of this source, together with its non-detection by Chandra suggest that it is likely to be due to diffuse emission, and so we remove this detection from the sample. Finally, there were two XMM-CDFS sources located at the edge of the XMM-Newton FOV, outside the E-CDFS and COMBO-17 coverage. For simplicity, we have removed these two sources from our sample. The lower panel of figure 3 shows the X-ray-optical position differences for all the XMM-CDFS sources as a function of X-ray flux.
We note that the XMM-CDFS sample contains several low redshift sources with fluxes that imply very low luminosities (L2−10 ∼ 10 40 erg s −1). It is therefore feasible that our sample contains a small number of ultra luminous X-ray (ULX) sources. Lehmer et al. (2006) have recently used Chandra deep field data to look for low X-ray luminosity sources lying off-axis in low redshift, optically bright galaxies. The Lehmer et al. (2006) sample contains 8 objects within the area covered by the XMM-Newton observations of the CDFS, but only one of these (J033234.73-275533.8 at z = 0.038) is associated with an XMM-CDFS source. We exclude this object from our sample because it is unlikely to be an AGN.
In summary, after applying our XMM-Newton detection likelihood and positional criteria, and after removing the detections which are confused or unlikely to be point sources, there are 309 sources in the XMM-CDFS sample. Of these, 291 have Chandra counterparts, 278 are matched to optical counterparts, and 259 (84%) have optical spectroscopic identifications and/or photo-z estimates. Fifteen of these are associated with Galactic stars, and one with a candidate ULX in a low redshift galaxy. Figure 4 compares the optical magnitudes of the XMM-CDFS sources to their 0.2-10 keV X-ray flux. Note that over a third (109/309) of the XMM-CDFS sources are optically faint (R > 24).

ESTIMATING THE INTRINSIC PROPERTIES OF THE SAMPLE USING X-RAY COLOURS AND MONTE CARLO SIMULATIONS
It has been shown by several authors (e.g. Mainieri et al. 2002;Della Ceca et al. 2004;Perola et al. 2004;Dwelly et al. 2005) that X-ray hardness ratios can be utilised to determine the spectral properties of faint XMM-Newton sources, and in particular, the amount of absorption. This approach relies on the observed source belonging to some assumed family of spectral types, for which the spectral parameters can be deduced. Spectral analyses of relatively bright AGN have shown that nearly all can be broadly described by a spectral model consisting of a primary power-law with slope Γ ∼ 1.9, attenuated with some absorbing column of neutral material (e.g. Piconcelli et al. 2003;Page et al. 2006). A number of additional spectral model components are sometimes required to provide the best fits to the highest signal to noise AGN spectra, although these extra components are generally much less important than the primary power-law component when considering broad band X-ray colours. However, in our previous work (Dwelly et al. 2005), we found that we were able to provide a better match to the X-ray colours of AGN detected in the 13 H deep XMM-Newton field by including an unabsorbed cold reflection component to the AGN spectral model; we use this as our baseline spectral model. The most important effect of the reflection component is to harden the spectrum at high energies (E > 5 keV), and thus it reduces the amount of absorption needed to explain the X-ray spectrum of AGN with hard-sloped spectra.
The traditional approach to estimate the absorption in faint X-ray detected AGN, is to fit the observed multi-band X-ray hardness ratios to a model spectrum using a spectral fitting package such as XSPEC. However, there is a large degree of degeneracy in such an approach because of the number of fitted spectral parameters (NH and/or Γ), compared to the limited number of data points (it is typical for authors to use just a single hardness ratio measure between the 0.5-2 and 2-10 keV energy bands). We have devised a novel Monte Carlo approach, in which we deduce the intrinsic properties of sources in the XMM-CDFS sample by comparing them to the "output" properties of a library of simulated AGN. This method allows us to take account of the scatter of sources in multiple HR space (which may be strongly asymmetric, and is dependent on the observations), and so allows a rigorous estimation of confidence intervals. What is more, we can use our simulation method to compare directly the source distributions predicted by various synthesis models, against those seen in our sample. We first summarise the method used to generate a simulated library of sources, and then describe the absorption and intrinsic luminosity estimation processes.
This approach allows us to treat all the identified sources in the XMM-CDFS sample in a consistent, uniform fashion, as opposed to examining the brighter sources using a different method to the fainter sources.

Monte Carlo simulations of the AGN populations
A detailed description of our Monte Carlo simulation process is given in Dwelly et al. (2005). Here we give a short summary. A population of AGN is randomly generated according to a model of the intrinsic X-ray luminosity function (XLF).
To calculate the expected number of input AGN per field, the model XLF is integrated over the ranges 0.015 < z < 5.0, and 10 40 < LX < 10 48 erg s −1 (extrapolating from published models where necessary). For each of these AGN we assign a random redshift and luminosity, where the probability of a source having a particular value of z and LX is taken from the XLF model. Each AGN is then randomly assigned values of NH and Γ according to the respective model distributions. The z, LX , NH and Γ for each source are converted to multi-band XMM-Newton EPIC count-rates using the spectral model together with the EPIC response matrices. We adopt a spectral model consisting of a primary transmitted component and a component reflected from neutral material. The primary component is modelled as a powerlaw with an exponential cutoff (E cutoff = 400 keV), absorbed by a neutral column of material. The reflection component is calculated from the pexrav model of Magdziarz & Zdziarski (1995), and we set the reflecting material to cover π steradians, to be inclined at 30 degrees to the viewing direction, and to have solar abundances. We then simulate how this population would appear if it had been observed in the same manner as for the XMM-Newton observations of the CDFS. We generate simulated images separately for every combination of the four energy bands, three EPIC cameras, and eight observations, totalling 96 images per simulated field. These images incorporate the effects of the EPIC point spread function and detector response, use realistic exposure maps, and include a background at the same level as is observed in the real data. We sum these images over all observations and for the MOS1, MOS2 and pn cameras to produce one image per energy band. We mask out the four regions in the simulated images which are affected by diffuse emission in the real data.  Background maps are calculated independently for each of the 96 images, (as for the real data) and then summed to produce one background map per energy band. We use the tasks EBOXDETECT and EMLDETECT to search the images for sources in the four energy bands simultaneously. The resultant output sourcelists are curtailed using the same detection likelihood and positional accuracy criteria as for the real XMM-Newton dataset. We use the off-axis angle dependent matching ellipse described in section 3 to match output detections to input sources. To exclude output detections which are affected by confusion, we flag all output detections which are matched to two or more input sources having comparable (within a factor of five) full-band count-rates. This step mimics the method we used to find confused sources in the real XMM-CDFS sample.

Constructing the library of simulated sources
We have used the Monte Carlo method to generate a large reference library of simulated sources, and have used the following constituents to the AGN population model. We use the Luminosity Dependent Density Evolution (LDDE) XLF model of Ueda et al. (2003). The latter was fitted only over the 10 41.5 L2−10 10 46.5 erg s −1 range and to z = 3. In order to cover the range of luminosities expected in the XMM-CDFS sample, we have extrapolated this XLF model down to L2−10 = 10 40 erg s −1 , and out to z = 5. We use a model NH distribution in which the number of AGN per unit log NH is proportional to (log NH ) 8 , and is independent of luminosity and redshift. To recreate the intrinsic scatter in the spectral slopes of AGN (Piconcelli et al. 2003;Page et al. 2006), we assign a randomly selected Γ to each simulated source. These spectral slopes are randomly chosen from a Gaussian distribution centred on Γ = 1.9, with σΓ = 0.2, with the additional constraint that 1.2 < Γ < 2.6. We adjusted the absolute normalisation of the model XLF such that the sky density of sources with 0.5-2 keV flux 2 × 10 −15 erg s −1 cm −2 is similar to that measured in the XMM-CDFS sample (after removing confirmed non-AGN sources). A total of 2000 fields worth of simulations were carried out to generate the simulated source library; enough to ensure that redshift, luminosity and HR space is well populated with simulated sources, but not so large as to consume a prohibitive amount of processing time. We have verified that this population model broadly reproduces the source counts of the extragalactic sources in the XMM-CDFS sample.

Models of the AGN NH distribution
During this study, we test the predictions of a number of model NH distributions, which are described below.
The "(log NH ) β " NH distribution models: In these models the number of AGN with absorption NH , per unit log NH , is proportional to (log NH ) β , and is not dependent on redshift or luminosity. We have tested three variations by setting the parameter β to 2, 5, and 8. A similar parameterisation of the NH distribution was introduced in the XRB synthesis models of Gandhi & Fabian (2003).
The "T04" NH distribution model: Treister et al. (2004), introduce an NH model in which the number of AGN having a particular value of absorption is based on a model in which the density of the obscuring torus decreases with distance away from its plane. The torus geometry in this model is independent of redshift and luminosity.
The "GSH01A" and "GSH01B" NH distribution models: Gilli, Salvati & Hasinger (2001) investigated the ability of two absorption distribution models to reproduce both the shape of the XRB and the AGN source counts below 10 keV. In both models, the distribution of NH within the absorbed AGN was taken to be the same as that observed in nearby Seyfert 2 galaxies (Risaliti, Maiolino & Salvati 1999). In model "A", the number ratio between AGN with NH 10 22 cm −2 and those with NH < 10 22 cm −2 is fixed to be 4. In model "B" the ratio increases with redshift; at z = 0 the ratio is 4, and at z 1.32 the ratio is 10.
The "U03" NH distribution model: Ueda et al. (2003) fitted a luminosity dependent model to the distribution of NH in their AGN sample. In this model, the fraction of AGN having NH > 10 22 cm −2 decreases linearly with luminosity, from ∼ 0.6 of AGN with L2−10 10 43.5 erg s −1 , to ∼ 0.4 of AGN with L2−10 = 10 45 erg s −1 .

Absorption estimation technique
The measured properties of the XMM-CDFS sources which we use to estimate the absorption are the redshift (z), the vignetting corrected count rate in the 0.2-10 keV band (R0.2−10), and three hardness ratios (HR1, HR2, and HR3). The absorption and luminosity of each real source is estimated from those simulated library sources which have similar values of z ′ , R ′ 0.2−10 , HR1 ′ , HR2 ′ , and HR3 ′ . The following process is carried out for each optically identified source in the XMM-CDFS sample.
We first select the objects from the simulated library which have similar redshift (|z − z ′ | (1 + z) × 0.1), a similar full band count-rate, 0.5 < R ′ 0.2−10 /R0.2−10 < 2, and have similar hardness ratios to the real XMM-CDFS source The weaker constraint on HR3 reflects the poorer counting statistics at harder energies. We compensate for the possible influence that the shape of the baseline NH distribution used to generate the simulated library may have on the estimation process. Statistical weights are calculated by counting, for a large number of bins in NH , the numbers of library sources which satisfy the redshift and countrate criteria. The weight of each selected object is the inverse of the number counted in the NH bin in which it lies. A sliding box technique is then used to estimate the absorption in the real source from the NH values and weights of the selected objects. We choose the NH value where the sum of the statistical weight inside a sliding box of width 0.25 dex is maximised. The confidence interval is taken to be the range of NH about this peak which contains 68% of the statistical weight of the selected objects.

Intrinsic luminosity estimation technique
We estimate the intrinsic, rest-frame 2-10 keV luminosity (L2−10) of the real sources using a similar technique as that used to estimate absorption. For each XMM-CDFS source, we select the subset of sources from the simulated library which have similar redshifts, count rates, and hardness ratios (using the same criteria as before). We can account for the differences between the z, R0.2−10 of the real XMM-CDFS source, and the z ′ , R ′ 0.2−10 of each of the selected library sources: the L ′ 2−10 of the library sources are corrected by factors of R0.2−10/R ′ 0.2−10 and d 2 where dL is the luminosity distance). We then take the median of the corrected L ′ 2−10 of the selected subset of simulated sources as our estimate of the intrinsic luminosity of the real source. The confidence interval is given by the range of corrected L ′ 2−10 about the median value which contains 68% of the simulated subset.

Fidelity of the NH /LX estimation technique
We have measured the efficacy of our absorption/luminosity estimation technique by quantifying both how well it is able to estimate the NH /LX values of individual sources, as well as how well it can recover an NH /LX distribution of a population of sources.

Ability to recover NH /LX of individual sources
We constructed a test population of simulated AGN using the method described in 4.2. The equivalent of one hundred XMM-CDFS fields were generated. For each test source, we made an estimate of absorption and intrinsic luminosity using our NH /LX estimation technique, in the same way as we would for the real sources. The estimated NH /LX values for each test source are then compared to the input parameter values. Figure 5 shows the relationship between the estimated and input NH for test sources in a number of redshift ranges. The technique recovers the input NH values very well for test sources having moderate to heavy absorption. However, for low absorbing columns, the scatter increases rapidly. This becomes increasingly apparent at higher redshifts, as more and more of the absorption is shifted out of the EPIC bandpass. We have calculated the level below which less than 68% of test sources have NH estimates within 0.5 dex of the input value. This ranges from 10 21.1 cm −2 for sources in the 0 < z < 0.5 redshift range, up to 10 22.6 cm −2 for the 3 < z < 4 range. The estimation technique becomes less accurate at very high levels of absorption. This is not unexpected; for all but the highest redshift AGN having this level of absorption, virtually all the flux has been removed below 5 keV. Thus HR1 and HR2 contain little information, meaning that we have less diagnostic power to determine the amount of absorption. What is more, at these high column densities, the effects of Compton scattering, which are not included in our spectral model, will become significant in the spectra of the real sources. We are therefore cautious about the exact NH for any sources estimated to have an absorbing column of greater than 10 24 cm −2 , but we can be confident that such objects are very heavily absorbed. However, because of Xray selection effects, we expect our sample to contain rather few of these very heavily absorbed AGN.
In figure 6 we show the relationship between estimated and input intrinsic luminosity for the test sources. The high fidelity of the technique is evidenced by the low scatter of points about the one-to-one relation (less than ±0.2 dex for most of the luminosity range).

Ability to recover NH /LX distributions of a population of sources
We have investigated how well the estimation technique can recover an input model absorption distribution. In addition, we have checked that the initial choice of AGN population model used to generate the simulated source library does not have a major effect on the estimated NH /LX distributions. We extend the method of section 4.6.1 to generate several simulated test populations, each of which is based upon a different input NH model. The NH distribution of each population is then recovered using our NH estimation technique. Figure 7 shows a comparison of the input and recovered absorption distributions for three different input NH models. We show the input distribution of absorption in the test model, as well as the distribution in the sources that are output by the Monte Carlo simulation process. Heavily absorbed sources are less likely to be detected by the XMM-Newton observations than sources with lower absorbing columns, this selection effect results in the differences between the input and output distributions (see section 4.7). Above column densities of ∼ 10 21 cm −2 , the distribution of absorption recovered by our NH estimation method is a good approximation to the "true" NH distribution in the output test population. There are disparities at low absorbing columns, where the estimation method can provide only weak constraints on the NH values of the test sources. However, the total number of output test sources having estimated NH 10 21 cm −2 is consistent with the total number having "true" NH 10 21 cm −2 .
We have investigated whether the luminosity function used to construct the simulated source library has an ef-  fect on the outputs of our NH /LX estimation process. We first generated a test population of AGN distributed in redshift/luminosity space according to the XLF model of Ueda et al. (2003), and with an absorption distribution following the (log NH ) 8 model. We then used our NH /LX estimation technique to recover the absorption and luminosity of these test sources. This was carried out twice, firstly us- . Plots demonstrating the independence of our N H /L X estimation technique from the X-ray luminosity function model used to generate the simulated source library. In the top panel, the solid line shows the distribution of "true" N H for a population of test sources, in this case generated using the (log N H ) 8 model together with the model XLF of Ueda et al. (2003). The dotted line is the N H distribution estimated when we use the simulated source library generated from the model XLF of Ueda et al. (2003). The dot-dash line shows the estimated distribution when we use a second source library which was generated from the "LDDE1" model XLF of Miyaji et al. (2000). The lower panel shows the equivalent plot for the input and estimated distributions of intrinsic luminosity.
ing the original simulated source library (generated according to the model XLF of Ueda et al. 2003), and secondly using a new simulated source library in which the sources are distributed according to the "LDDE1" XLF of Miyaji, Hasinger & Schmidt (2000). Note that the "LDDE1" XLF model is defined in the observed 0.5-2 keV band, with no correction of luminosities for absorption. For the purposes of this study we convert the 0.5-2 keV observed frame luminosities of the Miyaji et al. (2000) model to rest frame 2-10 keV luminosities assuming a mean power law spectrum with slope Γ = 1.9. This conversion does assume that the Miyaji et al. (2000) sample is predominantly AGN unabsorbed in the X-rays (see section 3.1 of Miyaji et al. 2000). Figure 8 shows the resultant NH and LX distributions recovered using the two different source libraries. The differences between the recovered NH distributions are relatively small: for accurately measurable absorbing columns (10 21.5 < NH < 10 24.5 cm −2 ) they agree to better than 10%. We compare this to the Poisson noise of 15% in the 0.5 dex wide bins of NH measured in the XMM-CDFS sample. We conclude therefore that our NH estimation technique is not strongly dependent on the AGN population model chosen to generate the simulated source library. However, we note that for L2−10 > 10 44 erg s −1 , there is a significant difference between the luminosity distributions recovered by the two simulated source libraries (which have markedly different redshift/luminosity distributions). In order to mitigate this effect, when applying our estimation technique to the real XMM-CDFS sample, we should use a simulated source library in which the sources have a broadly similar luminosity and redshift distribution to the sources in the sample. Therefore, for the remainder of this study, we have used a simulated source library that is generated according to the model XLF of Ueda et al. (2003), as described in section 4.2.

X-ray completeness of the XMM-CDFS sample
The probability that an AGN in the CDFS will be detected in the XMM-Newton observations depends on the object's redshift, luminosity, absorption, and position in the FOV. In order to quantify the selection function in the XMM-CDFS sample, we have compared the input and output sources in a large simulated population generated using our Monte Carlo process. The X-ray completeness is simply the ratio of the number of output detections to the number of input sources, and is calculated for a number of bins in redshift, luminosity and absorption. Figure 9 shows the regions in luminosity, absorption, and redshift space where at least half of the input sources have output detections. The XMM-CDFS observations are capable of detecting at least half the members of any population of luminous (L2−10 10 44 erg s −1 ) obscured QSOs at z ∼ 2 even if they are absorbed with large column densities (NH ∼ 10 23 cm −2 ).

Direct comparison of the sample with model AGN populations
We wish to compare directly the NH , LX , z distribution observed in the XMM-CDFS sample with the distributions predicted by various AGN population models. To accomplish this we have used our Monte Carlo process to simulate a number of model AGN populations, then have applied our NH /LX estimation technique to recover the NH /LX distributions of the model populations. In this way we incorporate both the complex X-ray selection effects of the XMM-Newton observations, and account for the limitations of the NH /LX estimation technique. Therefore, the output simulated source distributions from this process can be compared like-with-like to the real XMM-CDFS sample. We have compared the XMM-CDFS sample with the predictions made by the seven different NH model distributions described in section 4.3. Simulated populations are generated for each of these NH models, according to both the LDDE XLF model of Ueda et al. (2003) as well as the "LDDE1" XLF model of Miyaji et al. (2000). As before, the XLF models are extrapolated to low luminosities (L2−10 = 10 40 erg s −1 ), and high redshifts (z = 5). For each of the fourteen combinations of NH model and XLF model, the absolute normalisation of the XLF is adjusted in order that the simulated integral 0.5-2 keV source counts above 2 × 10 −15 erg s −1 cm −2 match the integral source counts in the extragalactic XMM-CDFS sample. We simulate 100 fields worth of sources for each combination of NH model and XLF model. Finally, we apply the NH /LX estimation process to recover the absorption and luminosity distributions of the simulated output populations.

Applying the NH /LX estimation technique to the XMM-CDFS sample
We find that our technique is able to evaluate the absorption and luminosity in the vast majority of the optically identified extragalactic sources in the XMM-CDFS sample. However, we find that because of their very soft spectra, two AGN are not matched to any objects in the simulated source library. We discuss the properties of these sources in Appendix C.
There is one XMM-CDFS source which we find to have a 2-10 keV luminosity less than 10 40 erg s −1 . For the purposes of all the comparisons made in this section, we have excluded this source because it lies outside the luminosity range simulated in the model AGN populations.

Source counts in the XMM-CDFS
In figure 10 we show the differential 0.5-2.0 keV source counts for the extragalactic (including unidentified) sources in the XMM-CDFS sample, and compare them to the predictions of several simulated model AGN populations. The shape of the predicted source count curves is dependent predominantly on the form of the XLF model rather than on the NH distribution. We see that the XLF model of Miyaji et al. (2000) predicts a 0.5-2 keV source count distribution which is rather steeper than that found in the XMM-CDFS. At 0.5-2 keV fluxes above 10 −15 erg s −1 cm −2 , the source counts predicted by the Ueda et al. (2003)  We remind the reader that for each model of the AGN population, the XLF normalisation has been adjusted such that the integral source counts in the simulated population match the 0.5-2 keV extragalactic source counts measured in the XMM-CDFS sample. (see section 4.8). and GSH01A models, because rather similar distributions are found in the other NH models. It is clear that the redshift distribution predicted by the XLF of Ueda et al. (2003) is a far closer match to the redshift distribution of the XMM-CDFS sample than the prediction from the XLF model of Miyaji et al. (2000). Figure 11 shows that the same holds true for the luminosity distribution. We remind the reader that the Miyaji et al. (2000) luminosity function is defined in the observed 0.5-2 keV band, and that we have converted to intrinsic rest frame 2-10 keV luminosities assuming an unabsorbed AGN X-ray spectrum (a Γ = 1.9 powerlaw). This is of course a simplification; some of the AGN in the Miyaji et al. (2000) sample may be X-ray absorbed (and hence have hard spectra), and some of the AGN may have very soft Xray spectra. But this scatter of slopes is unlikely to be a significant issue, and certainly not sufficient to explain the large differences between the redshift and luminosity distributions predicted by the Miyaji et al. (2000) XLF model and those seen in the XMM-CDFS sample.

The redshift and luminosity distributions in the XMM-CDFS sample
In figure 12 we show redshift and luminosity distributions separately for the absorbed and unabsorbed AGN in the XMM-CDFS sample, and compare these to the predictions of the (log NH ) 8 and U03 NH models.

The NH distribution in the XMM-CDFS sample
In figure 13 we show the distribution of absorption in the optically identified sources in the XMM-CDFS sample determined using our NH estimation technique. The measured distribution is compared to the predicted distributions from the seven simulated NH models. The high fidelity of our NH estimation process means that the recovered absorption distribution in the XMM-CDFS sample contains more information than just the relative numbers of absorbed and unabsorbed AGN; we can compare the shape of the absorption distribution measured in the XMM-CDFS sample with the shapes of the distributions predicted by the simulated model AGN populations. We have used the Kolmogorov-Smirnov (KS) test to make this comparison, the results of which are shown in table 3. This test clearly discriminates between the models, with the T04 NH distribution being the most strongly rejected.

The absorbed fraction in the XMM-CDFS and its dependence on luminosity and redshift
In order to characterise the degree of luminosity and/or redshift dependence of the absorption distribution, we have examined the relative numbers of absorbed and unabsorbed AGN in the XMM-CDFS sample, and compared it to the predictions of the simulated model populations. We set the threshold for a source to be considered "absorbed" to be 10 22 cm −2 for sources at z < 3, and to be 10 22.6 cm −2 for  fig.11, but with the AGN divided into "absorbed" and "unabsorbed" objects. Here we set the threshold for an AGN to be considered "absorbed" to be 10 22 cm −2 for sources at z < 3, and 10 22.6 cm −2 for sources at z > 3. In each panel the solid histogram shows the XMM-CDFS sources, and the curves show the distributions predicted by coupling the XLF model of Ueda et al. (2003) with the (log N H ) 8 and U03 N H distribution models. The shaded areas in the lower two panels show the luminosity distribution of the optically unidentified sources in the XMM-CDFS sample if they are assumed to lie at z = 2.
sources at z > 3. When we tested the reliability of the NH estimation technique (see section 4.6.1), we found that for sources in the 3 < z < 4 range, 10 22.6 cm −2 was the lowest level of absorption for which the scatter of output NH about input NH was less than 0.5 dex. The total absorbed fraction in the identified XMM-CDFS sample is 0.39 ± 0.03, if the unidentified XMM-CDFS sources are assumed to lie at z=2, then the absorbed fraction becomes 0.45 ± 0.03. These respective values can be considered as lower and upper bounds on the "true" total absorbed fraction. For comparison, in  Figure 13. The distribution of absorption in the optically identified XMM-CDFS sample (histogram) in comparison to the predicted distributions from the seven N H models (curves). The model N H distributions are for simulated populations generated according to the XLF model of Ueda et al. (2003). The shaded area shows the N H distribution of the optically unidentified sources if they are assumed to lie at z = 2. For clarity, the model distributions are displayed in two groups.
with these bounds. For each NH model, the populations generated using the Miyaji et al. (2000) XLF have a slightly higher absorbed fraction than for the Ueda et al. (2003) XLF. This is because the former XLF model predicts more objects at high redshifts (see fig. 11), where heavily absorbed AGN are selected against less strongly. Figure 14 shows the absorbed fraction as a function of both redshift and intrinsic luminosity compared to the distributions predicted from the Monte Carlo simulations of the NH models. We see later that the optically unidentified sources, which are not included in fig. 14, have on average, harder X-ray colours than the identified sources, and so are likely to increase the absorbed fraction.
For L2−10 < 10 43.5 erg s −1 there is an apparent positive correlation between intrinsic luminosity and the absorbed fraction in the XMM-CDFS sample, as well as for each of the seven NH models. For the NH models which are not dependent on luminosity, this can only be due to the selection function of the XMM-Newton observations. For the luminosity dependent U03 NH model, we see a levelling out of the predicted absorbed fraction above L2−10 > 10 43.5 erg s −1 . This marks the transition from a regime in which the observed absorbed fraction is determined primarily by the XMM-CDFS selection function, to a regime in which the shape of the underlying NH distribution becomes more important. In table 3 we show the results of χ 2 tests which quantify how well the NH models reproduce the redshift and luminosity dependence of the absorbed fraction in the XMM-CDFS sample. The most stringent test of the AGN population models is to see how well they reproduce the distribution of XMM-CDFS sources simultaneously in absorption,luminosity and redshift space. Figure 15 shows the distribution in LX , z and NH of the XMM-CDFS sources and the predictions from three of the NH models. By using the three-dimensional Kolmogorov-Smirnov test (3D-KS), which requires no binning, we can make a statistical comparison which interrogates the maximum information content of the sample. However, as shown in figure 5, our absorption estimation method only weakly constrains NH for sources with very small absorbing columns. Therefore, in order to reduce the effect on the 3D-KS test of the large scatter at low absorption levels, all sources with absorption below a threshold of NH = 10 21 cm −2 , are taken to lie at this threshold. The conversion from the 3D-KS statistic, to a probability, is rather dependent on the size of the sample and the correlations within it (Fasano & Franceschini 1987). This is important for the test we wish to apply here because of the strong correlation between z and L2−10 in the sample. Therefore, we have adapted the method described in Dwelly et al. (2005) in which the conversion from the 3D-KS statistic to a probability is calculated numerically for the actual correlations, and real number of sources in the tested data set. The results of the 3D-KS tests are shown in table 3. We see that only the GSH01A NH model is able to reproduce the distribution of the XMM-CDFS sample in NH , z, L2−10 space with better than 1% probability.

DISCUSSION
We have carried out statistical comparisons of the absorption distribution found in the XMM-CDFS sample to the absorption distributions predicted by a number of NH models. A simple measure of the relative importance of absorbed and unabsorbed AGN over a range of redshifts and luminosities is found by measuring the fraction of AGN which are significantly absorbed. This tells us about the relative numbers of absorbed and unabsorbed AGN, and the redshift/luminosity dependence of this distribution. In addition, the high fidelity of our NH estimation process means that an analysis of the shape of the recovered absorption distribution in the XMM-CDFS sample is informative. This is important for XRB synthesis models such as that of Treister & Urry (2005) in which the shape of the NH distribution is closely related to the geometry of some "typical" absorbing torus.

Ability of AGN population models to reproduce the absorption distribution in the XMM-CDFS sample
The total NH distribution of the XMM-CDFS sample (see fig. 13) reveals that there is a wide range of absorbing columns present in the AGN population. The NH models we have tested reproduce the observed distribution with varying degrees of success (see table 3). In section 5.4, the GSH01A NH model was seen to provide the best match to the shape of the observed NH histogram. It is remarkable that the distribution of absorption in local Seyfert-2 galaxies, (on which the GSH01A model is based), provides a good match to a sample which reaches to QSO luminosities and to z = 3.7. However, this test only compares the total NH distribution. So if the underlying NH distribution of AGN is actually dependent on redshift and/or luminosity, then the total observed NH distribution will depend on where in redshift/luminosity space the sample lies. Therefore, in section 5.5 we divided the XMM-CDFS sample into several bins, firstly in luminosity, and then in redshift, and tested how well the NH models matched the observed absorbed fraction in each bin. Because it examines the fraction of absorbed sources, this comparison should not depend strongly on differences between the total redshift/luminosity distributions seen in the XMM-CDFS sample and predicted by the population models. We found that there was a marked contrast in the ability of the different NH models to reproduce the redshift and luminosity dependence of the absorbed fraction measured in the XMM-CDFS sample. The (log NH ) 2 , U03, GSH01B, and T04 NH models are all unable to reproduce the pattern seen in the XMM-CDFS sample. However, the (log NH ) 8 and GSH01A NH models provide statistically adequate (probabilities of order 0.20), fits to both the redshift and luminosity dependence of the absorbed fraction in the XMM-CDFS. The (log NH ) 5 model also provides a statistically adequate match, but less well than the latter two models.
The (log NH ) 2 and T04 NH models are both poor descriptions of the AGN in the XMM-CDFS sample, and are rejected with high confidence in all of the statistical tests. The former significantly under-predicts the number of absorbed AGN, and the latter significantly over-predicts the number.
The strong downturn in the absorbed fraction at high luminosities predicted by the U03 NH model is not seen in the XMM-CDFS sample. In addition, the U03 NH model predicts a total absorbed fraction of 30.1% compared to 38% seen in the XMM-CDFS sample. This implies that if indeed there are relatively fewer absorbed AGN at high luminosities, then the downturn can only be important at higher luminosities (> 10 45 erg s −1 ) than are covered by this sample.
We do not see evidence for the increase in the ab- Table 3. Statistical comparison of the XMM-CDFS sample with the predictions of the AGN population models. Columns 2 and 3 show the results of two χ 2 tests of the ability of the simulated N H models to reproduce the redshift dependence, and the luminosity dependence of the absorbed fraction measured in the XMM-CDFS sample. The redshift test is computed using the 0< z <1, 1< z <2, 2< z <3, and 3< z <4 bins, and the luminosity test is calculated using the 10 40 < L 2−10 < 10 42 , 10 42 < L 2−10 < 10 43 , 10 43 < L 2−10 < 10 44 , and 10 44 < L 2−10 < 10 45 erg s −1 bins. Column 4 shows the Kolmogorov-Smirnov (KS) probabilities that the N H distribution of the XMM-CDFS sample and the N H distribution predicted by each model population follow the same underlying distribution. Column 5 shows the three dimensional KS test probability that the distributions of the sample and model sources in N H , z and L 2−10 space follow the same underlying distribution.
Distribution of the absorbed fraction Overall source distribution KS prob. 3D-KS prob. sorbed fraction from z = 0 to z = 1.3 predicted by the GSH01B model. There is a suggestion that the absorbed fraction in the XMM-CDFS sample does increase at much higher redshifts (z > 3). However, as there are only seven XMM-CDFS sources in this redshift range, no definitive conclusion can be drawn from this dataset alone. At such high redshifts it is difficult to measure even large absorbing columns (NH 10 23 cm −2 ) because most of the effects of the absorption are shifted out of the XMM-Newton-EPIC bandpass. Five of the XMM-CDFS objects at z > 3 have been spectroscopically identified by Szokoly et al. (2004); two broad-line AGN (BLAGN), and three "high excitation line" galaxies (HEX). The optical classifications of the high redshift objects tally with the X-ray determinations of their properties; the BLAGN have X-ray colours consistent with little or no absorption, but the three HEX objects are heavily absorbed. The two other XMM-CDFS objects at z > 3  Figure 16. The X-ray hardness ratio distribution of the XMM-CDFS sample. The optically identified sources are shown with triangles. Sources without optical identifications are marked with circles. The two sources which do not match any objects in the simulated library are highlighted with boxes. For clarity, we show the median hardness ratio errors for the whole sample rather than the errors for each source. The solid line is the path in hardness ratio space for a model AGN lying at z = 1, with Γ = 1.9 and with absorbing columns ranging from zero to 10 24 cm −2 (graduations are marked, and labelled where space permits, for absorbing columns of logN H = 24, 23.5, 23, 22.5, 22, 21.5, 21 and 19). The hardness ratios, HR1 and HR2 are defined in the text.
have only photometrically determined redshifts, their X-ray colours indicate they both have significant absorption. Of the seven NH models examined in this study, the two which incorporate some redshift or luminosity dependence are both strongly rejected. The luminosity dependent U03 NH model predicts that there are few "type-2" QSOs, whereas the redshift dependent GSH01B NH model suggests that nearly all of the accretion power at z > 1.3 is obscured. Neither of these scenarios fits the pattern seen in the XMM-CDFS sources, which is much better described by models in which a similar distribution of absorption is found in the AGN population at all redshifts and luminosities (namely the (log NH ) 8 and GSH01A NH models).

The XMM-CDFS sources without redshift determinations
There are 50 XMM-CDFS sources for which we do not have a spectroscopic or photometric redshift. Here we discuss the nature of these X-ray detections. For "type-1" quasars at high redshift, optical identification is made relatively easy by prominent, broad emission lines in the rest frame UV. However, for absorbed AGN, in which the optical spectrum is primarily that of the host galaxy, determination of redshifts is much more difficult. In particular, the so called "photo-z desert" ( 1.5 < z < 2.5) occurs where typical galaxy spectra have no easily identifiable spectroscopic features in the observed optical band. However, with the addition of NIR data, this problem can be attenuated. For example, by utilising deep NIR photometry, Mainieri et al. (2005b) were able to photometrically identify faint (R > 25) counterparts to 1Ms Chandra sources in the CDFS sample. Indeed, these authors showed that these faint objects lie on average at higher redshift than the optically brighter counterparts. The unidentified objects in our sample are nearly all optically faint; 43/50 of the unidentified sources have R > 24.5 (see fig. 4). The three optically brightest unidentified sources do not have COMBO-17 redshift estimates because they lie close to bright stars. It is reasonable to assume that most of the unidentified sources have no redshift estimates because they lie at z > 1.5, that is, they are beyond the upper redshift limit for galaxies in the COMBO-17 survey (where the 4000Å break has left the reddest band). It is difficult to allow for such a bias against high redshift objects being identified in our XMM-CDFS sample, because the relationship between X-ray and optical properties of absorbed and unabsorbed sources is far from clear cut. The effect is mitigated by the high completeness (84%) of optical identification in our sample. However, figure 16 shows that on average, the unidentified sources have harder X-ray spectra than the identified sources. For example, the median HR1 value is 0.44 for the optically identified extragalactic sources, whereas it is 0.70 for the unidentified sources. We have tested the significance of this difference by making a two-dimensional Kolmogorov-Smirnov comparison of the distributions of the identified and unidentified sources in (HR1, HR2), space. The probability that the unidentified and identified extragalactic sources follow the same underlying distribution in (HR1, HR2) is 6 × 10 −5 .
As an experiment, we assign the unidentified XMM-CDFS sources a nominal redshift of z = 2 (approximately the middle of the "photo-z desert"), and then use the NH /LX estimation technique in the same way as for the identified sources. We find that 38/50 of the objects have NH > 10 22 cm −2 , and their median intrinsic luminosity is 10 44.0 erg s −1 . Figs. 11 and 13 show the resulting luminosity and NH distributions in the XMM-CDFS sample if we make this assumption. The number of XMM-CDFS sources in the NH 10 22 cm −2 , L2−10 10 44 erg s −1 regime (a common definition for a "type-2" QSO) is doubled from 23 to 46 if the unidentified objects are placed at z = 2. If in fact the unidentified sources lie at an average redshift greater than 2, then obviously the numbers of absorbed sources and their median luminosity will be higher. With the addition of these luminous, high redshift AGN, the observed redshift/luminosity distribution is closer to that predicted by the XLF of Miyaji et al. (2000). However, the unidentified sources cannot fully produce the numbers of high luminosity, high redshift AGN predicted by the XLF model of Miyaji et al. (2000).

The luminous absorbed AGN population
Rather few luminous absorbed (type-2) QSOs have been found in X-ray surveys to date. There are certainly fewer than predicted by XRB synthesis models in which a large fraction of the XRB is made up of luminous but highly absorbed AGN at redshifts 1.5-2.5 (Gilli, Salvati & Hasinger 2001;Comastri et al. 1995). In response to the lack of type-2 QSOs, models have been devised in which the fraction of AGN with significant absorption declines with increasing luminosity (Barger et al. 2005;Ueda et al. 2003). The physical interpretation is that highly luminous QSOs have the power to remove a significant fraction of the surrounding material, effectively increasing the opening angle of any circumnuclear structure, a so called "receding torus" model (e.g. Lawrence 1991). The X-ray samples detected in recent Chandra pencilbeam studies do contain large numbers of absorbed AGN, but they lie at lower redshifts, and have lower luminosities than the peak in type-1 QSO activity (LX ∼ 10 44 erg s −1 , z ∼ 1.5 − 2). It is this low redshift, low luminosity absorbed population which is postulated to take the place of the type-2 QSOs in making up the hard spectrum of the XRB. These findings, if they are taken at face value, raise important questions about the cosmic history of the AGN population. The implication is that, with soft X-ray and optical surveys, we have already detected the majority of intrinsically luminous accretion powered objects in the form of type-1 QSOs. In this scheme, because the absorbed population lies at low redshifts and luminosities, the total energy output that is powered by accretion, integrated over cosmic timescales, is significantly reduced from the predictions of the simplest "unified" schemes.
A few rare examples of type-2 QSOs, selected in Xray and/or optical surveys have been reported in the last few years (e.g. Norman et al. 2002;Della Ceca et al. 2003;Gandhi et al. 2004;Mainieri et al. 2005a;Ptak et al. 2006). In fact, two of these, the Norman et al. (2002) and Mainieri et al. (2005a) objects, appear in our XMM-CDFS sample. The type-2 QSOs appear to be far less numerous than the large population predicted by say the Gilli, Salvati & Hasinger (2001) XRB synthesis model. However, recent mid-infrared surveys with Spitzer have started to reveal the existence of significant numbers of type-2 QSOs. Surveys in the midinfrared are sensitive to emission originating in dusty material that is heated by a compact heat source, i.e. a powerful AGN. Martinez-Sansigre et al. (2005) have demonstrated that type-2 QSOs at z > 2 can be selected efficiently by choosing objects bright at 24µm, but faint at near-infrared and radio wavelengths. Using these criteria, they identified a population of luminous obscured QSOs between 1 and 3 times as numerous as the population of luminous type-1 QSOs found by optical surveys (e.g. Croom et al. 2004). The absorbed fraction of 50-75% found by Martinez-Sansigre et al. (2005) is comparable to the absorbed fraction at high luminosities of ∼ 75% predicted by the two NH models (namely the (log NH ) 8 and GSH01A NH models) which best match the pattern of absorption in the XMM-CDFS sample.

AGN with complex X-ray spectra
For the purposes of this study, we have considered only a rather simple AGN spectral model; an absorbed or unabsorbed power law (having a range of intrinsic spectral slopes) with a component of reflected radiation. However, high signal-to-noise X-ray spectra of brighter samples (e.g. Piconcelli et al. 2003;Mateos et al. 2005a,b;Page et al. 2006), have revealed that a small fraction of AGN have ionised rather than neutral absorbers, and that many AGN have additional spectral components. In particular, in the sample of Piconcelli et al. (2003), the spectral fits to ∼ 35% of the absorbed AGN were improved by the addition of an extra soft component. The most common hypotheses for the origin of this soft component are that it is either due to strong star formation activity in the host galaxy, or that it is reprocessed emission from the AGN itself.
It is possible that soft X-ray emission powered by intense star formation in the host galaxy could contribute to the spectra of some AGN, causing us to underestimate their absorbing columns. The single most X-ray luminous starburst galaxy in the local Universe, NGC3256, has a 0.5-10 keV luminosity of ∼ 5 × 10 41 erg s −1 in our assumed cosmology (Lira et al. 2002). If similarly powerful starbursts are common in AGN host galaxies, then we will expect to underestimate NH for some AGN. However, in the XMM-CDFS sample, nearly all (85%) of the optically identified sources have observed 0.5-10 keV fluxes 10 times the flux expected from NGC3256 if it were placed at the same redshift (K corrected assuming Γ = 2.5). Any contribution to the X-ray colours from star formation is expected to be dwarfed by the primary AGN component in these luminous objects.
X-ray emission from AGN can reach the observer indirectly via reprocessing in a body of photo-ionised plasma which extends well beyond the obscuring "torus" (e.g. Turner et al. 1997). For AGN where much of the direct Xray continuum is absorbed, the reprocessed emission may constitute a large fraction of the observed soft X-ray flux. For example, in the archetypal Seyfert-2 galaxy, NGC1068, virtually all of the soft X-ray flux can be attributed to reprocessed emission from the AGN (Kinkhabwala et al. 2002). However, the intensity of the scattered component is generally only a small fraction of the primary power law component (Turner et al. 1997;Page et al. 2006). So unless the AGN is very heavily absorbed (like NGC1068), the direct power law component will still dominate the observed spectrum.
We do include a reflection component in our spectral model. This has the effect of hardening the spectrum at high photon energies (E > 5 keV). Therefore, the addition of a reflection component to the spectral model reduces the absorbing column that is required to reproduce the X-ray spectrum of an absorbed AGN. For example, consider an example AGN at high redshift (where the additional reflection component becomes more important), and let us assume that for this AGN, we measure a hardness ratio between the 2-5 and 5-10 keV bands of HR3 = −0.5. For a simple powerlaw model with slope Γ = 1.9, HR3 = −0.5 corresponds to an absorbing column of 10 23.39 cm −2 at z = 2, in comparison, a slightly smaller column of 10 23.17 cm −2 is required to get the same HR3 value with the power-law plus reflection spectral model. Our NH /LX estimation technique uses information from all three HR measures, including the HR1 and HR3 bands where the addition of a reflection component in the model spectrum typically makes only a small difference. Thus we do not expect the results of this study to be strongly affected by our particular choice of spectral model.
In summary, we expect these extra spectral components to have only a small influence on our analysis of the NH function in the XMM-CDFS sample. Although additional soft components may be a common feature in absorbed AGN, their amplitudes are small in comparison to the primary power law component. The net effect will be that we underestimate the absorbing columns of some AGN.

Why have other X-ray surveys arrived at different conclusions?
Many authors have reported a lack of X-ray selected absorbed AGN at high redshifts and with high luminosities, and have developed luminosity dependent absorption schemes to explain this phenomenon (e.g. Franceschini et al. 2002;Steffen et al. 2003;Ueda et al. 2003;Barger et al. 2005;La Franca et al. 2005;Lamastra, Perola & Matt 2006). In the following subsections we discuss possible reasons why these studies have arrived at different conclusions to our own. Perhaps because of its excellent point source sensitivity, the majority of recent deep X-ray survey studies have relied on data from Chandra. The AGN population studies which have used XMM-Newton data have typically been limited to relatively bright fluxes (e.g. Piconcelli et al. 2003;Caccianiga et al. 2004;Della Ceca et al. 2004;Mateos et al. 2005a). Therefore here we pay particular attention to the differences between the capabilities of Chandra observations with respect to the XMM-Newton data used in this work.

Spectroscopic incompleteness
The 2-8 keV band luminosity function derived by Barger et al. (2005) relies at its faint end on an X-ray sample which is only 50-60% spectroscopically identified. Photometric redshifts raise their identified fraction, but will systematically miss objects in the 1.5-2.5 redshift interval. We have shown that the 50 sources without redshifts in our sample have harder than average colours, and therefore could be intrinsically luminous but heavily absorbed QSOs. It is possible therefore that the Barger et al. (2005) sample is underestimating the size of the population of such objects. This effect could explain the lack of objects in their sample lying at z > 1, and having observed 2-8 keV luminosities below 10 44 erg s −1 .

X-ray selection function
The faintest sources in the sample of Ueda et al. (2003) are taken from the first 1Ms observations of the Chandra Deep Field North. A 2-8 keV flux limit of 3.0×10 −15 erg s −1 cm −2 was applied to define the Ueda et al. (2003) sample (much shallower than the limit of the Chandra data). Many of the high redshift, absorbed sources in our sample would not have been selected with this criterion. Of the XMM-CDFS sources with z > 1 and NH > 10 22 cm −2 , most (29/52) have 2-8 keV fluxes below this limit (calculated from their 2-5 keV fluxes, assuming a power law slope of Γ = 1.4). Therefore it is not surprising that by extrapolating the XLF and NH model of Ueda et al. (2003) to fainter flux limits, we are unable to reproduce fully the XMM-CDFS sample.

The broad energy range of XMM-Newton EPIC
The combined EPIC detectors have an effective area (mirror plus detector quantum efficiency) of more than 1000 cm 2 at 7 keV compared to only ∼100 cm 2 for ACIS-I. The additional sensitivity of EPIC compared to ACIS-I at hard energies has two important effects for surveys of heavily absorbed AGN. Firstly, absorbed AGN are detectable with EPIC because of its sensitivity to the high-energy unabsorbed part of their spectrum. Secondly, the wide spectral range of EPIC provides better constraints on the spectral shape of sources, and hence allows a better measurement of absorbing column and intrinsic luminosity. XMM-Newton EPIC is also usefully sensitive over a much broader energy range (0.2-10 keV) than Chandra ACIS-I (0.7-7 keV). Section 4.4 describes how we took advantage of the soft X-ray sensitivity of EPIC by including data from the 0.2-0.5 keV band in our NH estimation technique. At low redshifts, we are sensitive to columns of NH 10 21.1 cm −2 . More importantly, for redshifts up to ∼ 3, we can reliably detect columns of NH 10 22 cm −2 , which is the traditional dividing line between absorbed and unabsorbed AGN (see section 4.6.1). However, X-ray classification schemes that are based on Chandra hardness ratios may miss considerable absorption in many high-z AGN, where even substantial absorbing columns can be shifted out of the ACIS-I sensitivity range. For example, the X-ray classification scheme of Zheng et al. (2004) uses a fixed Chandra hardness ratio as a dividing line between "type-1" and "type-2" AGN/QSOs. We find that there are nine high redshift (z > 2) absorbed (NH 10 22 cm −2 ) sources in the XMM-CDFS which are classified as "type-1" AGN/QSOs by Zheng et al. (2004). The Chandra XID numbers (see Giacconi et al. 2002) of these sources are 6,62,85,122,159,179,225, 506 and 517.

The effects of large scale clustering
The redshift distribution of X-ray sources in the 1Ms Chandra catalogue in the CDFS is dominated by narrow overdensities at z = 0.67, 0.73 containing 38 sources (Gilli et al. 2003). These over-densities are also seen in our sample with at least 26 XMM-CDFS sources lying in these redshift spikes. We note that large redshift spikes at z 1 also appear in the 2Ms Chandra Deep Field North catalogue Gilli et al. 2005). It is not yet clear whether such clustering of AGN at z 1 is a ubiquitous feature of the Universe, and therefore common over the whole sky. Despite the enhancements at low redshifts, the CDFS field has a total sky density of X-ray sources somewhat lower than other deep X-ray fields (Rosati et al. 2002;Manners et al. 2003;Nandra et al. 2004;Loaring et al. 2005). This suggests that the CDFS is under dense at higher redshifts. As described in section 4.8, for each of the simulated model AGN populations the normalisation of the XLF model was adjusted so that the simulated source counts matched the observed integral extragalactic source counts above a 0.5-2 keV flux of 2 × 10 −15 erg s −1 cm −2 . We found that an XLF normalisation 0.7 times that given in Ueda et al. (2003) is required in order for the simulated source counts predicted by their full population model (XLF and U03 NH model) to match the XMM-CDFS extragalactic source counts.
In figure 17 we compare the redshift versus B magnitude distribution of our sample with the distribution of the sources in the 1Ms CDFS catalogue. We see that the CDFS has rather few of the optically bright, high redshift objects typically detected in optically selected quasar surveys (e.g. at z > 2 in the XMM-CDFS sample, and none in the region covered by the Chandra 1Ms observations. The predicted numbers of such objects from the optical QSO luminosity function of Croom et al. (2004) are 3.1 in the ∼ 0.18 deg 2 covered by the XMM-Newton observations, and 1.9 in the ∼ 0.11 deg 2 of the 1Ms Chandra coverage. Given this mean sky density, there is an 82% probability that we would observe more than one BAB < 21 quasar at z > 2 in the XMM-CDFS. This is again consistent with the XMM-CDFS field being an under-dense region of the sky at high redshifts. Another way to investigate whether the CDFS is an underdense region of the sky is to compare the integrated emission from resolved X-ray sources to that found in other deep fields, and to the mean intensity of the extragalactic Xray background. We calculate the contribution to the XRB for just the sources detected in the inner 10 ′ of the XMM-CDFS field, where the XMM-Newton observations are most sensitive. Note that we have excluded from our sample X-ray sources associated with Galactic stars and groups/clusters of galaxies (see section 3.2).
Following Worsley et al. (2004), we use the De Luca & Molendi (2004) 2-10 keV XRB model to compute the expected extragalactic XRB intensity in the 0.5-2, 2-5, and 5-10 keV bands. De Luca & Molendi (2004) found that in the 2-10 keV band, the extragalactic XRB is well described by a power law with slope 1.41 ± 0.06, and total intensity of 2.24 ± 0.16 × 10 −11 erg s −1 cm −2 deg −2 , in close agreement with the result of Lumb et al. (2002). We do not consider the 0.2-0.5 keV band here because the extragalactic XRB intensity at these energies is poorly constrained.
The integrated flux of the XMM-CDFS sources and the mean extragalactic XRB intensity are shown in table 4. For Table 4. The contribution of the XMM-CDFS sources to the intensity of the extragalactic XRB. I XRB is the extragalactic XRB intensity in units of 10 −12 erg s −1 cm −2 deg −2 (see text) ΣS i /A is the summed flux from the individual XMM-CDFS sources, divided by the geometric area, in units of 10 −12 erg s −1 cm −2 deg −2 , and "frac" is the fraction of I XRB that this constitutes.  Worsley et al. (2004) find that the integrated fluxes from their sources are equivalent to 90 ± 6%, 73 ± 7%, 53 ± 7% and 42 ± 7% of the extragalactic XRB intensity in the 0.5-2, 2-4.5, 4.5-7.5 and 7.5-12 keV bands respectively. Although the XMM-Newton observations in the Lockman Hole are somewhat deeper than in the CDFS (680 ks of good pn time compared to 340 ks), the flux limits of the Worsley et al. (2004) sample are similar to those of the XMM-CDFS sample. More importantly, Worsley et al. (2004) calculate their countrate-to-flux conversion factor assuming a power law spectral model with slope Γ = 1.4, rather harder than the Γ = 1.7 used here. Converting from countrate to flux using Γ = 1.4 would change our calculated fluxes by factors of 1.092, 0.978 and 0.962 in the 0.5-2, 2-5 and 5-10 keV bands respectively. Even using Γ = 1.4, the integrated 0.5-2 keV flux of sources in the XMM-CDFS amounts to only ∼ 75% of that reported for the Lockman Hole by Worsley et al. (2004). This result is consistent with the Worsley et al. (2005) study which showed that the integrated 0.5-2 keV emission from sources in the 1Ms Chandra CDFS was significantly lower than that found in the CDFN and Lockman Hole. We also note that the 28 XMM-CDFS objects in the redshift spikes at z = 0.67, 0.73 account for a significant part of the summed flux from the XMM-CDFS sample: around 20% of the summed 0.5-10 keV flux of sources within 10 ′ of the aim point. These findings are consistent with the hypothesis that the CDFS is a relatively under-dense region of the sky.
It is important to point out that in our statistical comparison of the models and the XMM-CDFS sample in section 5.5, we first grouped the AGN into a number of bins in redshift/luminosity, and then calculated the fraction (rather than the absolute number) of AGN with significant absorption in each bin. Therefore, even if the AGN in the CDFS field do have a distribution in redshift/luminosity space which is unrepresentative of the Universe at large, we still expect our findings about the AGN absorption distribution to be valid.  Tozzi et al. (2006) sample, and for which we agree a redshift. The sources determined to have zero N H by Tozzi et al. (2006) are plotted at N H = 10 19.1 cm −2 . Sources treated by Tozzi et al. (2006) as "Compton Thick", and so fitted with a pure reflection spectrum, are marked with open boxes. Sources determined to have an additional soft component by Tozzi et al. (2006) are marked with triangles. The lower two panels show the equivalent histograms of absorption (left) and luminosity (right) for the sources common to both samples. The sources determined to have zero N H by Tozzi et al. (2006) are placed in the leftmost N H bin.
1Ms Chandra imaging of the CDFS. Here we briefly compare the results of the latter study with our own. After careful correction for selection effects Tozzi et al. (2006) find no evidence for a correlation between absorption and intrinsic luminosity, in agreement with our findings. The intrinsic (that is before selection effects) NH distribution derived by Tozzi et al. (2006) is a log-normal distribution broadly similar (at least below 10 24 cm −2 ) to the T04 model. However, the T04 model is strongly rejected as a good description of the XMM-CDFS population (see table 3), primarily because it underpredicts the number of unabsorbed sources in the XMM-CDFS and predicts too many sources with NH ∼ 10 22 cm −2 (see fig. 13).
We now compare the absorption and luminosity measurements for the sources common to the XMM-CDFS sample and the Tozzi et al. (2006) study. Redshifts for some of the Tozzi et al. (2006) sources are taken from Zheng et al. (2004), several of which we have determined to have incorrect optical counterparts. Therefore, for the purposes of this comparison, we consider only 142 sources out of the 158 sources which appear in both the XMM-CDFS and the Tozzi et al. (2006) samples. In the left hand panels of figure  18 we compare the NH measurements of these 142 AGN. It can be seen that there are some differences in the absorbing columns determined by the two studies. The most marked difference is for XMM-CDFS sources which we determine to have effectively zero absorption (NH < 10 21 cm −2 ), but which are found to have a high (NH > 10 22 cm −2 ) absorbing column by Tozzi et al. (2006). This effect is apparent in both the NH scatter plot and the NH histogram, with the Tozzi et al. (2006) study finding ∼ 2 times as many sources in the 10 22 < NH < 10 23 cm −2 bin. The Tozzi et al. (2006) spectral fitting has been carried out over the 0.7-7 keV ACIS-I sensitivity range. As discussed by Tozzi et al. (2006), the limited soft energy range of the Chandra data sometimes means that unabsorbed sources are spuriously scattered into the high absorption (NH > 10 22 cm −2 ) regime. This effect could explain the differences between the NH distributions measured for objects in common to XMM-CDFS and the Tozzi et al. (2006) study.
The luminosity measurements agree very well as demonstrated by the right hand panels of figure 18. The exceptions are the few sources for which Tozzi et al. (2006) have chosen to fit with a pure reflection model. For such objects, the luminosities we determine are typically lower than determined by Tozzi et al. (2006).

Obscured black hole growth over cosmic timescales
The absorption distribution of AGN in the XMM-CDFS sample is best matched by population models in which obscured AGN are ∼ 3 times as populous as unobscured AGN at all redshifts and luminosities, implying that most (∼ 75%) supermassive black hole growth was obscured. Various studies have attempted to reconcile the observed accretion powered luminosity density at high redshifts, with the locally observed relic black hole mass function (e.g. Fabian & Iwasawa 1999;Yu & Tremaine 2002;Elvis, Risaliti & Zamorani 2002;Marconi et al. 2004). Marconi et al. (2004) found that the AGN population model of Ueda et al. (2003, XLF and NH function) was consistent with the local black hole mass function if the mean accretion efficiency is ∼ 0.08. However, as we have shown in this study, the observed numbers of luminous, absorbed AGN are substantially higher than the predictions of the U03 NH model. Coupled with the Martinez-Sansigre et al. (2005) findings, the implication is that the mean accretion efficiency is higher than the Marconi et al. (2004) result, or alternatively that the local black hole mass density has been underestimated.

CONCLUSIONS AND SUMMARY
We have analysed the distribution of absorption in a sample of AGN detected in the very deep XMM-Newton observations of the CDFS. Importantly, spectroscopic or photometric redshift determinations are available for 84% of the X-ray sample. We determined the absorption and intrinsic luminosity of each AGN using a novel method which takes advantage of the high photon throughput and broad bandpass of XMM-Newton EPIC. The AGN in the XMM-CDFS sample were compared to the predictions of a number of model AGN populations, using Monte Carlo simulations which allow for the selection function and particulars of the XMM-Newton observations. We find no evidence for a decline in the fraction of AGN with significant absorption at high luminosities that has been reported by many authors.
The NH distribution models which most closely match the pattern of absorption in the XMM-CDFS sample are independent of redshift and luminosity. Our sample contains at least 23 heavily absorbed AGN with QSO-like luminosities. We postulate that nearly half of the objects without redshift determinations are also absorbed QSOs; the reason they are without COMBO-17 redshift estimates is because they lie in the "photo-z desert" (1.5 < z < 2.5). In order to confirm this hypothesis, the redshifts of these optically faint objects must be determined; photo-zs incorporating near-and midinfrared photometry should make this possible.
HR1 −0.1, softer than the bulk of the unabsorbed objects in the sample. However, the two unmatched sources have relatively normal HR2 and HR3 values, indicating that their spectral slopes flatten toward higher energies. For the purposes of this study, we assume that these two objects have zero absorption and calculate their rest frame 2-10 keV luminosities from their observed 2-5 keV flux assuming a photon index of 1.9. This paper has been typeset from a T E X/ L A T E X file prepared by the author.