Statistical properties of the combined emission of a population of discrete sources: astrophysical implications

We study the statistical properties of the combined emission of a population of discrete sources (e.g. X-ray emission of a galaxy due to its X-ray binaries population). Namely, we consider the dependence of their total luminosity L_tot=SUM(L_k) and of fractional rms_tot of their variability on the number of sources N or, equivalently, on the normalization of the luminosity function. We show that due to small number statistics a regime exists, in which L_tot grows non-linearly with N, in an apparent contradiction with the seemingly obvious prediction=integral(dN/dL*L*dL) ~ N. In this non-linear regime, the rms_tot decreases with N significantly more slowly than expected from the rms ~ 1/sqrt(N) averaging law. For example, for a power law luminosity function with a slope of a=3/2, in the non-linear regime, L_tot ~ N^2 and the rms_tot does not depend at all on the number of sources N. Only in the limit of N>>1 do these quantities behave as intuitively expected, L_tot ~ N and rms_tot ~ 1/sqrt(N). We give exact solutions and derive convenient analytical approximations for L_tot and rms_tot. Using the total X-ray luminosity of a galaxy due to its X-ray binary population as an example, we show that the Lx-SFR and Lx-M* relations predicted from the respective ``universal'' luminosity functions of high and low mass X-ray binaries are in a good agreement with observations. Although caused by small number statistics the non-linear regime in these examples extends as far as SFR<4-5 Msun/yr and log(M*/Msun)<10.0-10.5, respectively.


INTRODUCTION AND QUALITATIVE CONSIDERATION
In many astrophysical situations a problem arises to predict or interpret results of measurements of the total (combined) luminosity of a population of discrete sources.Among many examples are the total luminosity of X-ray binaries in a galaxy, or total flux of background sources detected above the sensitivity limit inside the field of view of a telescope.
In the following discussion we will use high mass X-ray binaries (HMXB) in star forming galaxies as an example.As was shown by Grimm, Gilfanov & Sunyaev (2003), the luminosity distribution of HMXB sources in a galaxy is described to first approximation by a "universal" luminosity function whose shape is the same in all galaxies and whose normalization is proportional to the star formation rate (SFR) of the parent galaxy: In a broad luminosity range, log(LX ) ∼ 35.5 − 40.5, the shape of the HMXB "universal" luminosity function is close to a power law, f (L) = L −α , with the slope of α ≈ 1.6.Importantly, the luminosity of the compact sources in star forming galaxies appears to be restricted by a maximum value of Lcut ∼ few × 10 40 erg/s.This cut-off luminosity can be defined, for example, by the Eddington luminosity limit for the most massive objects associated with the star forming regions.Obviously, on the faint side, the luminosity distribution eq.( 1) must become flatter or have a cut-off as well, in order to keep the total number of sources finite.This low luminosity cut-off may be caused for example by the propeller effect (Illarionov & Sunyaev 1975) as discussed by Shtykovskii & Gilfanov (2004).
The expectation value for the total number of sources in a galaxy equals and, naturally, is directly proportional to its star formation rate.The number of sources actually observed in a given galaxy obeys Poisson distribution pµ(N ) with µ defined by eq.( 2).Apart from effects of counting statistics, the number of HMXB sources found in an arbitrarily chosen galaxy will be close to the above expectation value.
The problem, considered in this paper, is the behavior of the total luminosity of high mass X-ray binaries in a galaxy as a function of its star formation rate.
An apparently obvious expression for the Ltot can be obtained integrating the luminosity distribution (1): Hence, one might expect that the total X-ray luminosity of HMXB sources is also directly proportional to the star formation rate of the host galaxy, as is the total number of sources.This problem, however, involves some subtleties related to the statistical properties of the power law distribution of the sources over luminosity, which appear not to have been recognized previously, at least in astrophysical context (a somewhat related problem has been considered by Kalogera et al. 2001 in connection with estimating the coalescence rate for the NS-NS binaries in the Galaxy).Although eq.( 4) correctly predicts the average X-ray luminosity computed for a large number of galaxies with similar values of star formation rate, it fails to describe the relation between the most probable value of X-ray luminosity of an arbitrarily chosen galaxy and its SFR.The main surprise of the study presented here is that, in the low SFR regime, the relation between SFR of the host galaxy and the total luminosity of its HMXBs is non-linear -with increase of the star formation rate the luminosity appears to grow faster than linear.The relation becomes linear only for sufficiently high star formation rates, when the total number of objects with a luminosity close to the maximum possible value, defined by Lcut, becomes sufficiently large.This can be illustrated by the following simple consideration.For an arbitrarily chosen galaxy, the brightest source is most likely going to have a luminosity Lmax defined by the condition1 For a power law luminosity distribution with slope α and with a cut-off at Lcut, eq.(1), the above expression yields: As might be intuitively expected, at low SFR, the most probable luminosity of the brightest source increases with SFR, until it reaches the maximum value of Lcut.The threshold value of the star formation rate, separating low and high SFR regimes, is defined by the condition N (L ∼ Lcut) ∼ 1, i.e. that there are ∼few sources expected, with luminosity close to the cut-off value Lcut.
The most probable value of the total luminosity, Ltot can be then computed integrating the luminosity function from Lmin to Lmax: which, for 1 < α < 2 and Lmin << Lmax, leads to i.e. is non-linear in the low SFR regime and becomes linear only at high star formation rates.This can be qualitatively understood as follows.For a slope of the luminosity distribution 1 < α < 2, the total luminosity of a galaxy is defined by the brightest sources.The non-linear behavior in the low-SFR limit is caused by the fact that an increase of the SFR leads to non-linear increase of the luminosity of the brightest sources.Therefore their total luminosity grows faster than the star formation rate.This non-linear growth continues until the maximum possible value of the luminosity of the compact sources is achieved.Further increase of the star formation rate leads to a linear increase of the number of the brightest sources in the galaxy, but not of their individual luminosities.Consequently, the LX −SFR relation becomes linear.
In the more formal language of statistics such a behavior is related to the properties of the probability distribution of the collective luminosity p (Ltot).In particular, it can be understood in terms of the difference between the expectation mean and the mode of the probability distribution.The expectation mean is defined as and is given by eq.( 4).The mode of the statistical distribution, Ltot, is defined as the value of the random variable (Ltot in our case) at which the probability distribution p (Ltot) reaches its maximum value.Whereas the expectation mean Ltot describes the result of averaging of the X-ray luminosities of many galaxies having similar values of SFR, it is the mode of the p (Ltot) distribution that predicts the most probable value of the total luminosity of a randomly chosen galaxy.The non-linear behavior in the low SFR regime is caused by the skewness of the probability distribution p (Ltot) resulting in a difference between expectation mean and mode.In the high SFR limit, the p (Ltot) distribution asymptotically approaches the Gaussian distribution, in accord with the Central Limit Theorem.The boundary value of SFR, separating the non-linear and linear regimes of the LX −SFR relation is defined by the parameters of the luminosity function.
Interestingly, the fact of the existence of a linear regime in the LX -SFR relation is a direct consequence of the cut-off in the luminosity function.Only in the presence of a maximum possible luminosity for the sources, Lcut, (for instance Eddington limit for a neutron star) a linear regime can exist, when the total luminosity of a galaxy is defined by a sufficiently large number of bright sources near Lcut, and any subsequent increase of the star formation rate results in linear growth of the total luminosity.
In the above discussion, we used high mass X-ray binary populations in star forming galaxies as an example.Obviously, the effect considered in this paper is of a broader general interest and is at work in many situations related to computing/measuring integrated luminosity of a finite number of discrete sources.One example related to the application of stellar synthesis models to observations of stellar clusters has been independently discovered by Cervino & Luridana (2004).
The paper is structured as follows.In section 2 we consider the statistical properties of the total luminosity, in particular in section 2.1 derive the formulae for the probability distribution p (Ltot), using two different approaches, and present results of numerical calculations in 2.2.The variability of the total emission is studied in section 3.In section 4 we discuss astrophysical applications, including the properties of the total X-ray emission of a galaxy due to its population of high-(section 4.2, 4.3) and low-(section 4.4) mass X-ray binaries.Our results are summarized in section 5.In the Appendixes we derive convenient approximations for Ltot and fractional rms of the total emission and consider their asymptotical behaviour.A casual reader, not interested in the mathematical aspects of the problem, can skip section 2.1 and proceed with section 2.2.

TOTAL LUMINOSITY
We consider a population of sources with a luminosity function (LF): The expectation values for the total number of sources and the total luminosity are: It is assumed that both quantities are well defined and finite.

The probability distribution of the total luminosity
Below we present two methods of computing the probability distribution of the total luminosity.The second of these two methods is somewhat more convenient computationally.

Method I
To compute the probability distribution for the total luminosity, p (Ltot), we divide the L1 − L2 luminosity range into intervals of infinitesimal width δL k and express the combined luminosity of all sources as a sum where L tot,k is the combined luminosity of the sources in the k-th interval, running from L k to L k + δL k .The number of sources in the interval (L k , L k + δL k ) obeys a Poisson distribution with mean For δL k → 0, it is sufficient to consider the occurrence of either zero or one source per interval whose probabilities are, respectively: The probability distribution for the combined luminosity of the sources in the k-th interval is where δ(x − x0) is the delta function.The characteristic function of this distribution is Using the convolution theorem, the characteristic function of the probability distribution of the total luminosity can be computed as a product of characteristic functions of p (L tot,k ) Finally, where n is given by eq.(11).

Method II
The probability distribution of the total number of sources follows a Poisson distribution Pµ(n) = µ n e −µ /n! with mean µ given by eq.( 11).The probability distribution of the total luminosity is: where pn(Ltot) is the probability distribution of the total luminosity of exactly n sources.In the majority of practically L k /n, of n discrete sources with a power law LF, eq.( 25).The lower and upper luminosity cut-offs were fixed at L 1 = 1 and L 2 = 10 3 for all plots.The value of the slope α is indicated in each panel.Each curve is marked according to the number of sources n.The vertical dashed lines show the expectation mean Ltot /n.interesting cases, the total number of sources is sufficiently large, n >> 1, and the Poisson distribution in eq.( 21) can be replaced by the delta-function δ(n − n ): where n = n is given by eq.( 11) The probability distribution of the sum of n random variables, Ltot = k=n k=1 L k , can be calculated using the convolution theorem: where p1(L) is the probability distribution for the luminosity of one source, which equals the luminosity function with appropriate normalization: The probability distribution pn(Ltot), defined by eq.( 23), describes the case when n sources are observed.On the contrary, the distribution p (Ltot), defined by eq.( 20), is parametrized via the normalization of the luminosity function or, equivalently, via the expectation value n , and describes the case when n sources are expected.These two distributions are related via eq.( 21).For n >> 1, which is often the case, they are nearly identical, and it can be assumed that n = n and both quantities are related to the normalization of the luminosity function via eq.( 11).

Illustrative examples
To illustrate the behaviour of the total luminosity we assume a power law luminosity function:   at L1 = 1 and L2 = 10 3 The skewness of the probability distribution pn(Ltot) leads to a deviation of its mode Ltot (the most probable value of the total luminosity) from the expectation mean Ltot indicated in each panel by the vertical dashed line.The effect is most pronounced for α ∼ 3/2, is unimportant for for shallow luminosity functions with α < 1 and gradually diminishes with increasing α at α > 2.
For illustration, we also show in Fig. 1 the probability distributions for a flat LF, α = 0, in which case Ltot exactly equals Ltot .Naturally, for any value of α the pn(Ltot) → Gaussian distribution in the limit of n → ∞, in accord with the Central Limit Theorem.Correspondingly Ltot → Ltot in this limit.

The most probable value of total luminosity
The dependence of the most probable value of the total luminosity Ltot on the total number of sources n for different values of α and the ratio L2/L1 is shown in Fig. 2 and 3. Convenient analytical approximations for the Ltot − n relation are derived in Appendix A and its asymptotical behavior is considered in Appendix A2.
The Ltot − n relation is significantly non-linear for "small" number of sources or, equivalently, for small values of the LF normalization A. For L1 << L2, α > 1, the boundary between the non-linear and linear regimes, expressed in terms of the normalization A, depends only on the LF slope α and its high luminosity cut-off L2 (see Appendix A2).This is due to the fact that the behavior of Ltot is defined by the number of sources near the high luminosity cut-off of the luminosity function, rather than by the total number of sources in the entire L1 − L2 luminosity range, i.e. the condition for the linear regime is: The total number of sources in the L1 −L2 luminosity range, on the contrary, is defined by the low luminosity cut-off L1 (for α > 1).For sufficiently large L2/L1, the non-linear regime can occur for an arbitrarily large total number of sources (cf. the curves corresponding to different values of L2/L1 in Fig. 2).Interestingly, for the LF slope in the range of 1 < α < 2, where the effect is strongest, there is a relatively sharp break separating the non-linear part of the dependence from the linear part.
Although for small n the most probable value of luminosity Ltot can deviate significantly from the expectation mean Ltot (Fig. 1,2), the condition p (Ltot) Ltot dLtot = Ltot is satisfied for any n.Consequently, the average of Ltot over a large number of realizations with the same n always equals Ltot .This equality is achieved due to the existence of outliers, having values of Ltot significantly exceeding both Ltot and Ltot , in accordance with the skewness of the probability distribution pn(Ltot) for small n.This naturally leads to enhanced and asymmetric dispersion of the observed values of Ltot in the non-linear regime as illustrated by the shaded areas in Fig. 3.

The luminosity of the brightest source in a sample
The probability distribution for the luminosity of the brightest source observed in a sample of n sources equals: where p1(L) is the probability distribution for the luminosity of one source (e.g.eq.( 24)) and p1(L < Lmax) denotes the cumulative probability p1(L < Lmax) = Lmax 0 p1(L) dL.The p (Lmax) distribution for α = 1.5 is shown in Fig. 4 and illustrates the intuitively obvious fact that if the number of sources is sufficiently small the brightest sources most likely will not reach the highest possible value of L2.An analytical formula for the most probable value of the luminosity of the brightest source is given by eq.(A1).

VARIABILITY OF THE TOTAL EMISSION
For a population of n sources with luminosities L k and fractional rms of aperiodic variability rms k , the fractional rms of the total emission equals: assuming that variations of the source fluxes are uncorrelated with each other.In the following we also assume for simplicity that all sources have the same value of fractional rms, i.e. rms k = rms0.
In the limit of n → ∞, corresponding to the linear regime in the Ltot − n relation, the sums in the eq.( 28) can be replaced by the respective integrals of the LF: In the linear regime the fractional rms of the collective emission obeys ∝ 1/ √ n averaging law, as expected for uncorrelated variations of individual sources.
In the non-linear regime, however, for a sufficiently flat luminosity function, the total luminosity is defined by a few brightest sources.To first approximation the number of such sources effectively contributing to the sums in eq.( 28) does not depend on the LF normalization.Consequently, the fractional rms of the total emission is constant, independently on the total number of sources or of their total luminosity.This contradicts the intuitive expectation that the fractional rms decreases with the number of sources as rms ∝ 1/ √ n.
To illustrate this behavior we performed a series of the Monte-Carlo simulations for a power law LF.For each set of parameters and a given value of the number of sources, in each run n sources were placed between L1 and L2 tot /rms 2 0 , the shaded areas show the 67% intrinsic dispersion, both obtained from the Monte-Carlo simulations.The thin solid lines were computed using the approximation given by eq.( B3).The dashed line shows rms tot/rms0 ∝ 1/ √ n dependence.
with a power law luminosity distribution, eq.( 25).For each run, the ratio rms 2 tot /rms 2 0 was computed following eq.(28).From results of many runs, the probability distribution for rms 2 tot /rms 2 0 was obtained.The maximum of this distribution defines the most probable value of rms 2 tot and its width characterizes the intrinsic dispersion of this quantity.The examples are shown in Fig. 5 for two values of the LF slope, α = 1.5, 2.5.For α < 1 and α > 3 there is no nonlinear regime and the fractional rms of the total emission obeys ∝ 1/ √ n law for any n.In plotting Fig. 5, we converted rms 2 tot to rms tot, so that the thick solid curves shows the behavior of the square root of the most probable value of the rms 2 tot /rms 2 0 ratio.Approximate formulae for rms 2 tot are derived in Appendix B.

Determining LF parameters from total emission of unresolved sources
The shape of the Ltot − A relation (A is the normalization of the luminosity function) is defined by the parameters of the luminosity function.For 1 < ∼ α < ∼ 2, it has two distinct power law regimes, separated by a break (Fig. 2 and 3): The position of the break between the non-linear and linear regime is defined by the high luminosity cut-off of the LF and its slope as described by eq.( A13).This opens a possibility to determine the LF parameters without actually resolving the individual sources, but studying large samples of objects (e.g.galaxies) and the relation between their total emission and the normalization A. The value of A in many cases can be determined independently from observations at other wavelengths.For example, the normalizations of the luminosity functions of high and low mass X-ray binaries are proportional to the star formation rate (Grimm et al. 2003) and stellar mass (Gilfanov 2004) of the host galaxy respectively.Both quantities can be determined from the conventional indicators, such as radio or near-infrared luminosities.
Of course with the sub-arcsec angular resolution of Chandra the luminosity distribution of point sources in nearby galaxies can be studied directly.However, for more distant galaxies, D > ∼ 30 − 50 Mpc, even Chandra resolution becomes insufficient and only the total luminosity of the galaxy can be measured.Provided that the contaminating contribution of the emission of a central AGN and/or of hot X-ray emitting gas can be constrained or separated, one can study the relation between the total luminosity LX of a (unresolved) galaxy and its star formation rate or stellar mass.The LX −SFR or LX − M * "growth curves" constructed for large samples of galaxies and spanning a broad range of SFR and M * can be used to constrain the XLF parameters of Xray binaries in distant galaxies including those located at intermediate and high redshifts.With this, one can study the influence of a number of factors such as effects of binary evolution, metallicity, regimes of star formation, etc. on the luminosity distribution of X-ray binaries.
4.2 High mass X-ray binaries in star forming galaxies 4.2.1 LX −SFR relation for star forming galaxies The slope of the "universal" luminosity function of HMXBs, α = 1.6, is in the range where the non-linear behavior of the  4).Right: The L X −SFR relation.The open circles are nearby galaxies observed by Chandra, the filled triangles are spatially unresolved nearby galaxies observed by ASCA and BeppoSAX, for which only total luminosity is available, the filled circles are distant star forming galaxies from the HDF-N.The thick grey line is relation between the SFR and the most probable value of the total luminosity, predicted from the "universal" HMXB XLF, the shaded area shows its 67% intrinsic spread, the dashed line is the expectation mean, defined by eq.( 4).The five nearby galaxies, used by Grimm et al. (2003) to derive the "universal" HMXB XLF, are marked as crossed boxes.
Ltot − A relation is most pronounced.In the left panel of Fig. 6 we plot the probability distribution of the total luminosity p (Ltot), computed for different values of the star formation rate.The distribution is strongly asymmetric in the non-linear low-SFR regime which, for the parameters of the "universal" HMXB XLF, corresponds to the formation rate of massive stars below SFR < ∼ 4 − 5 M⊙/yr.Note that a small value of SFR does not necessarily imply a small total number of sources, which is defined by the (unknown) low luminosity cut-off of the HMXB XLF.For example, for SFR=0.2M⊙/yr, when the non-linear effect is very pronounced, the total number of sources might be as large as ∼ 300 (∼ 1200) for the low luminosity cut-off of 10 34 (10 33 ) erg/s.These low values of the star formation rate correspond to the familiar examples of the Milky Way galaxy and the Magellanic Clouds.On the opposite end among the relatively nearby and well known galaxies are the Antennae interacting galaxies which, with the star formation rate of SFR∼ 7 M⊙/yr, have a nearly symmetric p (Ltot) distribution, sufficiently close to the normal distribution.
The predicted LX −SFR relation is shown in the right panel in Fig. 6, along with the measured values of X-ray luminosities and the star formation rates for a number of nearby galaxies and galaxies observed with Chandra at intermediate redshifts, z ∼ 0.2 − 1.3, in the Hubble Deep Field North.The data shown in Fig. 6 are from Grimm et al. (2003), complemented with the local galaxies data from Ranalli, Comastri & Seti (2003).In plotting the latter we removed the duplications and the galaxies likely to be contaminated by the contribution of low mass X-ray binaries, unrelated to the current star formation activity, as discussed by Gilfanov et al. (2004).The luminosities and star formation rates for the Hubble Deep Field North galaxies (Brandt et al. 2001) were computed for the following cosmological parameters: H0 = 70 km/s/Mpc, Ωm = 0.3, Λ = 0.7, as described in Gilfanov et al. (2004).Fig. 6 further illustrates the difference between the mode and the expectation mean of the p (Ltot) probability distribution.The thick solid line in the right panel shows the SFR-dependence of the mode of p (Ltot) and predicts the most probable value of the X-ray luminosity of a randomly chosen galaxy.If observations of many (different) galaxies with similar values of SFR are performed, the obtained values of Ltot will obey the probability distribution depicted in the left panel.The average of the measured values of Ltot will be equal to the expectation mean given by eq.( 4) and shown by the dashed straight lines in the left and right panels.Due to the properties of p (Ltot) these two quantities are not identical in the low-SFR limit when the total luminosity is defined by the small number of the most luminous sources.Only in the large SFR limit when there are sufficiently many sources with luminosities near the cut-off of the luminosity function, log(LX ) ∼ 40, does the LX −SFR relation behave in the intuitively expected way.
Owing to the skewness of the probability distribution p (Ltot) large and asymmetric dispersion among the measured values of Ltot is expected in the non-linear low-SFR regime.This asymmetry is already seen in Fig. 6 -at low SFR values there are more points above the thick solid curve than below.Moreover, the galaxies lying significantly above the solid and dashed curves in Fig. 6 should be expected at low SFR and will inevitably appear as the plot is populated with more objects.Such behavior differs from a typical astrophysical situation and should not be ignored when analyzing The data are the same as in Fig. 6. and fitting the LX -SFR relation in the low SFR regime.In particular, due to non-Gaussianity of the p (Ltot) distribution the standard data analysis techniques -least square and χ 2 fitting -become inadequate.

High luminosity cut-off in the HMXB XLF
The position of the break between the non-linear and linear parts of the LX -SFR relation depends on the LF slope and its cut-off luminosity (Fig. 7, eq.A13): SFR break ∝ L α−1 cut .This allows to constrain the parameters of the luminosity distribution of compact sources using the data of spatially unresolved galaxies as discussed in section 4.1.
Agreement of the predicted LX −SFR relation with the data both in high-and low-SFR regimes confirms the universality of the HMXB luminosity function, derived by Grimm et al. (2003) from significantly fewer galaxies (shown as crossed boxes) than plotted in Figs. 6, 7. It provides an independent confirmation of the existence of a cut-off in the HMXB XLF at log(Lcut) ∼ 40.5, including HMXBs in spatially unresolved high redshift galaxies from the Hubble Deep Field North.This implies, in particular, that ultraluminous X-ray sources (ULX) at redshifts of z ∼ 0.2 − 1.3 were not significantly more luminous than those observed in the nearby galaxies.

Aperiodic variability
X-ray flux from X-ray binaries is known to be variable in a broad range of time scales, from ∼msec to ∼ years.In addition to a number of coherent phenomena and quasi-periodic oscillations, significant continuum aperiodic variability is often observed.The fractional rms of aperiodic variations de-Figure 8. Variability of the total emission of HMXBs in star forming galaxies.Dependence of the ratio rms tot/rms0 on the star formation rate.The thick solid line shows the most probable value of the rms tot/rms0, the shaded area shows its 67% intrinsic dispersion, both obtained from the Monte-Carlo simulations for a power law LF with the slope of α = 1.6 and a cut-off at L 2 = 2 • 10 40 erg/s.The dashed line shows asymptotical behavior at large SFR.The filled circles correspond to HMXB sources in the Milky Way, SMC and the Antennae galaxies, computed from eq.( 28), using the observed luminosities of X-ray sources in these galaxies.
pends on the nature of the binary system and the spectral state of the X-ray source and is usually in the range from a fraction of a per cent to ∼ 20−30 per cent.Correspondingly, the combined emission of X-ray binaries in a galaxy should be also variable in a similarly broad range of time scales.It has been suggested by Grimm, Gilfanov & Sunyaev (2004) that due to a large difference in the characteristic time scales of the accretion flow onto a stellar mass object and onto a supermassive black hole variability of the X-ray emission from a galaxy can be used to distinguish between the combined emission of a population of X-ray binaries and that of an accreting supermassive black hole in the centre of a galaxy (AGN).
From results of section 3 and Appendix B one should expect that in the non-linear low-SFR regime the fractional rms of the total X-ray emission of a star forming galaxy is independent of SFR.This prediction is confirmed by the results of the Monte-Carlo simulations performed as described in section 3 and shown in Fig. 8.For moderate star formation rates, SFR < ∼ 5 − 10 M ⊙ /yr, we predict a rather large aperiodic variability of the total emission of HMXBs at the level of ∼ 1/3 − 1/2 of the fractional rms of individual X-ray binaries.At larger values of SFR, corresponding to the linear regime in the LX −SFR relation, it decreases as rms ∝ 1/ √ SFR, in accord with eq.( B3).Also shown in Fig. 8 are values of the fractional rms reduction, computed directly from eq.( 28) using the luminosities of the observed HMXBs in the Milky Way (Grimm et al. 2002),  27) using the parameters of the "universal" luminosity function of HMXBs.Right: Expected luminosity of the brightest HMXB vs. star formation rate.The thick solid line shows the most probable value of the Lmax (eq.(A1)), the shaded area shows its 67% intrinsic dispersion, obtained from the probability distribution given by eq.( 27).The filled circles show maximum observed luminosity of HMXBs in the Milky Way and several nearby star forming galaxies.SMC (Yokogawa et al. 2000), and in the Antennae galaxies (Zezas et al. 2002).Note, that the predicted rms−SFR relation can be modified by the luminosity dependence of the rms of individual sources.This factor might become especially important at large values of SFR when the total luminosity of a star forming galaxy is dominated by ULXs whose variability properties we know little about.

Luminosity of the brightest source
As the first Chandra observations of compact sources in nearby galaxies became available, it has been noted (e.g.Sarazin et al. 2001;Irwin et al. 2002;Fabbiano & White 2003) that the luminosity of the brightest X-ray binary in a galaxy might depend on its properties.In particular in the case of high mass X-ray binaries it appeared to correlate with the star formation rate of the host galaxy.For example, in the Antennae galaxies, a number of compact sources have been discovered with luminosities of ∼ 10 40 erg/s (Zezas et al. 2002).On the other hand, the luminosities of the brightest HMXB sources in the Milky Way do not exceed < ∼ 10 38 erg/s (Grimm et al. 2002).It has been argued that this might reflect the difference in the intrinsic source properties, related to the difference in the galactic environment and in initial conditions for X-ray binary formation in starburst galaxies and in galaxies with weak and steady star formation.
However, as discussed in section 2.3, the probability distribution for the luminosity of the brightest source in a galaxy, p (Lmax), depends on the LF normalization, i.e. on the SFR of the host galaxy in the case of HMXBs.The luminosity of the brightest source increases with SFR, until it reaches the maximum possible value, defined by the high luminosity cut-off of the LF.The p (Lmax) distributions, computed from eq.( 27) for the parameters of the "universal" HMXB XLF, are shown for different values of SFR in the left panel of Fig. 9.The right panel in Fig. 9 shows the dependence of the most probable value of the luminosity of the brightest HMXB on the SFR of the host galaxy, described by eq.(A1), along with its intrinsic 67% uncertainty.Filled symbols in Fig. 9 are the luminosities of the brightest source observed in star forming galaxies from the sample of Grimm et al. (2003, and references therein).As is clear from the plot, the large difference in the maximum luminosity between low-and high-SFR galaxies, e.g. between the Milky Way and the Antennae galaxies, can be naturally understood in terms of the properties of the probability distribution p (Lmax).No additional physical processes, affecting HMXBs formation in starburst galaxies, need to be invoked.

Intermediate mass black holes
The hypothetical intermediate mass black holes, probably reaching masses of ∼ 10 2−5 M⊙, might be produced, e.g.via black hole merges in dense stellar clusters, and can be associated with extremely high star formation rates.To accrete efficiently they should form close binary systems with normal stars or be located in dense molecular clouds.It is natural to expect, that such objects are significantly less frequent than ∼stellar mass black holes.The transition from ∼stellar mass BH HMXB to intermediate mass BHs should manifest itself as a step in the luminosity distribution of compact sources (Fig. 10, left panel).If the cut-off in the HMXB XLF, observed at log(Lcut) ∼ 40.5 corresponds to the max- As was shown by Gilfanov (2004) the luminosity distribution of low mass X-ray binaries in nearby early type galaxies and bulges of spiral galaxies can be described by a "universal" XLF whose shape is approximately the same in different galaxies and whose normalization is proportional to the stellar mass of a galaxy.The shape of the "universal" LMXB XLF is significantly more complex than that of HMXBs.It appears to follow the L −1 power law at low luminosities, gradually steepens at log(LX ) > ∼ 37.0 − 37.5, and has a rather abrupt cut-off at log(LX ) ∼ 39.0 − 39.5.In the log(LX ) ∼ 37.5 − 38.7 luminosity range, it approximately follows a power law with the differential slope of ≈ 1.8 − 1.9.
Given the shape of the XLF, the total luminosity of LMXB sources in a galaxy is defined by the sources with log(LX ) ∼ 37 − 38, the contribution of the brighter and, especially, weaker sources being less significant.Therefore the non-linear regime in the LX − M * relation, although does exist for log(M * ) < ∼ 10.0 − 10.5, is less important than in the LX −SFR relation for high mass X-ray binaries (see, for example, Fig. 14 in Gilfanov 2004).
Significantly more pronounced is the dependence of the luminosity of the brightest source on the LF normalization, i.e. on the stellar mass of the host galaxy.In order to study this dependence we used the broken power law approximation for the LMXB XLF from Gilfanov (2004) and performed a series of the Monte-Carlo simulations, similar to those described in section 3. The probability distribution of the maximum luminosity p (Lmax), obtained from these simulations, is shown in the left panel in Fig. 11 for different values of the stellar mass of the host galaxy.The right panel shows the dependence of the most probable value of the maximum luminosity and of its 67% intrinsic spread on the stellar mass.Its broken line shape is a result of the broken power law approximation of the LMXB XLF used in the simulations.Solid symbols show the observed values of the maximum luminosity for the number of nearby early type galaxies, bulges of spiral galaxies and for LMXBs in the Milky Way from the sample of Gilfanov (2004, and references therein).
Similarly to HMXBs in star forming galaxies it is obvious from Fig. 11 that the significant difference in the value of the luminosity of the brightest source can be naturally explained by the properties of the luminosity function of LMXBs.The same effect leads to an artificial (unphysical) dependence of the average luminosity of low mass X-ray binaries in a galaxy on its stellar mass (e.g.Fig. 17 in Gilfanov 2004).So far there is no evidence for a significant change of intrinsic properties of low mass X-ray binaries with galactic environment.The difference between the luminosity of the brightest LMXB in massive elliptical galaxies and the bulges of spiral galaxies can be understood based on the probability arguments.

Variability
As in the case of HMXBs, in the linear large-mass limit, log(M * ) > ∼ 10.5, the fractional rms of the aperiodic variability of the combined emission of LMXBs follows the rms tot ∝ 1/ √ M * averaging law.Owing to the shape of the LMXB XLF it decreases rather quickly with the stellar  27), using the parameters of the broken power law approximation to the "universal" luminosity function of LMXBs.Right: Expected luminosity of the brightest LMXB source vs. stellar mass of the host galaxy.The thick solid line shows the most probable value of the Lmax, the shaded area shows its 67% intrinsic dispersion, both obtained from the probability distribution given by eq.( 27).The filled circles show maximum observed luminosities of LMXB sources in the Milky Way and several nearby galaxies, studied with Chandra.The "broken" shape of the predicted dependence is a consequence of the broken power law approximation of the "universal" LMXB XLF, used in the calculations.mass of the galaxy in the non-linear low-mass regime as well (Fig. 12).Consequently, in massive elliptical galaxies, with stellar mass log(M * ) ∼ 11.0 − 11.5, the fractional rms variability of the total emission will be suppressed by a factor of ∼ 10 − 15 with respect to the rms of individual sources.In a galaxy similar to the Milky Way with log(M * ) ∼ 10.5 − 10.7 the suppression factor is ∼ 5. Considerable variability on the level of ∼ 1/4 − 1/2 of that of individual X-ray binaries can be expected only for light bulges of spiral galaxies with masses in the log(M * ) ∼ 9.5 − 10.5 range.
Fig. 13 compares the dependence of the fractional rms on the most probable value of the total luminosity for high and low mass X-ray binaries.In the bright luminosity end, log(LX ) > ∼ 39.5, the X-ray emission from early type galaxies is expected to be significantly, up to a factor of ∼ 7, less variable than from star forming galaxies.

SUMMARY
We studied the statistical properties of the combined emission of a population of discrete sources.Namely, we considered the properties of their total luminosity and its dependence on the number of sources n or, equivalently, on the normalization of the luminosity function (LF).
Using high mass X-ray binaries in star forming galaxies as an example, L k are the luminosities of individual X-ray binaries in a galaxy and Ltot is its total X-ray luminosity due to HMXB population.In this example the normalization of the luminosity function, i.e. the number of HMXBs in the galaxy n, is proportional to its star formation rate.We showed that due to statistical properties of the probability distribution p (Ltot) the result of a measurement of the total luminosity of a randomly chosen galaxy might deviate significantly from the intuitively obvious expression These properties of p (Ltot) can result in a surprising nonlinear dependence of the total luminosity on n.They can also cause anomalous variability of the combined emission in an apparent violation of the rms ∝ 1/ √ n averaging law for uncorrelated variations of individual sources.
Our results can be summarized as follows: (i) The probability distribution p (Ltot) can be computed numerically from eq.( 20) or alternatively eq.( 23).Examples are shown in Fig. 1.
(ii) The relevant characteristics of the p (Ltot) distribution are: (1) its mode Ltot -the value of the random variable Ltot for which p (Ltot) has the maximum and (2) the expectation mean Ltot defined by eq.( 32).
It is the mode of the p (Ltot) distribution that predicts the most probable value of the total luminosity of a randomly chosen galaxy.If many galaxies with ∼same total number Figure 12.Variability of the total emission of low mass X-ray binaries.Dependence of the ratio rms tot/rms0 on the stellar mass of the galaxy.The thick solid line shows the most probable value of the rms tot/rms0, the shaded area shows its 67% intrinsic dispersion, both obtained from the Monte-Carlo simulations for the "universal" luminosity function from Gilfanov (2004).The dashed line shows asymptotical behavior at large M * .The filled circles correspond to LMXBs in the Milky Way and several nearby galaxies, computed from eq.( 28), using the observed luminosities of X-ray sources in these galaxies.
of sources n are observed the measured values of their total X-ray luminosities are distributed according to the p (Ltot) distribution whose shape depends on the chosen value of n (Fig. 6, left panel).The average of the measured values of Ltot is equal to the expectation mean Ltot and is always proportional to n in agreement with eq.( 32).
(iii) For small values of the LF normalization, Ltot and Ltot do not equal each other (Fig. 1) and the Ltot − n relation is non-linear (Fig. 2-3, 6).Only in the limit of n ≫ 1, Ltot = Ltot and Ltot depends on n linearly.The threshold value of n depends on the LF shape and can be arbitrarily large.
(iv) The skewness of the p (Ltot) probability distribution in the non-linear regime (Fig. 1) results in an enhanced and significantly asymmetric intrinsic dispersion of the measured values of Ltot (Fig. 3,6).Its non-Gaussianity precludes the use of the standard fitting techniques in analyzing the Ltot − n relation (e.g.LX −SFR relation for star forming galaxies), such as χ 2 minimization technique.
(v) The variability of the total emission (e.g.aperiodic variability of X-ray emission of a galaxy due to superposition of variabilities of individual sources) in the non-linear regime decreases with the number of sources more slowly than rms ∝ 1/ √ n law for uncorrelated variations of individual sources, resulting in anomalously high variability of the total emission.In the linear regime the rms ∝ 1/ √ n dependence is restored (section 3, Fig. 5).
(vi) The amplitude of the discussed effect depends on the Dependence of the ratio rms tot/rms0 on the total X-ray luminosity of the galaxy due to X-ray binaries.The thick solid lines show the most probable value of the rms tot/rms0, the shaded areas show its 67% intrinsic dispersion, both obtained from the Monte-Carlo simulations for the respective "universal" luminosity functions from Grimm et al. (2003) and Gilfanov (2004).For both curves, the linear parts at high L X follow rms ∝ 1/ √ L X averaging law.
shape of the luminosity function.For a power law LF it is strongest for 1 < α < 2, is unimportant for shallow luminosity functions with α < 1, and gradually diminishes with increase of α above α = 2 (Fig. 1-3).
We illustrate these results using the example of the combined emission of X-ray binaries in galaxies and its dependence on the star formation rate and on the stellar mass of the host galaxy.
(i) For the slope of the HMXB "universal" XLF, α ≈ 1.6, the discussed effects are strongest with a significant nonlinear regime in the LX −SFR relation at SFR < ∼ 4−5 M⊙/yr.The predicted LX −SFR dependence is in good agreement with observations (Fig. 6).Given the shape of the "universal" LMXB XLF no significant non-linearity of the LX − M * relations is expected, also in a good agreement with observations.
(ii) The LX −SFR relation can be used to constrain the XLF parameters of HMXBs in distant unresolved galaxies including the galaxies observed with Chandra in HDF-N at redshifts z ∼ 0.2 − 1.3 (sections 4.2.2,4.3,Figs.7,10).
(iii) Both for high and low mass X-ray binaries a strong dependence of the luminosity of the brightest source on the SFR and stellar mass of the host galaxy is expected.The Lmax−SFR and Lmax −M * dependences predicted from the respective "universal" XLFs explain well the results of Chandra observations of nearby galaxies (Fig. 9 and 11).The significant difference in the luminosity of the brightest LMXB between bulges of spiral galaxies and giant ellipticals or between the brightest HMXB in the Milky Way and in starburst galaxies can be understood based solely on probability arguments.
(iv) We predict enhanced variability of X-ray emission from star forming galaxies due to HMXBs, significantly above the ∝ 1/ √ n averaging law.For SFR < ∼ 5 M⊙/yr the expected fractional rms of variability of the combined emission of HMXBs does not depend on the star formation rate and approximately equals ∼ 1/3 − 1/2 of the rms of individual sources (Fig. 8).On the contrary, variability of Xray emission from early type galaxies due to LMXBs will be significantly suppressed because of the averaging effect (Fig. 12), upto ∼ 3 − 10 times in the ∼ 10 10 − 10 11 M⊙ stellar mass range.For the same total luminosity star forming galaxies are expected to have significantly larger fractional rms, than massive elliptical and S0 galaxies assuming that fractional the rms of individual sources are comparable (Fig. 13).

Figure 1 .
Figure 1.Probability distributions of the average luminosity, n k=1 L k /n, of n discrete sources with a power law LF, eq.(25).The lower and upper luminosity cut-offs were fixed at L 1 = 1 and L 2 = 10 3 for all plots.The value of the slope α is indicated in each panel.Each curve is marked according to the number of sources n.The vertical dashed lines show the expectation mean Ltot /n.

Figure 2 .
Figure2.The ratio of the most probable value of the total luminosity Ltot to the expectation mean Ltot versus number of sources observed in L 1 − L 2 luminosity range for different values of the LF slope α and the ratio L 2 /L 1 .The results of exact calculation using eq.(23) are shown by the solid symbols connected with thick line.The thin solid line shows the approximate relation calculated from eqs.(A7) and (A3), as described in the Appendix A1.

Figure 3 .
Figure3.The most probable value of the total luminosity Ltot and its intrinsic dispersion versus number of observed sources for different values of the LF slope α.The lower and upper cut-off luminosities were fixed at L 1 = 1, L 2 = 10 3 .The results of exact calculation using eq.(23).Behavior of Ltot is shown by the thick solid line.The narrower and broader dashed areas correspond to its 67% and 90% intrinsic dispersion.The thin solid line shows linear relation for the expectation mean Ltot ∝ n.

Figure 4 .
Figure 4. Probability distributions p (Lmax) of the luminosity of the brightest source in a sample of n sources with a power law LF with the slope α = 1.5 and L 1 = 1, L 2 = 10 3 .Each curve is marked according to the number of sources n.

Figure 5 .
Figure5.Variability of the total emission.The ratio of the fractional rms of the combined emission to that of one source, rms 0 , as a function of the number of sources for different values of the LF slope α (indicated in the upper-right corner of each plot) and different values of the ratio of L 2 /L 1 .The thick solid lines show the square root of the most probable value of the rms 2 tot /rms 2 0 , the shaded areas show the 67% intrinsic dispersion, both obtained from the Monte-Carlo simulations.The thin solid lines were computed using the approximation given by eq.(B3).The dashed line shows rms tot/rms0 ∝ 1/ √ n dependence.

Figure 6 .
Figure 6.Left: The probability distributions p(Ltot/SFR) for different values of SFR.The vertical dashed line shows the expectation mean, defined by eq.(4).Right: The L X −SFR relation.The open circles are nearby galaxies observed by Chandra, the filled triangles are spatially unresolved nearby galaxies observed by ASCA and BeppoSAX, for which only total luminosity is available, the filled circles are distant star forming galaxies from the HDF-N.The thick grey line is relation between the SFR and the most probable value of the total luminosity, predicted from the "universal" HMXB XLF, the shaded area shows its 67% intrinsic spread, the dashed line is the expectation mean, defined by eq.(4).The five nearby galaxies, used byGrimm et al. (2003) to derive the "universal" HMXB XLF, are marked as crossed boxes.

Figure 7 .
Figure 7. Dependence of the L X −SFR relation on the XLF cutoff luminosity Lcut.The three curves, corresponding to different values of Lcut coincide in the non-linear low SFR regime but differ in the position of the break between linear and non-linear regimes.The data are the same as in Fig.6.

Figure 9 .
Figure9.Left: The probability distribution of the luminosity of the brightest HMXB source for different values of the star formation rate, computed from eq.(27) using the parameters of the "universal" luminosity function of HMXBs.Right: Expected luminosity of the brightest HMXB vs. star formation rate.The thick solid line shows the most probable value of the Lmax (eq.(A1)), the shaded area shows its 67% intrinsic dispersion, obtained from the probability distribution given by eq.(27).The filled circles show maximum observed luminosity of HMXBs in the Milky Way and several nearby star forming galaxies.

Figure 10 .
Figure 10.Illustration of the effect of hypothetical intermediate mass black holes on the L X −SFR relation.Left: The luminosity function of compact sources at different levels of star formation rate.Right: Corresponding L X −SFR relation.The thin straight line shows the linear dependence.

Figure 11 .
Figure 11.Left: Probability distribution for the luminosity of the brightest LMXB for different values of the stellar mass, computed from eq.(27), using the parameters of the broken power law approximation to the "universal" luminosity function of LMXBs.Right: Expected luminosity of the brightest LMXB source vs. stellar mass of the host galaxy.The thick solid line shows the most probable value of the Lmax, the shaded area shows its 67% intrinsic dispersion, both obtained from the probability distribution given by eq.(27).The filled circles show maximum observed luminosities of LMXB sources in the Milky Way and several nearby galaxies, studied with Chandra.The "broken" shape of the predicted dependence is a consequence of the broken power law approximation of the "universal" LMXB XLF, used in the calculations.

Figure 13 .
Figure13.Comparison of variability of low and high mass Xray binaries.Dependence of the ratio rms tot/rms0 on the total X-ray luminosity of the galaxy due to X-ray binaries.The thick solid lines show the most probable value of the rms tot/rms0, the shaded areas show its 67% intrinsic dispersion, both obtained from the Monte-Carlo simulations for the respective "universal" luminosity functions fromGrimm et al. (2003) andGilfanov (2004).For both curves, the linear parts at high L X follow rms ∝ 1/ √ L X averaging law.