κ monty: a Monte Carlo Compton Scattering code including non-thermal electrons

Low-luminosity active galactic nuclei are strong sources of X-ray emission produced by Compton scattering originating from the accretion flows surrounding their super-massive black holes. The shape and energy of the resulting spectrum depend on the shape of the underlying electron distribution function (DF). In this work, we present an extended version of the grmonty code, called κ monty . The grmonty code previously only included a thermal Maxwell J¨utner electron distribution function. We extend the gromty code with non-thermal electron DFs, namely the κ and power-law DFs, implement Cartesian Kerr-Schild coordinates, accelerate the code with MPI


INTRODUCTION
Active Galactic Nuclei (AGN) are strong sources of radiation over the full range of the electromagnetic spectrum, from radio up to γ-rays.The emission is expected to originate from a relativistic plasma flow close to these galaxies' central supermassive black holes.Low-luminosity AGN (LLAGN) are well-known sources of X-ray emission.Sagittarius A* (SgrA*), the black hole in the centre of the milky way, shows X-ray variability on the time scales of hours (Baganoff et al. 2003;Eckart et al. 2004).Messier 87 (M87), now famous for the first picture of a black hole shadow by the event horizon telescope collaboration (EHT) (EHT Collaboration et al. 2019a), is also active in X-ray emissions (Wilson and Yang 2001;Marshall et al. 2002;Perlman and Wilson 2005;Prieto et al. 2016), and shows X-ray variability on timescales ⋆ E-mail: j.davelaar@columbia.edu of days (Harris et al. 2009).One channel to generate the Xray emission is via inverse Compton (IC) scattering.
To compute synthetic spectra of LLAGN, a variety of Monte Carlo codes have been developed (see e.g.Yao et al. (2005); Stern et al. (1995); Schnittman et al. (2006); Schnittman and Krolik (2009); Laurent and Titarchuk (1999); Böttcher and Liang (2001); Böttcher et al. (2003); Dolence et al. (2009); Ryan et al. (2015); Narayan et al. (2016); Zhang et al. (2019); Mościbrodzka (2020)).A large subset of the code uses the input from general relativistic magnetohydrodynamics (GRMHD) global simulations of weakly radiating accretion flows.In these simulations, the electron energy distribution function is not explicitly computed.GRMHD codes use a fluid approximation that only contains information on the bulk properties of the plasma and no information on the distribution function.A fluid approach also does not intrinsically contain collisionless effects.However, accretion flows in LLAGN like M87* and SgrA* have a mean free path for the electrons that is much large than the actual system size, making them effectively collisionless, and deviations from thermality are, therefore, to be expected.Magnetic reconnection, dissipation of turbulent energy, shocks, and/or other plasma instabilities that influence the shape of the electron distribution function are, in general, poorly resolved, and sub-grid models for electron heating and acceleration have to be invoked.Successful attempts to resolve magnetic reconnection in global twodimensional GRMHD simulation have been performed by Ripperda et al. (2020); Nathanail et al. (2020) and, more recently, in three-dimensional simulations by Ripperda et al. (2022).These models further strengthen the need for nonthermal electron distribution functions.
One of the first attempts to model non-thermal emission from SgrA* with radiatively inefficient accretion flows (RIAF) models were made by Özel et al. (2000).Later works using RIAF models include Yuan et al. (2003), among others.Semi-analytical models of RIAFs, including jets, were developed by Broderick et al. (2015), who included electron acceleration via gap acceleration.Works by (Quataert 2004;Chan et al. 2015b,a) showed that for Sgr A* that X-ray bremsstrahlung emission is non-negligible.More recently, dynamical models based on GRMHD simulations were published by Chan et al. (2009); Ball et al. (2016); Mao et al. (2017); Chael et al. (2017); Davelaar et al. (2018bDavelaar et al. ( , 2019)); Chatterjee et al. (2020); Cruz-Osorio et al. (2022); Fromm et al. (2022).A common conclusion in all these works is that non-thermal electrons enhance the amount of NIR and radio emission.However, most of these works rely on General Relativistic Ray Tracing methods, so X-ray emission generated via IC is neither included nor approximated.
The importance of including X-ray emission in the current models used within the EHT community was shown by Mościbrodzka et al. (2016).For M87, the X-ray emission is a clear discriminator between models.If the electron temperature in the accretion disk is too high, the model easily overproduces the observed X-ray flux.However, the electron distribution function was assumed to be a thermal Maxwell-Jüttner distribution.
The X-ray emission is expected to be produced by Synchrotron Self Comptonisation.This process starts with electrons in the accretion flow that gyrate around magnetic field lines and produce emission via synchrotron emission.The emitted photons are then upscattered by the hot relativistic electrons inside the flow to X-ray and γ-ray energies via Compton scattering.Since the amount of energy that is transferred from the electron to the photon (∆E) depends on the Lorentz factor of the electron (γ) as ∆E ∝ γ 2 , the resulting spectra of the upscattered photons depend on the choice of the distribution function.It is expected that adding accelerated particles will increase the total X-ray luminosity of the source since electrons with large γ factors and photons with larger frequencies (NIR) are present.Observed X-ray flares in AGN, and other astrophysical sources, are indicators of ongoing particle acceleration to rule out potential acceleration mechanisms models that include the generation of X-ray emission based on non-thermal electrons are needed.
In this work, we present κmonty 1 a new flavour of the Monte Carlo code grmonty originally developed by Dolence  (Mościbrodzka 2020) also included polarisation and was extended to include non-thermal electron distribution functions (Mościbrodzka 2022).We made three large adaptations to the original Dolence et al. (2012) code.First, we coupled our code to the non-uniform adaptive mesh refinement (AMR) grid data structure of the GRMHD code BHAC (Porth et al. 2017;Olivares et al. 2019, www.bhac.science), in a similar manner as we described in Davelaar et al. (2019).Second, we implemented the fit formula for the emission and absorption coefficients as obtained by Pandya et al. (2016) for the initial seed photons.Third, we derived and implemented semi-analytical sampling algorithms for the κ and power-law distribution function.The methods described in this work were used to compute the X-ray SEDs of the κ-DF based models in the Event Horizon Telescope results of Sagittarius A* (Event Horizon Telescope Collaboration 2022).
In Section 2, we explain our sampling routine and describe the setup used.In Section 3, we perform a variety of code tests.We discuss and summarise our results in Section 5.

METHODS
In this section, we present the additions we made to the original grmonty code (Dolence et al. 2009).Our new code κmonty includes the κ distribution and power-law distribution to study accelerated particle emission and is mpi and openmp optimised.At first, the superphotons, a packet of photons with weight w, where the weight is w is the number of real photons represented by the superphoton, are initialised by either thermal, κ, or power-law-based emission coefficients.As they propagate through the plasma, the total intensity decreases due to absorption.Scattering events are selected based on the mean free path length.If a photon is selected for scattering, the electron has to be drawn from the relevant distribution function.In this section, we summarise the new features of κmonty, which are new sets of coordinates, new distribution functions, and the coupling to non-uniform data formats.For a complete explanation of the initialisation, integration, and scattering of the superphotons, see the paper by Dolence et al. (2009).In this section, we will give a global summary of the different aspects of the code, and we will explain in detail our modifications.

Geodesic integration
The trajectory of the superphotons is computed by solving the geodesic equation, where Γ α µν are the Christoffel symbols, and λ the affine parameter.The Christoffel symbols depend on derivates of the metric and are given by, (2) The Christoffel symbols are either provided analytically or can be computed by computing the metric derivatives using a second-order finite difference method.

Kerr-Schild coordinates
A rotating black hole is described by the Kerr metric (Kerr 1963).In spherical Kerr-Schild horizon penetrating coordinates the non-zero covariant components of the metric2 gµν are given by for a black hole with unitary mass M = 1 and angular momentum J, where a * = J/(M c) is the dimensionless spin parameter, Σ = r 2 + a 2 * cos 2 θ, and ∆ = r 2 − 2r + a 2 * .The GRMHD code used in this work BHAC primarily uses for the EHT GRMHD library (EHT Collaboration et al. 2019b) Modified Kerr-Schild coordinates where the r and θ coordinates are modified.The coordinates (t, X1, X2, X3) are related to standard KS via This results in a grid that is logarithmic spaced in radius and concentrated towards the midplane in θ, set by the h parameter.Transforming the metric to MKS is done via multiplication of the metric terms with the non-zero elements of the Jacobian,

Cartesian Kerr-Schild coordinates
We extended the code to include Cartesian Kerr-Schild (CKS) coordinates, which relate to spherical Kerr-Schild coordinates via The covariant Cartesian KS metric, gµν , is given by (Kerr 1963) here ηµν is the Minkowski metric and is given by ηµν = diag(−1, 1, 1, 1), and where r is given by and In the limit of R ≫ a * , the radius r → R. The contravariant metric is defined as where l ν is given by

Distribution functions
For the distribution function, we either use a Maxwell-Jüttner (MJ) distribution function (DF), a κ-DF, or a power law-DF.All three distribution functions are isotropic.
The MJ DF is given by where γe is the Lorentz factor of the electrons, ne the number density of electrons, Θe the dimensionless electron temperature, and K2 the modified Bessel function of the second kind.For the thermal DF, the emission coefficients used in κmonty can be found in Leung et al. (2011a).
The κ DF is used to describe the particle population of a variety of space plasma, such as the solar wind (Decker and Krimigis 2003), coronal flares on the Sun (Livadiotis and McComas 2013), turbulent flows (Kunz et al. 2016), and jets (Davelaar et al. 2018b).X-ray spectra generated based on this distribution function could, therefore, be of interest to a broad range of astrophysical problems.The DF in relativistic form (Xiao 2006) is given by, where the κ-parameter sets the power law-index via κ = p + 1, w is the width of the distribution function, and N is a normalisation constant.In the κ case, the normalisation constant N is not known analytically and is, therefore, when needed, computed during run time with a gsl integrator by demanding that The emission coefficients for the κ-DF can be found in Pandya et al. (2016).
Finally, the power-law DF is given by where p is the power-law index.The DF function is nonzero only when γ is between γmin and γmax.The emission coefficients for the power law-DF can be found in Pandya et al. (2016).
All three DFs are shown in figure 1.The dependence of the DF in the code can be found in three places; the emission coefficients, the cross-section for scattering, and the sampling of the DF if a scattering event takes place.In the remainder of this section, we will explain what changes we made to the code for each of these.

Emission coefficients
For the emission coefficients, the code uses fit formulas from Leung et al. (2011b) for the thermal distribution function and κ and power-law from Pandya et al. (2016).The fit formulas for the κ coefficients are only valid for κ < 7.5, they do not recover the thermal DF in the limit of κ → ∞.

Cross-sections
The cross-section for an IC scattering is dependent on the local electron population, both the energy budget as well as the shape of the distribution.The cross-section is given by where σ h is defined as the "hot cross section", where ne is the number density of electrons, dne d 3 p is the electron distribution function, µe is the cosine of the angle between the superphoton momentum and the electron momentum, and βe is the electron speed in the plasma frame, and σ the Klein-Nishina total cross-section.
From the cross-section the code computes the scattering opacity τs, via τs = αsrgh/(mec 2 )∆λ, where αs is the extinction coefficient given by αs = neσ h .The total probability for a scattering event is then given by p = 1 − e −bτs , where b is a bias factor that enhances the scattering probability, as introduced in Dolence et al. (2009).This bias is then counteracted by splitting the scattered superphoton into an upscattered superphoton with weight bw and an unscattered remnant superphoton with weight (1 − b)w.In this work, κmonty only exploits the original grmonty bias function given by b = Θe/⟨Θe⟩, where ⟨Θe⟩ is the volume averaged dimensionless electron temperature.More fine-tuned bias functions, such as the one in igrmonty (Wong et al. 2022), are not explored since the main focus of this work is the numerical algorithms for non-thermal DF sampling.

Sampling routines
The outcome of a Compton scattering event between a superphoton, and an electron depends on the superphoton's wavevector and the electron's four-velocity.The electrons are, in the case of GRMHD models, coupled to the plasma parameters of the protons, which are approximated by a fluid description.We, therefore, only know ensemble averages.To be able to select a single electron, we need a sampling algorithm that, given a set of plasma variables, draws a γ factor based on the chosen DF, in the original grmonty, the procedure from Canfield et al. (1987) is used for the MJ DFs.Therefore, only new samplers for the κ-DF and power law-DF are needed.

A semi-analytical sampling routine for the κ distribution function
For the κ-DF, we will generalise the procedure from Canfield et al. (1987) for the κ-distribution function.The relativistic κ-distribution function as function of velocity β = v c is given by To sample electrons based on this distribution function, we derive a Monte Carlo-based scheme.We first introduce a random variable y that is coupled to γ and w and transform our probability density function (pdf) accordingly Following the procedure by Canfield et al. (1987), we can split our pdf into a series of pdfs after multiplying with to obtain this can be rewritten as We can now identify two different functions that are the core of the sampling routine, a rejection function H3(w, y) and a sampling function G3(w, y) such that fκ(y, w) = G3(w, y)H3(w, y). (26) The sampling function is given by and consists of a series of sample coefficients gj(y) and probability coefficients πj(w).The sampling coefficients are given by gj(y) ≡ where nj is a normalisation constant obtained by integrating Performing these four integrals, we get where Γ(x) is the Gamma function.The analytical solutions for the normalisations are only valid in the case that κ > 2.
The probability coefficients are given by with The rejection function H3(w, y) is then defined as The rejection criterion is then, similarly to Canfield et al. (1987), The rejection criterion can be generalised even more when one also wants to add an exponential cutoff to the κdistribution function, We can contract the exponential cutoff into the rejection criterion.Since if we encounter a large value of γ, it will decrease the likelihood of being accepted by the sampling routine, The procedure for sampling the κ distribution function is, therefore This rejection constraint is, as mentioned in Canfield et al. (1987), very efficient because for large values of w or small values of w, h(w, y) asymptotes to one.
The last step in this derivation of the sampling routine is to find a procedure for the third step, find y according to gj(y).In the case of a thermal distribution function, the gj(y) are χ 2 functions that can be sampled with standard gsl library functions.In the case of the κ-distribution function, this is less straightforward.To sample gj(y), we make use of the fact that the cumulative distribution functions (CDF) belonging to the pdfs gj(y) are monotonically increasing functions between zero and one.These CDFs can be obtained by integrating gj(y) from zero to y, Fj(y) = y 0 gj(y)dy. (43) Performing these four integrals result in where 2F1 is the second-order hypergeometrical function of the first kind.We can then find an y by using an inverse transform sampling method, (i) draw a number u from a uniform distribution [0, 1] (ii) solve such that Fj(y) = u, where Fj(y) is the cumulative distribution function of gj(y) (iii) y is sampled according to gj(y) To solve step two, we implemented a Brent root-finding algorithm.
For the implementation of the algorithm, special attention has to be paid to the hypergeometrical functions encountered in the CDFs.For negative integer values of the arguments of the hypergeometrical function, it is impossible to use the series expansion form, as implemented in the gsl library.We, therefore, pre-computed a table of the hypergeometrical function as a function of κ and y with Mathematica, which is read in by κmonty.The resulting table is then interpolated with a first-order interpolation scheme.

The κ → ∞ limit
In the case that κ → ∞ we expect our derived sampler to recover the original sampler by Canfield et al. (1987).
Our last test is to check the rejection criterion, which is already in the same form and is equal to the one by Canfield et al. (1987) in the case that lim κ→∞ w = Θe, which is the case if w = κ−3 κ Θe.

A numerical sampling routine for the κ distribution function
We also implemented a more mundane rejection sampling method for drawing electrons from the κ distribution.This implementation can be found in the public code igrmonty 3 as well as in κmonty.The expression ∂fκ/∂γ = 0 is solved for γ, hereafter γmax.Fiducial γ are then drawn uniformly in log space between γmin, = MAX(1, 0.01 × Θe) ( 56) where these parameters are chosen to ensure both accuracy and computational efficiency for all Θe.New γ are drawn until the condition γfκ(γ)/γmaxfκ(γmax) > rand, where rand is a uniformly distributed random number in the range [0, 1) and the extra factors of γ arise from drawing fiducial γ uniformly in log space.This procedure generalises to any realistic electron distribution function, including the MJ distribution, but also, e.g., anisotropic DFs, DF based on charged test particles in MHD, or DFs based on first-principle PIC simulations.Compared to the Canfield et al. (1987) prescription for sampling MJ, this rejection sampling approach leads to only modestly (∼ 20 %) slower calculation wallclock times.

Power-law distribution
For the power-law distribution function, the procedure is much more trivial.The cumulative distribution function is, in this case, given by Since this CDF can be inverted analytically, we can use an inverse sampling method.We first pick a random number x between zero and one, and γ can then be found by computing the inverse of eqn.58 and setting

Interface with BHAC
The Black Hole Accretion Code (BHAC) (Porth et al. 2017;Olivares et al. 2019) is a finite volume code that solves the covariant GRMHD equation using a 3+1 split.The code is capable of using AMR grids that, during runtime, based on user-defined criteria, can refine or derefine the grid.BHAC outputs the GRMHD data in an octree structure.We fully interfaced κmonty to this data format similarly to Davelaar et al. (2019), and made κmonty capable of initialising superphotons and performing IC in a non-uniform grid.Only at a low or high Lorentz factor the error increases for the w = 10 and w = 0.1 cases, respectively.This is caused by the MC nature of our sampler, which makes regions with small electron number density harder to sample due to MC noise.

CODE VERIFICATION
To verify our implementation of the described methods, we performed extensive code tests.In this section, we describe the performed tests and discuss the results.

Sampling routines
We first tested the sampling routines by computing κdistribution functions for three values of w = [0.1, 1, 10] and κ = 4.This is done by drawing 10 9 electrons and comparing the resulting distribution with the analytical form.The results of this are shown in figure 2. Overall the relative difference between the exact form and the distribution is close to 0% for all values of w, except for γ values close to one for w = 10, or large γ values in the case of w = 0.1.For these γ values, the distribution functions have a small population of electrons, which are therefore dominated by MC noise.Similarly, for the power-law distribution, we computed distribution for three values of the power-law index, p = [3, 4, 5] and γmin = 3.5, and γmax = 104 .The results of this are shown in figure 3. We find good agreement between the analytical form and the output of our sampler, and large deviations from 0% are only found in regions with low electron number densities, e.g., large γ values.

Uniform sphere test
To test the implementation of both the emission coefficients and the scattering kernel, we designed a simple one-zone model that we will use to compare κmonty with the ray For all three cases we set γ min = 1.0 and γmax = 10 3 .The deviation between the sampler and the exact form is for almost the entire range less than 1%, except for large γ values, due to the steep power-law populating the high end of the DF is affected by MC noise.
tracing codes RAPTOR 4 and ipole-IL and cross-compare the semi-analytical and numerical samplers in κmonty and igrmonty.The one-zone model is an isothermal sphere with a uniform density profile embedded in a uniform vertical magnetic field in flat spacetime.We solve the problem in spherical polar coordinates; the non-zero metric terms are given by gtt = −1 (60) grr = 1 (61) The plasma variables are given by, The solution is specified by constants ρ0, R0, Θe,0, and B0, along with an outer boundary to the domain Rout.ρ0 is given in terms of a characteristic Thomson depth τ0: where L is the code length unit conversion and N is the electron number density unit conversion.B0 is expressed in terms of the plasma β at r = 0, β0: where Pg is the gas pressure.We set the adiabatic index to γadiab = 13/9, and the ratio between the proton and electron temperature is set to be Trat = 3.0.

Comparison with RAPTOR
To check whether the implementation of the emission coefficients is correct, we compute the SED without Compton scattering of the uniform sphere with κmonty and RAP-TOR.RAPTOR (Bronzwaer et al. 2018;Davelaar et al. 2018a;Bronzwaer et al. 2020) is a General Relativistic Ray tracing code that solves the covariant radiation transport equation in curved spacetime.GRRT methods are intrinsically different from MC methods since they use bundles of rays.This makes them ideal for computing synthetic images since only a small portion of the sky is covered by a camera.The camera consists of pixels, and every pixel is assigned an initial wavevector used to solve the geodesic equation backwards in time.Along these geodesics, the unpolarised radiation transport equation is solved.
The setup and code-specific parameters for both κmonty as well as RAPTOR are shown in Table 1.The results of this test can be seen in Figure 4.All three cases show minor discrepancies between the two codes except for the lowfrequency part of the spectrum, which is dominated by MC noise.The thermal case shows deviations at the highest frequencies as well.The source becomes optically thin at these frequencies, and the emission region shrinks.The electrons responsible for this emission are at the exponential tail of the distribution functions, affecting the sampling.

Comparison with ipole-IL
As an additional independent test to validate both κmonty as well as RAPTOR, we also cross-compared the output of κmonty with the ipole-IL ray tracing code (Mościbrodzka and Gammie 2018;Wong et al. 2022).This test is identical to the test with RAPTOR presented in the previous subsection.The result can be seen in Figure 5, and the agreement between the two codes is identical to the test with RAPTOR.

Compton Scattering test
To test the implementation of the non-thermal DFs within the Compton scattering module, we cross-compared a semianalytical implementation with a numerical one in igrmonty.The camera is positioned at a distance of 10 4 L, and we compute spectra for three inclinations 30 • , 60 • , 90 • .The full set of parameters is similar to the ones shown in Table 1, except that N θ = 3,N ϕ = 1.And we use folding around the equator, meaning bins in the half-sphere above and below the equator are averaged to increase statistics.
The results of this comparison can be seen in Figure 6.There is clear consistency between the two methods, with a relative difference of less than 1% for most of the SED.The MC noise grows at high frequencies due to the low probability of double-scattering events.To test the convergence of this test, we performed multiple runs with different amounts of initial superphotons.The convergence can be seen in the right panel of Figure 6.There is a clear √ N convergence visible, as would be expected of a Monte Carlo code.

GRMHD test
A more challenging test is performed by comparing the synchrotron part of the SED between κmonty and RAPTOR computed from a snapshot of a GRMHD simulation.The simulation, in CKS coordinates, is the same as that presented in Davelaar et al. (2019); Olivares et al. (2019).The initial condition of this simulation is a Fishbone and Moncrief (1976) torus with black hole spin parameter a * = 15/16, inner radius 6 GM/c 2 , pressure maximum at 12 GM/c 2 , and adiabatic index γ = 4/3.The initial magnetic field profile is a single poloidal loop that follows isocontours of the density profile.The initial torus is weakly magnetised and set by the ratio between the maximum magnetic pressure Pmag,max and maximum gas pressure Pmax and is set to be Pmax/Pmag,max = 100.Since the GRMHD simulation we used does not include electron thermodynamics, we use a parameterisation for the electron temperature Θe = U (γ − 1)mp ρme(Trat + 1) where U is the internal energy, mp the proton mass, mp the electron mass, and Trat the ratio between the proton to electron temperature which we set to Trat = 3.0.To impose charge neutrality of the plasma, we set the electron number density equal to the proton number density.
The GRMHD simulation is scale-free to convert from code to c.g.s.units.Besides the aforementioned length and time unit, also a mass unit M is needed.The length and time units are given by the black hole length and gravitational timescales, rg = GMBH/c 2 and tg = rg/c, while the mass unit sets the energy content of the simulation and is tightly related to the mass accretion rate via Ṁ = ṀsimM/T .To convert the plasma variable to c.g.s.units the following conversion factors are used: ρ0 = M/L 3 , u0 = ρ0c 2 , and B0 = c √ 4πρ0.The test-specific parameters for the camera, DF, and GRMHD parameters can be seen in Table 1.Only the inner 40 rg of the GRMHD domain is used to limit the field of view needed and speed up the convergence of the MC solution.
The results of this test can be seen in Figure 7.For all three DFs, the error is close to 1% in most frequency bins.For the thermal case at high frequency, the error grows at high frequencies, similar to the uniform sphere test.All three DFs show less agreement at lower frequencies (around 10 10 Hz), due to opacity effects.

CODE PERFORMANCE AND AVAILABILITY
Since the convergence of a Monte-Carlo simulation scales with √ N , with N the amount of superphotons, it is computationally demanding to acquire a fully converged solution.To accelerate the convergence, grmonty was parallelized with OpenMP, allowing it to run on multiple cores on a single node.To improve our code performance even further, we parallelized κmonty with MPI, which allows us to run over many nodes.We identified two potential ways to MPI parallelize our computations.One could either distribute the GRMHD domain over all the available MPI processes and trace superphotons through the domain, this could lead to substantial communication overhead when superphotons leave/enter the domain of a processor, or would require a very labour intensive implementation where batched superphotons are send and received with non-blocking MPI.Alternatively, one could launch independent MPI instances that all have the full domain in memory and at termination, sum all the resulting spectra.The first option has the benefit that it is memory efficient.However, scaling is limited to the IO overhead as the domain per MPI instance gets smaller.The second option is more memory demanding but is trivial to implement and is easily scalable to large numbers of nodes as long as the domain fits within the memory per MPI task.A typical 256 3 simulations takes about five Gigabytes of memory, which allows for this implementation strategy, for high resolution simulations either the first method or a hydro Openmp+MPI implementation should be explored.For κmonty, we opted for the second parallelization strategy.In the remainder of this section, we test our implementation for scalability and performance.
Figure 6.Comparison between the semi-analytical (κmonty) and numerical sampler (igrmonty) at three inclinations, top left: spectra at inclinations of 30 • , 60 • , and 90 • .All three cases show consistent spectra with order 1% differences (bottom left).The MC noise grows at high frequencies due to the low probability of double-scattering events.Right: L 1 convergence of the comparison between the semi-analytical and numerical samplers.As expected of MC methods, a clear √ N trend is visible.The code is publicly available on GitHub5 .To test the performance of κmonty we ran a BHAC MKS GRMHD snapshot with scattering for the κ-DF and thermal-DF.First, we varied the number of initial superphotons to test for the solution's self-convergence.The resulting spectra for N = (106 , 10 7 , 10 8 , 10 9 ) can be seen in the left panel of Figure 8.The right panel shows the convergence rate, which shows a √ N scaling.Secondly, we ran the code on varying amounts of nodes to test our code's scalability.For this test, we used nodes with 128 AMD Rome cores with a total processing power per node of 4.6 teraflops.We compiled the code with the intel compiler and standard intel optimization flags.We varied the number of nodes from 1 to 40 nodes.Again, we use the GRMHD setup with scattering and start the code with N = 10 5 superphotons per core.The scaling is shown in Figure 9, and shows an evident linear scaling.Overall we achieve a speed of a few thousand superphotons per second per core, although note that this highly depends on the problem, stepsize, and scattering opacity.The 40 nodes run achieves a speed of ten million superphotons per second.The difference between the thermal, power-law, and κ-DFs is negligible since the computational bottleneck is the handling of the non-uniform data structure and geodesic integration.
We also tested the performance of the semi-analytical samplers.We provide a stand-alone openmp accelerated code of the sampling routines on GitHub 6 .We ran the samplers on the same architecture as for the κmonty performance test.The κ-DF samples a few hundred thousand electrons per second per core, while the power-law and thermal DF samples around ten million electrons per second per core.Although the κ-DF is orders of magnitude slower, since it does not make use of heavily optimized gsl samplers, the routine κmonty 11 10 10 10 12 10 14 10 16 10 18 10 20 10 22 [Hz] is only called at most one or two times per superphoton, meaning the computational cost is negligible when used by κmonty.

CONCLUSION
We presented our new κmonty code.The code is an extension of grmonty and now includes κ and power-law distribution functions for both the radiative transfer coefficients and sampling routines.The code can also post-process the AMR data format of BHAC.We tested our sampling routines by comparing the numerical output to the analytical DFs.
We used a uniform isothermal sphere to test the implementations of the emission coefficients by comparing them with the ray-tracing code RAPTOR.We tested the full emission and scattering kernels by comparing them to an implementation of the kappa distribution in igrmonty that uses the numerical sampling routine.And finally, test the coupling to BHAC by comparing the synchrotron emission with RAPTOR by using a snapshot of a black hole simulation in Cartesian coordinates which uses AMR.
(i) Draw a random number x1 (ii) If x1 < πj (iii) Find y according to gj(y) (iv) Draw a random number x2 (v) Accepted y when x2 < h(w, y)

Figure 2 .
Figure2.Output of the κ sampler compared to the analytical form for: w = 0.1 (red), w = 1.0 (green) and w = 10 (blue).All three cases use κ = 4.0.The majority of the DF from the sampler shows almost perfect agreement with respect to the exact form.Only at a low or high Lorentz factor the error increases for the w = 10 and w = 0.1 cases, respectively.This is caused by the MC nature of our sampler, which makes regions with small electron number density harder to sample due to MC noise.

Figure 3 .
Figure3.Output of the power-law sampler compared to the analytical form, p = 4.0 (red), p = 3.0 (green), and p = 2.0 (blue).For all three cases we set γ min = 1.0 and γmax = 10 3 .The deviation between the sampler and the exact form is for almost the entire range less than 1%, except for large γ values, due to the steep power-law populating the high end of the DF is affected by MC noise.

Figure 4 .Figure 5 .
Figure4.Comparison between κmonty and RAPTOR for the uniform sphere test.Top panel: spectra for MJ-DF (red), κ-DF (green), and power-law DF (blue) for κmonty, black crosses data point from RAPTOR.Bottom panel: the relative difference between the three DFs.The relative error is around 1% for the majority of the SED.In the thermal case, at high frequency, the error grows.This is caused by a quickly shrinking emission region size, making sampling more difficult.All three models show slightly more noise at low frequencies due to the larger optical thickness.More superphotons are absorbed, which makes convergence more difficult compared to the optically thin part of the spectrum.

Figure 7 .
Figure 7.Comparison between RAPTOR and κmonty for the GRMHD test, left: Thermal DF, middle: κ-DF, right: power-law DF.Consistent with the uniform sphere test, the deviations are of the order of 1% per cent except for the high or low-frequency part of the spectrum.

Figure 8 .Figure 9 .
Figure 8. Self convergence of the spectra based on the GRMHD snapshot including Compton scattering, model uses the κ-DF.Left: spectra for N = (10 5 , 10 6 , 10 7 , 10 8 , 10 9 ) superphotons.Right: self convergence of the solution with respect to the N = 10 9 superphotons run.An evident √ N scaling is visible as expected for MC methods.
.(2012).grmonty is a general relativistic Monte Carlo radiative transport code developed to compute spectra of accreting black holes.A more recent version called RAD-POL 1 Publicly available at: https://github.com/jordydavelaar/kmontyet al

Table 1 .
Code and test parameters for all tests described in this paper.