Particle-in-cell Simulations of the Magnetorotational Instability in Stratified Shearing Boxes

The magnetorotational instability (MRI) plays a crucial role in regulating the accretion efficiency in astrophysical accretion disks. In low-luminosity disks around black holes, such as Sgr A* and M87, Coulomb collisions are infrequent, making the MRI physics effectively collisionless. The collisionless MRI gives rise to kinetic plasma effects that can potentially affect its dynamic and thermodynamic properties. We present 2D and 3D particle-in-cell (PIC) plasma simulations of the collisionless MRI in stratified disks using shearing boxes with net vertical field. We use pair plasmas, with initial $\beta=100$ and concentrate on sub-relativistic plasma temperatures ($k_BT \lesssim mc^2$). Our 2D and 3D runs show disk expansion, particle and magnetic field outflows, and a dynamo-like process. They also produce magnetic pressure dominated disks with (Maxwell stress dominated) viscosity parameter $\alpha \sim 0.5-1$. By the end of the simulations, the dynamo-like magnetic field tends to dominate the magnetic energy and the viscosity in the disks. Our 2D and 3D runs produce fairly similar results, and are also consistent with previous 3D MHD simulations. Our simulations also show nonthermal particle acceleration, approximately characterized by power-law tails with temperature dependent spectral indices $-p$. For temperatures $k_BT \sim 0.05-0.3\, mc^2$, we find $p\approx 2.2-1.9$. The maximum accelerated particle energy depends on the scale separation between MHD and Larmor-scale plasma phenomena in a way consistent with previous PIC results of magnetic reconnection-driven acceleration. Our study constitutes a first step towards modeling from first principles potentially observable stratified MRI effects in low-luminosity accretion disks around black holes.


INTRODUCTION
The primary driver of accretion in astrophysical disks is believed to be the turbulence generated by the magnetorotational instability (MRI; Balbus & Hawley 1991, 1998, which provides the needed outward transport of angular momentum. Most of our knowledge about the nonlinear evolution of the MRI in different disk regimes comes from magnetohydrodynamic (MHD) simulations. However, in the regime where the plasma accretion rate is much lower than the Eddington rate, the Coulomb mean free path of the particles can be much larger than the system size, rendering the disk effectively collisionless and making the MHD approach inapplicable. This collisionless accretion regime is expected, for instance, in the low-hard state of X-ray binaries (Esin et al. 1997) as well as around the central supermassive black holes of most nearby galaxies, including M87 and Sagittarius A* (Sgr A*) in our own Milky Way (Yuan & Narayan 2014).
The collisionless version of the MRI can give rise to several kinetic plasma phenomena, which may in turn affect its dynamics as well as the thermodynamic properties of the accreting plasma. ★ E-mail: astor.sandoval@ug.uchile.cl These kinetic phenomena have been studied mainly via unstratified shearing-box MRI simulations, using either a fluid approach through kinetic-MHD models (Sharma et al. 2006(Sharma et al. , 2007 or particle simulations that employ either the hybrid or the particle-in-cell (PIC) methods (Riquelme et al. 2012;Hoshino 2013Hoshino , 2015Kunz et al. 2016;Inchingolo et al. 2018;Bacchini et al. 2022). One of the relevant kinetic phenomena is the appearance of an anisotropic stress, which is due to the presence of a pressure anisotropy in the accreting turbulence. Previous unstratified shearing-box simulation studies, both based on fluid and particle methods, have found that this anisotropic stress may contribute significantly to the disk viscosity, making the collisionless MRI turbulence more efficient in transporting angular momentum compared to its collisional counterpart.
Another potentially important collisionless phenomenon is the possibly different ion and electron heating rates (e.g., Sharma et al. 2007). However, to date PIC studies have only used an ion to electron mass ratio / = 1 (or close to unity), therefore not capturing the possibly different heating efficiencies of the different species. Plasma energization can also include nonthermal particle acceleration. Studying this phenomenon requires fully kinetic treatments of at least one species, which has been done through PIC and hybrid simulations. Different levels of nonthermal particle acceleration have indeed been found by these types of studies (Riquelme et al. 2012;Hoshino 2013Hoshino , 2015Kunz et al. 2016;Inchingolo et al. 2018;Bacchini et al. 2022), although the conditions under which this acceleration is most efficient and the mechanism(s) underlying this phenomenon remain to be clarified.
An important physical ingredient, so far not included in hybrid or fully kinetic PIC studies of the MRI, is the vertical stratification of the disks. While the unstratified local shearing-box approximation allows us to investigate a disk by focusing on a small vertical section, this approach does not account for potentially important processes in stratified disks, such as outflows, disk expansion and the generation of a corona, among others. Stratified disks have been included in previous MHD shearing-box simulations of the MRI, which have found that stratification can give rise to important phenomena like outflows and dynamo-like processes, which may in turn affect the overall accretion efficiency of the disks (Bai & Stone 2013;Salvesen et al. 2016). Also, a kinetic-MHD study that considers a stratified disk (Hirabayashi & Hoshino 2017) has found that disk stratification may decrease the importance of anisotropic stress significantly compared to unstratified kinetic-MHD results.
To address these possible effects, our study employs 2D and 3D stratified shearing-box PIC simulations to examine the development of the collisionless MRI. We use equal ion and electron masses, = = for computational convenience, and focus on the sub-relativistic temperature regime, relevant for the inner regions of black hole accretion disks ( ≲ 2 , where is the Boltzmann constant, is the plasma temperature and is the speed of light). By comparing with unstratified PIC runs, we show the importance of including stratification to describe phenomena like plasma beta evolution, effective viscosity and particle acceleration. In our 2D runs we pay special attention to the role played by the ratio between the initial cyclotron frequency of the particles and the Keplerian frequency of the disk, ,0 /Ω 0 (hereafter, the scale-separation ratio). In realistic disks, this ratio satisfies ,0 /Ω 0 ≫ 1 and determines the scale separation between mesoscale MHD phenomena and kinetic microphysical processes. Even though most of our analysis is done in 2D, in this paper we take a step forward by conducting the first fully kinetic 3D simulation of the stratified MRI evolution. This preliminary 3D simulation enables us to compare it with our established 2D results and gain valuable insight into the limitations of the 2D approach. This exploration sets the stage for future investigations aimed at fully unraveling the complexities of the 3D scenario.
The paper is organized as follows. In §2 we describe our numerical method and simulation setup. In §3 and §4 we present the general properties of the stratified MRI turbulence in 2D and 3D, respectively. In §5 we quantify the effective viscosity in our runs, and in §6 we analyze the ability of the stratified MRI turbulence to accelerate particles. Finally, we present our conclusions in §7.

SIMULATION SETUP
We use the electromagnetic PIC code TRISTAN-MP (Buneman 1993;Spitkovsky 2005) in 2D and 3D. Our simulations are performed in the local, shearing-box approximation (Hawley et al. 1995), using Cartesian coordinates where the , and axes correspond to the radial, azimuthal (or toroidal) and vertical directions of the disk, respectively. This reference frame rotates with an angular velocity 0 = Ω 0ˆ, corresponding to the Keplerian angular velocity at a radius that coincides with the center of our simulation box. In order to model a stratified disk, we include the vertical component of the gravitational force produced by the central object, − Ω 2 0ˆ, and we initially set up an isothermal disk in hydrostatic equilibrium with a -dependent density profile: where 0 is the plasma density at the disk midplane (considering both species), 0 is the scale height of the disk given by 0 = (2 0 / ) 1/2 /Ω 0 and 0 is the initial plasma temperature, which is given by 0 / 2 = 5 × 10 −3 in all of our runs. Our runs do not include any type of particle cooling, so a gradual increase in the temperature and scale height of the simulated disks is expected due to dissipation of magnetic energy. The whole simulation domain is initially threaded by a vertical, homogeneous magnetic field 0 = 0ˆ, so that the initial plasma parameter in the disk midplane, 0 (= 8 0 0 / 2 0 ), has a value of 0 = 100. These choices for 0 and 0 imply that the initial Alfvén velocity in the disk midplane, ,0 (= 0 /(4 0 ) 1/2 ), is ,0 / = 10 −2 in all of our runs.

Basic Equations
In our rotating frame, the time derivative of particles momentum = ( , , ) is determined by the Lorentz force, the radial and vertical components of gravity, and the Coriolis force: 1 where = /( ) = ( , , ), , and are, respectively, the particle velocity, the particle charge and the electric and magnetic fields. In this non-inertial frame, Maxwell's equations also acquire extra terms, which modify the evolution of the electric field as (Schiff 1939): where is the current density and 0 is the Keplerian rotation velocity of the disk at the center of our simulation box (the evolution of the magnetic field / = − ∇ × is not modified in the rotating frame). As discussed in Riquelme et al. (2012), the terms proportional to 0 in Eq. 3 can in principle be comparable to the displacement current / , but should not change the non-relativistic MHD behavior of the plasma. This is because, in the non-relativistic regime (| 0 | = 0 ≪ ), these extra terms are always much smaller than the first term on the right hand side of Eq. 3 ( ∇ × /4 ). Therefore, the current density should still adjust to satisfy ≈ ∇ × /4 , as assumed in the non-relativistic MHD approach. Thus, as it has been done in all previous PIC and hybrid studies of the MRI, we drop the terms proportional to 0 in Eq. 3 and solve the conventional Maxwell's equations. We are thus implicitly assuming that these (beyond MHD) modifications to the displacement current do not affect considerably the kinetic MRI dynamics. We list the initial parameters of our simulations, which are: the scale-separation ratio ,0 /Ω 0 , where ,0 = | | 0 / is the initial cyclotron frequency of the particles, the box size along the different axes ( , and ) in terms of = 2 ,0 /Ω 0 , the grid spacing Δ (equal in all dimensions) in terms of the initial plasma skin depth, / ,0 = /(4 0 2 / ) 1/2 , the initial number of particles (ions and electrons) per cell, and the speed of light in units of Δ/Δ , where Δ is the simulation time step.

Shearing Coordinates
Simulating the MRI in the shearing-box approximation requires implementing shearing periodic boundary conditions in the radial ( ) direction (e.g., Hawley et al. 1995). We do this by employing shearing coordinates (Riquelme et al. 2012), in which the grid follows the shearing velocity profile within the shearing box, allowing the use of standard periodic boundary conditions in the radial ( ) direction. However, the use of shearing coordinates introduces modifications in the evolution of the electric and magnetic fields, as well as in the evolution of particles momenta and positions. These modifications are described in detail in the Appendix of Riquelme et al. (2012) and, for easy access, are also summarized below.
In the shearing coordinates, the fields evolve as The last terms in these equations, which are proportional to / (hereafter, the -dependent terms), can, however, be neglected in the ,0 / ≪ 1 regime, as it is shown below. This can be seen considering that the size of our shearing-boxes in the direction should be typically a few times the wavelength of the most unstable MRI modes = 2 ,0 /Ω 0 , which means that Ω 0 ∼ ,0 (the fact that is the dominant scale of the MRI turbulence even in its nonlinear stage will be shown in §4.1 and further discussed in §4.2). Also, assuming that the order of magnitude of the time derivative of any field component should satisfy / ∼ Ω 0 , one can calculate the ratios between the magnitudes of the −dependent terms in Eqs. 4 and 5 and the left hand side of these equations, obtaining: for Eq. 4 and for Eq. 5. Since in general | |/| | ≲ 1 (which is verified in §4.2), the right hand side of Eq. 6 is much smaller than unity as long as ,0 / ≪ 1, implying that the -dependent term in Eq. 4 can be safely neglected. The right hand side of Eq. 7, on the other hand, is not necessarily ≪ 1 since its value depends on the precise magnitude of the ratio | |/| |, which makes the −dependent term in Eq. 5 not necessarily negligible. However, using the approximation ∇ ∼ −1 ∼ (Ω 0 / ,0 ) , we can calculate the ratio between the magnitude of this −dependent term and an estimate of the magnitude of the first term on the right hand side of Eq. 5 ( ∇ × ), obtaining: This implies that dropping the −dependent term in Eq. 5 should not change the non-relativistic MHD behavior of the plasma, in which ≈ ∇ × /4 , and is also consistent with our previous choice of ignoring the terms proportional to 0 in Eq. 3. Doing a similar analysis, we find that the ratio between the magnitudes of the third and the first terms on the right hand side of Eq. 5 is so, for consistency, we also neglect the former. Therefore, in our simulations we evolve the fields by solving the equations: In terms of particles evolution, in the shearing coordinates each particle's momentum evolves as (Riquelme et al. 2012): which is valid in the limit Ω 0 ∼ ,0 ≪ and as long as the shear velocity of the plasma within our simulation domain, , is non-relativistic. This last assumption is justified since ∼ Ω 0 , and in our shearing-box is also of the order of a few times = 2 ,0 /Ω 0 . This implies that | | ∼ Ω 0 ∼ ,0 , making Eq. 12 valid in the regime ,0 ≪ . Finally, the evolution of the particles position = ( , , ) is given by: which is obtained combining Eqs. A30 and A35 of Riquelme et al. (2012) also in the limit in which is non-relativistic.
In order to safeguard the numerical stability and accuracy of our simulations, every time the factor (3/2)Ω 0 on the right hand side of Eqs. 10, 11 and 13 equals an integer, we reset these equations to their initial ( = 0) shape. This implies a periodic "unshearing" of our shearing grid that, therefore, requires a remapping of the electric and magnetic fields, as well as of the particles positions. This periodic redefinition of the time origin in our runs means that the factors (3/2)Ω 0 in Eqs. 10, 11 and 13 never surpass unity. Notice also that Eq. 13 implies that relativistic particles may in principle change position in the −direction at a rate close to twice the speed of light. This should not be considered a violation of special relativity, since this equation only describes the update of particle positions in our non-inertial, time-varying shearing coordinates. However, this situation may affect the numerical stability of our method. In order to avoid this possibility, our 3D runs use = 0.225Δ/Δ . Since this is not an issue in our 2D runs, in those cases we use = 0.45Δ/Δ (see Table 1).
We emphasize that, in order to obtain our plasma evolution equations, we have assumed a non-relativistic plasma with ,0 / ≪ 1, which rotates at non-relativistic velocities ( 0 ≪ ). This implies that our work strictly applies to a plasma at radii significantly larger than the gravitational radius of a central black hole. For this reason, in this work we concentrate on a sub-relativistic regime, where the plasma temperature satisfies ≲ 2 . Notice, however, that the treatment of individual particles is relativistic, since (as we see below) a small fraction of them can still be nonthermally accelerated to energies much larger than 2 .

Boundary conditions along
Using shearing coordinates allows the use of periodic boundary conditions both in the (radial) and (toroidal) coordinates. In the coordinate we use open boundary conditions, which allow the existence of field and particle outflows in our stratified setup. Thus, in our runs particles are removed from the simulation box after they cross the vertical boundaries, while the fields are absorbed by these boundaries. This configuration effectively prevents outflowing fields from rebounding into the simulation domain (Cerutti et al. 2015;Belyaev 2015;Sironi et al. 2016). This is done by implementing an absorption layer of width Δ in the vertical boundaries, where the terms are added to the right hand side of Eqs. 10 and 11, respectively. We use Δ = 50 cells and ( ) = (40/Δ )(| − |/Δ ) 3 within the absorption layer ( ( ) = 0 otherwise), where is the inner edge of the absorption layer and Δ is the simulation time step.

Numerical parameters
The simulations presented in this paper and their numerical parameters are listed in Table 1, with all physical quantities in stratified runs corresponding to plasma conditions in the disk midplane. These parameters are the scale-separation ratio ,0 /Ω 0 , where ,0 = | | 0 / is the initial cyclotron frequency of the particles, the box size along the different axes ( , and ) in terms of , the grid spacing Δ in terms of the initial plasma skin depth, / ,0 = /(4 0 2 / ) 1/2 , the initial number of macroparticles (ions and electrons) per cell and the speed of light in units of Δ/Δ , where Δ is the simulation time step. Notice that we ran simulations using several values of Δ, and , and to make sure that our results are numerically converged. Table 1 only includes the simulations used to present our results.

Notation Convention
In this section, we introduce various types of averages denoted by angled brackets with different subscripts, namely ⟨ ⟩ , ⟨ ⟩ − , and ⟨ ⟩ . ⟨ ⟩ denotes the average along the axis at a fixed height for 2D stratified simulations, ⟨ ⟩ − denotes the average over the − plane at a fixed for 3D stratified simulations, and ⟨ ⟩ represents the average taken over the volume of the disk for stratified simulations, while for unstratified simulations it represents the average over the entire simulation domain.
Additionally, we use an overline notation ( ) for quantities that are computed as the ratio of two volume averages. For instance, for the plasma and temperature we define ≡ ⟨8 ⟩ /⟨ 2 ⟩ and ≡ ⟨ ⟩ /⟨ ⟩ . In these expressions, ⟨ ⟩ = ⟨ ∥ ⟩ /3 + 2⟨ ⊥ ⟩ /3, where denotes the isotropic pressure, and ∥ and ⊥ correspond to the pressure parallel and perpendicular to the local magnetic field, respectively.
Since in the stratified runs these averages are calculated in the disk region, we define this region through the condition | | < ( ), where ( ) = (2 / ) 1/2 /Ω 0 denotes the instantaneous scale height of the disk. Notice that the calculation of has to be done in the disk region itself, whose definition depends on through the inequality | | < ( ), implying that the of the disk has to be determined recursively.

2D RESULTS
In this section we describe the stratified MRI turbulence using 2D simulations, paying special attention to the difference between stratified and unstratified simulations and to the role played by the scaleseparation ratio ,0 /Ω 0 . In §3.1 we analyze the properties of the turbulence, and in §3.2 we show the evolution of the plasma properties in the disk.

Turbulence properties in 2D
Figure 1 shows three snapshots of the squared magnetic fluctuations 2 (where = | | and = − 0 ) and of the particle density for the stratified 2D run ST2D-20 ( ,0 /Ω 0 = 20). Panels and show the initial formation of nonlinear channel flows at time = 2.5 [2 /Ω 0 ]. These channel flows appear both in 2 and and are more clearly formed within the disk region (| | < ( )), which is marked by the horizontal dotted lines in all the panels. Panels  . Similarly, panel shows averaged over the − plane for the 3D run ST3D-3.5. and show the same quantities but at = 3.25 [2 /Ω 0 ], when the channel flows have already experienced reconnection, breaking into a turbulent state. At that moment, the disk thickness has increased due to plasma heating and significant particle and magnetic field outflows occur. This turbulent state continues during the entire simulation and is accompanied by a permanent puffing up of the disk, as shown by panels and , corresponding to = 4.5 [2 /Ω 0 ].
Our 2D runs also show the formation of a large scale, preferentially toroidal dynamo-like field, similar to those observed in previous MHD studies (e.g., Bai & Stone 2013;Salvesen et al. 2016). This is seen in panel of Fig. 2, which shows ⟨ ⟩ as a function of time and of the vertical coordinate . We see that a net ⟨ ⟩ is formed, with a maximum amplitude of ∼ 30 − 40 0 during the nonlinear stage of the stratified simulation and with oposite signs inside and outside the disk. The amplitude of ⟨ ⟩ is very close to the one observed by previous equivalent 3D MHD simulations of the stratified MRI with initial 0 = 100 in the disk midplane (Salvesen et al. 2016).
In order to explore the effect of the scale-separation ratio ,0 /Ω 0 on the 2D turbulence structure, Fig. 3 shows 2 and in the nonlinear MRI state ( = 4 [2 /Ω 0 ]) for an analogous run using ,0 /Ω 0 = 3.5 (run ST2D-3.5) instead of ,0 /Ω 0 = 20. By comparing with panels and of Fig. 1, we see that the scale-separation ratio does not appear to produce a qualitative change in the properties of the 2D MRI turbulence, preserving features such as disk thickness increase and the presence of outflows. Run ST2D-3.5 also shows significant dynamo-like activity, as seen in panel of Fig. 2, where a net ⟨ ⟩ field appears similarly to the case of run ST2D-20 in panel .
The weak effect of ,0 /Ω 0 on the field structure of the stratified MRI can also be seen in the magnetic field power spectra, which are shown in Fig. 4 for runs with ,0 /Ω 0 = 7, 10, 14, 20 and 28 (all of them at ≈ 5 [2 /Ω 0 ]). Panel shows the spectra of the poloidal component of the magnetic field, (|˜( )| 2 + |˜( )| 2 )/ ( ) (˜( ) and˜( ) are the Fourier transform of the and components of and is the corresponding wave number), while panel shows the spectra of the toroidal component, |˜( )| 2 / ( ). For all the values of ,0 /Ω 0 , the spectra show similar shapes, with a break at ∼ 1 ( = 1 is marked by the colored dots on each line), where is the typical particle Larmor radius, defined as ≡ (3 / ) 1/2 /| |⟨ 2 ⟩ 1/2 . Their main difference is the location of the break of the spectra, which moves to larger wave numbers (in units of −1 = Ω 0 /2 ,0 ) as ,0 /Ω 0 increases, implying a growing separation between the kinetic ( ) and the MHD ( ) scales. However, apart from this growing separation between scales, increasing ,0 /Ω 0 does not significantly affect the qualitative shape of the power spectra. Panels and also compare (|˜( )| 2 + |˜( )| 2 )/ and |˜( )| 2 / with power-law functions of index (∝ − ) and show that, at sub-Larmor scales ( ≳ 1), the poloidal and toroidal spectra are approximately consistent with ≈ 3. This ≈ 3 behavior is expected for kinetic Alfvén wave turbulence (e.g., Passot & Sulem 2015) and it has also been observed in previous unstratified 2D and 3D kinetic simulations (Kunz et al. 2016;Inchingolo et al. 2018;Bacchini et al. 2022). Above Larmor scales ( < 1) the poloidal spectra show a peak at 2 ,0 /Ω 0 ∼ 1, followed by a power-law region characterized by ≈ 5/3. The toroidal spectra, on the other hand, has a peak at 2 ,0 /Ω 0 ∼ 0.2, followed first by an approximately flat region for 0.2 ≲ 2 ,0 /Ω 0 ≲ 1 and then by a steeper ≈ 2 region for 2 ,0 /Ω 0 ≳ 1. The nearly flat behavior of the toroidal spectra at 0.2 ≲ 2 ,0 /Ω 0 ≲ 1 is significantly affected by the presence of the dynamo-like field. Indeed, panels and of Fig. 4 show the poloidal and toroidal spectra of the "turbulent" part of the magnetic field, B , which is obtained by removing the contribution from the dynamo-like field: where B ≡ ⟨B⟩ . While the turbulent and total spectra of the poloidal field are very similar (see panels and , respectively), the turbulent spectra of the toroidal field (panel ) decrease substantially , 10 (green), 14 (red), 20 (purple) and 28 (blue). Panels and are analogous to panels and , but considering only the turbulent component of the magnetic field, B . In all the panels, the spectra use arbitrary normalization and the wave number is normalized by Ω 0 /2 ,0 .

Disk properties in 2D
In this section we show the evolution of the average disk properties in our 2D runs, paying attention to the way these properties are affected by the presence of stratification and by the scale-separation ratio ,0 /Ω 0 .
In order to assess the effect of stratification, we compare run ST2D-20 with the analogous unstratified run UN2D-20, with the same initial conditions as in the disk midplane of run ST2D-20. Figure 5 shows 2 and for run UN2D-20 at the moment when nonlinear channel flows appear ( = 2.5 [2 /Ω 0 ]) and then when these channel flows have reconnected and broken into turbulence ( = 3 [2 /Ω 0 ]). At first glance, this figure suggests that the evolution of the MRI turbulence in run UN2D-20 is similar to the one in the disk region of run ST2D-20. However, the average plasma properties between these two runs differ substantially, as shown in Fig. 6. Panel of Fig. 6 shows the evolution of ⟨ 2 ⟩ in the disk of run ST2D-20 (solid blue line) and in the whole volume of the analogous unstratified run UN2D-20 (solid red line). In both simulations there is an initial exponential growth regime that transitions to a much slower growth regime at ≈ 3 [2 /Ω 0 ]. Also, the two runs show the lack of a complete magnetic field saturation. However, at ≳ 3 [2 /Ω 0 ], at any given time the unstratified case reaches a ⟨ 2 ⟩ magnitude ∼ 5 − 10 times larger than in the stratified case. This factor ∼ 5 − 10 larger amplification applies similarly to the three components of the magnetic field, as can be seen from panel of Fig. 6.
Interestingly, the component in the unstratified case appears to be dominated by a large scale, dynamo-like component, similarly to what occurs in the stratified runs. This can be seen from panel of Figure 2, which shows ⟨ ⟩ for run UN2D-20. We see that by = 6 [2 /Ω 0 ], ⟨ ⟩ reaches an amplitude ∼ 100 0 , similar to the one of the total component, as shown by the dashed red line in panel of Fig. 6. However, whereas the dynamo activity in the analogous stratified run ST2D-20 (shown in panel of Fig. 2) produces a rather homogeneous ⟨ ⟩ in the disk region (| | < ( ), marked by the dashed black lines), the characteristic wavelength of the large scale field in the unstratified case is ∼ 4 times smaller. We thus interprete the large scale field in the unstratified case as a growth in the wavelength of the MRI modes, being therefore of different nature compared to the larger scale ⟨ ⟩ of the stratified runs.
The time when the stratified simulation significantly slows down its growth ( ≈ 3 [2 /Ω 0 ]) coincides with the moment when stratification effects, such as outflows and disk expansion become important, as can be seen from Fig. 1. Notice that this moment coincides with the time when the disk temperature starts increasing significantly, as we can see from the dashed-blue line in panel of Fig. 6, showing a connection between energy dissipation and disk expansion and outflow generation. Despite the fact that the magnetic field is amplified less in the stratified case, the average cold sigma parameter (≡ ⟨ 2 ⟩ /⟨4 2 ⟩ ) is larger in the nonlinear regime of these runs. This is shown by the solid blue and solid red lines in panel of Fig. 6 for the stratified and unstratified cases, respectively. This can be explained by the decrease in the disk density due to its expansion in the stratified runs. Finally, in both cases the plasma beta (≡ ⟨8 ⟩ /⟨ 2 ⟩ ) reaches a nearly steady state regime for ≳ 3 [2 /Ω 0 ], as shown by the dashed lines in panel of Fig. 6. However, while ∼ 2 in the unstratified case, ∼ 0.4 in the stratified case, which shows that stratification produces a disk that is magnetic-pressure supported, consistently with previous MHD stratified simulations (Bai & Stone 2013;Salvesen et al. 2016).
In order to explore the role of the scale-separation ratio ,0 /Ω 0 in our stratified runs, panels and of Figure 7 show the quantities ⟨ 2 ⟩ , , , and for stratified simulations with ,0 /Ω 0 = 7, 10, 14, 20 and 28. We see that increasing ,0 /Ω 0 produces a slight increase in ⟨ 2 ⟩ and , not yet showing a clear convergence for the highest values of ,0 /Ω 0 . (Note that the time origins of these simulations were slightly adjusted to align their exponential growth temporally, facilitating comparison). The evolutions of and exhibit some variations within a factor of ∼ 2, but without showing a discernible dependence on ,0 /Ω 0 .
As discussed in §3.1, another important feature of the stratified 2D simulations is a dynamo-like action that produces a significant = ⟨ ⟩ field, as shown in panels and of Fig. 2. The magnetic energy in the disk provided by the dynamo-like field B in run ST2D-20 is shown by the solid lines in the panel of Fig. 8, where the red-solid, black-solid and green-solid lines show the contributions by the three components of B : ⟨( ) 2 ⟩ , ⟨( ) 2 ⟩ and ⟨( ) 2 ⟩ , respectively. We see that the dynamo-like action within the disk is indeed dominated by the toroidal component of the magnetic field. Panel of Fig. 8 also shows in dashed lines the contribution to the magnetic energy provided by the three components of the turbulent magnetic field , which are averaged over the disk volume obtaining ⟨( ) 2 ⟩ , ⟨( ) 2 ⟩ and ⟨( ) 2 ⟩ (red-dashed, black-dashed and green-dashed lines, respectively). We see that the turbulent field is dominated by its toroidal component as well and contributes most of the magnetic energy in the disk from the triggering of the MRI turbulence at ≈ 2 [2 /Ω 0 ] until ≈ 3.5 [2 /Ω 0 ]. After that, the toroidal component of the dynamo-like field ⟨⟨ ⟩ 2 ⟩ becomes larger (by a factor of ∼ 2) than the toroidal component of the turbulent field. Panel of Fig. 8 shows the total energies in the dynamo-like field B (blue-solid line) and in the turbulent field B (blue-dashed line) for run ST2D-20 ( ,0 /Ω 0 = 20). We see that after ≈ 4 [2 /Ω 0 ] the energies in the dynamo and turbulent fields are comparable. Thus, in terms of the total magnetic energy, the dynamo-like and turbulent magnetic fields are roughly equally important after the initial period (of ∼ 1 orbit after the triggering of the MRI) in which the turbulent magnetic energy dominates. This trend appears to not be significantly affected by the scale-separation ratio ,0 /Ω 0 . This is shown by the pink-solid and pink-dashed lines in panel of Fig. 8, which show the contributions by, respectively, the turbulent and dynamo-like fields to the magnetic energy in the disk of run ST2D-7 ( ,0 /Ω 0 = 7). We see that in this ,0 /Ω 0 = 7 run there is also an initial period of about ∼ 1 orbit in which the turbulent field energy dominates, followed by a similar contribution to energy by the turbulent and dynamo-like fields.
Thus, we have shown that disk stratification in 2D can change significantly the behavior of the MRI turbulence compared to the unstratified case. Besides producing significant outflows and a puffing up of the disk due to temperature increase, stratification makes the disk turbulence more magnetically dominated (smaller ) compared to what is shown in an analogous 2D unstratified simulation. Stratification also gives rise to a significant large scale dynamo-like activity, which contributes similarly to the magnetic energy in the nonlinear MRI stage as the turbulent field after ∼ 1 orbit from the triggering of the MRI. We also found that increasing the scale-separation ratio produces slightly more magnetized disks, obtaining no complete convergence for the largest values of ,0 /Ω 0 used, consistent with the result obtained by Bacchini et al. (2022) who found that a scale separation ratio ,0 /Ω 0 ≳ 60 is required for a complete convergence in the 2D case.
In the next section we compare these 2D results with a 3D simulation showing that, although some differences appear, most of our 2D results are reasonably well reproduced in the 3D case.

3D MRI TURBULENCE
In this section we present results from a 3D stratified simulation, run ST3D-3.5 ( ,0 /Ω 0 = 3.5) and compare them with the analogous 2D stratified run ST2D-3.5. Figure 9 shows three snapshots of 2 for the stratified 3D run ST3D-3.5 at times = 1.5, 2.5 and 5.5 [2 /Ω 0 ]. At the qualitative level, there are many similarities with the turbulence structure of the 2D runs. At = 1.5 [2 /Ω 0 ], nonlinear channel flows are present in 2 , which look similar to the ones shown in panel of Fig. 1 for run ST2D-20. At = 2.5 [2 /Ω 0 ], the channel flows have already reconnected and broken into turbulence, with a significant increase in the disk thickness, similarly to what was shown for run ST2D-20 in panel of Fig. 1. This trend continues at later times, as can be seen in panel of Fig. 9, which shows 2 at = 5.5 [2 /Ω 0 ]. Figure 10 shows the same snapshots of Fig 9 but for the particle density . At = 1.5 [2 /Ω 0 ] nonlinear channel flows are present in , similarly to what is shown in panels of Fig. 1 for run ST2D-20. At = 2.5 and 4 [2 /Ω 0 ], a much more turbulent and progressively thicker disk is shown, as also shown for run ST2D-20 in panels and of Fig. 1.

Turbulence properties in 2D vs 3D
Our 3D run also shows the action of a dynamo-like mechanism, as can be seen from panel of Fig. 2, which shows averaged over the − plane, ⟨ ⟩ − , as a function of time and of the vertical coordinate . We see that a net ⟨ ⟩ − field is formed, with an amplitude similar to the 2D cases shown in panels and of Fig. 2 (runs ST2D-20 and ST2D-3.5). However, while the dynamo-like field in 2D shows significant time-variability and inhomogeneity along the -coordinate, in 3D this field appears less variable and more homogeneous.
The behavior of the magnetic power spectrum seems to be quite similar in 2D and 3D. Panels and of Fig. 11 compare, respectively, the poloidal and toroidal magnetic spectra of the 2D and 3D runs ST3D-3.5 and ST2D-3.5 (blue and green lines, respectively). These runs share the same ratio ,0 /Ω 0 = 3.5, so that the effect of the scale-separation does not affect significantly the comparison. At sub-Larmor scales ( > 1, where = 1 is marked by the colored dots on each line), we observe a magnetic spectrum with ≈ 3.3 (poloidal case) and ≈ 3.5 (toroidal case), for both types of runs. These ≈ 3.3 and 3.5 spectra are, however, steeper than the ones shown by the 2D runs with higher ,0 /Ω 0 , showing that a minimum scaleseparation ratio is necessary for correctly capturing the behavior of the sub-Larmor part of the spectra. Above Larmor scales ( < 1), both runs show a poloidal and toroidal magnetic field spectra with close to ≈ 5/3 and ≈ 2, respectively. These ≈ 5/3 and ≈ 2 behaviors are maintained when removing the dynamo- like field in both runs (which in 3D is defined as in Eq. 15 but with B ≡ ⟨B⟩ − ). This is shown in panels and of Fig. 11 where we show the poloidal and toroidal spectra of B . The main effect of removing B is to substantially reduce the contribution of 2 ,0 /Ω 0 ≲ 1 to the toroidal part of the 2D and 3D spectra. In this way, the peak of the poloidal and toroidal spectra of B both in 2D and 3D approach 2 ,0 /Ω 0 ∼ 1 (although the toroidal part of the turbulent spectrum in 3D has its peak at wavelengths ∼ 3 times larger than in 2D). This behavior of the B spectra in runs ST3D-3.5 and ST2D-3.5 above Larmor scales is similar to the ones shown in our stratified 2D runs with higher scale-separation ratio (Fig. 4), as well as in previous unstratified MHD (Walker et al. 2016) and kinetic 3D simulations (Kunz et al. 2016;Bacchini et al. 2022).

Validation of assumptions
The fact that the peaks of the poloidal and toroidal spectra of B are close to 2 ,0 /Ω 0 ∼ 1 in 3D is important for the validation of the shearing coordinates approach used in this work. Indeed, since B only depends on and , substracting this quantity from the total field does not change the power spectra of the magnetic field for and ( =ˆ· k and =ˆ· k, where k is the wave vector in Fourier space). Thus, the dominance of 2 ,0 /Ω 0 ∼ 1 for poloidal and toroidal components of B implies that the dominant wavelength of the magnetic fluctuations along the and axis is given by ∼ 2 ,0 /Ω 0 . This is indeed one of the assumptions made in our implementation of the shearing coordinates approach ( §2.2), which, combined with the condition ,0 / ≪ 1, allowed us to drop the −dependent terms in the field evolution equations 4 and 5, as well as to obtain the momentum and particle position evolution equations 12 and 13. Notice that our shearing coordinates approach also assumes that the electric field in the MRI turbulence is either smaller or of the same order of the magnetic field (|E|/|B| ≲ 1). To support this assumption, panels and of Fig. 12 shows as example the distribution of the electric current magnitude |J| and the |E|/|B| ratio for runs ST2D-20 and ST3D-3.5 at time = 5 [2 /Ω 0 ]. We see that in both cases the entire distribution satisfies |E|/|B| ≲ 3, Figure 11. Panels and show the power spectra of the poloidal and toroidal components of the magnetic field, ( |˜( ) | 2 + |˜( ) | 2 )/ ( ) and |˜( ) | 2 / ( ), respectively, for the 2D and 3D stratified runs ST2D-3.5 and ST3D-3.5. Panels and show the same as in panels and , respectively, but considering only the turbulent field B .
including the regions with the largest value of |J|, which are expected to correspond to reconnecting current sheets. This shows that the assumptions made in §2.2 to derive the plasma evolution equations in our shearing coordinates are fully supported by our obtained MRI turbulence behavior.

Disk plasma properties in 2D vs 3D
In this section we compare the disk plasma properties evolution in 2D and 3D. Panel in Fig. 13 shows the evolution of ⟨ 2 ⟩ in the 2D and 3D runs ST2D-3.5 and ST3D-3.5. In both cases there is an initial exponential growth regime that evolves into a nonlinear regime with a much smaller growth rate at ≈ 1.5 [2 /Ω 0 ]. Later, in the time interval ∼ 1.5 − 3 [2 /Ω 0 ], significant differences appear in the 2D and 3D cases, with the 3D run having a ⟨ 2 ⟩ amplitude ∼ 5 times smaller. This significant difference in ⟨ 2 ⟩ produces a similar difference in , as can be seen from the solid blue and green lines in panel of Fig. 13. This implies that in that time interval the disk expansion (and therefore its density) is about the same in the two simulations. This is consistent with the fact that their temperatures reach similar values, as shown by the dashed lines in panel of Fig. 13. Finally, consistently with the behaviors of and , is ∼ 5 times larger in the 3D case during the time period ∼ 1.5 − 3.5 [2 /Ω 0 ]. Later, when ≳ 3 [2 /Ω 0 ] there is a transition towards a state in which the amplitudes of ⟨ 2 ⟩ in 2D and 3D tend to give more similar values, which also tends to produce similar values of . Indeed, for ≳ 4 [2 /Ω 0 ], ≈ 0.5 in both runs while ⟨ 2 ⟩ is only a factor of ∼ 2 larger in the 2D case.
The smaller magnetic amplification shown by the 3D run in the time interval ∼ 1.5−4 [2 /Ω 0 ] is consistent with recent unstratified PIC simulations of the MRI that show that using 3D runs is important to allow efficient reconnection of the toroidal magnetic field component (Bacchini et al. 2022). By the end of the simulations, however, the 2D and 3D magnetic energies only differ by a factor of ∼ 2. This can be explained by the growing importance of the dynamo-like field in the stratified 2D and 3D runs, which evolves very similarly in these two types of runs. The progressively growing importance of the dynamolike field in 2D can be seen from panel of Fig. 13, which shows that in run ST2D-3.5, |B | 2 (solid-blue line) starts smaller than the turbulent part of the magnetic energy density |B | 2 (dotted-blue line) for ≲ 4 [2 /Ω 0 ], but afterwards it becomes comparable to |B | 2 . This is indeed consistent with what was shown for 2D runs with larger scale-separation ratios in Fig. 8. In the 3D run ST3D-3.5 this increase in the dynamo-like field importance is even more significant, since |B | 2 (solid-green line) becomes ∼ 5 times larger than |B | 2 (dotted-green line) at ≳ 4 [2 /Ω 0 ], given that 3D runs dissipate |B | 2 more efficiently via reconnection. Since |B | 2 has essentially the same values in 2D and 3D, it is thus expected that, by the end of the simulations, ⟨ 2 ⟩ only differs by a factor of ∼ 2 between the 2D and 3D cases. In §5.2 we show that the similitude between 2D and 3D by the end of the runs is also reproduced when analyzing the MRI-driven effective viscosity.

EFFECTIVE VISCOSITY
In this section we analyze the effective disk viscosity caused by the MRI turbulence. This viscosity is quantified making used of the parameter (Shakura & Sunyaev 1973), defined as the component of the plasma stress tensor , normalized by the plasma pressure, = / . This stress tensor component has three contributions: the Maxwell stress = − /4 , the Reynolds stress = , where = ( , , ) is the fluid velocity, and the anisotropic stress = −( ⊥ − ∥ ) / 2 , where ⊥ and ∥ are the plasma pressures perpendicular and parallel to the local magnetic field. Notice that, even though in the calculation of we assume non-relativistic fluid velocities, in our simulations individual particles can still acquire relativistic velocities. Thus is calculated as = ⟨ ⟩ / ⟨ ⟩ , where and are the momenta and Lorentz factors of the particles in a given fluid element and ⟨ ⟩ denotes an average over those particles. In this way we ensure that the fluid velocity corresponds to the velocity of the reference frame where the average particles momentum within a fluid element vanishes. Figure 13. Average disk plasma properties as as a function of time for 2D (blue) and 3D (green) stratified runs ST2D-3.5 and ST3D-3.5, respectively. Panel shows ⟨ 2 ⟩ (solid) and (dashed). Panel shows (solid) and (dashed). Panel shows the magnetic energy densities in the turbulent field |B | 2 (dashed) and in the dynamo-like field |B | 2 (solid) in 2D and 3D. (≡ ⟨ ⟩ /⟨ ⟩ ; dashed line), respectively. We see that reaches a saturated value of ∼ 1, which is dominated by the Maxwell stress, with the contributions to following the ordering > > . The fact that ∼ 1 is consistent with the dominance of magnetic pressure compared to particle pressure ( ≲ 1) seen in panel of Fig. 6 for the same run ST2D-20. We compare these results with the ones of the analogous unstratified run UN2D-20, where we find a similar ordering of the contributions to , > > , but with a ∼ 4 times smaller . This difference is consistent with the ∼ 4 times larger obtained in the unstratified run, implying a significant effect of stratification on the disk viscosity in collisionless studies of the MRI. Notably, the importance of in our results seems to contradicts previous unstratified kinetic studies (Kunz et al. 2016;Bacchini et al. 2022), but are in line with the findings from MHD simulations (Bai & Stone 2013;Salvesen et al. 2016).

Effect of stratification and
We also measured the effect of ,0 /Ω 0 on the behavior of , , and the total , which is done in panel of Fig. 15. We see that, although fluctuates by factors of order unity, there is no discernible dependence of this quantity on /Ω 0 , implying that the scale separation used in our 2D runs appears to be large enough to accurately capture the behavior of the MRI-driven viscosity. The blue lines in panel of Fig. 15 also compare the contribution to by the turbulent field B (dashed lines) and dynamo-like fields B (solid lines) in the ST2D-28 run ( ,0 /Ω 0 = 28), which we name and , respectively. We see that dominates until ∼ 4 [2 /Ω 0 ]. After that moment and are comparable, with fluctuating differences of a factor ∼ 2 − 3). A similar behavior is obtained for and in run ST2D-7 ( ,0 /Ω 0 = 7), which are shown by dashed-pink and solid-pink lines, respectively. This implies that no clear effect of the scale-separation ratio of and is observed in our simulations. The fact that these quantities become comparable after ∼ 4 [2 /Ω 0 ] is in line with the behaviors of |B | 2 and |B | 2 for runs ST2D-28 and ST2D-7, which also become comparable in the same time period, as shown in panel of Fig. 8.

Viscosity in 2D vs 3D
Panel of Fig. 16 compares the effective viscosities of the 3D and 2D runs ST3D-3.5 and ST2D-3.5, respectively. We see that for ≳ 1.5 [2 /Ω 0 ], the viscosity of the 3D run has a nearly steady value of ≈ 0.5. For ∼ 1.5 − 3.5 [2 /Ω 0 ], the 3D is ∼ 3 − 4 times smaller than in the 2D case, while for ≳ 3.5 [2 /Ω 0 ] the of the 2D and 3D runs become more similar, differing by a maximum factor of ∼ 2. The time dependence of the difference between the 2D and 3D values of is consistent with the fact that, initially, the 3D is ∼ 3 − 5 times larger than in 2D, with a subsequent period at ≳ 3.5 [2 /Ω 0 ] in which both 's acquire essentially the same value, as shown by the dashed blue (2D) and green (3D) lines in panel of Fig. 13.
These results reinforce the idea that, when the dynamo-like field becomes either dominant (3D) or comparable to the turbulent field (2D) at ≳ 3.5 [2 /Ω 0 ], the 2D and 3D runs produce fairly similar results, which include the value of the (Maxwell stress-dominated) disk viscosity. When that happens, itself is significantly affected by the dynamo-like field. This can be seen from panel of Figure  16, which shows the contributions of the dynamo-like magnetic field (solid) and the turbulent magnetic field (dashed) to the Maxwell stress, and , in the 3D run ST3D-3.5 (green) and the 2D run ST2D-3.5 (blue). At > 3.5 [2 /Ω 0 ], the 3D run exhibits a greater contribution to the Maxwell stress attributed to the dynamo-like field , respectively) are also shown by the dotted, dash-dotted and dashed lines, respectively. Panel shows in blue lines the contribution to by the turbulent field B (dashed lines) and dynamo-like fields B (solid lines) in run ST2D-28 run ( ,0 /Ω 0 = 28), which we name and , respectively. and for run ST2D-7 ( ,0 /Ω 0 = 7) are shown by dashed-pink and solid-pink lines, respectively.
(by a factor ∼ 5). This dominant contribution to the viscosity by the large scale dynamo-like field is in qualitative agreement with the 3D MHD simulations of Bai & Stone (2013) in the case of 0 = 100. Conversely, in the 2D run ST2D-3.5 the dynamo-like contribution to the viscosity becomes comparable to the one of the turbulent field after ∼ 3.5 [2 /Ω 0 ], with some dominance of the former after ∼ 4.5 [2 /Ω 0 ] by a factor of ∼ 2 − 3. This is in line with results shown for the 2D runs ST2D-28 and ST2D-7 ( ,0 /Ω 0 = 28 and 7, respectively), for which and were comparable after ∼ 4 [2 /Ω 0 ], with no discernible dependence on ,0 /Ω 0 .
Both our 2D and 3D runs give rise to an anisotropic stress that is subdominant compared to the Maxwell stress, although the former is larger than the Reynolds stress in the 3D run, which is the contrary to what occurs in 2D, suggesting that 3D effects would tend to suppress the fluid velocities that give rise to the Reynolds stress.

Pressure anisotropy behavior
The very small contribution of to the total effective viscosity in our 2D and 3D stratified runs seems to contradict previous kinetic simulation studies that suggest that the anisotropic stress can be as important as Maxwell stress in collisionless disks (e.g., Kunz et al. 2016). This discrepancy, however, appears to be mainly due to the small regime reached in the nonlinear state of our simulations. To demonstrate this point, panel of Fig. 17 shows the distribution of plasma anisotropy ⊥ / ∥ and ∥ in the disk of run ST2D-20 during a time interval = 3.5 − 4.5 [2 /Ω 0 ], and compares it with a threshold Figure 16. Panel shows in solid lines the evolution of in the 3D run ST3D-3.5 (green) and the 2D run ST2D-3.5 (blue). Their contributions from the Maxwell, Reynolds, and anisotropic stresses ( , and , respectively) are also shown by the dotted, dash-dotted and dashed lines, respectively. Panel shows the contributions of the dynamo-like magnetic field (solid) and the turbulent magnetic field (dashed) to the Maxwell stress, and , in the 3D run ST3D-3.5 (green) and the 2D run ST2D-3.5 (blue).
for the growth of unstable mirror modes (black line) obtained from linear Vlasov theory (Hellinger et al 2006): We see that in most cases ⊥ / ∥ tends to be larger than unity and limited by the mirror threshold. As an estimate of the upper limit for the expected importance of , one can compute the ratio ⟨ / ⟩ assuming that the pressure anisotropy of the plasma is given by Eq. 16. In that case we would have where we have applied Eq. 16 in the limit ∥ ≫ 0.016 ( ∥ ≈ 0.4 for ≳ 3 [2 /Ω 0 ], as can be seen from the ∥ evolution for run ST2D-20 shown in Fig. 6). Thus, using ∥ ≈ 0.4, we obtain ⟨ / ⟩ ≲ 0.3. This upper limit is consistent with the fact that is much smaller (by a factor of ∼ 10) than in run ST2D-20, as shown, respectively, by the dashed blue and dotted blue lines in Fig. 14. Notice that in a hypothetical case in which ∥ ∼ 100 (e.g., as in Kunz et al. 2016), Eq. 17 would predict comparable contributions from the anisotropic and Maxwell stress with ∼ .
Panel of Fig. 17 shows the same as panel but for the 2D run ST2D-3.5. We see that ⊥ / ∥ is somewhat larger in the case of run ST2D-3.5 for a given ∥ . The larger value of ⊥ / ∥ is consistent with the smaller scale-separation ratio, as shown by previous PIC simulation studies of the mirror instability driven by a growing background magnetic field (see, e.g., Ley et al. 2023). However, the distribution of ⊥ / ∥ and ∥ in the disk of run ST2D-3.5 still follows reasonably well the threshold for the growth of mirror modes presented in Eq. 16, consistently with the essentially absent effect of scale-separation on the dominance of in our runs. Panel of Fig. 17 shows the behavior for ⊥ / ∥ and ∥ in the 3D run ST3D-3.5. We see that the pressure anisotropy behaves similarly in the runs ST2D-3.5 and ST3D-3.5, in agreement with the small contribution of to the effective viscosity in the 3D case.
In summary, our 2D and 3D runs give a (Maxwell stress dominated) with values between ∼ 0.5 (3D) and ∼ 1 (2D), with a progressively similar behavior of the 2D and 3D runs as the dynamo-like field becomes dominant ( ≳ 3.5 [2 /Ω 0 ]). In this dynamo-dominated regime, is expected to be mainly produced by the dynamo-like field. Interestingly, this viscosity behavior is very similar to the one obtained from 3D MHD simulations of stratified disk with net vertical field and initial = 100 (Salvesen et al. 2016).

PARTICLE ACCELERATION
Our stratified MRI simulations show significant particle acceleration. In this section, we show that the acceleration efficiency grows as the disk temperature and the scale-separation ratio ,0 /Ω 0 increase. Well developed nonthermal tails are observed mainly in our 2D runs, due to their relatively large scale-separation ratio.

Spectrum evolution in 2D
The evolution of the particle spectrum, / , calculated in the disk of run ST2D-20 is shown in Fig. 18, where is the particle Lorentz factor. The spectra are shown for different values of the disk temperature , instead of at different times. (This allows us to compare spectra from different simulations, removing the fact that different runs may take different times to trigger the MRI and/or to heat the plasma). As the plasma temperature increases, their spectra develop a nonthermal tail that can be approximately described as a power-law with an exponential cut-off, where and c are the corresponding spectral index and cut-off Lorentz factor, respectively. This behavior can be seen in Fig. 18, for instance, in the cases / 2 ≈ 5.2 × 10 −2 and 3.5 × 10 −1 . For the Figure 18. The evolution of the particle energy distribution for our fiducial run ST2D-20. Different colors represent different disk temperatures, and the dotted lines correspond to fits to the nonthermal tails using Eq. 18.
first temperature we fitted Eq. 18 using ≈ 2.2 and c ≈ 4.8 (green dotted line), while for the second temperature we used ≈ 1.9 and c ≈ 35 (pink dotted line). These fits were obtained using a maximum likelihood analysis considering only particles with energy larger than 10 .
This discrepancy in how depends on ,0 /Ω 0 is likely a manifestation of the underlying acceleration mechanism, which appears to be consistent with the expectation from reconnection driven acceleration. Indeed, the dependence on ,0 /Ω 0 for / 2 ≈ 3.5 × 10 −1 is qualitatively consistente with the pair plasma magnetic reconnection results of Werner et al. (2016) in the limit of small system size, . These results show power-laws with supra-exponential cut-offs ( / ∝ − − 2 / 2 c, rec ) with ,rec ≈ 0.1 / 0 , where 0 = 2 / and is the magnitude of the magnetic field in the upstream medium of the reconnecting plasmas. The corresponding value of in our simulations can be estimated from the power spectrum of the − (poloidal) magnetic energy component, (| ( )| 2 + | ( )| 2 )/ ln( ), for runs with different scale-separation ratios shown in panel of Fig. 4 (we use the poloidal magnetic field since this is the component that can experience reconnection in 2D). We see that the poloidal spectra peak at ∼ Ω 0 /2 ,0 fairly independent of the scale-separation ratio. Thus a reasonable estimate for is ∼ 2 / ∼ (2 ) 2 ,0 /Ω 0 . In addition, we can estimate 0 = 2 / ≈ /( ,0 ), where (≡ (⟨ 2 ⟩ ) 1/2 / 0 ) is the root mean square amplification factor of the magnetic field in the disk at a given time. Thus, if reconnection Figure 19. The particle spectra for runs with ,0 /Ω 0 = 7, 10, 14, 20 and 28, and with temperatures / 2 = 5.2 × 10 −2 (panel ) and 3.5 × 10 −1 (panel ). For each spectra, we show in dotted lines a power-law fit with an exponential cut-off (as in Eq. 18). The normalizations of the spectra are arbitrary.
is the main driver of particle acceleration in our runs, should be close to ,rec , which would be given by where we have used that ,0 / = 10 −2 in all our simulations. The value as a function of is shown in dashed lines in Fig. 21 for different values of ,0 /Ω 0 . We see that when / 2 ≈ 3.5 × 10 −1 , ≈ 30, fairly regardless of the scale-separation ratio. This means that the expected ,rec at / 2 = 3.5 × 10 −1 is ,rec ≈ 36 The black line in panel of Fig. 20 shows the case = ,rec , where ,rec is given by Eq. 20. We see that ,rec reproduces well the behavior of in our runs with / 2 = 3.5 × 10 −1 .
The behavior ,rec ≈ 0.1 / 0 expected from reconnection is valid as long as / 0 ≲ 40 (Werner et al. 2016), where corresponds to the cold sigma parameter in the upstream medium of the reconnection simulations. We estimate using ⟨ ⟩ in our runs when / 2 = 3.5 × 10 −1 , which is ⟨ ⟩ ∼ 10 − 50 for the range of ,0 /Ω 0 considered, as shown by the solid lines in Fig. 21. 2 Thus, 2 Since we want to estimate the equivalent of the upstream cold sigma parameter , we are mainly interested in the values of outside the current sheets, where is the largest. Thus, given that in our runs ⟨ ⟩ (= ⟨ 2 /4 2 ⟩ ) > (= ⟨ 2 ⟩ /4 ⟨ ⟩ 2 ), we are using ⟨ ⟩ instead of as our estimate of . Figure 20. The values of and obtained from fitting Eq. 18 to the obtained spectra as a function of ,0 /Ω 0 . Red and blue squares correspond to / 2 = 5.2 × 10 −2 and 3.5 × 10 −1 , respectively.
The values of for / 2 = 5.2 × 10 −2 and 3.5 × 10 −1 seen in panel of Fig. 20 are close to ∼ 2.2 and ∼ 1.9, respectively, and do not show a clear dependence on the scale-separation ratio. This is also consistent with acceleration being driven by reconnection. For instance, for / 0 ≳ 40 and = 3 (a case close to our results with / 2 = 5.2 × 10 −2 , where ⟨ ⟩ ∼ 1 − 2; see

Effect of stratification on the acceleration
The previous discussion underscores the importance of plasma conditions, in particular and , in determining the efficiency of nonthermal particle acceleration. Since these conditions vary significantly between stratified and unstratified simulations (as shown by Fig. 6), we expect the acceleration efficiency in these two types of runs to be different. Fig 22 compares spectra from run ST2D-20 with the equivalent spectra in the unstratified run UN2D-20 at the same values of . We see that the spectra in the unstratified run are always softer than in the stratified run ST2D-20. This is consistent with the fact that, for a given temperature, in run UN2D-20 the value of is smaller than in the run ST2D-20, which favors harder nonthermal acceleration (as seen in panel of Fig. 6)

Acceleration in 2D vs 3D
In Fig. 23 we compare spectra from the 2D and 3D simulations ST2D-3.5 and ST3D-3.5, both with a scale-separation ratio ,0 /Ω 0 = 3.5, for / 2 = 6.8 × 10 −2 , 2.1 × 10 −1 and 3.5 × 10 −1 . As expected from our previous discussion on the dependence of on ,0 /Ω 0 , the 2D run ST2D-3.5 should produce a nonthermal tail of rather short extension, which is what we see in Fig. 23. However, it is still interesting to verify whether its main features are reproduced in the 3D run ST3D-3.5. We see that, although the spectra show somewhat different shapes, they both feature nonthermal tails with similar maximum energies. In particular, when / 2 = 3.5 × 10 −1 , the 2D and 3D spectra look very similar, suggesting that 3D effects maintain the main particle accelerating properties of the MRI turbulence in the stratified setup. Even though the nonthermal particle behavior in our runs suggests a significant role of magnetic reconnection in the acceleration of particles, our simulations may be subject to effects that are not present in previous magnetic reconnection studies. These include particle escape from the disk, stochastic acceleration by the MRI turbulence (e.g., Kimura et al. 2019;Sun & Bai 2021), and the action of various kinetic instabilities that may contribute to field dissipation and/or particle acceleration, including, e.g., the drift kink instability (Zen-  ,0 /Ω 0 = 3.5 (runs ST2D-3.5 and ST3D-3.5., respectively). The different colors represent ⟨ ⟩/ 2 = 6.8 × 10 −2 , 2.1 × 10 −1 and 3.5 × 10 −1 .
itani & Hoshino 2007) and the ion-cyclotron instability (Ley et al. 2019). We thus defer to future research a detailed determination of the dominant acceleration process(es) as well as the role of the scaleseparation ratio by including 2D and 3D runs with larger values of ,0 /Ω 0 .

CONCLUSIONS
In this work we have studied the effect of stratification on the collisionless MRI using 2D and 3D PIC simulations. Comparing 2D stratified and unstratified runs, we found that stratification affects the evolution of the disk conditions, due to the presence of outflows and disk expansion, leading to a decrease in the amplification of magnetic field energy density in the turbulent non-linear MRI regime. However, the expansion of the disk also decreases the plasma pressure and density, resulting in a highly magnetized disk, with smaller and larger cold magnetization parameter compared to the unstratified case. Indeed, in the nonlinear regime the disk is magnetic-pressure supported with ∼ 0.4, which is a factor ∼ 5 smaller than the value reached by its unstratified counterpart. Our stratified simulations do not exhibit a discernible low beta corona separated from the disk. In the disk region, our runs also give rise to a significant large scale and predominantly toroidal dynamo-like field B (≡ ⟨B⟩ in 2D), whose dominant scale length follows the disk scale height. Although a large scale ⟨B⟩ field also appears in the 2D unstratified case, its scale length is ∼ 4 times smaller. The increased magnetization of our 2D stratified runs produces an effective viscosity in the disk that reaches ∼ 1, which is a factor ∼ 5 larger than in the equivalent unstratified case. This viscosity is dominated by the Maxwell stress, , with a small contribution of the anisotropic stress, . This small is consistent with the regulation of pressure anisotropies by kinetic microinstabilities in the low regime.
Even though our 2D and 3D stratified simulations produce similar results, some differences are present. In order to assess them, we compared 2D and 3D runs focusing on a specific case with small scale separation, ,0 /Ω 0 = 3.5. In the early phase of the non-linear MRI stage (i.e., ∼ 1 − 2 orbits after the triggering of the instability), 3D simulations exhibit a significantly lower amplification of the magnetic field energy density compared to their 2D counterpart, consistently with a more efficient reconnection of the toroidal magnetic field component in 3D (as shown in the recent work of Bacchini et al. 2022). This primarily affects the effective viscosity and the plasma , which are, respectively, ∼ 3 − 4 times smaller and larger in the 3D case. However, after this initial stage (at ∼ 4 [2 /Ω 0 ]), our 2D and 3D simulations are more similar, with reaching essentially the same values and the effective viscosity being only ∼ 2 times smaller in 3D (in 3D, ≈ 0.5 during the whole nonlinear MRI stage). This transition at ∼ 4 [2 /Ω 0 ] occurs because of the growing importance of the large scale dynamo-like field B (≡ ⟨B⟩ − in 3D). Indeed, after an initial stage in which the turbulent field B (= B − B ) dominates, the dynamo-like field becomes larger than the turbulent field in the 3D runs while in 2D it reaches values comparable to the turbulent field. Since the dynamo field has almost the same amplitude in 2D and 3D, the total fields in these two types of runs differ by a small amount after ∼ 4 [2 /Ω 0 ]. Also, in this dynamo-dominated period, the 3D viscosity is mainly produced by the dynamo-like field, while in 2D the turbulent and dynamo fields contribute comparably to . In 3D the disk viscosity is also dominated by the Maxwell stress, , with a small contribution from the anisotropic stress, . This is also consistent with the action of pressure anisotropy-driven kinetic microinstabilities in the 3D case, as it occurs in 2D. Our 2D and 3D results in terms of , and dynamo-like field behaviors are reasonably consistent with previous 3D MHD simulations of stratified disks with similar initial conditions (e.g., Bai & Stone 2013;Salvesen et al. 2016).
In terms of particle acceleration, in our 2D runs we find that the particle spectra in the nonlinear MRI stage follow power-laws with exponential cut-offs, with power-law indices ≈ 2.2 − 1.9 for disk temperatures ∼ 0.05 − 0.3 2 . Additionally, depending on the value of during the nonlinear MRI stage, the maximum energy attained by the particles is either proportional to the scale separation ,0 /Ω 0 or fairly independent of this parameter, which appears to be consistent with previous magnetic reconnection studies (Werner et al. 2016). Particle acceleration in our 2D unstratified runs appears to be less efficient than in the analogous stratified case. This is likely due to the smaller cold magnetization parameter attained in the unstratified simulations. Furthermore, the particle acceleration observed in our 2D run with ,0 /Ω 0 = 3.5 is well reproduced by its analogous 3D simulation, suggesting that 3D effects should maintain most of the acceleration properties of the MRI turbulence. However, 3D runs with larger scale-separation ratio are needed to confirm this trend.
In summary, our results suggest that including disk stratification in shearing-box PIC simulations of the MRI is important for studying its saturation, effective viscosity generation and particle acceleration physics. Interestingly, 2D and 3D simulations give quite similar results for the scale-separation ratios used in this work, especially after the magnetic field energy becomes dominated by a large-scale, dynamo-like field (which occurs ∼ 1 − 2 orbits after the triggering of the instability). We leave for future work the clarification of the effect of larger scale-separation ratios in 3D, as well as disentangling the underlying mechanism(s) for particle acceleration. We also note that our results refer to a specific case of initial plasma conditions. Further research is thus needed to clarify the effects of changing the initial and/or temperature in the disk, potentially leading to a more distinct differentiation between an unmagnetized disk and a magnetized corona (e.g., Salvesen et al. 2016). Investigating the effect of more realistic mass ratios on the dynamic and thermodynamic properties of the collisionless MRI is also deferred to future work.