Scaling behavior for electric vehicle chargers and road map to addressing the infrastructure gap

Abstract Enabling widespread electric vehicle (EV) adoption requires a substantial build-out of charging infrastructure in the coming decade. We formulate the charging infrastructure needs as a scaling analysis problem and use it to estimate the EV infrastructure needs of the USA at a county-level resolution. We find that gasoline and EV charging stations scale sub-linearly with their respective vehicle registrations, recovering the sub-linear scaling typical of infrastructure. Surprisingly, we find that EV charging stations scale super-linearly with population size within counties, deviating from the sub-linear scaling of gasoline stations. We discuss how this demonstrates the infancy of both EVs and EV infrastructure while providing a framework for estimating future EV infrastructure demands. By considering the power delivery of existing gasoline stations, and appropriate EV efficiencies, we estimate the EV infrastructure gap at the county level, providing a road map for future EV infrastructure expansion.


Supporting Information Appendix 1. Data Curation
A. EVSE Stations.We tabulated the number of EVSE stations per county using station-level data provided by the National Renewable Laboratory's (NREL) Alternative Fuel Stations API [46].We then geocoded each station's latitude/longitude coordinates to the bounding county using shapefiles provided by the U.S. Census Bureau [61].Nearly all stations were reverse geo-coded correctly, except for 7 of the 47,024 stations in the dataset.We manually confirmed the counties for these stations using their street address; in all cases, incorrect coordinates were at fault.We used the "Open Date" provided by the NREL's dataset to generate time series of EVSE station counts for each county.Effectively we assumed that no station closed or was removed from the dataset.That is, for any date, we assume the number of stations in a county is the number of stations with an opening date before or on that date.For our dataset, we used the number of EVSE stations open on December 31st, 2020 as the 2020 count of EVSE stations in a county.
We estimate the total station power by multiplying the number of Level 1 (1.4 kW), Level 2 (7.2 kW), and DC Fast (50 kW) charging points reported by NREL for each station, by their respective power output shown in parenthesis [2].Higher power delivery rates are supported by the Tesla Superchargers [58] and the CHAdeMO Protocol 3.0 [15]; however, most EVs are limited to drawing at most 50 kW [2].Our estimate of the maximum station power delivery may be underestimated, especially for stations that mainly provide DC Fast charging ports.

B. Gasoline Stations.
Gasoline station counts at the county level were obtained from the United States Bureau of Labor Statistics (BLS) Quarterly Census of Employment and Wages (QCEW) Program for the fourth quarter of 2020 [60].We used the number of establishments classified as "Gasoline Stations" under the North American Industry Classification System [28].This sub-sector includes stations with attached convenience stores (NAICS 447110) and "Other Gasoline Stations" (NAICS 447190).NAICS 447190 includes gasoline stations without attached convenience stores, marine service stations, and truck stops [28].As such, our count of gasoline stations is inflated beyond the number of stations that service passenger vehicles.This issue also affects the Standard Industrial Classification coding for gasoline stations (SIC 5541) and is no longer used by the BLS for their QCEW program [48].
For example, in Iowa, we used the number of "Automobiles" defined by IA Code §321A.1(42D)as a "motor vehicle designed primarily for carrying nine passengers or less, excluding motorcycles and motorized bicycles."Several states (TX, ID, IL, OK, MD, WV), did not provide a breakdown of registrations by vehicle type, in these cases, we used the total number of vehicle registrations by county.We sought vehicle registration counts representative of the 2020 population counts.We used the 2020 calendar year registration counts for all states but Maryland.Maryland tabulates registration data for the fiscal year; thus, we used the 2020 fiscal year instead of the 2020 calendar year There are additional variations in temporal reporting methodologies from state to state.For example, California reports the number of vehicles registered within the state on January 1st of each year.Conversely, Oklahoma reports the total number of vehicle registrations issued during the calendar year.As such, a vehicle that was registered in California on January 1st, but then moved to Oklahoma within the year, would be double counted by our dataset.Given the rate of domestic migration [12], we suspect the effect of double counting is negligible.
Vehicles registered out of state were excluded from our analysis, even in cases where we could assign the data to particular counties (i.e., California's dataset); as for most states, this was not possible.Additional processing steps were required for Alaska and California, as described below.
C.1.Alaska.Alaska registration counts were provided for governmental boundaries (Boroughs, cities, municipalities) instead of at the county or county-equivalent level.Thankful, we were able to uniquely map all provided governmental boundaries to a single census area, allowing us to tabulate registration counts at the county level.We excluded the 16k passenger vehicles registered in "Other Alaska", as we could not precisely attribute them to any particular county.Additionally, we excluded the 16k passenger vehicles registered to Alaska but residing outside the state, as we had insufficient information to attribute them.C.2. California.California registration data is tabulated by ZIP Code and not by county, as this analysis requires.We converted ZIP code counts to county-level counts using the US Census's ZIP Code Tabulated Area Crosswalk [63], based on the fractional land overlap between ZIP codes and counties, in line with prior work [20].Additionally, the California dataset was tabulated by vehicle duty (Light/Heavy).We restricted our dataset to "light-duty" vehicles, as defined by Cal.Code Regs.Tit. 13, §2903, as: Light-duty vehicle means any passenger car or light-duty truck that meets the applicable definitions in Title 13, section 1900 and is subject to the certification requirements in Title 13, Division 3, Chapter 1, Article 2. A street-use motorcycle is not a light-duty vehicle.
The California dataset additionally breaks out vehicle registrations by fuel type.We considered using this information to attempt to filter out light-duty trucks (i.e., by excluding diesel vehicles).However, we opted against this as diesel passenger vehicles exist [41] and can be served by most gasoline stations.

D. Electric Vehicle Registration
Counts.Electric Vehicle registration data for 16 states (CA, CO, CT, FL, MI, MN, MT, NK, NY, OR, TN, TX, VA, VT, WA, and WI) were obtained from the Atlas EV Hub [7].Registration data was provided at varying intervals (Annually, Bi-Annually, Quarterly or Monthly) and at different levels (ZIP Code or County).We ingested all data as provided and performed the following regularization steps: 1. Map registration data collected at the ZIP code level to counties.
2. Attribute out-of-state registrations to the correct state and resolve differences in reporting intervals.

Impute registration county for entries with partial data
Once regularized, we selected each state's latest registration counts for 2020.For example, we used the December count for a state reporting monthly, while for a quarterly reporting state, we used the 4th quarter count.This procedure is consistent with our processing of the EV stations dataset.

D.1. Mapping ZIP Code Registration data to Counties.
Consistent with prior works [20], we used the U.S. Census Bureau's ZIP Code Tabulated Area Crosswalk [63] to convert registration data tabulated for ZIP Codes to a county-level tabulation.While most ZIP codes are fully contained within a bounding county, many ZIP codes span multiple counties.As such, the remapping process is a many-to-many join, as multiple ZIP Codes can make up a single county, and a single ZIP code can span numerous counties.We used the fractional area overlap between ZIP codes and counties to remap registrations onto counties.

D.2. Out-of-State Registrations.
Out-of-state registrations occur when the location of a vehicle is outside of the state to which it is registered.Out-of-state registrations can arise due to aliasing effects, for example, when a vehicle is registered to a ZIP code spanning state borders.Or due to differences in the residency of a vehicle's owner and the vehicle's primary location.Of the 11.3 million entries in the Atlas EV dataset, 37 thousand (0.33%) are out-of-state registrations.To avoid ignoring these vehicles, we implemented the following procedure: We deleted the entry for vehicles in states we did not have data for (i.e., a car located in Nevada and registered to California).Including the registration would likely drastically underestimate the number of EVs in the out-of-state county.For vehicles located in a state that we did have data for (i.e., a car located in California and registered in Oregon), we recorded a registration at the date of the registration snapshot.For example, we attribute a vehicle located in California but registered in Colorado to California on the date of the Colorado snapshot; regardless of the reporting interval of California.We then linearly interpolated the registration state's data onto the location state's reporting interval.In the Colorado case, if a CA county had 3 CO vehicles in December and 4 in February, we would attribute 3.5 vehicles for January.However, in practice, CO reports more frequently than CA, eliminating the need for any interpolation.Interpolating out-of-state registration data was solely used to attribute registrations from slowly reporting states (i.e., annually) onto faster reporting intervals (i.e., monthly).We did not use interpolation to increase the reporting rate of states.
We considered using alternative interpolation schemes, such as cubic interpolation or regression, instead of linear interpolation.However, these higher-order methods can suffer from oscillations and can return estimates outside the range of observed values [36,54].In most cases, out-of-state registrations are brief; a TX county may have three vehicles registered in CA one month and none the following month.Using linear interpolation ensures we do not overstate the number of vehicles in a county.

D.3. Imputing County of Origin.
The county of registration was marked unknown for 28,786 of the 11.3 million (0.25%) entries in the source data set.To avoid listwise deletion, and its associated issues, we used "hot-deck imputation" to impute the missing counties of registration [55,44,5].This is consistent with Myers' recommendation to use hot-deck imputation when the non-response rate is less than 10%.Other more powerful methods, such as Expectation Maximization or Regression Imputation, do not apply to our dataset, as our dataset lacked the requisite data to support these methods.Similarly, mean substitution does not apply to our dataset, as we have data on per-vehicle bias and lack a notion of an "average" country.Imputation was handled at the state level for each reporting snapshot provided to EV Atlas using Impute.jl[29].
In Hot-deck imputation, we replace missing values with "donor" values drawn at random from the rest of the dataset [29,5].This procedure preserves the expected fraction of vehicles in a given county.Removing vehicles with an unknown county of origin also preserves the fraction of vehicles in a particular county, but would underestimate the total number of cars.

Model Fitting
We had initially sought to leverage urban scaling to model the relationship between refueling/recharging infrastructure and the overall population.Having identified a difference in the scaling for EVSE stations and gasoline stations, we then curated an additional dataset of EV and passenger vehicle registrations.Using our expanded dataset, we then applied urban scaling to model the scaling of vehicle registrations (EV and conventional) with population, and the scaling of recharging/refueling infrastructure with its respective vehicle population.Where "Power (Log-Log Transform)" fits a linear model to the log-log transformed data, similar to prior works [10,9,17,39].Conversely, "Power (Log-Link)" uses a Generalized Linear Model to predict the log of the expected value (ln E[y]) in line with the recommendation of Leitao et al. [34].Using a link function instead of log transforming y allows our model to account for y = 0, for example, counties with no charging station.
Further, we fit our Generalized Linear Models using the following probability distributions: • Normal for all models except for "Power (Log Link)" • Poisson for the "Power (Log Link)" and Null models • Negative Binomial for the "Power (Log Link)" Null models A. Generalized Linear Models.Generalized Linear Models (GLM), extend linear regression by introducing a non-linear link function g such that the expected response is: , where η(X) is the linear function of the dependent variables X.Additionally, GLMs allow distribution of Y to be any distribution f (Yi; θi = g(µ), ϕi) from the exponential family, where θi is the natural parameter of the distribution, and ϕi is the dispersion parameter of the distribution.We use i = 1 . . .n to index our n observations and related quantities.
For the power law models, we used the following for g and η(N ): As the expected value of the model E[Y |N ] = g −1 (η(N )), this parameterization results in the desired power law model: Maximum Likelihood Estimation.Maximum likelihood estimation (MLE) estimates a model's most probable parameters given the observed data.Resulting in an unbiased estimate of the model parameters, which are the most probable values of the parameters, and in the limit of large sample sizes, approach the true values of the parameters.For a GLM, the likelihood of the model L is defined by Eq. 2, where f is the probability distribution of the model.
For numerical stability, we maximize the log-likelihood ℓ of the model rather than the likelihood L. We used the Julia package GLM.jl [8] to compute the MLE of our model's parameter using the Iteratively Reweighed Least Squares algorithm [26].The implementation details of the algorithm are beyond the scope of this document.
The canonical link function g of a distribution f is the inverse of b ′ and thus transforms θ to the mean of the distribution.Alternative link functions are permissible, but the canonical link enables further simplification of the likelihood function, as shown below [1].
B.1.Poisson Distribution.For the Poisson power law models Y ∼ P ois(Y0N β ), we used the following for g(µ) and η(N ): Substituting into Eq.5, for a single observation we get: This gives the criterion for the maximum likelihood estimate for the Poisson power-law models: Negative Binomial Distribution.The negative binomial distribution N B(Yi; θi, r) has the probability density function given by Eq. 7, where r is a shape parameter, and θi is the expected value of the distribution.
This is a transformation of the N B(Yi; r, p) parameterization, using θ = pr/(1 − p) [33], such that E[Y ] = θ and the variance is Var[Y ] = θ (1 + θ/r).Additionally, in the limit of r → ∞, the negative binomial distribution is equivalent to the Poisson distribution.Thus it allows us to extend the Poisson distribution to account for over-dispersion instead of enforcing Var(Y ) = E[Y ] [33,72].The negative-binomial power-law models Y ∼ N B(µ = Y0N β , r) using Eq. 1 for the g(µ) and η(N ).Substituting Eq. 7, into Eq. 5 gives the log-likelihood for a single observation (ln N B(Yi; θi, θ)): Yi ln θi − Yi ln(θi + r) − r ln(1 + θi/r) − ln(Yi!) + ln Γ(Yi + r) − ln Γ(r) Dropping terms independent of the parameters (Y0 and β), and expanding θi = ln Y0 + ln βNi, gives the following criterion for the maximum likelihood estimate: As the negative binomial is only a GLM for fixed r, we fit the model using Expectation-Maximization to iteratively alternate between performing MLE of the model parameters (Y0 and β) and of the shape parameter (r) [8]; until we converge to an MLE estimate for all three parameters.This procedure is consistent with other statistical package [72] and results in an MLE estimate for both the model parameters and the dispersion parameter r [33].Again, model fitting was performed using GLM.jl [8].
C. Ordinary Least Squares.Prior works in scaling analysis have minimized the least-squares criterion of the log-transformed data [10,9,17,39], as shown in Eq. 8. Expanding Eq. 8 by dropping terms that are independent of Y0 and β as well as transforming to a maximization problem results in Eq. 9, and is equivalent to a GLM model of the form: ln Y ∼ N (ln Y0 + β ln N, σ 2 ).argmin As noted by Leitao et al., this is an MLE estimate of the model assuming the fluctuations between ln Y and ln Y0 + β are normally distributed, and the probability of zero-count data is zero [34].However, as 43.8% of counties in our dataset have no EVSE stations, P (y = 0) = 0 is not a reasonable approximation.For comparison purposes only, we have computed OLS fits of the log-transformed data for both EVSE and gasoline stations but have excluded them from our analysis in favor of models that can fully explain the data.

Fits with Population.
For the gasoline station vs. population, our fit had a R 2 M cF = 0.903, a likelihood ratio of λLR = 30.9• 10 3 and a BIC score of 3.35 • 10 3 based on 3,111 out of 3,143 counties.The fitted scaling exponent was β = 0.77 ± 0.01, shows close agreement with previous work [10], and the fitted exponent for the NB model (β = 0.76 ± 0.01).These results suggest that approximating the power-law fit with log-transformed data is reasonable for gasoline stations.For the EVSE stations vs. population, our fit had a R 2 M cF = 0.754, a likelihood ratio of λLR = 12.8 • 10 3 and a BIC score of 4.20 • 10 3 , based on 1,645 out of 3,143 counties.The fitted scaling exponent was β = 0.71 ± 0.03, in contrast to the β = 1.07 ± 0.05 predicted by the NB power model.These results suggest that excluding zero-count data significantly alters the model's fit and is inappropriate for the EVSE dataset.

Fits with Registrations.
For the gasoline station vs. registrations, our fit had a R 2 M cF = 0.859, a likelihood ratio of λLR = 12.5•10 3 and a BIC score of 2.08 • 10 3 based on 1,287 out of 1,288 counties.The fitted scaling exponent was β = 0.66 ± 0.02 and shows close agreement with the fitted exponent for the NB model (β = 0.68 ± 0.02).For the EVSE stations vs. registrations, our fit had a R 2 M cF = 0.81, a likelihood ratio of λLR = 5.46 • 10 3 and a BIC score of 1.34 • 10 3 , based on 498 out of 732 counties.The fitted scaling exponent was β = 0.57 ± 0.03, in contrast to the β = 0.70 ± 0.03 predicted by the NB power model.OLS vs. NB Model.We selected the negative binomial version of the power law model for all models presented in the paper, as it directly accounts for zero-count data and thus supports the largest slice of the dataset.

Statistical Testing
In order to check the quality of the various GLM fits, perform model selection, and check the significance of the fitted parameters, we performed various statistical tests.

A. McFadden's Pseudo-R Squared. R 2
M cF compares the log-likelihood of the model (ln L) to the log-likelihood of the null model (ln L).
To compute ln L0, we fit a null model using the same link and distribution but with a constant η and use its log-likelihood for ln L0.As noted by McFadden, R 2 M cF is typically smaller than the R 2 of linear regression, and a value of 0.2 to 0.4 represents an "excellent fit' [42].Additionally, as R 2 M cF is a comparison of a model to its null model, it can be used to compare disparate models.

B. Root Mean Squared Deviation.
The Root Mean Squared Deviation (RMSD), or Root Mean Squared Error (RMSE), is defined as: Where Ŷi are the predictions generated by the fitted model, Yi are the observed values, n is the number of observations, and i = 1 . . .n indexes the observations.C. Likelihood Ratio Test.The likelihood ratio test can be used to check if the model is significant relative to the null model.For large sample sizes, test statistic λLR is chi-squared (X 2 ) distributed under the null hypothesis that the model is not significantly better than the null model [69].With the alternative hypothesis, that model's fit is significantly different from the null model.
We then compare the test statistic to the critical value for X 2 k > λLR where k, the degrees of freedom for X 2 , is the difference in the degrees of freedom between the model and the null model.We found all models to be highly significant with P < 10 −99 for all models.To compute ln L0, we fit a null model using the same link and distribution but with a constant η and use its log-likelihood for ln L0.
Alexius Wadell, Matthew Guttenberg, Christopher P. Kempes, and Venkatasubramanian Viswanathan D. Parameter Significance.To check if the fitted parameters of each GLM model are significantly different from zero, we performed a Wald test to check if the model's parameters are significantly different from zero [67].
Where X is the value of the fitted parameter, µ is its value under the null hypothesis, and SE is the standard error of the parameter as computed while fitting the model.We then compare the test statistic to the critical value that W < |N (0, 1)|.For all fitted models, we found nearly all parameters to be significantly different from zero (p < 0.001).For the quadratic model for EVSE count data, we failed to reject the null hypothesis that the intercept parameter was significantly different from zero p < 0.1.As the other parameters were significantly different from zero, this does not represent a degenerate model.
In addition, we repeated the test for the power-law fits to check if β was significantly different from one [67].Using the above test, we found all models at the p < 0.001 significance level.

E. Compare Parameters.
To compare the fitted coefficients between models, we followed the recommendation of Clogg et.al. and computed the following test statistic [16].
Where βa and β b are the fitted parameter for the two models respectively; and s 2 ( * ) is the squared standard error for the fitted parameters.We then compare the test statistic z to the standard unit normal N (0, 1).Testing the null hypothesis that the difference between fitted parameters βa − β b is zero against the alternative hypothesis that the difference is significant [16].
We used this statistical test to compare estimates of the scaling parameter β using different spatial units of analysis, as discussed in Section 4.

F. Bayesian Information Criteria.
To compare the predictive power of various models, we computed the Bayesian Information Criteria (BIC) for each of the fitted GLMs.The BIC is based on the log-likelihood (ln L), the number of observations (n) and the number of parameters (k), or complexity, of the model [56].
We can then compare two models by using the BIC scores to estimate the Bayes' Factor B12 that one model is better than the other [34,24].
Where ∆BIC = BIC2 − BIC1, thus if BIC1 < BIC2 then model 1 is better than model 2; and the likelihood ratio of model 1 vs. 2 is given by Eq. 14.As proposed by Lei et al., we have used ∆BIC > 6 as our threshold to declare one model is significantly better than the other.This corresponds to a Bayes' Factor of B12 ≈ 20.1 or that one model is at least 20 more likely to describe the data than the other.

Spatial Unit of Analysis
Our decision to use counties as the basis of our analysis was driven primarily by data availability, but also our desire to provide estimates for the entire United States.As noted in Section 1, most of our data sources were provided at the county level.As such the use of Urban Area [62] would entail spatially interpolating our dataset and would be highly susceptible to the modifiable unit area problem and other resampling artifacts [6].We note that prior scaling works have used a variety of spatial units for their analysis including municipalities [35,32,14,34], urban areas [38,34], metropolitan statistical areas (MSA) [11,14,34,10,71,22], core-based statistical areas [37,23,22], and commuting zones [43].Additionally, numerous prior works examining charging infrastructure have similarly used counties as the basis of their analysis [19,18,45,31].
United States Office of Management and Budget does define 929 core-based statistical areas (CBSA) consisting of a "county or counties associated with at least one core . . .plus adjacent counties that have a high degree of social and economic integration with the core measured through commuting ties" [64].As such, the use of CBSA would not introduce resampling artifacts into our dataset.We note that the Office of Management and Budget cautions against the use of CBSA "as a general-purpose geographic framework" [64].CBSAs are further divided into Metropolitan Statistical Areas (MSA) and Micropolitan Statistical Areas (µSA), based on the population of the core county.However, data availability and inconsistencies in state-to-state reporting methodologies could introduce significant interpolation and omission artifacts.For example, New York-Newark-Jersey City, NY-NJ-PA MSA would be excluded from Fig. 1a and 1b in our manuscript; due to the lack of electric vehicle registration data for NJ and PA.Alternatively, the Lewiston, ID-WA MSA spans Washington, which tabulates monthly vehicle registrations by vehicle type, and Idaho, which provides annual reports and does not differentiate vehicle type.
A. Model Fits using Core-Based Statistical Areas.We repeated our model fits using CBSAs instead of counties.To do this, we summed the population, EVSE stations, gasoline stations, EV registrations and vehicle registration counts over the counties composing each CBSA.We excluded CBSAs from analysis if data were missing for any of the constituent counties.For example, the New York-Newark-Jersey City, NY-NJ-PA MSA was used when fitting EVSE stations vs. population, as we had all the requisite data, but not for EVSE stations vs. EV registrations as we lack EV registration data for NJ and PA.We followed the same model fitting procedure as described in the Model Fitting Section.We consistently found strong evidence to prefer power-law models over alternative models, as measured by their Bayesian Information Criteria.Using the above statistical tests, we confirmed that the CBSA fits were significant and not degenerative.Additionally, we performed two tests to compare the CBSA estimates of β to our county-based estimates.First, we applied Clogg et.al's recommendation to use a z-test to check if the parameter estimates significantly differed [16].Second, we checked if the CBSA estimate for β was significantly different from one using a Wald test [67].The following further discusses our estimates of β using the NB and OLS power law models.Overall, the estimates for β using CBSAs were similar to our estimates using counties.Fig. 2 is a recreation of Fig. 1 using CBSAs, instead of counties, as the spatial unit of analysis.
A.1.Stations vs. Registrations.Both the OLS and NB models provided similar fits for gasoline (β = 0.77 ± 0.04 vs. 0.76 ± 0.04) and EVSE (β = 0.68 ± 0.04 vs. 0.71 ± 0.04) stations.As before, we find the NB model to be the more appropriate choice, as the OLS model is unable to account for 135 of 392 (34.4%)CBSAs lacking an EVSE station.We were unable to reject the null hypothesis that our CBSA and county-based estimates of β for EVSE stations differed (P = 0.973).Additionally, we found evidence that the CBSA estimate was also sub-linear (P < 0.001).For gasoline stations, we found evidence that the CBSA and county-based estimates of β differed (P < 0.001) and remained sub-linear (P < 0.001).
A.2. Registrations vs. Population.We found that OLS models performed significantly better, based on the BIC, the NB model for both gasoline (373 vs. 8996) and EVSE (779 vs. 3841) stations.The OLS model was able to fit all 392 CBSA for which we had data for the gasoline station, but only 275 CBSA, due to zero count data, for the EVSE stations.Both the OLS and NB models provided similar fits for gasoline (β = 1.00 ± 0.03 vs. 0.98 ± 0.03) and EVSE (β = 1.43 ± 0.09 vs. 1.38 ± 0.11) stations.As before, we find the NB model to be the more appropriate choice, as the OLS model is unable to account for 117 of 392 (29.8%)CBSAs lacking an EVSE station.We were unable to reject the null hypothesis that our CBSA and county-based estimates of β for EVSE stations differed (P = 0.212).Additionally, the CBSA estimate of β remained super-linear (P < 0.001).For gasoline stations, we failed to reject the null hypothesis that our CBSA and county-based estimates of β differed (P = 0.621).Additionally, we failed to reject the null hypothesis β ̸ = 1 using our CBSA estimate of β, consistent with our county-based estimate (P = 0.254).
A.3.Stations vs. Population.The OLS and NB models provided similar fits for gasoline (β = 0.83 ± 0.02 vs. 0.83 ± 0.02) and EVSE (β = 0.95 ± 0.05 vs. 1.03 ± 0.07) stations.As before, we find the NB model to be the more appropriate choice, as the OLS model is unable to account for 161 of 927 (17.4%) of CBSAs lacking an EVSE station.Using Clogg's test we failed to reject the null hypothesis that the CBSA and county bases estimates of β for EVSE stations differed (P = 0.262).We failed to reject the null hypothesis that β ̸ = 1 using a Wald test (P = 0.407).We note that our estimate of β remains significantly larger for EVSE than for gasoline stations; consistent with our assessment of EVSE scaling as being out-of-equilibrium.For gasoline stations, we found evidence that the CBSA estimate of β differed (P < 0.001) from the county estimate.However, using a Wald test, we found that the estimate remained sub-linear, (P < 0.001).

•
Number of EVSE stations vs. county population • Number of gasoline stations vs. county population • Number of gasoline stations vs. passenger vehicle registrations • Number of EVSE stations vs. electric vehicle registrations • Passenger vehicle registrations vs. county population • Electric vehicle registrations vs. county population Building on the procedure developed by Leitao et al., we fit the following models to predict the above relationship [34]: • Null: y = c • Linear: y = ax 2 + c • Quadratic: y = ax 2 + bx + c • Power (Log-Log Transform): ln y = a ln x + c • Power (Log Link): y = a ln x + c