## Abstract

In many procurement auctions, the bidders' unobserved costs depend both on a common shock and on idiosyncratic private information. Assuming a multiplicative structure, I derive sufficient conditions under which the model is identified and propose a non-parametric estimation procedure that results in uniformly consistent estimators of the cost components' distributions. The estimation procedure is applied to data from Michigan highway procurement auctions. Private information is estimated to account for 34% of the variation in bidders' costs. It is shown that accounting for unobserved auction heterogeneity has important implications for the evaluation of the distribution of rents, efficiency, and optimal auction design.

## INTRODUCTION

A researcher analysing choices of an economic agent often cannot observe some of the inputs that went into the agent's decision problem. Such missing information is typically referred to as unobserved heterogeneity, and addressing it is important in many empirical applications. For example, labour and macroeconomists face this issue when they analyse the decision to go to college since individual's costs or return to education are only imperfectly measured in the data. The same problem arises in the empirical analysis of auctions with private information. For example, the results of an empirical evaluation of the distribution of rents, efficiency, or an optimal design of an auction mechanism depend on the researcher's ability to uncover the distribution of bidders' private information, and the auction literature has long emphasized that unobserved variation in the distribution of bidders' private information is likely present in many environments. Several methods have been proposed in the literature to control for unobserved auction heterogeneity. However, it remained unclear whether private information can be separated from unobserved auction heterogeneity on the basis of auction data.

In a first-price auction environment where bidder valuations are known to them but are their private information, a growing literature started by Laffont, Ossard and Vuong (1995) and Guerre, Perrigne and Vuong (2000) uses the equilibrium relationship between bids and valuations to uncover the distribution of private information. The identification power of the methods proposed by this literature crucially relies on the fact that, after controlling for observed auction characteristics, the remaining variation in bids is generated by variation in private information. More specifically, these methods cannot be directly applied in an environment where a part of the variation in bids may be generated by systematic differences in auctions that are known to bidders but unobserved by the researcher.

This paper studies the first-price auction environment with private information and unobserved auction heterogeneity. It uses insights from a multifactor measurement error literature to develop a non-parametric estimation method to recover distributions of private information and unobserved auction heterogeneity from submitted bids. It also establishes sufficient conditions under which these distributions are identified and shows uniform consistency of the estimators. The estimation method is applied to data from Michigan highway procurement auctions to quantify the importance of private information in this market and to demonstrate the implication of unobserved auction heterogeneity for the evaluation of the distribution of rents, efficiency, and optimal mechanism design.

I assume that an environment with unobserved auction heterogeneity with *n* bidders can be characterized by a set of (*n* + 1) factors. One of the factors, a common cost component, represents information about cost attributes that are available to all bidders. Part of this information may not be observed by the researcher. Other factors, individual cost components, reflect cost attributes privately observed by each bidder. A bidder's costs are given by the product of the common cost component and this bidder's individual cost component. This cost structure implies that the distribution of costs may vary across projects even after all project characteristics known to the researcher are held constant. I allow bidders to be asymmetric, so that the distribution of the individual cost component may vary with the observable bidder characteristics.

The unobserved part of the common component (unobserved auction heterogeneity) generates dependence between bids submitted in the same auction. This dependence can be used to recover the distributions of the unobserved auction heterogeneity and individual bid components. In particular, I show that the distributions of components are identified from the joint distribution of two arbitrary bids submitted in the same auction when the individual cost components are independently distributed across bidders and are independent from the common component. Further, the distributions of individual bid components are used to uncover the distributions of individual cost components. The identification argument suggests a number of tests that can be performed to verify whether assumptions of the multifactor model are satisfied in the data. The paper also demonstrates that the set of bid distributions that can be rationalized by affiliated private values (APV), another informational environment that induces dependence in bids submitted in the same auction, is distinct from the set of bid distributions that can be rationalized by the model with unobserved auction heterogeneity. It proposes a test that can be used in practice to distinguish between these environments.

The estimation procedure proposed in the paper follows the steps of the identification argument. The Monte Carlo simulations confirm that the estimation procedure performs well in samples of moderate size.

I use data from Michigan highway procurement auctions to quantify the importance of accounting for unobserved auction heterogeneity. I estimate the distributions of private information and unobserved auction heterogeneity using the estimation procedure described earlier in the paper. I test the assumptions of the model and find that they are strongly supported by the data. The results of the estimation suggest that variation in private information accounts for only 34.4% of the bid variation. While comparing these results to the results obtained under alternative assumptions of independent private values (IPV) and APV, I find that the bid strategies recovered under alternative assumptions tend to overestimate the markup over bidders' cost relative to the estimates obtained under the assumption of unobserved auction heterogeneity. I also find that the distributions of bidders' costs recovered under alternative assumptions tend to have lower means and higher variances compared to the estimates obtained under unobserved heterogeneity. In particular, I find that the average markup estimated under the unobserved heterogeneity assumption is 8.4%, whereas the average markup is 14% for the APV assumption and 19% for the IPV assumption. The variance of the estimated costs distribution is 18% and 22% higher under these alternative assumptions relative to the variance of the costs distribution estimated under unobserved auction heterogeneity.

I use three sets of estimates to derive an optimal reserve price that minimizes procurement costs. I find that the reserve price chosen on the basis of IPV or APV estimates leads to significantly higher costs of procurement than the reserve price chosen on the basis of the estimates that account for the presence of unobserved auction heterogeneity. This result holds both in the case where the reserve price is derived as a function of a specific realization of unobserved heterogeneity and in the case where a single reserve price is chosen in such a way as to minimize the average cost of procurement where the average is computed with respect to the distribution of unobserved heterogeneity. In the latter case, the average cost of the procurement is about 9%–19% lower than the average cost achieved when the reserve price based on either IPV or APV estimates is used.

This paper contributes to the literature on the estimation of auction models that aims to uncover the distribution of bidders' private information from the submitted bids. In particular, Donald and Paarsch (1993), Donald and Paarsch (1996) and Laffont, Ossard and Vuong (1995) develop parametric methods to recover the distribution of costs from the observed distribution of bids. Elyakime *et al.* (1994), Elyakime *et al.* (1997) propose a non-parametric method to estimate the distribution of costs. Guerre, Perrigne and Vuong (2000) study identification of the first-price auction model with symmetric bidders and propose a uniformly consistent estimation procedure. Li, Perrigne and Vuong (2000), Li, Perrigne and Vuong (2002) extend the result to the APV and the conditionally IPV models. Campo, Perrigne and Vuong (2003) prove identification and develop a uniformly consistent estimation procedure for first-price auctions with asymmetric bidders and APV. These papers rely on the assumption of no unobserved auction heterogeneity, *i.e.* they explicitly use a one-to-one mapping between the distribution of bidders' costs and the distribution of observed bids that arises in such environments.

The paper by Li, Perrigne and Vuong (2000), LPV hereafter, also uses the methods of multifactor measurement error analysis. LPV consider the model with conditionally IPV. They assume that variation in bids is generated by variation in observable factors and private information only, so that their model does not allow for unobserved auction heterogeneity. The innovation in LPV is to allow for bidders' costs to be composed of common and individual factors. Thus, the structure of costs is similar to the one in my paper. However, unlike the environment with unobserved auction heterogeneity studied in my paper, in LPV the common factor is part of the private information of the bidder. Moreover, the bidder himself does not observe the realization of the common factor separately from the entire realization of his costs (his private information). He only knows the draw of his private information that is composed of common and individual factors. This implies that standard methods (that do not distinguish between common and individual factors) are still fully applicable in this environment. Having estimated the distribution of private information, LPV apply the multifactor decomposition, a result from a measurement error literature, in order to understand correlation patterns in bidders' private information.

The few papers that address the issue of unobserved auction heterogeneity include Campo, Perrigne and Vuong (2003), Bajari and Ye (2003), Haile, Hong and Shum (2003), Hong and Shum (2002), Athey and Haile (2000), and Chakraborty and Deltas (1998). The first two papers rely on the assumption that the number of bidders can serve as a sufficient statistic for the unobserved auction heterogeneity. Haile, Hong and Shum (2003) appeal to the instrumental variables approach to control for the variation generated by unobserved factors. More recently, Guerre, Perrigne and Vuong (2009) build on this methodology to identify the model with unobserved heterogeneity based on exclusion restrictions derived from bidders' endogenous participation. Hong and Shum (2002) account for unobserved auction heterogeneity by modelling the median of the bid distribution as a normal random variable with a mean that depends on the number of bidders. Athey and Haile (2000) study identification of auction models with unobserved auction heterogeneity in the context of second-price and English auctions. Chakraborty and Deltas (1998) assume that the distribution of bidders' valuations belongs to a two-parameter distribution family. They use this assumption to derive small-sample estimates for the corresponding parameters of the auction-specific valuation distributions. The estimates are later regressed on observable auction characteristics to determine the percentage of values variation that is due to unobserved auction heterogeneity.

Highway procurement auctions have been extensively studied in the literature. Porter and Zona (1993) find evidence of collusion in Long Island highway procurement auctions. Bajari and Ye (2003) reject the hypothesis of collusive behaviour in procurement auctions conducted in Minnesota, North Dakota, and South Dakota. Jofre-Bonet and Pesendorfer (2003) find evidence of capacity constraints in California highway procurement auctions. Hong and Shum (2002) find some evidence of common values in bidders' costs in the case of New Jersey highway construction auctions. Bajari and Tadelis (2001) and Bajari, Houghton and Tadelis (2004) study the implications of the incompleteness of procurement contracts.

The paper proceeds as follows. Section 2 describes the model. Section 3 discusses identification and testable implications of the model. Section 4 details the estimation procedure and summarizes results of the simulation study. Section 5 presents results of estimation and Section 6 concludes.

## THE MODEL

This section describes the first-price auction model under unobserved auction heterogeneity and summarizes properties of the equilibrium bidding strategies.

The seller offers a single project for sale to *m* bidders. Bidder *i*'s cost is equal to the product of two components: one is common and known to all bidders; the other is individual and the private information of firm *i*. Both the common and the individual cost components are random variables, and they are denoted by the capital letters *Y* and *X*, respectively. The small letters *y* and *x* denote realizations of the common component and the vector of individual components. The two random variables (*Y*, *X*) are distributed on , , , according to the probability distribution function *F*,

### Asymmetries between bidders

I assume that there are two groups of bidders: *m*_{1} bidders are from Group 1 and *m*_{2} bidders, *m*_{2} = (*m* − *m*_{1}), are from Group 2. Thus, the vector of independent cost components is given by *X* = (*X*_{11},…,*X*_{1m1},*X*_{2(m1 + 1)},…,*X*_{2m}). The model and all the results can easily be extended to the case of *m* groups. I focus on the case of two groups for the sake of expositional clarity. Groups are defined from the observable characteristics of bidders.

Assumptions (*D*_{1})–(*D*_{4}) are maintained throughout the paper.

(D

_{1})*Y*and*X*_{j}'s are mutually independent and distributed according towhere*F*_{Y},*F*_{X1}, and*F*_{X2}are marginal distribution functions of*Y*,*X*_{1j}, and*X*_{2j}, respectively. The supports of*F*_{Y}and*F*_{Xk}are given by , ;(

*D*_{2}) The probability density functions of the individual cost components,*f*_{X1}and*f*_{X2}, are continuously differentiable and strictly positive on the interior of .(

*D*_{3})*E**X*_{1j}= 1.(

*D*_{4})The number of bidders is common knowledge. There is no binding reservation price.

Assumption (*D*_{2}) ensures the existence and uniqueness of equilibrium. 1 The identification result relies on assumptions (*D*_{1}) and (*D*_{3}). In particular, assumption (*D*_{3}) is used to fix the scale of one of the cost components.2 (*D*_{4}) summarizes miscellaneous assumptions about the auction environment.

The auction environment can be described as a collection of auction games indexed by the different values of the common component. An auction game corresponding to the common component equal to *y*, , is analysed below.

The cost realization of bidder *i* is equal to *x*_{i}×*y*, where *x*_{i} is the realization of the individual cost component. The information set of bidder *i* is given by . A bidding strategy of bidder *i* is a real-valued function defined on

I use a small Greek letter β with subscript *yi* to denote the strategy of bidder *i* as a function of the individual cost components and a small Roman letter *b* to denote the value of this function at a particular realization *x*.

### Expected profit

The profit realization of bidder *i*, π_{yi}(*b*_{i},*b*_{ − i},*x*_{i}), equals (*b*_{i} − *x*_{i}×*y*) if bidder *i* wins the project and zero if he loses. The symbol *b*_{i} denotes the bid submitted by bidder *i*, and the symbol *b*_{ − i} denotes the vector of bids submitted by bidders other than *i*. At the time of bidding, bidder *i* knows *y* and *x*_{i} but not *b*_{ − i}. The bidder who submits the lowest bid wins the project. The interim expected profit of bidder *i* is given by

A Bayesian Nash equilibrium is then characterized by a vector of functions β_{y} = {β_{y1},…,β_{ym}} such that *b*_{yi} = β_{yi}(*x*_{i}) maximizes *E*[π_{i}∣*X* = *x*_{i},*Y* = *y*], when *b*_{j} = β_{yj}(*x*_{j}), *j*≠*i*, *j* = 1,…,*m*; for every *i* = 1,…,*m* and for every realization of *X*_{i}.

McAdams (2003) and others establish that, under assumptions (*D*_{1})–(*D*_{2}), a vector of equilibrium bidding strategies β_{y} = {β_{y1},…,β_{ym}} exists and is unique. The strategies are strictly monotone and differentiable.

Next, I characterize a simple property of the equilibrium bidding strategies.

If (α_{1}(·),…,α_{m}(·)) is a vector of equilibrium bidding strategies in the game with *y* = 1, then the vector of equilibrium bidding strategies in the game with y, , is given by β_{y} = {β_{y1},…,β_{ym}} such that β_{yi}(*x*_{i}) = *y*α_{i}(*x*_{i}), *i* = 1,…,*m*.

The proposition3 shows that the bid function is multiplicatively separable into a common and an individual bid component, where the individual bid component is given by α_{i}(·). The proof of this proposition is based on the comparison of the two sets of first-order conditions and follows immediately from the assumption that costs are multiplicatively separable and that the common component is known to all bidders.

Next, I characterize the necessary first-order conditions for the set of equilibrium strategies when *y* = 1. Note that α_{i}(·) denotes a strategy of bidder *i* as a function of the individual cost component and *a*_{i} the value of this function for a particular realization of *X*_{i}. The equilibrium inverse individual bid function for a group *k* bidders is denoted by ξ_{k}. Since the function α_{k}(·) is strictly monotone and differentiable, the function ξ_{k}(·) is well defined and differentiable.

The probability of winning in this game can be expressed as

*k*(

*i*) denotes bidder

*i*'s group and “ −

*k*(

*i*)” denotes the complementary group. The necessary first-order conditions are then given by

_{k}

^{′}(·) denotes the derivative of ξ

_{k}(·).

Equation (1) characterizes the equilibrium inverse individual bid function when *y* = 1. It describes the trade-off the bidder faces when choosing a bid: an increase in the markup over the cost may lead to a higher *ex post* profit if bidder *i* wins, but it reduces the probability of winning. The bid *a* is chosen in such a way that the marginal effects of an infinitesimal change in a bid on the winner's profit and the probability of winning sum to zero.

The next section uses properties of the equilibrium bidding functions to show how the primitives of the first-price auction model can be recovered from the submitted bids in the presence of unobserved auction heterogeneity.

## IDENTIFICATION AND TESTABLE IMPLICATIONS

The first part of this section formulates an identification problem and provides conditions under which a first-price auction model with unobserved auction heterogeneity is identified. The second part describes the restrictions this model imposes on the data. The third part discusses possible extensions.

### Identification

I assume that the econometrician has access to bid data, based on *n* independent draws from the joint distribution of (*Y*,*X*). The observable data are in the form {*b*_{ij}}, where *i* denotes the identity of the bidder, *i* = 1,…,*m*, and *j* denotes the project, *j* = 1,…,*n*. If the data represent equilibrium outcomes of the model with unobserved auction heterogeneity, then

*i.e.*

*b*

_{ij}is a value of bidder

*i*'s equilibrium bidding strategy corresponding to

*y*

_{j}evaluated at the point

*x*

_{ij}).

As was shown in the previous section, *b*_{ij} depends on the realizations of the common and individual components as well as on the joint distribution of the individual cost components. This section examines what properties of available data guarantee that there exists a unique triple {{*x*_{ij}},*F*_{Y},*F*_{X}} that satisfies equation (2), *i.e.* under what conditions the model from a previous section is identified.

Guerre, Perrigne and Vuong (2000) obtain an identification result by transforming the first-order conditions for optimal bids to express a bidder's cost as an explicit function of the submitted bid, the bid probability density function, and the bid distribution function. Under unobserved auction heterogeneity, the necessary first-order condition yields an expression for *x*_{ij}·*y*_{j} as a function of *b*_{ij} and the conditional bid probability density function and the conditional bid distribution function conditional on *Y* = *y*_{j}. The econometrician does not observe the realization of *Y* and, consequently, does not know the conditional distribution of bids for *Y* = *y*_{j}. Hence, it is not possible to establish identification based on the above first-order conditions.

The idea of my approach is to focus on the joint distributions of bids submitted in the same auction instead of the marginal bid distributions in order to identify the model with unobserved auction heterogeneity.

I use *B*_{i} to denote the random variable that describes the bid of bidder *i* of group *k*(*i*) with distribution function *G*_{Bk(i)}(·) and the associated probability density function *g*_{Bk(i)}(·); *b*_{ij} denotes the realization of this variable in auction *j*. The econometrician observes the joint distribution function of (*B*_{i1},…,*B*_{il}) for all subsets (*i*_{1},…,*i*_{l}) of (1,…,*m*).4

Proposition 1 establishes that

*a*

_{ij}is a hypothetical bid that would have been submitted by bidder

*i*if

*y*were equal to one. I use

*A*

_{i}to denote the random variable with realizations equal to

*a*

_{ij}. The associated distribution function is denoted by

*G*

_{Ak(i)}(·) with the probability density function

*g*

_{Ak(i)}(·). Note that the econometrician does not observe

*y*

_{j}and neither, therefore,

*a*

_{ij}. The distribution of

*A*

_{i}is latent.

The identification result is established in two steps. First, it is shown that the probability density function of *Y* can be uniquely determined from the joint distribution of two bids that share the same cost component. Further, it is shown that the probability density function of *A*_{k} can also be uniquely determined if the joint distribution is for two bids such that at least one of them corresponds to a bidder of group *k*. Second, monotonicity of the inverse bid function is used to establish identification of the cumulative density functions *F*_{X1} and *F*_{X2} from the distributions of the individual bid components, *G*_{A1} and *G*_{A2}.

The following theorem is the main result of this section. It formulates sufficient identification conditions for the model with unobserved heterogeneity.

If conditions (*D*_{1})*–*(*D*_{4}) are satisfied, then the probability density function *f*_{Y}(·) is identified from the joint distribution of (*B*_{i1},*B*_{i2}), where (*i*_{1},*i*_{2}) is any pair such that *i*_{1},*i*_{2}∈{1,…,*m*} . Further, *f*_{Xk}(·) is identified from the joint distribution of (*B*_{i1},*B*_{i2}) if either *k*(*i*_{1}) = *k* or *k*(*i*_{2}) = *k* or both.

Theorem 1 states that the distribution functions of cost components *f*_{Xk}(·) and *f*_{Y}(·) are identified. The proof of this theorem consists of two steps and is given in Appendix A. In the first step, a statistical result by Kotlarski (1966)5 is applied to the log-transformed random variables *B*_{i1} and *B*_{i2} given by

*Y*), log(

*A*

_{i1}), and log(

*A*

_{i2}) from the joint characteristic function of (log(

*B*

_{i1}),log(

*B*

_{i2})). Let Ψ(·,·) and Ψ

_{1}(·,·) denote the joint characteristic function of (log(

*B*

_{i1}),log(

*B*

_{i2})) and the partial derivative of this characteristic function with respect to the first component, respectively. Also, let Φ

_{log(Y)}(·) and Φ

_{log(Ak)}(

*t*) denote the characteristic functions of log(

*Y*) and log(

*A*

_{k}). Then,

*Y*) and log(

*A*

_{k}) are uniquely determined once

*E*[log(

*A*

_{1})] is fixed. It is convenient to start with normalization

*E*[log(

*A*

_{1})] = 0 and then adjust recovered random variables

*Y*,

*A*

_{1},

*A*

_{2}to achieve normalization in (

*D*

_{3}). Since there is a one-to-one correspondence between the set of characteristic functions and the set of probability density functions, the probability density functions of

*Y*,

*A*

_{i1},

*A*

_{i2}can be uniquely deduced from the characteristic functions of log(

*Y*), log(

*A*

_{i1}), and log(

*A*

_{i2}) since log(·) is a strictly increasing function and α

_{k}(·)∈(0,

*∞*),

*k*= 1,2. Note that the marginal distribution of a single bid per auction may not allow us to identify the distribution functions of

*Y*,

*A*

_{i1},

*A*

_{i2}because there is no unique decomposition of the sum (or product) into its components. The second step in the proof establishes that the distributions of the individual cost components are identified with (possibly) asymmetric bidders and IPV. It is similar to the argument given in Laffont and Vuong (1996). Once the distribution of

*X*

_{1},

*F*

_{X1}that corresponds to

*E*[log(

*A*

_{1})] = 0 is identified, and given that expectation of such

*X*

_{1}is equal to

*e*

_{1}, then , , with are unique random variables that correspond to normalization in (

*D*

_{3}).

A related question concerns identification of specific realizations *x*_{ij} and *y*_{j} corresponding to a particular bid *b*_{ij}. In this case, the answer is negative: *x*_{ij} and *y*_{j} cannot be separately identified. The reason is that we cannot solve for (*m* + 1) unknown {*y*:{*a*_{ij}}_{i = 1,…,m}} from *m* equations constructed on the basis of *m* bids submitted in a given auction.

Theorem 1 establishes that identification of the model with unobserved auction heterogeneity crucially relies on the assumption of independence of individual components across bidders and from the common cost component. Next, we show how the validity of these assumptions can be evaluated within a framework of the model with unobserved auction heterogeneity.

### Testable implications

Note that instead of log(*B*_{i1}) and log(*B*_{i2}), Kotlarski's result can be applied to the variables and since and . Here log(*A*_{i3}) plays the role of a common component, whereas log(*A*_{i1}) and log(*A*_{i2}) remain individual components. If the individual cost components *X*_{i1}, *X*_{i2}, and *X*_{i3} are independently distributed, then so are log(*A*_{i1}), log(*A*_{i2}), and log(*A*_{i3}). The characteristic functions of these variables can be computed using the joint characteristic function of (, ), which I denote by Θ(·,·), according to a formula similar to equation (3).6 Specifically,

*B*

_{i1}and

*B*

_{i3}are submitted by bidders of the same group and the assumption about independence of individual components holds, then Λ

_{log(Ai3)}(

*t*) and Λ

_{log(Ai1)}(

*t*) should be equal.7

Second, I have relied only on the functional form and the independence of the individual cost components assumptions to obtain Λ_{log(Aik)}(·). The assumption of the independence of *Y* and *X* then implies that Λ_{log(Ai3)}(·) and Λ_{log(Ai1)}(·) have to coincide with the functions given by equation (3) under normalization *E*[log(*A*_{i1})] = 0.8 These observations are summarized by conditions (*W*_{1}) and (*W*_{2}).

(

*W*_{1}) For any triple (*i*_{1},*i*_{2},*i*_{3}) such that {*i*_{1}= 1,…,*m*_{1}and*i*_{3}= 1,…,*m*_{1}}, or {*i*_{1}=*m*_{1}+ 1,…,*m*and*i*_{3}=*m*_{1}+ 1,…,*m*}, and*i*_{k}≠*i*_{l}for any*k*,*l*∈{1,2,3},*k*≠*l*,for every*t*∈[ −*∞*,*∞*] under normalization*E*[log(*A*_{i1})] = 0.9(

*W*_{2}) For any triple (*i*_{1},*i*_{2},*i*_{3}) such that*i*_{k}≠*i*_{l}for any*k*,*l*∈{1,2,3},*k*≠*l*,for every*t*∈[ −*∞*,*∞*] under normalization*E*[log(*A*_{i1})] = 0. Independence of individual cost components further implies condition (*W*_{3}).(

*W*_{3}) For any quadruple (*i*_{1},*i*_{2},*i*_{3},*i*_{4})⊂{1,…,*m*} such that*i*_{k}≠*i*_{l}for any*k*,*l*∈{1,2,3,4},*k*≠*l*, and are independently distributed.

Proposition 2 describes the implications of the independence assumptions.

Let bidder i's cost for the project j be given by *c*_{ij} = *x*_{ij}*y*_{j}.

*I*f the individual cost components are independent, then (*W*_{1})*has to be satisfied.**I*f the individual cost components are independent, then (*W*_{3})*has to be satisfied.**F*urther, if*Y is independent of X*,*then**W*_{2}*holds.*

Note that (*W*_{1}) and (*W*_{2}) apply to samples with *m* ≥ 3, whereas statement (*W*_{3}) applies only to the samples with *m* ≥ 4. The proof of Proposition 2 is given in Appendix A.

More generally, conditions below describe a set of joint restrictions imposed on the data by all the assumptions of the model with unobserved auction heterogeneity.)

(

*W*_{4}) For every pair (*i*_{l},*i*_{p}),*i*_{l},*i*_{p}= 1,…,*m*, the functions Φ_{ l o g(Y)}(·),Φ_{ l o g(Ail)}(·), and Φ_{ l o g(Aip)}(·) given by equation (3) represent characteristic functions of real-valued variables. In particular, the random variables that correspond to Φ_{ l o g(Ail)}(·) and Φ_{ l o g(Aip)}(·) should have the same support.(

*W*_{5}) The characteristic function Φ_{ l o g(Y)}(·) does not depend on the pair of (*i*_{l},*i*_{p}),*i*_{l},*i*_{p}= 1,…,*m*, which is used to derive it, and Φ_{ l o g(Ail)}(·) = Φ_{ l o g(Air)}(·) and Φ_{ l o g(Aip)}(·) = Φ_{ l o g(Aiq)}(·) for (*i*_{r},*i*_{q}) such that*k*(*i*_{r}) =*k*(*i*_{l}) and*k*(*i*_{q}) =*k*(*i*_{p}). (*W*_{6})(

*W*_{6}) The inverse bid functionsare strictly increasing in*a*.10

Proposition 3 establishes necessary conditions for the model with unobserved auction heterogeneity to be rationalizable11 by a given data set.

If a model with unobserved heterogeneity generated the data, then conditions (*W*_{4})*–*(*W*_{6}) must hold.

### Distinguishing from the model with APV

Unobserved auction heterogeneity induces dependence between bids submitted in the same auction. Within the private values framework, a similar regularity pertains to models with APV. Interestingly, it can be shown that the set of bid distributions that can be rationalized by a model with APV does not coincide with the set that can be rationalized by a model with unobserved auction heterogeneity.

Indeed, bids generated by a model with unobserved auction heterogeneity are conditionally independent. On the other hand, the distributions of bids generated by models with APV are affiliated. The results related to de Finetti theorem12 establish that the set of affiliated distributions is larger than the set of conditionally independent distributions. Therefore, there must exist bid distributions with dependent bids that could not be generated by a model with unobserved auction heterogeneity.

Further, it is possible to construct a test that distinguishes the APV setting from the unobserved auction heterogeneity setting in the data. As noted above, under unobserved heterogeneity, for every quadruple of bids submitted in the same auction, the pairwise ratios involving distinct bids are independent. This property, however, does not hold for a large class of bid distributions generated by models with APV. The proof of this statement is given in the Technical Appendix posted on the website of the *Review of Economic Studies*. The pairwise ratio independence may hold for a small set of affiliated distributions (the details are in the Technical Appendix). Therefore, this test has no power against some alternatives. The Technical Appendix also provides several examples of widely used affiliated distributions that fail the property of pairwise ratios independence. Among others, it considers a truncated multivariate normal distribution and shows that it fails the test for a large set of parameter values.

## ESTIMATION

This section describes the estimation method and derives properties of the estimators. Some practical issues related to the estimation procedure are discussed in Sections A.2 and A.3.

### Estimation method

The econometrician has data for *n*_{0} auctions. For each auction *j*, (*m*_{j},{*b*_{ij}}_{i = 1}^{i = mj},*z*_{j}) are observed, where *m*_{j} is the number of bidders in the auction *j*, with *m*_{j1} bidders of Group 1 and *m*_{j2} bidders of Group 2; {*b*_{ij}}_{i = 1}^{i = mj} is a vector of bids submitted in the auction *j*; and *z*_{j} is a vector of auction characteristics. The estimation procedure is described for the case of discrete covariates. It can be extended to the case of continuous *z*_{j}. 13

The estimates are obtained conditional on the number of bidders, *m*_{j} = *m*_{0}, *m*_{1j} = *m*_{01}, and *z*_{j} = *z*_{0}. Let *n* denote the number of auctions that satisfy these restrictions. The estimation procedure closely follows the identification argument described in the proof of Theorem 1. It consists of the following steps.14

The log transformation of bid data is performed to obtain

*L**B*_{ilj}= log(*B*_{ilj}) and*L**B*_{ipj}= log(*B*_{ipj}), where*i*_{l}= 1,…,*m*_{01}and*i*_{p}=*m*_{01}+ 1,…,*m*_{0}.The joint characteristic function of an arbitrary pair (

*L**B*_{il},*L**B*_{ip}) is estimated byand the derivative of Ψ(·,·) with respect to the first argument, Ψ_{1}(·,·), byI average over all possible pairs to enhance efficiency.The characteristic functions of the log of individual bid components

*L**A*_{k},*k*= 1,2, and the log of the common cost component*LY*are estimated asI first use the normalization*E*[log(*A*_{1})] = 0.The inversion formula is used to estimate densities ,

*k*= 1,2, and .for and , where*T*is a smoothing parameter.The densities of

*A*_{k}and*Y*are obtained asfor and .15The individual inverse bid function at a point

*a*is estimated aswhereand is an estimate of the lower bound of the support of*g*_{Ak}(.) which corresponds to the normalization*E*[log(*A*_{1})] = 0 (see Section A.2 for discussion of the support estimation).The individual bid function for a group

*k*at a point is estimated asThe cumulative distribution function of the individual cost component is estimated by substituting the corresponding estimated bid function into the estimated cumulative distribution function of the individual bid component

To arrive at the normalization in (

*D*_{3}), compute and then perform the following adjustments: and .I have also constructed an estimate of the total cost density function

for .An expected inverse bid function16 is estimated as

### Properties of the estimator

This subsection shows that the estimation procedure yields uniformly consistent estimators of the relevant distributions. This result is derived under the following restrictions on the tail behaviour of characteristic functions.)

(

*W*_{5}) The characteristic functions Φ_{LY}(·) and Φ_{LAk}(·) are ordinary smooth17 with ϰ > 1.

This property holds, *e.g.* when cumulative probability functions of cost components admit up to *R*, *R* > 1, continuous derivatives on the support interior such that *M* of them, 1 ≤ *M* ≤ *R*, can be continuously extended to the real line. The uniform consistency of bid component estimators is used to establish the uniform consistency of the cumulative distribution function estimator for the individual cost component.

Proposition 4 summarizes properties of the estimator.

If conditions (*D*_{1})*–*(*D*_{5}) are satisfied, then *and* are uniformly consistent estimators of *F*_{Y}(·) *and**F*_{Xk}(·), *k* = 1,2, respectively.

The proof of Proposition 4 is given in Appendix A. Confidence intervals for the estimates are obtained through a bootstrap procedure.

### Monte Carlo study

In this section, I present and discuss the results from the simulation study, which analyses the performance of the estimator in small samples.

The simulated data sets are generated as follows. The cost of bidder *i* is set to be equal to the product of common and individual cost components, *c*_{i} = *y**x*_{i}. The data are generated using random draws from distributions that are similar in shape to the estimated distributions of cost components. To create a typical data set describing *n* procurement auctions with *k*_{1} and *k*_{2} bidders from Groups 1 and 2 correspondingly, I take *k*_{1}×*n* and *k*_{2}×*n* independent draws from the distributions of the individual cost component for Groups 1 and 2 and combine them with *n* draws from the distribution of the common cost component such that

I set the value of *k*_{1} = 2 and *k*_{2} = 2 similar to the configuration in the data and consider the data sets of progressively smaller sizes with *n* = 250,200, and 150. Therefore, individual cost components are estimates using 500, 400, and 300 bids, respectively.

The results of this study are summarized in Figures 1, 2, and 3. Figure 1 presents results for the common component, while Figures 2 and 3 describe the performance of the estimators for the cumulative distribution and probability density functions of individual components. These figures depict the original distributions of the individual and common cost components used to generate the simulated data as well as the 5% and 95% quantiles of the estimators.

Figures 1, 2, and 3 demonstrate that estimators perform well except for the smallest data set where the quantile range becomes quite wide and does not contain small parts of the underlying distribution functions.

## MICHIGAN HIGHWAY PROCUREMENT AUCTIONS

This section describes characteristics of the Michigan highway procurement auctions. Sections 5.1 and 5.2 present the data and report some descriptive statistics. Section 5.2 also presents the results of specification tests. Section 5.3 describes the estimation results for the model with unobserved auction heterogeneity, compares them to the estimates obtained under the assumption of IPV and APV, performs reserve price analysis under alternative specifications, and summarizes the tests' outcomes for the assumptions of the model with unobserved auction heterogeneity.

### Market description

The Michigan Department of Transportation (DoT) is responsible for construction and maintenance of most roads within Michigan. The DoT identifies work that has to be done and allocates it to companies in the form of projects through a first-price sealed bid auction. The project usually involves a small number of tasks, such as resurfacing, replacing the base, or filling in cracks.

**Letting process**. The DoT advertises projects 4–10 weeks prior to the letting date. Advertisement usually consists of a short description of the project, including the location, completion time, and a short list of the tasks involved. Companies interested in the project can obtain a detailed description from the DoT.

**Estimated cost.** The DoT constructs a cost estimate for every project that is based on the engineer's assessment of the work required to perform each task and prices derived from the winning bids for similar projects let in the past. The costs are then adjusted through a price deflator.

Federal law requires that the winning bid should be lower than 110% of the engineer's estimate. If a state decides to accept a bid that is higher than this threshold, it has to justify this action in writing. In this case, the engineer's estimate has to be revised and verified for any possible mistake. In my data set, I observe a number of bids higher than 110% of the engineer's estimate. On multiple occasions, the winning bid is higher than this threshold. These facts suggest that bidders consider the probability of an event when this restriction comes into effect to be rather small. The assumption of no reserve price is justified in this environment.

**Number of bidders**. It is unclear if the auction participants have a good idea about the number of their competitors. The existing literature on highway procurement auctions tends to argue that this is a small market where participants are well informed about each other and can accurately predict the identities of auction participants.18 I follow this tradition and assume that the number of actual bidders is known to auction participants.

### Descriptive statistics

I use data for the highway procurement auctions held by the Michigan DoT between February 1997 and December 2003. In particular, I focus on highway maintenance projects with bituminous resurfacing as the main task. The data set consists of a total of 3947 projects. My information includes the letting date, the completion time, the location, the tasks involved, the identity of all the bidders, their bids, and an engineer's estimate.

My choice of the projects' type is motivated by two objectives. First, I want to ensure that the auction environment is characterized by private rather than common values. Second, I am looking for an environment that is likely to have unobserved auction heterogeneity. Highway maintenance projects are usually precisely specified and relatively simple. It is likely that bidders can predict their own costs for the project quite well. The existing variation in bids is, therefore, associated with variation in costs across firms, which is consistent with the private values environment. This variation is generated by differences in opportunity costs and input prices faced by different firms. Further, although highway maintenance projects are rather simple, their costs can be substantially affected by local conditions such as elevation and curvature of the road, traffic intensity, and age and quality of the existing surface. Information about these features may not be available to the researcher. On the other hand, firms' representatives usually travel to the project site and, therefore, are likely to collect this information and incorporate it into their bids. Hence, I expect to find unobserved auction heterogeneity.

The paving companies participating in the maintenance auctions mostly differ by their size (employment, number of locations). The differences in size may imply cost differences if economies of scale are present. For example, larger companies are likely to own their equipment instead of renting it, which may reduce cost. Since size is observable to all market participants, it is important to allow for the possibility that market participants have different beliefs about the distribution of costs for groups of companies that differ by size. Therefore, I allow for asymmetries between bidders. In particular, I distinguish between two types of bidders: regular (large) bidders and fringe bidders. The set of regular bidders is defined to include companies that consistently won at least $10 million in projects during each year in my data set and have at least 100 employees. 19

In my data, the number of bidders per project varies between 1 and 11. More than 85% of projects attracted between two and six bidders, with the mean number of bidders equaling 3.4 and a standard deviation of 1.3. About 75% of the projects have an engineer's estimate ranging between $100,000 and $1,000,000; 5% are below $100,000; and 20% are above $1,000,000.

Table 1 provides summary statistics of several important variables by the number of bidders. It shows that the mean of the engineer's estimate does not change significantly across groups of projects that attracted different numbers of bidders. The tabulation of the winning bid indicates that the difference between the engineer's estimate and the winning bid is positive and that it increases with the number of bidders. An important statistic of the data is “money left on the table” as represented by the difference between the lowest and second-to-lowest bid normalized by the engineer's estimate. This variable is usually taken to indicate the extent of uncertainty present in the market. “Money left on the table” is, on average, equal to 7% of the engineer's estimate and decreases with the number of bidders. The magnitude of the “money left on the table” variable is similar to the findings of other studies.20 It indicates that cost uncertainty may be substantial. Table 1 also shows that the number of regular bidders is usually between 1 and 3 and increases only slightly with the total number of bidders.

Number of bidders | Overall | 1 | 2 | 3 | 4 | 5 | 6 |

Number of observations | 3947 | 71 | 673 | 1126 | 1026 | 365 | 192 |

Engineer's estimate (’00000) | |||||||

Mean | 1.175 | 12.80 | 10.27 | 12.60 | 13.90 | 12.90 | 16.40 |

Standard deviation | 4.660 | 2.35 | 1.41 | 3.02 | 2.26 | 1.79 | 3.39 |

Winning bid (hdrds. th.) | |||||||

Mean | 1.175 | 11.10 | 10.00 | 11.80 | 12.90 | 11.80 | 15.20 |

Standard deviation | 4.660 | 2.32 | 1.50 | 2.89 | 2.25 | 1.66 | 3.35 |

Money left on the table | |||||||

Mean | 1.175 | 0.07 | 0.11 | 0.08 | 0.07 | 0.05 | 0.04 |

Standard deviation | 4.660 | 0.05 | 0.08 | 0.06 | 0.06 | 0.05 | 0.04 |

Number of regular bidders | |||||||

Mean | 1.175 | 1.92 | 1.43 | 1.65 | 2.07 | 2.16 | 2.29 |

Standard deviation | 1.175 | 1.06 | 0.62 | 0.72 | 0.98 | 1.21 | 1.32 |

Number of bidders | Overall | 1 | 2 | 3 | 4 | 5 | 6 |

Number of observations | 3947 | 71 | 673 | 1126 | 1026 | 365 | 192 |

Engineer's estimate (’00000) | |||||||

Mean | 1.175 | 12.80 | 10.27 | 12.60 | 13.90 | 12.90 | 16.40 |

Standard deviation | 4.660 | 2.35 | 1.41 | 3.02 | 2.26 | 1.79 | 3.39 |

Winning bid (hdrds. th.) | |||||||

Mean | 1.175 | 11.10 | 10.00 | 11.80 | 12.90 | 11.80 | 15.20 |

Standard deviation | 4.660 | 2.32 | 1.50 | 2.89 | 2.25 | 1.66 | 3.35 |

Money left on the table | |||||||

Mean | 1.175 | 0.07 | 0.11 | 0.08 | 0.07 | 0.05 | 0.04 |

Standard deviation | 4.660 | 0.05 | 0.08 | 0.06 | 0.06 | 0.05 | 0.04 |

Number of regular bidders | |||||||

Mean | 1.175 | 1.92 | 1.43 | 1.65 | 2.07 | 2.16 | 2.29 |

Standard deviation | 1.175 | 1.06 | 0.62 | 0.72 | 0.98 | 1.21 | 1.32 |

Next, I explore if there is scope for unobserved auction heterogeneity in my data. I implement the specification tests outlined in Section 3.4. More specifically, I test for (1) conditional independence of a pair of bids submitted in the same auction (*H*_{0}: IPV vs. *H*_{1}: unobserved auction heterogeneity) and (2) conditional independence of two ratios of bids using four different bids submitted in the same auction (*H*_{0}: unobserved auction heterogeneity vs. *H*_{1}: APV). I condition on a linear index of observable auction characteristics such as the engineer's estimate and time to complete the project (duration), type of highway, year and month dummies, district dummies, and total number of tasks. The index is estimated through an Ordinary Least Squares regression. The tests are performed conditional on the main task of the project and the number of bidders. The testing procedure I use is explained in Appendix A. For bituminous resurfacing projects with four regular bidders, the *p*-value for the first test statistics is equal to 0.03 and the *p*-value for the second test statistics is 0.52.21 Therefore, the null hypothesis of IPV can be rejected against the alternative of unobserved auction heterogeneity at the 5% significance level. At the same time, the null hypothesis of unobserved auction heterogeneity cannot be rejected against the alternative of APV.

I interpret the correlation between bids submitted in the same auction as evidence of unobserved auction heterogeneity. It is possible, however, that the correlation between bids is generated through some other mechanism. For example, it may arise if the auction environment has common values features. It may also arise if participating companies are systematically engaged in collusive behaviour. I deal with the first issue by restricting my attention to maintenance projects that are unlikely to have any project-related uncertainty that could lead to a common values effect. It is much harder to reject a possibility of collusion since all the tests proposed in the literature depend on the particular collusion scheme employed. I use the test proposed by Porter and Zona (1993), which is based on the assumption that if there is a collusion scheme, then only the winning bid corresponds to a real cost realization and all other bids are “phony”, *i.e.* unsubstantiated by any cost realization. I use a procedure described in Athey and Haile (2002) to recover the distribution of regular and fringe bids from the distribution of the winning bid. I then compare these distributions to the ones estimated from the losing bids. Distributions estimated through these two procedures appear to be similar, which gives me confidence that the data do not reflect the outcome of collusive behaviour.

Thus, I find evidence in favour of unobserved auction heterogeneity in Michigan highway procurement auctions. I estimate the distributions of cost components using the estimation method outlined in Section 4 to evaluate the relative importance of different cost components.

### Estimation results

The estimation results presented below correspond to the set of projects with an engineer's estimate between $300,000 and $580,000 and the time to completion between 3 and 6 months that attracted two regular and two fringe bidders. This set consists of 226 projects. The results for different values of engineer's estimate, duration, and the number of bidders are qualitatively similar.

I use projects that are quite similar in estimation. However, the data set still contains some residual variation in observable auction characteristics. I use a homogenization procedure to eliminate the variation in observable factors. To arrive at homogenized bids, I estimate the mean of log(bid) as a linear function of observable characteristics, eliminate the estimated mean from the bids, and use the residuals in the estimation; I add the estimated mean back in when evaluating the importance of private information and for comparison to alternative models.22 The mean of log(bid) is assumed to be a linear function of the engineer's estimate, duration, type of highway, year and month dummies, district dummies, and total number of tasks.

In the estimation, the mean of the regular type is normalized to be equal to one. Figure 4 presents estimated distributions of the unobserved auction heterogeneity component and individual cost components. The common cost component is a product of the common observable component (extracted through homogenization procedure) and unobserved heterogeneity component. The recovered distribution of the unobserved heterogeneity component has a mean of 0.98 and a standard deviation of 0.204, whereas the mean and standard deviation of the common component are equal to *$*392,000 and *$*78,890, respectively. The recovered distributions of individual components for regular and fringe groups are similar. The individual cost component of the fringe type has a higher mean but lower variance than the individual cost component of the regular type. The mean of the fringe type distribution is 1.06. Standard deviations of the regular and fringe type distributions are 0.14 and 0.13, respectively. I also perform a test of the equality of individual cost component distributions.23 The distribution of the test statistic is computed both through subsampling 24 and bootstrap procedures. The *p*-value of the test statistic is 0.69. Therefore, I can formally reject equality at the 10% significance level.

#### Variance decomposition.

Recall that bidder *i*'s cost for project *j* is given by *c*_{ij} = *y*_{j}×*x*_{ij}. A Taylor approximation applied to *C*(·,·) as a function of *X* and *Y* allows us to approximate the variance of *C* in the following way:

*E*

*Y*)

^{2}Var(

*X*) and (

*E*

*X*)

^{2}Var(

*Y*) are taken to represent parts of the cost variation generated by the variation in the individual cost and unobserved heterogeneity components, respectively, then it can be calculated that the individual cost component accounts for almost 31% of variation in the homogenized costs.25

^{,}26

^{,}27

#### Markups over the bidders' costs.

The estimated inverse bid functions are used to compute markups over the bidders' costs. The normalized markup, , *x* = ξ(*a*), ranges from 0.1% to 25% and, on average, is equal to 8.4% for the regular bidder. Markups for the fringe type bidders range between 0.1% and 18% and, on average, are equal to 6.1%.

#### Inefficient outcomes.

When bidders are asymmetric, it is possible that the project is not awarded to the lowest cost bidder, *i.e.* the auction outcome is not efficient. To compute the probability of such an event for the selected set of projects, I use the estimated distributions of cost components to create a pseudo-sample of bidders' costs for a set of 250 auctions with three bidders each. Then, for each cost draw, I calculate the bid value from the estimated bid function. Finally, the fraction of the auctions in which the lowest bid does not correspond to the lowest cost is computed. This exercise is repeated 1000 times. I find that the estimated probability of an inefficient outcome is, on average, equal to 5% with a 95% quantile range given by [3.6, 6.2]. This corresponds to an estimated 2% increase in the cost of the procurement; the 95% quantile range is given by [1.3, 2.8].

#### Comparison to alternative auction models.

Figure 5 compares the average bid function estimated under the assumption of unobserved auction heterogeneity to the bid function recovered under the APV and IPV assumptions, respectively.28 Both the IPV and APV procedures estimate the total costs that are substantially lower than the average costs estimated under the unobserved auction heterogeneity assumptions for both regular and fringe bidders. In particular, the model with unobserved auction heterogeneity implies an average markup over the bidders' costs to be 8.4% (6.1% for fringe bidders), whereas the model with APV predicts a markup of 14% (12.2%) and the model with IPV predicts a markup of 19% (16.5%). In each case, confidence intervals for the IPV and APV estimates intersect the confidence interval constructed under the unobserved heterogeneity assumption only for a very small part near the upper end of the support.

Figure 6 compares the average density function of the cost distribution estimated under the assumption of unobserved auction heterogeneity to the cost density functions recovered under APV and IPV assumptions. The estimated density function for the IPV and APV models are flatter relative to the density function estimated under the assumption of unobserved auction heterogeneity. In both cases, confidence intervals for the IPV and APV estimates intersect the confidence interval constructed under the assumption of unobserved auction heterogeneity only for a very small part of the support. The variance of the cost distribution estimated under the assumption of unobserved auction heterogeneity is about 18% lower than the variance of the cost distribution estimated under the assumption of APV and 22% lower than the variance of the cost distribution estimated under the assumption of IPV.

**Reserve price**. I use the results of estimation to compute the optimal reserve price in the environment with unobserved auction heterogeneity. I compare the performance of this reserve price and of reserve prices derived from the estimates based on alternative assumptions. To avoid theoretical complications unrelated to the subject of this paper, I restrict my attention to the symmetric case in this section.

The government chooses a reserve price to minimize the expected cost of procurement, which consists of two parts: the expected cost of not allocating the job today and the expected cost at which work can be completed today given the reserve price *r*. Let us denote the first component *c*_{0}. It represents the sum of the cost of waiting another period and the expected cost at which the project can be completed in the future. I do not have data on the magnitude of *c*_{0}. Therefore, I consider a range of possible values for *c*_{0} and derive an optimal reserve price for each of them.

I compute a reserve price under four assumptions: (a) unobserved auction heterogeneity (realization of unobserved heterogeneity is known to the government); (b) unobserved auction heterogeneity (realization of unobserved heterogeneity is unknown to the government); (c) IPV; and (d) APV. In (b) the reserve price is derived to minimize the average cost of procurement, where the average is taken with respect to the distribution of unobserved auction heterogeneity. While assumption (a) describes the benchmark case, it may not be implementable in practice if the government does not know the realization of unobserved auction heterogeneity. In this case, the reserve price derived in (b) can be used. I compare the performance of these four reserve prices on the basis of an average cost of procurement29 achieved for a given reserve price. To perform these computations, I use the results of the estimation for regular bidders only.

The results of the analysis are summarized in Table 4. The table records for every reserve price candidate (1) an average probability with which a bid is submitted; (2) the average cost of procurement as a percent of *c*_{0}; and (3) the average cost of procurement as a percent of the benchmark expected costs.

The results of the computation show that the reserve price computed from the cost distribution estimated under the IPV or APV assumption fares considerably worse in comparison to the benchmark case and to the reserve price derived from the average cost function in (b). In particular, the average expected cost achieved through the reserve price based on IPV estimates is 9%–20% of *c*_{0} higher than the benchmark cost, whereas the reserve price derived in (b) is only about 1% of *c*_{0} higher. The results are even more drastic if we express expected costs as a percent of benchmark costs. Then the reserve price in (b) produces still only a 1%–2% increase in costs relative to the benchmark case, whereas the IPV reserve price leads to a 10%–35% increase in costs. The disparity is smallest when *c*_{0} is very close to the mean cost, which is not very likely to happen in reality. In realistic cases of *c*_{0} equal to at least 150% of the mean costs, the gain from using the cost distribution estimated under the assumption of unobserved heterogeneity constitutes at least 16% of the benchmark costs. This is a significant effect, especially since the bidders' markup in this environment constitutes only about 6%–8% of the costs. The discrepancy is much higher when the reserve price is derived on the basis of APV estimates. Also, IPV and APV results imply a lower than optimal probability to submit a bid.

**Evaluating assumptions of the model.** The identification and estimation of the model with unobserved auction heterogeneity relies on the assumption that individual cost components are independent from each other and from the common cost component. Proposition 2 from the identification section allows us to evaluate the validity of these assumptions in the data.

Part (2) of Proposition 2 suggests a test of independence of individual components. Implementation of this test is discussed in Section 5.2 The results of the test are reported in Table 3. They strongly suggest that the null hypothesis cannot be rejected.

Part (3) of Proposition 2 allows us to test the assumption that the common component is independent from the individual components. This test is performed as a test of the equality of two functions. Both functions are estimated from the data. The testing procedure is described in Appendix A. The *p*-value of the test statistic is 0.81. The null hypothesis, therefore, cannot be rejected at any reasonable significance level.

I have also performed the test from part (1) of Proposition 2 following the same procedure as above. The *p*-value of the test statistic is 0.63. It is, therefore, in line with the results of the tests presented earlier.

**Robustness check.** The model of bidding behaviour that I take to the data assumes that firms' bidding decisions are independent across auctions. This assumption may be violated if bidders' decisions are affected by dynamic considerations. In particular, when a company is capacity constrained, it has to take into account the effect of winning the project today on its ability to explore profitable opportunities tomorrow. If dynamic links between auctions are substantial in magnitude, our estimates of the characteristic function of the joint distribution of two bids submitted in the same auction may be biased, which in turn would lead to biased estimates for the distributions of cost components. To evaluate the effect of dynamic links on the performance of the estimation procedure, I re-estimate the model for the subset of projects such that all regular firms bidding for the projects in this subset have their backlog variable between 30% and 75% of the maximum of the backlog variable for the firms observed in the data. Even though this exercise substantially reduces the number of available projects and, therefore, leads to less precise estimates, they imply similar results for the variance decomposition and the biases from misspecification.

## CONCLUSION

This paper proposes a non-parametric procedure to recover the distribution of bidders' private information when unobserved auction heterogeneity is present. It derives sufficient conditions under which the model is identified and shows that the estimation procedure produces uniformly consistent estimators of the distributions in question. The paper describes a number of testable restrictions implied by the model with unobserved heterogeneity. It also provides guidance on the practical implementation of the testing procedures that correspond to these restrictions.

This methodology is applied to the data for highway maintenance projects collected by Michigan DoT. For this data set, private information is estimated to explain only about 34.4% of the variation in a project's costs. This estimate is obtained while conditioning on the number of bidders, on the type of the project as defined by the main task, and on the size and duration bracket. Results of the estimation reveal that the estimation procedures that account for unobserved auction heterogeneity tend to estimate higher average costs, lower variance of the cost distribution, and lower markups relative to the estimates obtained under the assumption of IPV or APV. Additionally, the reserve price chosen on the basis of IPV or APV estimates leads to significantly higher costs of procurement than the reserve price chosen on the basis of the estimates for the unobserved auction heterogeneity model. This result holds both in the case where the reserve price is derived as a function of a specific realization of unobserved heterogeneity and in the case where a single reserve price is chosen in such a way as to minimize the average cost of procurement where the average is taken with respect to the distribution of unobserved heterogeneity. In the latter case, the average cost of the procurement is 9%–19% lower than the average cost achieved when the reserve price based on either IPV or APV estimates is used.

The methodology in this paper is developed for the case where a bidder's cost of completing the project equals the product of the common cost component and the individual cost component. A somewhat more general model that allows for the common component to have distinct effects on the mean and variance of the cost distribution is analysed in Krasnokutskaya (2009).

Variables | Test subsample | Estimation subsample |

Constant | 0.375 | 0.327 |

(0.010) | (0.012) | |

Engineer's estimate | 0.8413 | 0.8113 |

(0.028) | (0.025) | |

Duration | 0.0011 | 0.0015 |

(0.0013) | (0.0011) | |

Tasks | 0.0014 | 0.0012 |

(0.0007) | (0.0009) | |

(N_{regular},N_{fringe}) | (4,0) | (2,2) |

Number of projects | 370 | 226 |

R^{2} | 33% | 17% |

Variables | Test subsample | Estimation subsample |

Constant | 0.375 | 0.327 |

(0.010) | (0.012) | |

Engineer's estimate | 0.8413 | 0.8113 |

(0.028) | (0.025) | |

Duration | 0.0011 | 0.0015 |

(0.0013) | (0.0011) | |

Tasks | 0.0014 | 0.0012 |

(0.0007) | (0.0009) | |

(N_{regular},N_{fringe}) | (4,0) | (2,2) |

Number of projects | 370 | 226 |

R^{2} | 33% | 17% |

Test | p-Value | ||

1. | Conditional independence | 0.52 | |

2. | X_{i}⊥Y | Bootstrap | 0.81 |

Subsampling | 0.75 | ||

3. | X_{i}⊥X_{j} | Bootstrap | 0.43 |

Subsampling | 0.63 |

Test | p-Value | ||

1. | Conditional independence | 0.52 | |

2. | X_{i}⊥Y | Bootstrap | 0.81 |

Subsampling | 0.75 | ||

3. | X_{i}⊥X_{j} | Bootstrap | 0.43 |

Subsampling | 0.63 |

Medium projects: engineer's estimate = 4.0(’00,000) | Unobserved Heterogeneity | Unobserved Heterogeneity Expected | ipv | apv | ||

1. | Probability of submitting a bid (expected) | c_{0} = 5 | 0.29 (0.26, 0.31) | 0.28 (0.25,0.31) | 0.07 (0.05,0.09) | 0.03 (0.02,0.06) |

2. | Expected cost of procurement (as % of c_{0}) | 85.2 (83, 87) | 86.5 (85, 88) | 93.5 (91, 95) | 96.8 (95, 98) | |

3. | Expected cost of procurement (as % of unh) | 101.4 | 109.7 | 113.6 | ||

1. | Probability of submitting a bid (expected) | c_{0} = 7 | 0.45 (0.43, 0.46) | 0.42 (0.40,0.43) | 0.20 (0.18,0.21) | 0.04 (0.03,0.05) |

2. | Expected cost of procurement (as % of c_{0}) | 72.1 (71.2, 73.5) | 73.2 (72.3, 75.1) | 83.9 (81.2,84.7) | 95.4 (94.3,96.2) | |

3. | Expected cost of procurement (as % of unh) | 101.6 | 116.4 | 132.4 | ||

1. | Probability of submitting a bid (expected) | c_{0} = 10 | 0.58 (0.56, 0.59) | 0.55 (0.54,0.56) | 0.20 (0.18,0.21) | 0.037 (0.03,0.04) |

2. | Expected cost of procurement (as % of c_{0}) | 58.1 (56.2, 59.5) | 59.1 (57.8, 60.1) | 78.7 (76.5,79.7) | 94.7 (94.0,95.6) | |

3. | Expected cost of procurement (as % of unh) | 101.6 | 135.5 | 163.0 |

Medium projects: engineer's estimate = 4.0(’00,000) | Unobserved Heterogeneity | Unobserved Heterogeneity Expected | ipv | apv | ||

1. | Probability of submitting a bid (expected) | c_{0} = 5 | 0.29 (0.26, 0.31) | 0.28 (0.25,0.31) | 0.07 (0.05,0.09) | 0.03 (0.02,0.06) |

2. | Expected cost of procurement (as % of c_{0}) | 85.2 (83, 87) | 86.5 (85, 88) | 93.5 (91, 95) | 96.8 (95, 98) | |

3. | Expected cost of procurement (as % of unh) | 101.4 | 109.7 | 113.6 | ||

1. | Probability of submitting a bid (expected) | c_{0} = 7 | 0.45 (0.43, 0.46) | 0.42 (0.40,0.43) | 0.20 (0.18,0.21) | 0.04 (0.03,0.05) |

2. | Expected cost of procurement (as % of c_{0}) | 72.1 (71.2, 73.5) | 73.2 (72.3, 75.1) | 83.9 (81.2,84.7) | 95.4 (94.3,96.2) | |

3. | Expected cost of procurement (as % of unh) | 101.6 | 116.4 | 132.4 | ||

1. | Probability of submitting a bid (expected) | c_{0} = 10 | 0.58 (0.56, 0.59) | 0.55 (0.54,0.56) | 0.20 (0.18,0.21) | 0.037 (0.03,0.04) |

2. | Expected cost of procurement (as % of c_{0}) | 58.1 (56.2, 59.5) | 59.1 (57.8, 60.1) | 78.7 (76.5,79.7) | 94.7 (94.0,95.6) | |

3. | Expected cost of procurement (as % of unh) | 101.6 | 135.5 | 163.0 |

### A.1. Proofs of theoretical results

*Proof of Proposition 1.* The vector of equilibrium strategies in the game with *y* = 1 satisfies the system of differential equations

_{k}(

*x*) =

*d*

_{0}.

Define (β_{1y},β_{2y}), such that

_{1y}and β

_{2y}satisfy the first-order conditions for the game indexed by

*y*. They also satisfy corresponding boundary conditions by definition. Therefore, a vector (β

_{1y},β

_{2y}) constitutes the set of equilibrium functions. ∥

*Proof of Theorem 1.* (a) I start by establishing a statistical result that I use to prove Theorem 1. Namely,

Let X be a random variable with the probability density function *f*(·) *and support*, then the characteristic function of variable *X*,φ_{X}(*t*), is non-vanishing, i.e. for every *T* > 0 *there is t such that* |*t*| > *T* and φ_{X}(*t*)≠0.

*Proof.* The idea of a proof is to consider the extension of the characteristic function to the complex domain. In particular, I consider function defined as at an arbitrary complex point *z*. It is straightforward to show that is an entire function, *i.e.* it is infinitely complex differentiable at every finite point of the complex plane. Therefore, it can only be equal to zero in a countable number of points. Thus, the number of points where φ_{X}(*t*) is equal to zero cannot be more than countable, which means that φ_{X}(*t*) is non-vanishing.

Finally, is an entire function because

*k*, is well defined due to the boundedness of the

*X*'s support. That concludes the proof of Lemma A1. ∥

(b) Random variables *Y*, *A*_{i}, log(*Y*), and log(*A*_{i}) have bounded supports and, therefore, have non-vanishing characteristic functions. The identification result follows from a theorem by Kotlarski (1966)30 and results established by Laffont and Vuong (1996) as described in Section 3.1. ∥

*Proof of Proposition 2.* (1) If *X*_{ik}'s are independent, then so are log(*X*_{ik}) . The structure of the bidder's cost, *c*_{i} = *y**x*_{i}, implies that and . Then, by Kotlarski (1966) theorem, the characteristic function of log(*A*_{i3}) is given by

*A*

_{i1}) by

*i*

_{1}and

*i*

_{3}are from the same group, then the characteristic functions of log(

*A*

_{i1}) and log(

*A*

_{i3}) should be the same up to a multiplicative factor determined by the difference in means induced by normalization. Let us consider normalization

*E*[log(

*A*

_{i1})] = 0. Then, equation (A.3) implies that and − log(

*A*

_{i3}) should have the same mean. However,

*k*(

*i*

_{1}) =

*k*(

*i*

_{3}) therefore 0. This implies that

*E*[log(

*A*

_{i3})] =

*E*[log(

*A*

_{i1})] = 0, and hence

*Z*

_{1}and

*Z*

_{2}are independent, then so are

*f*(

*Z*

_{1}) and

*f*(

*Z*

_{2}), for any function

*f*(·).

(3) If *Y* and *X*_{i}'s are independent, the cost structure is given by *c*_{ij} = *y*_{j}*x*_{ij}, then Kotlarski (1966) theorem applied to (log(*B*_{i1}),log(*B*_{i2})) implies that the characteristic function of log(*A*_{i1}) is given by the function Φ_{log(Ai1)}(*t*) defined by equation (3). Kotlarski (1966) theorem applied to , (, ) implies that the characteristic function of log(*A*_{i1}) is given by Λ_{log(Ai1)}(*t*) defined by equation (4). Thus, under normalization *E*[log(*A*_{i1})] = 0, the following equality has to hold:

*i*

_{3}. This is obvious if

*i*

_{1}and

*i*

_{3}belong to the same group. If they do not, then we have to make sure that normalization does not induce a shift of a random variable, which corresponds to Λ

_{log(Ai3)}(

*t*) relative to the random variable, which corresponds to Φ

_{log(Ai3)}(

*t*). It is easy to see, however, that the former represents a characteristic function of a random variable with a mean equal to

*E*[log(

*B*

_{i3})] −

*E*[log(

*B*

_{i1})]. The same is true of the latter. ∥

*Proof of Proposition 3.* The “only-if”direction is a straightforward corollary of the identification argument and the properties of the bidding strategies. ∥

### Estimation

I start by describing how the supports of the distributions of the individual bid and the common cost components can be estimated. Then, I proceed to the proof of Proposition 4.

**Estimation of the support bounds.** Strictly speaking, bounds of the support are recovered during the inversion procedure when the density function of the distribution in question is computed. According to the inversion formula, the density function recovered from the theoretical characteristic function should approach zero as the smoothing parameter *T* approaches infinity at every point outside of the support. Therefore, the upper and lower bounds of the support are, respectively, defined as lower and upper limits of the points where the density function is equal to zero. In estimation, the density function recovered from the estimated characteristic function does not, in general, equal zero outside of the support. An econometrician, therefore, has to choose cut-off points that correspond to sufficiently low values of the estimated density function. Unfortunately, econometric theory does not provide us with guidelines on how to choose such cut-off points which is why I use a different approach in this paper. I estimate bounds of the supports for the distributions of interest using restrictions imposed by the model with unobserved auction heterogeneity. If the data are generated by the model with unobserved auction heterogeneity, then this approach leads to consistent estimators of the support bounds. The proof of this statement and the derivation of the rate of convergence are given together with the proof of Proposition 6. Below I describe a procedure to estimate the support bounds of the distributions of the individual bid and the common cost components.

Denote the support of the log of the common component by and the supports of the log of the individual bid components by . Then the support of the log of bids for Group 1 is given by and the support of the differences in the log of bids is given by . Additionally, I start with the normalization *E*[log(*A*_{1})] = 0. Since the bounds of the supports can be estimated as [min(log(*b*_{1lj})),max(log(*b*_{1lj}))] and [min(log(*b*_{1lj}) − log(*b*_{1pj})),max(log(*b*_{1lj}) − log(*b*_{1pj}))], I arrive at the system of equations

*g*

_{LA}(

*a*) ≥ 0 and , it must follow that ≤ 0 and or −

*U*

_{3}. Third, it is easy to show that is strictly increasing on ( −

*U*

_{3},0) since

*g*

_{LA}(·) is positive on the interior of the log(

*A*) support. Indeed,

*f*

^{′}(

*z*) = (

*U*

_{3}+

*z*)

*g*

_{LA}(

*U*

_{3}+

*z*) −

*z*

*g*

_{LA}(

*z*) > 0. If , then

*z*

*g*

_{LA}(

*z*) = 0; at the same time, and, therefore, (

*U*

_{3}+

*z*)

*g*

_{LA}(

*U*

_{3}+

*z*) > 0. The argument is similar when . We have two cases when and (b) . In (a),

*f*

^{′}(

*z*) > 0 follows immediately. In (b), we have . Finally,

*f*( −

*U*

_{3}) < 0, whereas

*f*(0) > 0. Therefore, the solution to must exist and be unique.

*Proof of Proposition 5.* The proof consists of several steps.

Given *D*_{1} − *D*_{5}, the distribution functions *G*_{Ak}(·) satisfy the following:

(1) First, I establish that the distribution function and the probability density functions of the individual bid components inherit properties of the distribution function and the probability density functions of the individual cost component. Namely,

(i) Their supports

*S*(*G*_{Ak}) are given by with and ;(ii)

*G*_{Ak}is continuously differentiable on the interior of*S*(*G*_{Ak});(iii) For every closed subset of the interior of

*S*(*G*_{Ak}), there exists*c*_{g}> 0 such that |*g*_{Ak}(*a*)| ≥*c*_{g}> 0 on this subset.(iv) For every closed subset of the interior of

*S*(*G*_{Ak}), there exists*c*_{G}> 0 such that 1 −*G*_{Ak}(*a*) ≥*c*_{G}> 0 on this subset.

*Proof.* The point (i) is established in Section 2. To show that the points (ii), (iii), and (iv) hold, I use the relationship between the distribution functions of the individual bid components and the distribution functions of the individual cost components. Namely,

_{k}(·) is the inverse individual bid function of the bidder of group

*k*. Then,

*D*

_{2}):

*f*

_{Xk}(·) is continuously differentiable and for every closed subset of

*S*(

*G*

_{Ak}), there exists

*c*

_{f}> 0 such that |

*f*

_{Xk}(·)| ≥

*c*

_{f}; from equilibrium characterization: ξ

_{k}(·) is continuously differentiable and strictly increasing on

*S*(

*G*

_{Ak}); therefore, for every closed subset of

*S*(

*G*

_{Ak}), there exists

*c*

_{0}> 0 such that |ξ

_{k}

^{′}(·)| ≥

*c*

_{0}. This implies (ii) and (iii), where

*c*

_{g}is equal to the product of corresponding

*c*

_{f}and

*c*

_{0}. Finally, (iii) implies that

*G*

_{Ak}(

*a*) < 1 for any closed subset of

*S*(

*G*

_{Ak}), which obtains (iv). ∥

(2) If the probability density functions of the cost components are ordinarily smooth of order ϰ > 1, then Theorems 3.1–3.2 in Li and Vuong (1998) apply; these theorems establish the uniform consistency of the first-stage estimators. In particular, they establish that

Since,and a ∈, , then(3) Next, I establish the uniform convergence of the individual bid function following the logic of Proposition 3 and Theorem 3 of Guerre, Perrigne and Vuong (2000).

(a) First, I derive the rate of convergence for the support bounds and . Recall that the bounds of supports have been derived in several steps. First, supports of the distributions of

*L**B*_{1i}and (*L**B*_{1i1}−*L**B*_{1i2}) have been estimated asThese are maximum likelihood estimators for the support bounds of corresponding densities. We know that they converge to the true value of the support bounds at the rate of*n*.The usual results for extremum estimators apply. Note that at the same rate as converges to*g*_{LA1}.31 Let us denote this rate by*d*_{n}. All the standard conditions for the convergence of extremum estimators hold; therefore, converges to uniformly at the rate*d*_{n}. Since, , and are linear combinations of , and , they converge uniformly to , correspondingly at the rate*d*_{n}. The bounds of supports for*A*_{k}are estimated as , respectively. The smoothness of the exponential function ensures the consistency of these estimators. The delta method can be used to show that the rate of convergence remains equal to*d*_{n}.(b) The rate of convergence for is established in Li and Vuong (1998). Recall that here we denote it

*d*_{n}. Now, we derive a rate of convergence for . The estimator for*G*_{Ak}is defined asTo establish consistency we considerSince*g*_{Ak}is a continuous function with bounded support, according to Lemma 6.1(ii), then*g*_{Ak}is a bounded function. For large enough*n*, is also bounded a.s. due to the uniform convergence of to*g*_{Ak}. Then, part (b) implies that the first summand converges to zero at the rate*d*_{n}. The second summand also converges to zero at the rate*d*_{n}since support of*g*_{Ak}is bounded. Therefore, converges to*G*_{Ak}at the rate*d*_{n}.(c) Next, I prove the uniform consistency of the estimator for the individual inverse bid function. The following argument holds for every closed subset of . Note that for every a ∈ , corresponding is finite. It follows immediately since (a) and 1−(a) are positive on the interior of the support. Note that (a)≥

*c*_{g}>0 and (1−(a) ≥*c*_{G}> 0 for some*c*_{g}and*c*_{G}since and uniformly converge to*g*_{Ak}and*G*_{Ak}, respectively, and (iii) and (iv) of Lemma 6.1. Let us denoteThen,where , and . The constants and are well defined because*g*_{Ak}(·) and*G*_{Ak}(·) are continuous functions and*S*(*A*_{k}) is a compact set.Pointwise application of the delta method and uniform convergence of and to

*g*_{Ak}and*G*_{Ak}correspondingly allows us to conclude that(d) Next, I establish the uniform convergence of the individual bid function estimator. For a given , let us denote by

*a*_{0}= α_{k}(*x*) and by . Here,*a*_{0}is some number from and*a*_{n}is a random variable with realizations in for large*n*. For every realization of*a*_{n}, there is a number*a*^{*}such thatsince ξ_{k}(·) is continuously differentiable on the compact. Let us also denote by*a*_{n}^{*}a random variable with realizations as above. Note that if*a*_{0},*a*_{n}always belong to the interior of*S*(*A*_{k}), then*a*_{n}^{*}also always belongs to the interior of*S*(*A*_{k}). Since ξ_{k}(·) is strictly increasing on the compact, then ∥ξ_{k}^{′}(*a*_{n}^{*})∥ ≥*c*_{ξ}> 0, and therefore,On the other hand,Since, as I have shown above, ,*n*converges uniformly to ξ_{k}, thenand(e) Finally, I establish the uniform convergence of .

Uniform convergence of and and continuous differentiability of*G*_{Ak}(·) obtain a.s. ∥

*Practical issues.* As noted by Diggle and Hall (1993) and Li, Perrigne and Vuong (2000), the estimators for and , which are obtained by truncated inverse Fourier transformation, may have fluctuating tails.32 This feature can be alleviated by adding a damping factor to the integrals in and . Following Diggle and Hall (1993) and Li, Perrigne and Vuong (2000), I introduce a damping factor defined as

The smoothing parameter *T* should be chosen to diverge slowly as *n*→*∞*, so as to ensure the uniform consistency of the estimators. However, the actual choice of *T* in finite samples has not yet been addressed in the literature. I choose *T* through a data-driven criterion. In particular, I use the bid data to obtain estimates of the means and variances for distributions33^{,}34 of *LY* and *LA*, . These estimates are then used to choose a value of *T*. Specifically, I try different values of *T* and obtain estimates of *f*_{LY}(·) and *g*_{LA}(·). From each estimated density I compute the means and variances , respectively. This gives goodness-of-fit criterion for *LY* and similarly for *L**A*_{k}. The value of *T* that I choose minimizes the sum of these errors in a percentage of and . In the estimation, the optimal *T* equals 50.

Finally, similar to Horowitz and Markatou (1996), I find that the bias correction technique described in their paper improves the performance of the estimator in small samples.

### Summary of testing procedures

Point (a) describes the procedure to test the conditional independence of *Z*_{1} and *Z*_{2} conditional on linear index variable *X*; point (b) outlines the procedure I use to test the equality of two functions.

(a) Test of conditional independence.35 The conditioning variable is assumed to be given by a single index of the observable covariates, λ

_{θ}(*X*). The test statistic is based on the monotonic transform of λ_{θ}(*X*),*U*=*F*_{0}(λ_{θ}(*X*)), and Rosenblatt's transforms36 of*Z*_{1}and*Z*_{2},Here, X denotes the vector of project characteristics, θ is a vector of parameters, and*F*_{0}(·) is the cumulative distribution function of λ_{θ}(*X*). The hypothesis tested isThe test statistic is given bywherewhere , and denote the empirical distribution function of and kernel estimators of*G*_{Z1∣U,i}(·∣·) and*G*_{Z2∣U,i}(·∣·). All three objects are estimated with the omission of the i-th data point. The test statistic above converges to a Gaussian process as*n*→*∞*. For more details, see Song (2009). I compute the distribution of the test statistics via a wild bootstrap procedure.(a) Test of the equality of two functions. The null hypothesis is

The test statistic is defined aswhere {*t*_{i}}_{i = 1,…,N}is a finite set of points from the real line. The asymptotic distribution of this test statistic is unknown. Therefore, it is not clear whether a bootstrap procedure can be used to compute the distribution of the test statistic. Instead, a subsampling procedure can be used since the rate of convergence is known.37 To ensure the power of the test, I use re-centred test statistics following Hall and Horowitz (1996):where is computed from a simulated sample and is computed from the data.

I would like to thank my advisors Martin Pesendorfer, Steve Berry, and Hanming Fang for their help and encouragement. I also thank Don Andrews, Han Hong, Philip Haile, Steven Matthews, Katsumi Shimotsu, and Kevin Song for helpful conversations. I am grateful to the Michigan DoT for providing data for my research and to Mr. Wayne Roe in particular for answering my questions and helping me to understand the industry better.

## References

*e.g.*

*E*[

*Y*] = 1 is one of them. My choice of normalization is motivated by application.

_{1}(·,·) denotes the partial derivative of Θ(·,·) with respect to the first argument.

_{log(Ai1)}(

*t*) and Λ

_{log(Ai3)}(

*t*) can potentially differ by a multiplicative factor due to normalization. Please read the proof of Proposition 2 for careful analysis of this detail.

*z*

_{j}requires smoothing over

*z*

_{j}.

*E*[log(

*B*

_{ip}) − log(

*B*

_{il})]),

*i*

_{l}= 1,…,

*m*

_{01}and

*i*

_{p}=

*m*

_{01}+ 1,…,

*m*

_{0}.

*y*are not consistent with a given

*b*due to finite supports of

*Y*and

*X*

_{k}.

**Definition 1**The distribution of random variable

*Z*is

*ordinary-smooth of order*ϰ if its characteristic function Φ

_{z}(

*t*) satisfies

*t*→

*∞*for some positive constants

*d*

_{0},

*d*

_{1}, and ϰ.

*e.g.*Bajari and Ye (2003).

*e.g.*Jofre-Bonet and Pesendorfer (2003).

*c*

_{ij}= exp(

*z*

_{j}α)

*y*

_{j}

*x*

_{ij}. Here

*z*

_{j}denotes the vector of project

*j*'s observable characteristics. The homogenization procedure is used by Haile, Hong and Shum (2003) and Bajari, Houghton and Tadelis (2004). The coefficients of the observable “scaling factor” are reported in Table 2.

*e.g.*Kolmogorov–Smirnov tests). Therefore, I perform this test as a test for the equality of two functions. The description of the test procedure is given in Appendix A.

*E*

*Y*)

^{2}Var(

*X*) accounts for 31.3% of Var(

*c*) computed according to the formula above.

*c*

_{ij}= exp(

*z*

_{j}α)

*y*

_{j}

*x*

_{ij}. It can be calculated that (

*E*[exp(

*Z*α)

*Y*])

^{2}Var(

*X*) accounts for 34.4% of the variation in total costs.

*E*[log(

*A*

_{1})] = 0. The distributions are later adjusted to satisfy normalization in (

*D*

_{3}).

*LY*,

*L*

*A*

_{1}, and

*L*

*A*

_{2}can be obtained as follows: , , , and .