-
PDF
- Split View
-
Views
-
Cite
Cite
Alessandra Fogli, Laura Veldkamp, Germs, Social Networks, and Growth, The Review of Economic Studies, Volume 88, Issue 3, May 2021, Pages 1074–1100, https://doi.org/10.1093/restud/rdab008
Close - Share Icon Share
Abstract
Does the pattern of social connections between individuals matter for macroeconomic outcomes? If so, where do differences in these patterns come from and how large are their effects? Using network analysis tools, we explore how different social network structures affect technology diffusion and thereby a country’s rate of growth. The correlation between high-diffusion networks and income is strongly positive. But when we use a model to isolate the effect of a change in social networks on growth, the effect can be positive, negative, or zero. The reason is that networks diffuse both ideas and disease. Low-diffusion networks have evolved in countries where disease is prevalent because limited connectivity protects residents from epidemics. But a low-diffusion network in a low-disease environment compromises the diffusion of good ideas. In general, social networks have evolved to fit their economic and epidemiological environment. Trying to change networks in one country to mimic those in a higher-income country may well be counterproductive.
1. Introduction
How does a country’s culture affect its income? Many papers in macroeconomics have tackled this question by modelling various aspects of culture and measuring its economic consequences.1 This article explores one aspect of a country’s culture: its pattern of social connections. Using tools from network analysis, we explore how and to what extent different social network structures might affect a country’s rate of technological progress. Our network model explains why societies might adopt growth-inhibiting structures and allows us to quantify the effect of networks on income.
Measuring the speed of information or technology diffusion within various kinds of networks has a long history (Granovetter, 2005). To measure the macroeconomic effects of networks, we could take a simple production economy, embed within it fast-diffusion and slow-diffusion networks, calibrate it, and measure the difference in output growth. This simple measurement exercise faces two key challenges. The first challenge is determining what network to use for a whole country. Although researchers have mapped complete social networks in schools or online communities, mapping the exact social network structure for a large cross section of countries is not feasible. Instead, we propose three aggregate features of networks that regulate the network’s diffusion speed and have measurable counterparts in cross-country data. We then measure a country’s social network by rating the network along those three dimensions. The second challenge is that there are multiple channels of endogeneity: first, income affects social network structure. Second, networks have long been known to affect disease transmission, which in turn affects productivity and income. Finally, technology diffusion reduces disease prevalence, and lower death rates alter how social networks evolve. All of these channels compromise the direct measurement of social networks’ effects. In response, we build a model incorporating each of these various sources of endogeneity. Then, we calibrate the model, assess its performance with overidentifying moments, and use the model to measure the effect of the network alone.
Our theory for how networks evolve endogenously revolves around the idea that communicable diseases and technologies spread in similar ways—through human contact. We explore an evolutionary model in which networks that are stable, local, and have fewer connections reduce the risk of infection, allowing the participants to live longer. But such low-diffusion networks also restrict the group’s exposure to new technologies. In countries where communicable diseases are inherently more prevalent, the high risk of infection makes nodes with many, unstable, or distant linkages more likely to die out. A network that inhibits the spread of disease and technology will emerge. In countries where communicable diseases are less prevalent, nodes with few, stable, and local connections will be less economically and reproductively successful and will die off in the long run.
Section 2 begins by building an evolutionary model of social networks where the networks govern disease and technology diffusion, but diseases and technology also govern network evolution. Section 2 proves theoretical results about the effect of each network feature on technology and disease diffusion. It also explores the reverse effects: how technology and disease affect the types of networks that emerge. Specifically, disease prevalence creates the conditions that are conducive to growth-inhibiting networks. Section 4 describes our measures of pathogen prevalence, social networks, and technology diffusion and how we use them to calibrate the model. Section 5 then uses the calibrated model to investigate how much an exogenous change in network structure would affect technology and output, and explores why this answer depends on the country. Finally, we estimate the effect of social networks on technology diffusion in the data, using the difference in communicable and non-communicable diseases as an instrument. Since diseases spread by humans are systematically more likely to infect people who are more central to their social network, human-to-human diseases are widely thought to affect the evolution of social networks, more than other diseases do. The model and data agree that changes in the social network can raise a country’s productivity and output growth by 50–100%. But when applied to an economy with high disease prevalence, the same change can undermine output by propagating disease.
1.1. Related literature
The article contributes to four growing literatures. Primarily, the article is about a technology diffusion process. It is perhaps most closely related to economic history work by de la Croix et al. (2017) on the role that guilds and extended families played in technology diffusion in pre-industrial economies. Recent work by Lucas and Moll (2014) and Perla and Tonetti (2011) uses a search model framework in which every agent who searches is equally likely to encounter any other agent and acquire the agent’s technology. Greenwood et al. (2005) model innovations that are known to all but are adopted when the user’s income becomes sufficiently high. In Comin et al. (2013), innovations diffuse spatially. What sets our article apart is its assumption that agents encounter only those in their own network. Our main results all arise from this focus on the network topology. Many recent papers use networks to represent the input/output structure of the economy instead of social connections.2 Our focus on social networks creates new measurement challenges and leads us to examine different forms of networks. For example, Oberfield (2013) models firms that optimally choose a single firm to connect to, which is appropriate for his question but precludes thinking about the network features we examine.
The article also contributes to the literature on culture and its macroeconomic effects (e.g.Bisin and Verdier, 2001; Doepke and Tertilt, 2009; Greenwood and Guner, 2010). Gorodnichenko and Roland (2017) focus on the psychological or preference aspects of collectivism, one of the four network measures we use as well. They use collectivism to proxy for individuals’ innovation preferences and consider the effects of these preferences on income. In contrast, we use collectivism as one of many dimensions of a social network and assess the effect of those relationships on the speed of technology diffusion. Similarly, most work on culture and macroeconomics regards culture as an aspect of preferences.3Greif (1994) argues that preferences and social networks are intertwined because culture is an important determinant of a society’s network structure. Although this may be true, we examine a different determinant of networks—pathogen prevalence—that is easily measurable for an entire country. Our evolutionary-sociological approach lends itself to quantifying the aggregate effects of social networks on economic outcomes.
Our empirical methodology clearly draws much of its inspiration from work on the role of political institutions by Acemoglu et al. (2002), Acemoglu and Johnson (2005) and the role of social infrastructure by Hall and Jones (1999). But instead of examining institutions or infrastructure, which are not about the pattern of social connections between individuals, we study an equally important but distinct type of social organization, the social network structure.
There is a micro literature that considers the effects of social networks on economic outcomes (e.g. see Rauch and Casella, 2001; Granovetter, 2005). In contrast, this article takes a more macro approach and studies the types of social networks that are adopted throughout a country’s economy and how those networks affect technology diffusion economy wide. Ashraf and Galor (2012) and Spolaore and Wacziarg (2009) also take a macro perspective but measure social distance with genetic distance. Our network theory and findings complement this work by offering an endogenous mechanism to explain the origins of social distance and why it might be related to the diffusion of new ideas.
Finally, many papers argue that disease prevalence is growth inhibiting.4 To this discussion, our article adds a new, social-networks-based channel through which health affects economic outcomes.
2. A Network Diffusion Model
The model is a framework for measurement, which allows us to quantify the effect of networks on growth. While our model is surely simpler than reality, our quantitative exercises give us a sense of the order of magnitude of the macroeconomic effect. The concept of social network structure is a fungible one. We pick particular aspects of networks on which to anchor our analysis. In doing this, we do not exclude the possibility that other aspects of social or cultural institutions are important for technology diffusion and income. We focus on dimensions we can measure. The model teaches us about how three different aspects of social networks facilitate technology diffusion. It also explains why disease that spreads from human to human might influence a society’s social network in a persistent way.
A key feature of our model linking social networks to technological progress is that technologies are spread by human contact. This feature is not obvious, since new ideas could be spread by print or electronic media. However, Ryan and Gross (1943) first established the importance of social contacts in the spread of technology. Since then, a large literature in sociology and consumer research, starting with Rogers (1976), equates a diffusion model with innovations that spread through a social system.5 In his 1969 American Economic Association presidential address, Kenneth Arrow remarked,
While mass media play a major role in alerting individuals to the possibility of an innovation, it seems to be personal contact that is most relevant in leading to its adoption. Thus, the diffusion of an innovation becomes a process formally akin to the spread of an infectious disease. (Arrow, 1969, p. 33)
While the idea that certain network properties facilitate information diffusion is well known, the origin of such social networks is less-explored territory. Since a key problem with determining the economic effect of social networks in the data is the endogeneity of networks, it is important for our model to have endogenous social networks as well. Therefore, a second key feature of our model is an evolutionary process for social networks, in which economically successful agents pass on their pattern of social connections. This evolutionary model also explains why growth-inhibiting social networks emerge in response to disease. The idea that social circles might evolve based on disease avoidance is based on economic and biological evidence Birchenall (2014).6 Motivated by this evidence, we propose the following model.
2.1. Economic environment
Time, denoted by |$t=\{1,\ldots, T\}$|, is discrete and finite. At any given time |$t$|, there are |$J$| agents, indexed by their location |$j \epsilon \{1,2,\ldots,J\}$| on a circle. Agents have social connections that we describe with a network.
2.1.1. Social networks
Each person |$i$| is socially connected to others. If two people have a social network connection, we call them “friends.” Friends could also be family, coworkers or any pair of people with close and repeated social contact. Let |$\eta_{jk}=\eta_{kj}=1$| if person |$j$| and person |$k$| are friends and |$=0$| otherwise. To capture the idea that a person cannot self-infect in the following period, we set all diagonal elements (|$\eta_{jj}$|) to zero. Let the symmetric network of all connections be denoted |$N$|.
In a large economy, it is not possible to measure the complete pattern of linkages between every agent. Instead, we categorize agents, or nodes, into types whose linkages follow particular patterns. This approach allows us to characterize a large, aggregate network by its fraction of nodes of each type. This characterization enables comparisons with survey data that will enable us to measure the properties of social networks across countries.
At each date |$t$|, each person |$j$| has a type |$\tau_j(t)$|. That type is a three-dimensional state. The three dimensions to node type are: a binary collectivist type |$\theta_j$|, a degree type |$n_{fj}$|, and a mobility type |$m_j$|, where |$n_{fj}$| and |$m_j$| come from a discrete, finite set of possible types. Note that node type does not include technology.
When we think about node types governing social connections, we are faced with an inevitable dilemma: what happens if |$i$|’s type dictates that he should be friends with |$j$|, but |$j$| should not be friends with |$i$|? To resolve this impasse, we assume that each agent’s type dictates her links in one direction but not the other. This dilemma is a product of the fact that social connections are inherently bidirectional. To break this impasse, we define node types by the patterns of links that nodes form to one side (their left, if we order nodes clockwise). These definitions do not rule out other links. In fact, every node will have additional links on the other side. But the nature of those (right-side) links is governed by other nodes’ types.
We explore three aspects of social networks because they are important determinants of diffusion speed and we have cross-country data measuring them. Of course, this means that we hold fixed many other aspects of networks that may also differ across countries. Measuring these other aspects of social networks and understanding their effects on economic growth would be useful topics for further research.
Network feature 1: degree. The degree of a node is the number of connections that node has to other nodes. In the context of a social network, degree is the number of friends a person has. In our model, the degree type of a node regulates the number of links it has on one side.
If agent |$j$| has degree |$n_{fj}$| (even), he is connected to |$n_{fj}/2$| other nodes to his left.
The number of links one creates on one side is given by the degree divided by two because that ensures that if every node has degree-type |$n_f$|, every node is connected to |$n_f$| others.
Network feature 2: collectivism/individualism. Collectivism and its opposite, individualism, define a feature of social networks that has been studied extensively by sociologists. Mutual friendships and interdependence are hallmarks of collectivist societies. Individualists have friendships, but those friends are less likely to know one another. To measure collectivism, we can ask: if |$i$| is friends with |$j$| and with |$k$|, how often are |$j$| and |$k$| also friends? We refer to a structure in which |$i$|, |$j$|, and |$k$| are all connected to each other as a collective.
A measure of the extent of shared friendships, and thus the degree of collectivism, is the number of such collectives. The number of collectives is related to a common measure of network clustering: divide the number of collectives by the number of possible collectives in the network to get the overall clustering measure (Jackson, 2008). A fully collectivist network is illustrated in Figure 1.

Slower diffusion in the collectivist network (|$\theta_j =0 \forall j$|, top) than in the individualist network (|$\theta_j =1 \forall j$|, bottom). Black dots represent nodes that have acquired a new technology that was discovered by the top node.
If agent |$j$| is collectivist (which we denote by |$\theta_j=1$|), then he is connected to the |$n_{fj}/2$| closest nodes to his left: |$\eta_{jk}=1$| for |$k=\{j+1, \ldots, j+n_{fj}/2 \}$|, modulus the size of the network.
All nodes are connected to their immediate neighbours. This is what constitutes a neighbour in this model. The modulus phrase in the definition simply means that, for example, if |$j+n_f/2 =N+1$|, then since there is no |$N+1$|st node, the link simply wraps around the other side of the circle and connects with node 1. Therefore, when a node |$j$| is connected to |$j+1$| and |$j+2$|, that will always form a collective, since those neighbours are always themselves connected. Thus, collectivist nodes are ones whose ties complete a collective. The opposite of collectivist is individualist: a node that is individualist is connected to some nodes that are not themselves connected.
If agent |$j$| is an individualist (not collectivist: |$\theta_j=0$|) with degree |$n_f$|, then she is connected to the closest node |$j+1$| and |$n_{fj}/2-1$| nodes that are all at least |$\Delta_j>2$| spaces away: |$\eta_{jk}=1$| for |$k=\{j+1,\, j+\Delta_j, \ldots , (j + \Delta_j + n_f/2 -2 ) \}$|, modulus the size of the network.
Let |$\bar{\theta}\equiv \int \theta_i di$| denote the fraction of collectivists in a network.
Network feature 3: mobility The third network feature we introduce is shortcut links that span the network. These long links represent the long-distance social ties that arise when some agents in a society are mobile. Their frequent travels bring them in close social contact with others who are not in their social neighbourhood. We model this as a small probability of a long link to a randomly chosen node in the network.
If an agent |$j$| has mobility |$m_j$|, then with probability |$m_j$|, one of |$j$|’s existing links is broken and reassigned to any other node with which |$j$| is unconnected, with equal probability. This long link endures until the next social change shock arrives.
This type of ring network, with random long links, is a small-world network Watts and Strogatz (1998). Sociologists frequently use small-world networks as an approximation to large social networks because of their high degree of collectivism and small average path length, both pervasive features of real social networks.
These are the three network features that we will map to data to assess the economic importance of social networks. A node can be any combination of the three features. However, some combinations do create tension. When we construct the network, we first construct collectivist or individualist links with the right degree. Then, we randomly draw mobility shocks and rewire some social linkages. A mobile node may end up connecting with a low-degree or collectivist node that now has additional long links. These random connections do not change the type of the node connected to. They do highlight that our categorizations of social relationships are only aggregate and approximate characteristics of the network.
2.1.2. Technology and output
Technological progress occurs when someone improves on an existing technology. To make this improvement, the person needs to know about the existing technology. Thus, if a person is producing with technology |$A_j(t)$|, she will invent the next technology with an i.i.d. probability |$\lambda_t$| each period. If she invents the new technology, |$\ln (A_j(t+1)) = \ln (A_j(t))+\delta$|. In other words, a new invention results in a |$(\delta\cdot 100)$|% increase in productivity.
People can also learn from others in their network. If person |$j$| is friends with person |$k$| and |$A_k(t)>A_j(t)$|, then with an i.i.d. probability |$\phi$|, |$j$| can produce with |$k$|’s technology in the following period: |$A_j(t+1) = A_k(t)$|. With multiple friends, each offers an independent chance of technology transmission. If multiple new technologies are transmitted to the same node, the best technology is adopted.
2.1.3. Network evolution
A node has an opportunity to change type when a social change shock arrives. This shock, |$\xi_{j}(t) \in \{0,1\}$|, is drawn independently across individuals and time. The social change probability |$p_\xi$| is the probability that |$\xi_{j}(t)=1$|. When |$\xi_{j}(t)=1$|, the agent at node |$j$| adopts the type of the most successful node that |$j$| is connected to. In other words, if the person at node |$j$| is socially connected to nodes |$\{k: \eta_{jk}(t)=1\}$| and gets hit with a social change shock at time |$t$|, then at time |$t+1$| he adopts the type of |$k^{\ast} = argmax_{\{k: \eta_{jk}(t)=1\}} A_{k}(t)$| (i.e. the friend with the highest time-|$t$| technology). Then the time-|$(t+1)$| type of person |$j$| is the same as the time-|$t$| type of person |$k^\ast$|: |$\tau_j(t+1) = \tau_{k^\ast}(t)$|.7 Whether or not one’s type changes, when social change arrives (|$\xi_{j}(t)=1$|), agent |$j$| takes a new, independent draw of social mobility. Whether they have a long link and which node that link connects them to can change.
The idea behind this process is that more successful types are passed on more frequently. Evolutionary models often have this feature. At the same time, we want to retain the network-based idea that one’s traits are shaped by one’s community. Therefore, in our model, the process by which one inherits types is shaped by one’s community, by the social network, and by the relative success (relative income) of the people in that local community.
A social change shock does not alter the technology diffusion process. It only changes the network, which affects technology diffusion indirectly.
2.1.4. Disease, death, and renewal
Each infected person transmits the disease to each of his friends with probability |$\pi_t$|. The transmission to each friend is an independent event. Thus, if |${\dot{n}_d}$| friends are diseased at time |$t-1$|, the probability of being healthy at time |$t$| is |$(1-\pi_t)^{\dot{n}_d}$|. If no friends have a disease at time |$t-1$|, then the probability of contracting the disease at time |$t$| is zero.
Let |$d_j(t) =1$| if the person in location |$j$| acquires a transmittable disease (is sick) in period |$t$| and |$=0$| otherwise. An agent |$j$| who acquires a disease is sick and loses the ability to produce for the remainder of her life span (|$A_j(s)=0,\; \forall s \epsilon [t, t_r]$|). The date at which life ends and a new node appears |$t_r$| is governed by the social change shock |$\xi_{j}(t)$|. When social change shock hits (|$\xi_{j}(t)=1$|), the agent |$j$| is replaced by a new, healthy person in the same location |$j$|. Just like when a healthy node is renewed, the new agent |$j$| inherits the technology and network type of the most productive node that the parent was connected to.
We link the rebirth and social change process because social change only propagates when it affects productive nodes that can pass on their technology and type to others. Therefore, when we allow a node to adjust its type, we also cure it of disease if it is sick. It is as if the node dies and is reborn, healthy, with a type dictated by its community. This assumption facilitates the process of social change.
2.1.5. Feedbacks between disease and technology
The challenge in measuring the effect of networks on income is endogeneity. Diseases affect not only networks but also technological innovation. Similarly, technology, which obviously raises income, also helps to eradicate disease, which in turn, influences social networks. Therefore, we build these effects into our model, calibrate them, and then account for them when we do model experiments that gauge the effect of a change in a social network.
3. Theoretical Results
Before using the model to measure network effects, we briefly explain the model’s basic properties. We begin by exploring how each network characteristic affects a network summary statistic known as average path length, and how that path length is related to both the expected time to infection and the expected rate of technological progress. These results formally break down the logic for why some types of social networks can increase aggregate income. We then explore the long-run convergence of networks. The proofs of all results are in Supplementary Appendix A.
3.1. Diffusion speed, infections, and innovations
Average path length is important because it governs the mean infection time from a disease and the mean discovery time for a new technological innovation. Let |$\bar{L}$| be the average healthy lifetime of an agent in network |$N$| (specifically, it is the number of consecutive healthy periods of a node |$j$|). Similarly, let |$\bar{\alpha}$| be the average number of periods it takes for a new idea to reach a given agent in network |$N$|. We call |$\bar{\alpha}$| the average adoption lag.
If |$\pi=1$| and |$\sum_j d_j(0) =1$|, then the average healthy lifetime |$\bar{L}(N)$| is monotonically increasing in the average path length of the network.
If |$\phi =1$|, then the average technology adoption lag |$\bar{\alpha}(N)$| is monotonically increasing in the average path length of a network.
Of course, diffusion is not the same as innovation. Diffusion accelerates technology growth because when idea diffusion is faster, redundant innovations are less frequent. Thus, more of the innovations end up advancing the technological frontier.
Next, we explore how the three characteristics of nodes in our network regulate the average path length of the network. The next result shows that higher degree, individualism, and greater mobility all have the effect of reducing path lengths, on average.
The following network features reduce the average path length of the network:
Higher degree: The average path length in a ring network is a decreasing function of |$n_f$|.
Individualism: |$\theta_j=1, \, \forall j$| is a network with a longer path length than |$\theta_j=0, \; \forall j$|, for a given degree, no mobility, and a network of size |$J > \bar{J}$|.
Greater mobility: The expected average path length of the network is a decreasing function of the mobility probability |$m_j\; \forall j$|.
A social network with a higher degree has a lower average path length. With more connections, it requires fewer steps to reach other nodes.
For individualism, the logic is slightly different. Disease and technology spread more slowly in the collectivist network because each contiguous group of friends is connected to, at most, four non-group members. Those are the two people adjacent to the group, on either side. Since there are few links with outsiders, the probability that a disease within the group is passed to someone outside the group is small. Likewise, ideas disseminate slowly. Something invented in one location takes a long time to travel to a faraway location. In the meantime, someone else may have reinvented the same technology level, rather than building on existing knowledge and advancing technology to the next level. Such redundant innovations slow the rate of technological progress and lower average consumption.
Figure 1 illustrates the smaller path length and faster diffusion process in a purely individualist network, compared with a purely collectivist one, where both networks have degree 4 and no mobility. In this simple case, in which the probability of transmission is 1, each frame shows the transmission of an idea or a disease introduced to one node at time 0. The “infected” person transmits that technology to all the individuals to whom she is connected. In period 1, four new people use the new technology, in both networks. But by period 2, nine people are using the technology in the collectivist network and twelve are using it in the individualist network. In each case, an adopter of the technology transmits the technology to four others each period. But in the collectivist network, many of those four people already have the technology. The technology transmission is redundant, so diffusion is slowed.8
For mobility, a greater probability of forming long links (higher mobility) is similar to the individualist’s long links. It decreases the average path length between nodes that would have otherwise been far apart. All the results in this section are about symmetric networks simply because there are too many possible asymmetric networks to consider them all. However, the results in Table 2 show how each of these network features speeds diffusion, even in a network that is asymmetric.
Taken together, these results explain why ideas and germs spread more quickly in low-path-length networks, why fast diffusion might imply faster technological progress and output growth, and what evolutionary advantages each type of network might offer its adopters.
3.2. Understanding network evolution
Why do some societies end up with networks that inhibit growth? If disease prevalence can permanently alter social structure, then diseases that were prevalent long ago may have created networks that are no longer ideal. This rationalizes differences in social structures that persist even after diseases have been eliminated. It also motivates our use of historical disease data, in the next section. The first result shows that eventually, the economy always converges to a uniform-type network.
Networks converge to a uniform type: With probability 1, the network becomes homogeneous: |$\exists T$| s.t. |$\tau_j(t)=\tau_k(t)$||$\forall k$| and |$\forall t>T$|.
In other words, after some finite date |$T$|, everyone will have the same type forever after. They might all be individualist or all be collectivist. But everyone will be the same. Traits are inherited from neighbours, so when a trait dies out, it never returns. The state in which all individuals have the same trait is an absorbing state. The result relies on there being a finite network and a finite type space.
Similarly, having zero infected people is an absorbing state. Since that state is always reachable from any other state with positive probability, it is the unique steady state.9
Disease dies out: With probability 1, |$\exists T$| s.t. |$d_j(t)=0$||$\forall j$| and |$\forall t > T$|.
These results show us that which network type will prevail is largely dependent on which dies out first: the disease-prone trait (individualism, high degree, mobility) or the disease. If disease is very prevalent, it kills off all disease-prone types. The society is left with a disease-resistant but diffusion-inhibiting network forever after. If disease is not very prevalent, its transmission rate is low, or by good luck it just dies out quickly, then the disease-prone will survive. Since they are more economically successful, the economy is more likely to converge to a disease-prone, high-diffusion network. The key consequence is that networks can persist long after the conditions to which they were adapted have changed.
4. Data, Measurement, and Calibration
After providing a qualitative analysis of the forces at work in the model, we bring the model to the data and use it to quantify the macroeconomic effects of social networks. We will do this by simulating economies with different initial disease prevalence. This gives rise to different social networks and economic outcomes. This laboratory will allow us to disentangle the network effects, feedback effects, and the direct effect of health on output.
Quantifying the model involves assembling an assortment of data that includes pathogen prevalence, social networks, and technology diffusion. We assembled a dataset that contains each of these variables for 71 countries. For calibration purposes, we use mostly data from the U.S. because that is our benchmark economy, from which we explore parameter deviations. We explore alternative parameterizations in Supplementary Appendix B. The full dataset is featured in Section 5, where an instrumental variables estimation supports the model’s predictions.
4.1. Mapping network data to model variables
We calibrate the model parameters to replicate features of the U.S. economy. The U.S. is characterized by low-disease prevalence and fast diffusion networks. Network degree |$(n_f)$| and mobility |$(m_j)$| allow the model to match the modal number of friends and the probability of moving. To keep the number of node types limited, we allow a person to be either high or low degree (|$n_f=\{ 4,6 \}$|) and either high or low mobility (probability of forming new links |$m_j = \{ 0, 0.1\}$|). We choose these high and low values because they straddle the variation in our data. Each node is linked to its immediate neighbours by definition. Because |$n_f$| is an even number, the minimum number of degree that differentiates collectivists and individualists is 4. On the other hand, the maximum number of degree in the General Social Survey (GSS) dataset for the U.S. is 6.
The spirit of the exercise is to simulate an economy that starts with a small fraction of high-diffusion network types. In different epidemiological environments, we calibrate the parameters to match a low-disease economy and generate time series for networks, output, and technology over 500 periods. We then change the initial disease prevalence and compare the economic outcomes with the benchmark economy of low disease. We use the U.S. as our benchmark economy and set the initial fraction of individualist (long-link type) to 30%.10 We use the same initial fraction for the high mobility type. Since in the General Social Survey, the annual interstate migration rate in the U.S. in 2000 is 3%, we set the high-mobility type to have a probability of moving equal to 10%. The migration rate informs our choice of the high-mobility value and the fraction of high-mobility types.
The length |$\Delta_j$| of the individualists’ long link governs the average path length in an individualist network. Seminal research done by Travers and Milgram (1969) on average U.S. social network path length found that a letter dropped in the middle of the U.S. typically found its recipient after being passed on five to six times. Thus, we set |$\Delta_j$| so that, after may periods, when disease is extinct and high-diffusion nodes dominate, that “recent” U.S. network looks like recent data. The average length of the path between any two nodes in that network is |$5.1$|.
4.2. Mapping data to disease parameters
For the initial disease prevalence rate (|$d_h(0)$|), we set the low germs prevalence to 0.05% since in the U.S., infectious diseases have been almost entirely eradicated.11 In our experiment, we set the high disease prevalence to 18%, which matches the malaria prevalence in Ghana in 2006.12
To calibrate the probability of disease transmission |$(\pi)$|, a natural target is the steady state rate of infection. But, as we have shown, the only steady state infection rate is zero. Therefore, we set the transmission rate so that, on average, the disease disappears in 150 years in the high germs economy (Ghana).
Finally, our social change probability |$p_\xi$| regulates the rate of network evolution. When we calibrate |$p_\xi$| to |$0.2$|, it means that on average, a social change shock arrives once every 5 years. Although the node can change type every 5 years, most of the time, the highest-productivity node among one’s friends has the same type, so no change of type occurs. This |$p_\xi$| value is set such that each generation (25 years), 5% of the population changes type.
4.3. Calibrating technology diffusion parameters
We use technology measures derived from the cross-country historical adoption of the technology dataset developed by Comin and Hobijn (2010). The data cover the diffusion of about 115 technologies in over 150 countries during the last 200 years. There are two margins of technology adoption: the extensive margin (whether or not a technology is adopted at all) and the intensive margin (how quickly a technology diffuses, given that it is adopted). If the technology was introduced to the country late, a country can be behind in a technology even though it is adopting it quickly.
To calibrate our model, we use the extensive measure, called the adoption lag. Comin and Hobijn (2010) define country |$x$|’s adoption lag to be the number of years in between invention and the date when the first adopter in country |$x$| adopts the technology. We average over the various technologies to arrive at the country’s average adoption lag.
Every node/person starts with a technology level of 0. Each period, any given person may discover a new technology that raises his productivity with probability |$\lambda$|. The rate of arrival |$(\lambda)$| is calibrated so that the average time between advances in the technology frontier is 21 years.
The magnitude of the increase in productivity from adopting a new technology (|$\delta$|) is calibrated to match the U.S. GDP growth rate of 2.6% per year. The probability of transmitting a new technology to each friend (|$\phi$|) is chosen to match the fact that for the average household technology, the time between invention and diffusion to half of the population is 40.18 years Greenwood et al. (2005).13
Disease-technology feedbacks. Section 2.1.5 introduced two feedback effects: an endogenous innovation rate that falls as the infection rate rises and an endogenous disease infection probability that falls as technology improves. Each of these feedbacks introduces one new parameter to calibrate. For endogenous innovation, we choose the parameter |$\kappa_T = 5.56$| (from equation (1)) to match the average difference in adoption lags between the U.S. and Ghana, which is 32 years. The new parameter that governs endogenous infection rates |$\kappa_G = 400$| (from equation (2)) targets a complete extinction of germs in all U.S. simulations. Notice that the innovation probability evolves as a function of disease prevalence, and when the germs are completely eradicated, it is equal to the U.S. innovation probability (|$\lambda_0$|). Given the disease prevalence in Ghana, this implies an initial value of innovation probability of 0.03%, which evolves over time as the germs disappear. In the low-disease economy, the U.S., the disease transmission probability (|$\pi$|) starts at 5% and decreases faster, because productivity increases more rapidly. At the end of the simulation period, the germs transmission probability approaches close to 0.
These parameters are summarized in Table 1.
| Description . | Parameter . | Value . | Target (data) . | Model moments . |
|---|---|---|---|---|
| Degree, low | |$n_f(L)$| | 4 | General Social Survey (GSS) | |
| Degree, high | |$n_f(H)$| | 6 | ||
| Mobility, low | |$m_j(L)$| | 0 | ||
| Mobility, high | |$m_j(H)$| | 0.1 | Interstate migration rates | |
| Disease transmission probability | |$\pi_0$| | 5% | Disease disappears in 150 years | 142 |
| Innovation productivity increase | |$\delta$| | 80 | 2.6% growth rate in low germ country | 3.0% |
| Technology transfer probability | |$\phi$| | 12% | Half-diffusion in 40.18 years (Greenwood et al., 2005) in low germ | 42.35 |
| Number of nodes to furthest friend of an individualist | |$\Delta$| | 7 | Average path length of 5 (Travers and Milgram, 1969) in U.S. | 5.1 |
| Technology arrival rate | |$\lambda_0$| | 0.08% | U.S. technology adoption lag of 21 years Comin and Hobijn (2010), low germ | 21.5 |
| Social change probability | |$p_\xi$| | 20% | 5% population type change every 25 years, low germ | 8.7% |
| Tech feedback | |$\kappa_T$| | 5.56 | Difference of adoption lag 32 years | 34 |
| Germs feedback | |$\kappa_G$| | 400 | Disease disappears in all low-germs simulations (0 survival of germs) | 0 |
| Initial conditions for endogenous variables: | ||||
| Fraction of individualist | |$f_i$| | 30% | ||
| Fraction of high degree | |$f_d$| | 20% | ||
| Fraction of high mobility | |$f_m$| | 70% | ||
| Description . | Parameter . | Value . | Target (data) . | Model moments . |
|---|---|---|---|---|
| Degree, low | |$n_f(L)$| | 4 | General Social Survey (GSS) | |
| Degree, high | |$n_f(H)$| | 6 | ||
| Mobility, low | |$m_j(L)$| | 0 | ||
| Mobility, high | |$m_j(H)$| | 0.1 | Interstate migration rates | |
| Disease transmission probability | |$\pi_0$| | 5% | Disease disappears in 150 years | 142 |
| Innovation productivity increase | |$\delta$| | 80 | 2.6% growth rate in low germ country | 3.0% |
| Technology transfer probability | |$\phi$| | 12% | Half-diffusion in 40.18 years (Greenwood et al., 2005) in low germ | 42.35 |
| Number of nodes to furthest friend of an individualist | |$\Delta$| | 7 | Average path length of 5 (Travers and Milgram, 1969) in U.S. | 5.1 |
| Technology arrival rate | |$\lambda_0$| | 0.08% | U.S. technology adoption lag of 21 years Comin and Hobijn (2010), low germ | 21.5 |
| Social change probability | |$p_\xi$| | 20% | 5% population type change every 25 years, low germ | 8.7% |
| Tech feedback | |$\kappa_T$| | 5.56 | Difference of adoption lag 32 years | 34 |
| Germs feedback | |$\kappa_G$| | 400 | Disease disappears in all low-germs simulations (0 survival of germs) | 0 |
| Initial conditions for endogenous variables: | ||||
| Fraction of individualist | |$f_i$| | 30% | ||
| Fraction of high degree | |$f_d$| | 20% | ||
| Fraction of high mobility | |$f_m$| | 70% | ||
| Description . | Parameter . | Value . | Target (data) . | Model moments . |
|---|---|---|---|---|
| Degree, low | |$n_f(L)$| | 4 | General Social Survey (GSS) | |
| Degree, high | |$n_f(H)$| | 6 | ||
| Mobility, low | |$m_j(L)$| | 0 | ||
| Mobility, high | |$m_j(H)$| | 0.1 | Interstate migration rates | |
| Disease transmission probability | |$\pi_0$| | 5% | Disease disappears in 150 years | 142 |
| Innovation productivity increase | |$\delta$| | 80 | 2.6% growth rate in low germ country | 3.0% |
| Technology transfer probability | |$\phi$| | 12% | Half-diffusion in 40.18 years (Greenwood et al., 2005) in low germ | 42.35 |
| Number of nodes to furthest friend of an individualist | |$\Delta$| | 7 | Average path length of 5 (Travers and Milgram, 1969) in U.S. | 5.1 |
| Technology arrival rate | |$\lambda_0$| | 0.08% | U.S. technology adoption lag of 21 years Comin and Hobijn (2010), low germ | 21.5 |
| Social change probability | |$p_\xi$| | 20% | 5% population type change every 25 years, low germ | 8.7% |
| Tech feedback | |$\kappa_T$| | 5.56 | Difference of adoption lag 32 years | 34 |
| Germs feedback | |$\kappa_G$| | 400 | Disease disappears in all low-germs simulations (0 survival of germs) | 0 |
| Initial conditions for endogenous variables: | ||||
| Fraction of individualist | |$f_i$| | 30% | ||
| Fraction of high degree | |$f_d$| | 20% | ||
| Fraction of high mobility | |$f_m$| | 70% | ||
| Description . | Parameter . | Value . | Target (data) . | Model moments . |
|---|---|---|---|---|
| Degree, low | |$n_f(L)$| | 4 | General Social Survey (GSS) | |
| Degree, high | |$n_f(H)$| | 6 | ||
| Mobility, low | |$m_j(L)$| | 0 | ||
| Mobility, high | |$m_j(H)$| | 0.1 | Interstate migration rates | |
| Disease transmission probability | |$\pi_0$| | 5% | Disease disappears in 150 years | 142 |
| Innovation productivity increase | |$\delta$| | 80 | 2.6% growth rate in low germ country | 3.0% |
| Technology transfer probability | |$\phi$| | 12% | Half-diffusion in 40.18 years (Greenwood et al., 2005) in low germ | 42.35 |
| Number of nodes to furthest friend of an individualist | |$\Delta$| | 7 | Average path length of 5 (Travers and Milgram, 1969) in U.S. | 5.1 |
| Technology arrival rate | |$\lambda_0$| | 0.08% | U.S. technology adoption lag of 21 years Comin and Hobijn (2010), low germ | 21.5 |
| Social change probability | |$p_\xi$| | 20% | 5% population type change every 25 years, low germ | 8.7% |
| Tech feedback | |$\kappa_T$| | 5.56 | Difference of adoption lag 32 years | 34 |
| Germs feedback | |$\kappa_G$| | 400 | Disease disappears in all low-germs simulations (0 survival of germs) | 0 |
| Initial conditions for endogenous variables: | ||||
| Fraction of individualist | |$f_i$| | 30% | ||
| Fraction of high degree | |$f_d$| | 20% | ||
| Fraction of high mobility | |$f_m$| | 70% | ||
5. Main Results: How Do Networks Affect Income?
Our objective is to better understand how social networks affect technology diffusion and economic development. The difficulty is that economic development also changes social network structures, both directly and through disease prevalence. Using the quantified model, we can separate these two effects by exogenously changing the network structure and observing the predicted effect on technology diffusion and output. To do this, we proceed in three stages. First, we look at the model-implied correlations between networks and income/productivity measures. This tells us how much variation in cross-country income our model, with all its mechanisms, can generate. Second, we use the model to isolate the network effect. Holding all other features fixed, we compare outcomes of an economy in a high-diffusion social network with those from a low-diffusion social network. The main finding is that, while high-diffusion social networks are a strong statistical predictor of output, this does not imply that changing a country’s social structure will boost incomes. If a country’s social network is low-diffusion, replacing it with a higher-diffusion pattern of social connection can impoverish the country. Third, we explore why the effect of network changes depends on disease prevalence. If a country has a low-diffusion network, that network evolved in order to protect the country from disease epidemics. Replacing the low-diffusion network with a high-diffusion one can be disastrous for the economy and the residents because the new network is maladapted to their environment.
5.1. The covariance between networks and income
How much of the variation in income can differences in disease and networks jointly explain? Here, we show that the forces of the model can jointly explain large differences in income across countries. Of course, just as in the data, the model’s networks are endogenous. This result does not imply that changing networks alone can produce this variation in cross-country income. The causal effect comes in the next subsection.
To explore the relationship between networks and income, the model needs some variation in networks. But networks are endogenous outcomes of different disease environments (and random chance). Therefore, we construct a set of economies with different initial disease prevalence, simulate them, and measure the network and incomes at the end of the simulation. Each initial disease level is meant to represent a different country that evolved to have a different social network. Specifically, we consider 10 equally spaced levels of initial disease prevalence between 0.05% and 18%. These end points correspond to the highest and lowest levels of prevalence of the diseases in our data. For each initial level of disease, we run 200 simulations, where each simulation is computed on a 400-person network, over 500 periods. At the end of the 500 periods, we use the fraction of individualists, high-degree, and high-mobility nodes to construct the network index (defined in (4), using weights from model output). Figure 2 shows that a 0.1 standard deviation change in the network index (0.15 points) correlates with a 0.25 log point (25%) higher end-of-simulation income.

Networks and income: covariance.
Notes: The graph reports the end-of-simulation income against the end-of-simulation network index. The points on the left of the |$x$|-axis come from a high-disease economy that starts with a maximum disease prevalence of 18% and evolves to have low-diffusion networks (low network index). The points on the right come from low-disease economies (minimum is 0.05% prevalence), where high-diffusion networks emerged (high network index).
In large part, networks and income are correlated simply because both are correlated with disease. At the same time, this correlation is a useful starting point because the real world also has joint and reverse causality. It makes clear that, later, the difference in the causal results arises not simply because the model differs from the real world. The causal results and correlations differ greatly even inside the model. The model is valuable, in part, because it explains why the true causal effects might deviate from the correlation that the data alone would suggest.
5.2. How exogenous network changes affect technology and income
To measure the causal effect of social networks on technology and output in the model, we change the network and observe the result. Surprisingly, we find that the effect of high-diffusion networks can be positive, negative, or zero. In other words, while the correlation between networks and income is strongly positive, the causal effect may not be. The result depends on the environment the network is in. The reason is that high-diffusion networks can expose disease-prone countries to income-reducing disease epidemics.
We begin by looking at social network effects in a low-disease environment. We start with our network calibrated to the U.S. and then vary each network feature one at a time. For each value of individualism, degree, and mobility, we double their values, one at a time. Specifically, we simulate networks with nodes that have either |$n_f = \{4, 6\}$|. High-degree networks have a higher fraction of degree 6 nodes. For mobility, we consider annual moving probabilities |$p =\{0, 0.1\}$|. High-mobility networks have more of the nodes that have a 1/10th chance of moving each year. Finally, nodes are either individualist or collectivist. Individualist networks have a higher fraction of individualist nodes. When we simulate a new economy with a different degree or mobility parameter, we change the probability that every node in the network has a particular characteristic, in a symmetric way.
The top of Table 2 describes the effect of higher-diffusion networks in a low-disease economy. This is the result one would expect: network features that facilitate diffusion speed the diffusion of new technologies and boost income. The greatest gains come from mobility and degree, where a doubling of more mobile or connected agents raises growth by 0.23–0.24 percentage points. Cumulated over 200 years, this network change raises incomes by 56–58%. This is a more modest effect than the correlations suggest, but it is directionally consistent.
| . | . | Benchmark . | Double . | Double . | Double . |
|---|---|---|---|---|---|
| . | . | . | individualists . | high-degree . | high-mobility . |
| Low-disease economy | Growth | 3.35% | 3.44% | 3.59% | 3.58% |
| (0.05% initial prevalence) | Log(income) | 6.70 | 6.88 | 7.18 | 7.16 |
| Tech diffusion | 1/25.2 | 1/24.1 | 1/22.5 | 1/23.2 | |
| Disease extinction | 59.4 | 58.5 | 96.3 | 51.2 | |
| High-disease economy | Growth | 1.33% | 1.29% | 0.12% | 1.46% |
| (0.18% initial prevalence) | Log(income) | 2.66 | 2.58 | 0.24 | 2.92 |
| Tech diffusion | 1/73.4 | 1/75.8 | 1/177.8 | 1/69.1 | |
| Disease extinction | 196.3 | 196.9 | 201.0 | 196.0 |
| . | . | Benchmark . | Double . | Double . | Double . |
|---|---|---|---|---|---|
| . | . | . | individualists . | high-degree . | high-mobility . |
| Low-disease economy | Growth | 3.35% | 3.44% | 3.59% | 3.58% |
| (0.05% initial prevalence) | Log(income) | 6.70 | 6.88 | 7.18 | 7.16 |
| Tech diffusion | 1/25.2 | 1/24.1 | 1/22.5 | 1/23.2 | |
| Disease extinction | 59.4 | 58.5 | 96.3 | 51.2 | |
| High-disease economy | Growth | 1.33% | 1.29% | 0.12% | 1.46% |
| (0.18% initial prevalence) | Log(income) | 2.66 | 2.58 | 0.24 | 2.92 |
| Tech diffusion | 1/73.4 | 1/75.8 | 1/177.8 | 1/69.1 | |
| Disease extinction | 196.3 | 196.9 | 201.0 | 196.0 |
Notes: Entries report average annual growth, ending level of output per capita, |$ln(A_{t+1}/A_t)$|, the number of periods it takes for half-diffusion of a new technology (inverted), and the average number of periods until disease extinction. All are averaged across 200 simulations, each of which runs 200 periods. Social networks are held constant. The network is either the U.S. social network or a network that is a U.S. benchmark where one of the following is doubled: the proportion of nodes that are individualist (from 44.5% to 89%), the proportion of high-degree |$n_f=6$| nodes (from 60% to 100%), or the proportion of high-mobility |$m_j= 0.1$| nodes (from 31.2% to 62.5%). All other model features and parameters are constant and reported in Table 1.
| . | . | Benchmark . | Double . | Double . | Double . |
|---|---|---|---|---|---|
| . | . | . | individualists . | high-degree . | high-mobility . |
| Low-disease economy | Growth | 3.35% | 3.44% | 3.59% | 3.58% |
| (0.05% initial prevalence) | Log(income) | 6.70 | 6.88 | 7.18 | 7.16 |
| Tech diffusion | 1/25.2 | 1/24.1 | 1/22.5 | 1/23.2 | |
| Disease extinction | 59.4 | 58.5 | 96.3 | 51.2 | |
| High-disease economy | Growth | 1.33% | 1.29% | 0.12% | 1.46% |
| (0.18% initial prevalence) | Log(income) | 2.66 | 2.58 | 0.24 | 2.92 |
| Tech diffusion | 1/73.4 | 1/75.8 | 1/177.8 | 1/69.1 | |
| Disease extinction | 196.3 | 196.9 | 201.0 | 196.0 |
| . | . | Benchmark . | Double . | Double . | Double . |
|---|---|---|---|---|---|
| . | . | . | individualists . | high-degree . | high-mobility . |
| Low-disease economy | Growth | 3.35% | 3.44% | 3.59% | 3.58% |
| (0.05% initial prevalence) | Log(income) | 6.70 | 6.88 | 7.18 | 7.16 |
| Tech diffusion | 1/25.2 | 1/24.1 | 1/22.5 | 1/23.2 | |
| Disease extinction | 59.4 | 58.5 | 96.3 | 51.2 | |
| High-disease economy | Growth | 1.33% | 1.29% | 0.12% | 1.46% |
| (0.18% initial prevalence) | Log(income) | 2.66 | 2.58 | 0.24 | 2.92 |
| Tech diffusion | 1/73.4 | 1/75.8 | 1/177.8 | 1/69.1 | |
| Disease extinction | 196.3 | 196.9 | 201.0 | 196.0 |
Notes: Entries report average annual growth, ending level of output per capita, |$ln(A_{t+1}/A_t)$|, the number of periods it takes for half-diffusion of a new technology (inverted), and the average number of periods until disease extinction. All are averaged across 200 simulations, each of which runs 200 periods. Social networks are held constant. The network is either the U.S. social network or a network that is a U.S. benchmark where one of the following is doubled: the proportion of nodes that are individualist (from 44.5% to 89%), the proportion of high-degree |$n_f=6$| nodes (from 60% to 100%), or the proportion of high-mobility |$m_j= 0.1$| nodes (from 31.2% to 62.5%). All other model features and parameters are constant and reported in Table 1.
The bottom of Table 2 paints a different picture for a high disease economy. In most of these experiments, altering the social network to facilitate faster diffusion lowers income. The effect is most drastic for degree. For high-disease economies, doubling degree causes output to fall by 90%. Likewise, individualism also impairs growth. The reason is that both perpetuate disease, which compromises growth. Only mobility facilitates economic development for economies with both high and low-disease prevalence. Of course, mobile agents can also introduce disease to new communities. But ideas are more prevalent than disease. Mobile agents spread ideas more than germs. Because of the feedback from technology to the infection rate, the faster technology diffusion from mobile agents speeds up disease extinction and promotes economic growth.
The bottom line is that the way in which networks affect economic growth depends on the disease environment. High-diffusion networks in high-disease economies spread more disease. Because that is the dominant effect, such networks—especially high-degree networks—can impoverish high-disease countries. These same networks also transmit more disease in low-disease countries. We can see that effect in a longer time to extinction. The difference is that when new ideas are prevalent and disease is rare, the net effect of spreading ideas and disease is positive. Thus, the same networks that impoverish poor countries can facilitate growth in rich ones where epidemics are rare. To thrive, each country needs a social network that is well adapted to its environment.
These causal effects can be substantially different from what the correlations suggest. Of course, depending on when we measure, we could get small or arbitrarily large differences in income levels. However, reporting income after 200 periods facilitates comparison with the previous results. For high-disease economies, the effect of high-diffusion networks is the opposite of what the correlations would predict. For low-disease economies, the results look more similar. After 200 periods, the log incomes in the low-disease economy range from 6.7 to about 7.2. This translates into income levels (not in logs) ranging from 812 to 1339, a roughly 50% difference in income per capita. This is similar to the correlation results in Figure 2.
5.2.1. Parameter sensitivity
The qualitative nature of the underlying data on social structure makes this model a tricky one to calibrate, which makes exploring alternative parameter values particularly important. For both low- and high-disease economies, we explored the robustness of our results to changes in model parameters. In particular, we explore doubling and halving the values of the probability of social change |$p_\xi$|, the technology transmission probability |$\phi$|, the probability of disease diffusion |$\pi$|. The effect of networks on technology and output is remarkably stable. It varies by less than 20% across all of these different specifications. Table B.5 in Supplementary Appendix B reports results for each exercise individually.
Instead of doubling, we could have instead halved the prevalence of each of our network features. Table B.6 in Supplementary Appendix B also shows that most of these results are the same as well, in the sense that, if doubling was detrimental to growth, halving is beneficial.
The role of disease-technology feedbacks. The disease-technology feedbacks help the model to generate more nuanced results. In particular, when disease transmission rates fall as technology improves, that force helps to prevent the extinction of high-diffusion types, who are more susceptible to disease.
Turning off the feedbacks primarily affects high-disease economies. Turning off feedback 1, so that the innovation rate does not depend on disease, makes high-disease economies grow faster. It also makes high-diffusion networks more economically beneficial: if innovation does not depend on disease, then when a high-diffusion network spreads disease, this has a less negative effect on output. When we turn off feedback 2, so that the infection rate does not depend on technology, growth in high-disease economies falls because these economies cannot innovate their way out of epidemics. This is particularly costly for economies with high-diffusion networks because the benefits of their technology diffusion are smaller, while the disease-promotive costs remain.
Details of these results and the calibration are in Table B.7 of Supplementary Appendix B. The message from this exercise is that the feedbacks strengthen our main conclusions but are not the sole drivers of the interaction between networks, technology, and disease.
5.3. How social networks adapt to their environment
Our main question is about the effect, not the origin, of networks. But to understand where and why changing the structure of social networks might be harmful, it is useful to understand how a country’s social network emerges.
In the model, what creates variation in networks is the initial level of disease and random chance. Therefore, we simulate two economies that vary in their initial infection rates. In Figure 3, the results labelled “low disease” have an initial disease prevalence of 0.05%. The results labelled “high disease” have an initial disease prevalence of 18%. All other parameters are the same across the two sets of simulations. While disease rates evolve endogenously through the model’s contagion process, the initial differences create persistent differences in social network structure and thus in output.

Initial disease, network evolution, extinction, and output.
Notes: These figures illustrate the persistent effects of initial disease on social networks and output. The top two graphs report the fraction of the model’s economies of individualists, high-degree agents, and high-mobility agents at each date |$t$|. The top left panel is a low-disease economy that starts with a disease prevalence of 0.05%. The top right panel is a high-disease economy that starts with 18% prevalence. The bottom two graphs plot the rate of disease extinction and output for both the low-disease (low germs) and high-disease (high germs) economies.
Figure 3 illustrates how social networks adapt to the disease environment. A low-disease environment has an increasing number of individualist nodes with high degree and high mobility. Such nodes thrive because they get new ideas faster than low-diffusion types. New ideas boost their output. When other nodes change type, they adopt the type of the most economically successful nodes, which are the high-diffusion nodes. Thus, in low-disease environments, high-diffusion network characteristics thrive.
But in high-disease environments (top right panel), high-diffusion nodes get sick quickly. They may also get new ideas. But, if a node is sick, it is unproductive, regardless of its technology. One has to be alive and well to be productive. Therefore, when nodes change type, the most economically successful types are the low-diffusion types because those agents are less likely to be sick. In the high-disease environment, most nodes retain low-diffusion characteristics.
Since networks adapt to the disease environment, changing the network without changing the disease environment can be disastrous. A high-diffusion network, in a place where disease is prevalent, is a recipe for epidemics and humanitarian crisis. At the same time, we also see that even after disease is eradicated, the network persists. Table 2 reports that on average (80% of the simulations), diseases are eradicated within 200 years. Yet, we see social network features persisting because they are passed down, generation by generation. So, networks are adapted to their environment. But, when the environment changes, especially if the change is not life-threatening, social networks may be slow to respond.
6. An Alternative Approach: Estimation with Instruments
So far, we learned that although the correlation between social networks and income is strong, the causal evidence is more mixed: some economies benefit from high-diffusion networks, and others do not. Before ending, we bring one additional perspective on the question of how social networks matter for aggregate outcomes. This last approach uses instrumental variables to try to identify a causal relationship in a non-structural way. Both approaches are imperfect. In an environment with equilibrium effects everywhere, no instrument will be exogenous to everything. Every model is also surely wrong in some ways. We briefly pursue this second approach to measuring the importance of networks in order to compare the two answers. We find that the model-based approach and the identification-based approach deliver broadly consistent answers to the following question: for the median country that has mostly eradicated communicable disease, how much do social networks matter for the macroeconomy?
6.1. Measuring social networks
Our model features three dimensions of social network connections: individualism, degree, and mobility.
6.1.1. Measuring individualism
In our model, collectivism is defined as a social pattern of closely linked or interdependent individuals. Individualism is its opposite. What distinguishes collectives from sets of people with random ties to each other is that in collectives, two friends often have a third friend in common. This is the sense in which they are interdependent. Hofstede defines collectivism as
the degree to which individuals are integrated into groups. On the individualist side we find societies in which the ties between individuals are loose: everyone is expected to look after him/herself and his/her immediate family. On the collectivist side, we find societies in which people from birth onwards are integrated into strong, cohesive in-groups, often extended families (with uncles, aunts and grandparents) which continue protecting them in exchange for unquestioning loyalty. (Hofstede, 2003, pp. 9–10)
As a proxy for individualism, we use country |$i$|’s Hofstede individualism index. This index was constructed from data collected during an employee attitude survey program conducted by a large multinational organization (IBM) within its subsidiaries in 72 countries. The survey took place in two waves, in 1969 and 1972, and included questions about demographics, satisfaction, and goals. The individual answers were aggregated at the country level after matching respondents by occupation, age, and gender. The countries’ mean scores for 14 questions were then analysed using factor analysis, which resulted in the identification of two factors of equal strength that together explained 46% of the variance.14 The individualism factor is mapped onto a scale from 1 to 100 to create an index for each country.
6.1.2. Measuring mobility
Most people form their strongest social ties through repeated social contact with neighbours. Long-distance ties are likely to be with one’s former neighbours. Thus, mobility governs the quantity of social ties in far-off locations. For U.S. data, the census tells us the annual probability of a U.S. resident moving across state lines. For other countries, we do not have a large cross-country panel of mobility data. But we do have extensive data on mobility for U.S. residents, including immigrants. The data come from the Annual Social and Economic Supplement (ASEC) of the Current Population Survey (CPS), from 1994 to 2013. We select individuals between ages 18 to 35 and use the variable birthplace (BPL) to assign country of origin. To measure mobility, we use the variable MIGRATE1, which indicates whether the respondent had changed residence in the past year. Movers are those who reported moving outside their state borders. Our measure of mobility is the fraction of first-generation U.S. immigrants from each country that move in a given year.
The underlying assumption is that people who move to the United States from countries with higher degree/mobility maintain higher degree/mobility and pass these social network preferences on to their children. This approach of using data for U.S. residents follows Fernández and Fogli (2009) and helps to control for institutional differences that might otherwise explain different behaviour across countries. At the same time, immigrants are a selected sample who likely have a higher propensity for mobility and have been partly assimilated into American culture. As such, our measure likely underestimates the differences in social network structure across countries.
6.1.3. Measuring network degree
A network’s degree is the average number of connections of a node in the network, |$n_f$| in the model. Our empirical proxy for network degree in each country is the average number of close friends reported by U.S. residents that report having ancestors coming from that country. Our data come from the General Social Survey (GSS).15 The variable numgiven asks, “From time to time, most people discuss important matters with other people. Looking back over the last six months—who are the people with whom you discussed matters important to you? Just tell me their first names or initials.” Based on this variable, we select respondents that report having ancestors coming from another country and average their responses to construct an index for network degree for each country in our sample.
In the GSS dataset, only 28 countries are found. For the remaining countries, we impute degree by estimating |$\alpha$|, |$\beta_1$|, and |$\beta_2$| in : |${degree}_i = \alpha+ \beta_{1} {hofstede}_i + \beta_{2} {mobility}_i + \epsilon_i$|. Then, we use the predicted values from that regression to fill in our missing observations.
6.1.4. Network diffusion index
Our three network features—collectivism/ individualism, degree, and mobility—all accelerate the diffusion of new technologies. Summarizing their effects jointly facilitates graphical representation and later avoids econometric identification problems. Therefore, we combine our network measures into a single network index. The index we construct is the first principal component of the three measures (see equation (3)). The last column of Table 3 lists the linear weights.
| . | Degree . | Individualism . | Mobility . | Index . |
|---|---|---|---|---|
| . | (1) . | (2) . | (3) . | loadings . |
| Individualism | 1.000 | 0.575 | ||
| Mobility | 0.364 | 1.000 | 0.540 | |
| Degree | 0.545 | 0.471 | 1.000 | 0.615 |
| . | Degree . | Individualism . | Mobility . | Index . |
|---|---|---|---|---|
| . | (1) . | (2) . | (3) . | loadings . |
| Individualism | 1.000 | 0.575 | ||
| Mobility | 0.364 | 1.000 | 0.540 | |
| Degree | 0.545 | 0.471 | 1.000 | 0.615 |
Notes: The table reports correlations of the three measures of social network structure described in Section 6.1 and the loadings of each measure on the network index.
| . | Degree . | Individualism . | Mobility . | Index . |
|---|---|---|---|---|
| . | (1) . | (2) . | (3) . | loadings . |
| Individualism | 1.000 | 0.575 | ||
| Mobility | 0.364 | 1.000 | 0.540 | |
| Degree | 0.545 | 0.471 | 1.000 | 0.615 |
| . | Degree . | Individualism . | Mobility . | Index . |
|---|---|---|---|---|
| . | (1) . | (2) . | (3) . | loadings . |
| Individualism | 1.000 | 0.575 | ||
| Mobility | 0.364 | 1.000 | 0.540 | |
| Degree | 0.545 | 0.471 | 1.000 | 0.615 |
Notes: The table reports correlations of the three measures of social network structure described in Section 6.1 and the loadings of each measure on the network index.
One might be concerned that all three measures capture the same variation. Table 3 also describes the cross-correlation of our three measures of social networks. While the measures are not uncorrelated, there is also some independent variation between them.
6.1.5. Comparing correlations in the model and data
Before we continue describing how we identify the effect of networks, it is useful to get some sense of the covariances. Figure 4 illustrates the network features of each country in our dataset. The last panel reports the disease prevalence measure, as well as the combination of all the network features into a single network index.

Social network features, income, and disease (data).
Notes: The first three panels are each of our three network measures (degree, collectivism/individualism, and mobility), ordered by log of income per capita. The last panel illustrates the relationship between disease (pathogen) prevalence and the network index |$\tilde{N}$|.
| Dependent variable: GDP per capita . | . | . | . | . |
|---|---|---|---|---|
| . | Data . | Model . | Data . | Model . |
| Network index | 0.713*** | 1.591*** | ||
| (0.103) | (0.067) | |||
| Germs sum | -0.640*** | -3.129*** | ||
| (0.0758) | (0.352) | |||
| Observations | 71 | 10 | 71 | 10 |
| |$R^2$| | 0.409 | 0.986 | 0.508 | 0.908 |
| Dependent variable: GDP per capita . | . | . | . | . |
|---|---|---|---|---|
| . | Data . | Model . | Data . | Model . |
| Network index | 0.713*** | 1.591*** | ||
| (0.103) | (0.067) | |||
| Germs sum | -0.640*** | -3.129*** | ||
| (0.0758) | (0.352) | |||
| Observations | 71 | 10 | 71 | 10 |
| |$R^2$| | 0.409 | 0.986 | 0.508 | 0.908 |
Notes: Entries are Ordinary Least Squares (OLS) estimated coefficients of equations (5) or (6). Output in the data is real output per capita log(|$Y/L$|). In the model, output per capita is |$A(t)$|. Disease prevalence in the data is the disease prevalence index, defined in Section 6.1. In the model, disease prevalence is the initial fraction of sick agents, which lies on a grid of 10 equally spaced points between 0.05% and 18%. Standard errors are in parentheses.
| Dependent variable: GDP per capita . | . | . | . | . |
|---|---|---|---|---|
| . | Data . | Model . | Data . | Model . |
| Network index | 0.713*** | 1.591*** | ||
| (0.103) | (0.067) | |||
| Germs sum | -0.640*** | -3.129*** | ||
| (0.0758) | (0.352) | |||
| Observations | 71 | 10 | 71 | 10 |
| |$R^2$| | 0.409 | 0.986 | 0.508 | 0.908 |
| Dependent variable: GDP per capita . | . | . | . | . |
|---|---|---|---|---|
| . | Data . | Model . | Data . | Model . |
| Network index | 0.713*** | 1.591*** | ||
| (0.103) | (0.067) | |||
| Germs sum | -0.640*** | -3.129*** | ||
| (0.0758) | (0.352) | |||
| Observations | 71 | 10 | 71 | 10 |
| |$R^2$| | 0.409 | 0.986 | 0.508 | 0.908 |
Notes: Entries are Ordinary Least Squares (OLS) estimated coefficients of equations (5) or (6). Output in the data is real output per capita log(|$Y/L$|). In the model, output per capita is |$A(t)$|. Disease prevalence in the data is the disease prevalence index, defined in Section 6.1. In the model, disease prevalence is the initial fraction of sick agents, which lies on a grid of 10 equally spaced points between 0.05% and 18%. Standard errors are in parentheses.
The strong positive correlation between network diffusion and income comports with what we see in the data. This finding is reassuring, since the model was not calibrated to match this correlation or any closely related moments. In calibration parlance, these are “overidentifying moments.” The similar orders of magnitude in Table 2 suggest that the model is a reasonable framework for quantifying network effects on income.
How large is this effect economically? Suppose every person in a given country had one more friend. How much additional income would these estimates suggest? Since the standard deviation of degree is |$0.30$|, the adjusted |$\widetilde{degree}$| measure changes by |$3.3$|. According to equation 4, this increases the Network Index by |$0.61 \times 3.3=2.01$|. Multiplying this by the model’s estimated coefficient on Network Index delivers an increase of average output per capita of |$2.01 \times 1.59 = 3.20$|. How large is that increase? Since the U.S. has average income (log|$Y/L$|) of |$10.5$|, this represents a 30% increase in U.S. GDP per capita.
Of course, that was a three standard deviation change in a network feature, a rather large change. Suppose that we change the fraction of individualists by |$0.24$|, an amount that corresponds to one standard deviation of individualism in the data. This increases the Network Index by |$0.61$|. When multiplied by the coefficient on Network Index estimated from model output, the predicted effect on average output is |$0.61 \times 1.59 = 0.97$|. That represents a more modest 9.2% of U.S. GDP. A similar calculation implies that a 6% change in the probability of moving (also one standard deviation in the data) would predict a change in GDP equal to 8.5% of U.S. income. The bottom line is that the estimation coefficients of the model and data reveal that, statistically, networks can predict lots of variation in cross-country income.
6.1.6. Disease difference instrument
Our theory suggests that an instrument with power to predict social network structure |$\tilde{N}$| is total disease prevalence. But this instrument is not likely to be valid, both because technology affects disease (vaccines are a technology, for example) and because poor health reduces productivity and diminishes one’s capacity for invention.
However, according to sociobiologists, the effect of disease on social networks depends on how the disease is transmitted.16 Sociobiologists often classify infectious diseases by reservoir. The reservoir is any person, animal, plant, soil, or substance in which an infectious agent normally lives and multiplies. From the reservoir, the disease is transmitted to humans. Pathogens that use humans as their reservoir (perhaps in addition to other reservoirs) have the potential to affect social networks because they are passed on to more connected individuals more frequently. Zoonotic pathogens are those not carried by people, only by other animals. Their prevalence is less likely to affect social networks in any systematic way.
We restricted the coefficients on |$d_h$| and |$d_z$| to be the same, meaning that human disease prevalence and zoonotic disease prevalence have the same effect on technology. Hence, the total effect on technology is determined by the sum |$d_h + d_z$|. This is orthogonal to the composition of the effect between the two types of disease, |$d_h - d_z$|, which has no direct effect on |$A$|. Therefore, since the diseases have different effects on networks |$\tilde{N}$| and similar effects on the speed of technology diffusion |$A$|, the instrument |$(d_h - d_z )$| can be a powerful and valid instrument.18
To measure the prevalence of both human and animal-transmitted diseases, we use historical data. In recent times, disease prevalence dropped drastically with the medical advances of the 20th century. Yet, our model predicts that disease patterns generations ago might still affect social networks today. The oldest comprehensive cross-country disease data come from the 1930s and includes the following nine life-threatening diseases: leishmanias, leprosy, trypanosomes, malaria, schistosomes, filariae, dengue, typhus, and tuberculosis. Most of the prevalence data come from Murray and Schaller (2010),19 who use a four-point coding scheme: |$0=$| completely absent or never reported, |$1=$| rarely reported, |$2=$| sporadically or moderately reported, and |$3 =$| present at severe levels or epidemic levels at least once.
We have prevalence of all nine diseases in 160 countries. For each country, we add up the prevalence score from each disease to get that country’s disease prevalence index.
Supplementary Appendix C contains more information about the diseases and their characteristics.
6.1.7. What do linear estimates teach us?
Table 5 reveals that more prevalent human-to-human disease is associated with a lower-diffusion social network. It also teaches us that higher-diffusion social networks boost GDP per capita. As expected, disease prevalence lowers output. All of these results are qualitatively consistent with the model. The bigger question is whether social networks matter quantitatively.
| Dependent variable: . | First stage: network index . | Second stage: GDP per capita . | ||
|---|---|---|---|---|
| Instruments: . | |$\Delta$|germ . | |$\Delta$|germ and English . | |$\Delta$|germ . | |$\Delta$|germ and English . |
| |$\Delta$|germ | -0.422*** | -0.448*** | ||
| (0.123) | (0.118) | |||
| germs_sum | -0.503*** | -0.424*** | -0.297^* | -0.361*** |
| (0.0706) | (0.0736) | (0.171) | (0.138) | |
| English | 1.446*** | |||
| (0.537) | ||||
| Network index | 0.684** | 0.556** | ||
| (0.307) | (0.236) | |||
| Constant | 1.973*** | 1.561*** | 9.849*** | 10.10*** |
| (0.299) | (0.325) | (0.672) | (0.549) | |
| Observations | 71 | 71 | 71 | 71 |
| F-stat | 31.12 | 25.07 | 38.17 | 41.09 |
| |$R^2$| | 0.478 | 0.529 | 0.517 | 0.551 |
| Sargan-p | 0.520 | |||
| Dependent variable: . | First stage: network index . | Second stage: GDP per capita . | ||
|---|---|---|---|---|
| Instruments: . | |$\Delta$|germ . | |$\Delta$|germ and English . | |$\Delta$|germ . | |$\Delta$|germ and English . |
| |$\Delta$|germ | -0.422*** | -0.448*** | ||
| (0.123) | (0.118) | |||
| germs_sum | -0.503*** | -0.424*** | -0.297^* | -0.361*** |
| (0.0706) | (0.0736) | (0.171) | (0.138) | |
| English | 1.446*** | |||
| (0.537) | ||||
| Network index | 0.684** | 0.556** | ||
| (0.307) | (0.236) | |||
| Constant | 1.973*** | 1.561*** | 9.849*** | 10.10*** |
| (0.299) | (0.325) | (0.672) | (0.549) | |
| Observations | 71 | 71 | 71 | 71 |
| F-stat | 31.12 | 25.07 | 38.17 | 41.09 |
| |$R^2$| | 0.478 | 0.529 | 0.517 | 0.551 |
| Sargan-p | 0.520 | |||
Notes: Columns report the |$\beta_2$| coefficient from an IV estimation of log|$(Y/L)=\beta_1 + \beta_2 \tilde{N} + \epsilon$|. |$\tilde{N}$| is estimated from a first-stage regression |$Network\ Index = \beta_3 + \beta_4 \Delta germs + \epsilon_2 $|. The Network Index is defined in equation (4). The instrument |$\Delta$|germ is defined in equation (7). English, which is 1 if the country is English-speaking, allows for a Sargan test for validity of the |$\Delta germ$| instrument. Standard errors are in parentheses. *** |$p<$|0.01, ** |$p<$|0.05, * |$p<$|0.1.
| Dependent variable: . | First stage: network index . | Second stage: GDP per capita . | ||
|---|---|---|---|---|
| Instruments: . | |$\Delta$|germ . | |$\Delta$|germ and English . | |$\Delta$|germ . | |$\Delta$|germ and English . |
| |$\Delta$|germ | -0.422*** | -0.448*** | ||
| (0.123) | (0.118) | |||
| germs_sum | -0.503*** | -0.424*** | -0.297^* | -0.361*** |
| (0.0706) | (0.0736) | (0.171) | (0.138) | |
| English | 1.446*** | |||
| (0.537) | ||||
| Network index | 0.684** | 0.556** | ||
| (0.307) | (0.236) | |||
| Constant | 1.973*** | 1.561*** | 9.849*** | 10.10*** |
| (0.299) | (0.325) | (0.672) | (0.549) | |
| Observations | 71 | 71 | 71 | 71 |
| F-stat | 31.12 | 25.07 | 38.17 | 41.09 |
| |$R^2$| | 0.478 | 0.529 | 0.517 | 0.551 |
| Sargan-p | 0.520 | |||
| Dependent variable: . | First stage: network index . | Second stage: GDP per capita . | ||
|---|---|---|---|---|
| Instruments: . | |$\Delta$|germ . | |$\Delta$|germ and English . | |$\Delta$|germ . | |$\Delta$|germ and English . |
| |$\Delta$|germ | -0.422*** | -0.448*** | ||
| (0.123) | (0.118) | |||
| germs_sum | -0.503*** | -0.424*** | -0.297^* | -0.361*** |
| (0.0706) | (0.0736) | (0.171) | (0.138) | |
| English | 1.446*** | |||
| (0.537) | ||||
| Network index | 0.684** | 0.556** | ||
| (0.307) | (0.236) | |||
| Constant | 1.973*** | 1.561*** | 9.849*** | 10.10*** |
| (0.299) | (0.325) | (0.672) | (0.549) | |
| Observations | 71 | 71 | 71 | 71 |
| F-stat | 31.12 | 25.07 | 38.17 | 41.09 |
| |$R^2$| | 0.478 | 0.529 | 0.517 | 0.551 |
| Sargan-p | 0.520 | |||
Notes: Columns report the |$\beta_2$| coefficient from an IV estimation of log|$(Y/L)=\beta_1 + \beta_2 \tilde{N} + \epsilon$|. |$\tilde{N}$| is estimated from a first-stage regression |$Network\ Index = \beta_3 + \beta_4 \Delta germs + \epsilon_2 $|. The Network Index is defined in equation (4). The instrument |$\Delta$|germ is defined in equation (7). English, which is 1 if the country is English-speaking, allows for a Sargan test for validity of the |$\Delta germ$| instrument. Standard errors are in parentheses. *** |$p<$|0.01, ** |$p<$|0.05, * |$p<$|0.1.
The magnitude of the coefficient on the network index in Table 5 implies that a one standard deviation increase in the network index (1.45 units) increases log output per worker by |$1.45 \times 0.684=0.99$|, which represents a 99% increase in real GDP per capita. Like the correlations from both the data and the model, that is a large effect. These large estimates suggest that social network structures might be relevant for macroeconomists and that policy makers might want to craft policies to alter these networks to promote growth.
However, Table 6 suggests a more subtle message that echoes the results of the model. The positive effect of social networks only appears for the low-disease prevalence countries. For countries with high disease prevalence, the estimates of the effect of higher-diffusion networks on income are negative and insignificant. When we include other controls, such as English-language dummy variable and total disease prevalence, the pattern of significant effects is robust only for low-disease countries. These results echo the finding of the model that changing social networks can be highly beneficial in some countries. However, installing a high-diffusion social network in countries where such social structures are maladapted, at best shows no benefit and at worst could be destructive.
| Dependent variable: GDP per capita . | High disease . | Low disease . |
|---|---|---|
| Network index | -0.371 | 0.882*** |
| (1.682) | (0.327) | |
| Constant | 7.455*** | 7.001** |
| (1.438) | (2.881) | |
| Observations | 34 | 26 |
| Dependent variable: GDP per capita . | High disease . | Low disease . |
|---|---|---|
| Network index | -0.371 | 0.882*** |
| (1.682) | (0.327) | |
| Constant | 7.455*** | 7.001** |
| (1.438) | (2.881) | |
| Observations | 34 | 26 |
Notes: Columns report the |$\beta_2$| coefficient from an IV estimation of log|$(Y/L)=\beta_1 + \beta_2 \tilde{N} + \epsilon$|. Variables and first-stage regression are described in Table 5. High (low) disease are countries with a disease score above 14 (below 12). Standard errors are in parentheses. *** |$p<$|0.01, ** |$p< $|0.05, * |$p< $|0.1.
| Dependent variable: GDP per capita . | High disease . | Low disease . |
|---|---|---|
| Network index | -0.371 | 0.882*** |
| (1.682) | (0.327) | |
| Constant | 7.455*** | 7.001** |
| (1.438) | (2.881) | |
| Observations | 34 | 26 |
| Dependent variable: GDP per capita . | High disease . | Low disease . |
|---|---|---|
| Network index | -0.371 | 0.882*** |
| (1.682) | (0.327) | |
| Constant | 7.455*** | 7.001** |
| (1.438) | (2.881) | |
| Observations | 34 | 26 |
Notes: Columns report the |$\beta_2$| coefficient from an IV estimation of log|$(Y/L)=\beta_1 + \beta_2 \tilde{N} + \epsilon$|. Variables and first-stage regression are described in Table 5. High (low) disease are countries with a disease score above 14 (below 12). Standard errors are in parentheses. *** |$p<$|0.01, ** |$p< $|0.05, * |$p< $|0.1.
6.1.8. Robustness
One reason the difference in the prevalence of human and zoonotic diseases might be invalid would be if human diseases were much more virulent, so that the difference predicted severe disability and thus reduced output. Another question is whether social networks are simply a proxy for some other economic variable. Supplementary Appendix C explores both questions.
7. Conclusions
Measuring the effect of social network structure on the economic development of countries is a challenging task. Networks are difficult to measure and susceptible to problems with reverse causality. We use a theory of social network evolution to identify properties of social networks that can be matched with data and to select promising instrumental variables that can predict network structure. The theory predicts that societies with higher disease prevalence are more likely to adopt low-diffusion social networks. Such networks inhibit disease transmission, but they also inhibit idea transmission. This model reveals which social features should speed or slow diffusion. It also suggests that disease prevalence might be a useful instrument for a social network because it affects how social networks evolve.
Quantifying the model reveals that small initial differences in the epidemiological environment can give rise to large differences in network structure that persist. Over time, these persistent network differences can generate substantial divergence in technology diffusion and output. We find evidence of this social network effect in the data. Exploiting the differential mode of transmission of germs, we are able to identify a significant effect of social network structure on technology diffusion and income. Specifically, we find that a one standard deviation change in social network structure can increase the growth of output per worker by 1/2% per year.
More broadly, the article’s contribution is to offer a theory of the origins of social institutions, propose one way in which these institutions might interact with the macroeconomy and show how to quantify and test this relationship.
The editor in charge of this paper was Michele Tertilt.
Acknowledgments
We thank participants at the Minnesota Workshop in Macroeconomic Theory, NBER EF&G meetings, SED meetings, the Conference on the Economics of Interactions and Culture and Einaudi Institute, the Munich conference on Cultural Change and Economic Growth, SITE, NBER Macro across Time and Space, and NBER growth meetings and seminar participants at Bocconi, Brown, USC, Stanford, Chicago, Western Ontario, Minnesota, Penn State, George Washington, and NYU for their comments and suggestions. We thank Corey Fincher and Damian Murray for help with the pathogen data, Diego Comin, Pascaline Dupas, Chad Jones, and Marti Mestieri for useful comments, and Isaac Baley, Callum Jones, Hyunju Lee, David Low, Amanda Michaud, and Arnav Sood for invaluable research assistance. Laura Veldkamp thanks the Hoover Institution for its hospitality and financial support through the National Fellows Program.
Supplementary Data
Supplementary data are available at Review of Economic Studies online.
Footnotes
See, for example, Bisin and Verdier (2001), Tertilt (2005), Doepke and Tertilt (2009), Greenwood and Guner (2010), or Doepke and Zilibotti (2014) for a literature review.
See, for example, Chaney (2014) or Kelley et al. (2013).
See, for example, Tabellini (2010) and Algan and Cahuc (2007). Brock and Durlauf (2006) review work on social influence in macroeconomics but bemoan the lack of work that incorporates social network interactions.
Bloom et al. (2004) summarize this literature, which typically uses aggregate economic data and aggregate health data, as well as standard controls, and estimates that a one-year increase in life expectancy raises output by 4%. In contrast, other authors such as Weil (2007) use micro-evidence of health effects on individual output and then simulate the implied aggregate effect on Gross Domestic Product (GDP).
In economics, Foster and Rosenzweig (1995) spawned a branch of the growth and development literatures that focuses on the role of personal contact in technology diffusion; see Conley and Udry (2010) or Young (2009) for a review.
Animal behaviour researchers have long known that many species form social connections that depend on the group’s health status Hamilton and Zuk (1982). For primates in particular, mating strategies, group sizes, social avoidance, and barriers between groups are all influenced by the presence of socially transmissible pathogens Loehle (1995). Thornhill et al. (2010) document this effect in human societies. An alternative to the evolutionary approach would be to work with a network choice model. But equilibria in such models often do not exist, and when they do, they are typically not unique.
Another logical specification for the social change shock is to link it to disease, so that someone who gets sick dies and then is reborn with, potentially, a different type. The problem with this formulation of the model is that epidemics prompt rapid social change. Since this is counterfactual, our model makes social change independent of disease.
What if technology diffusion is not a process “formally akin to the spread of an infectious disease” (Arrow, 1969). Instead, a technology is adopted only when a person comes in contact with multiple other adopters. This idea is called “complex contagion.” While Centola and Macy (2007) demonstrate that collectives can theoretically facilitate technology adoption, they admit, “We know of no empirical studies that have directly tested the need for wide bridges [collectives] in the spread of complex contagions.” In other words, the theoretical possibility lacks empirical support. In contrast, the idea that technology is adopted when information about the success of the technology arrives from a single social contact is a well-documented phenomenon (see e.g.Foster and Rosenzweig, 1995; Munshi, 2004; Conley and Udry, 2010).
Epidemiological models with an infinite number of agents often have a second steady state with a positive infection rate. For example, in Goenka and Liu (2015), any positive fraction of infected agents is still an infinite number of infections. Since the probability that none of the infinite infections is passed on to another node is zero, the infinite-agent model never reaches extinction. Our model predicts extinction because we have a finite number of agents.
This fraction should be small enough to leave room for endogenous diffusion of individualists. But at the same time, it needs to be large enough so that the individualists survive initial disease prevalence.
From Mali et al. (2008), in 2006 malaria cases were 1,564 and from the U.S. Census (U.S. Census Bureau, 2006), the population was 294 million. This gives a malaria prevalence of 0.0005%.
We chose Ghana because this was one of the highest disease prevalences in our data. According to the Ghana Health Service (Ghana Health Service, 2010), in 2006 there were 3.9 million malaria cases in Ghana. With a population of 21.9 million in the same year, this gives a malaria prevalence of 18%.
We calculate this half-diffusion from the Greenwood et al. (2005) dataset by averaging the number of years from introduction until a 50% adoption rate for 13 of their 15 technologies. We exclude the vacuum and washer because their adoption rates were more than 30% in the year they first appeared in the data.
Of course, IBM employees are not representative of country residents as a whole. Subsequent studies involving commercial airline pilots and students (23 countries), civil service managers (14 countries), and consumers (15 countries) have validated Hofstede’s results. Supplementary Appendix C describes these additional studies, as well as some of Hofstede’s questions. The Supplementary Appendix also summarizes sociological theories that link these questions to network structure. Finally, it documents studies that map out partial social networks for small geographic areas. Taken together, these network mapping studies reveal that highly collectivist countries, according to Hofstede, have a higher average prevalence of network collectives.
We use the variable ethnic, which asks the country of origin of ancestors, to associate individuals with different countries.
See, for example, Smith et al. (2007) or Thornhill et al. (2010). Also, Birchenall (2014) and Murray and Schaller (2010) argue that human-to-human transmitted diseases have a disproportionate effect on the pattern of social relationships.
When |$var(d_h) = var(d_z)$|, the difference |$(d_h - d_z)$| is orthogonal to the sum |$(d_h + d_z)$|.
We do not need to know all the determinants of social structure. Rather, any subset of the determining variables can serve as a valid instrument for |$\tilde{N}$|. Similarly, we do not need to observe |$\tilde{N}$| exactly. A proxy variable with random measurement noise is sufficient for an unbiased instrumental variables estimate of the coefficient |$\beta_2$|. To ensure that this approach is also model consistent, Supplementary Appendix B.5 verifies that in the model, this disease difference is a valid instrument. Supplementary Appendix C.9 discusses some additional endogeneity concerns and the ways in which we address them.
The Murray and Schaller (2010) data are based primarily on epidemiological maps provided in Rodenwaldt and Jusatz (1961) and Simmons et al. (1945), and originally collected by the Medical Intelligence Division of the United States Army. The one exception is tuberculosis, which comes from the National Geographic Society’s (2005) NatGeo Atlas of the World. For each region, they coded the prevalence of tuberculosis on a three-point scale: |$1=3-49$|, |$2=50-99$|, |$3=100$| or more per 100,000 people.
References
GHS HEADQUARTERS (
NATIONAL GEOGRAPHIC SOCIETY (
U.S. CENSUS BUREAU (