Assessing Human Genome-wide Variation in the Massim Region of Papua New Guinea and Implications for the Kula Trading Tradition

Abstract The Massim, a cultural region that includes the southeastern tip of mainland Papua New Guinea (PNG) and nearby PNG offshore islands, is renowned for a trading network called Kula, in which different valuable items circulate in different directions among some of the islands. Although the Massim has been a focus of anthropological investigation since the pioneering work of Malinowski in 1922, the genetic background of its inhabitants remains relatively unexplored. To characterize the Massim genomically, we generated genome-wide SNP data from 192 individuals from 15 groups spanning the entire region. Analyzing these together with comparative data, we found that all Massim individuals have variable Papuan-related (indigenous) and Austronesian-related (arriving ∼3,000 years ago) ancestries. Individuals from Rossel Island in southern Massim, speaking an isolate Papuan language, have the highest amount of a distinct Papuan ancestry. We also investigated the recent contact via sharing of identical by descent (IBD) genomic segments and found that Austronesian-related IBD tracts are widely distributed geographically, but Papuan-related tracts are shared exclusively between the PNG mainland and Massim, and between the Bismarck and Solomon Archipelagoes. Moreover, the Kula-practicing groups of the Massim show higher IBD sharing among themselves than do groups that do not participate in Kula. This higher sharing predates the formation of Kula, suggesting that extensive contact between these groups since the Austronesian settlement may have facilitated the formation of Kula. Our study provides the first comprehensive genome-wide assessment of Massim inhabitants and new insights into the fascinating Kula system.


Introduction
New Guinea has a long history of human occupation and harbors extensive ethnolinguistic diversity. Modern humans first colonized Sahul (the connected Australia-New Guinea land mass) at least 47 thousand years ago (kya), and perhaps as long ago as 65 kya (Roberts et al. 1990;Summerhayes et al. 2010;O'Connell and Allen 2015;Clarkson et al. 2017), while ∼3 kya, the Austronesian expansion/settlement brought a second wave of human migration into the region (Bellwood 2007;Kirch 2017). This long-term isolation of human occupation and different episodes of human migration promoted diverse regional culture developments, including an independent invention of farming in the New Guinea highlands at least 4-5 kya (Denham 2004;Shaw, Field, et al. 2020) and the Lapita culture, associated with the expansion of Austronesians from Near Oceania into the Pacific (Spriggs 1995;Kirch 2017).
Geographically, the Massim encompasses the southeastern tip of mainland Papua New Guinea (PNG) and nearby offshore islands (fig. 1A) and is considered by anthropologists to be a culturally defined region (Seligmann 1910;Shaw 2019). The Massim region is well known for the Kula ring tradition (or Kula), a network trading system in which two types of unique necklaces circulate among the islands in opposite directions, facilitating inter-island social and economic relationships (Malinowski 1922;Ziegler 2017;Irwin et al. 2018;Kuehling 2021). The Kula connects the islands of the northern and the western Massim (including the eastern tip of Mainland PNG) and also includes Misima, the most northern island of the southern Massim, while other islands of the southern Massim are not involved in the Kula ( fig. 1A). Due to seasonal wind conditions, traders on Kula voyages often spend long periods of time on different islands (Malinowski 1922;Irwin et al. 2018), which might MBE further facilitate interactions, including genetic (Schiefenhövel 2004), between the groups participating in Kula.
Since the initial description of Kula by Malinowski in 1922(Malinowski 1922, the region has attracted the attention of anthropologists, archeologists, and linguists, and chronologies of Massim have been proposed (Shaw 2016;Shaw, Coxe, Kewibu, et al. 2020). The earliest evidence for human occupation of the Massim is dated to 17.3 kya, on Paneati Island in the Louisiades Archipelago ; undated obsidian and stone artifacts that are similar to those dated to the Late Pleistocene or Early Holocene elsewhere in New Guinea further support human occupation around this time, when lowered sea levels connected much of the Massim to the PNG mainland ; Shaw 2017). After settlement by Lapita people ∼2.5-3 kya, the northern and southern Massim exhibited separate cultural developments, until ∼500-200 years ago with the formation of Kula (Shaw 2016). Linguistically, almost all Massim inhabitants speak languages belonging to the Austronesian language family, except for a few groups living on mainland PNG who speak Papuan languages belonging to the Trans-New Guinea language family (Eberhard et al. 2021), and the Rossel Islanders on the most eastern tip of the southern Massim. Rossel Islanders speak a Papuan language that has been classified either as a language isolate (Levinson and Majid 2013;Stebbins et al. 2017) or as belonging to the Yele-West New Britain language family, linking Rossel linguistically with the Bismarck Archipelago (Ross 2005;Eberhard et al. 2021).
While the Massim region has been well studied by archeologists, anthropologists, and linguists, genetic variation in the Massim remains largely unexplored, even though it occupies a key position in connecting the northern and southern coasts of New Guinea as well as connecting New Guinea with the neighboring Solomon Islands and other regions of Oceania. The most comprehensive human genetic study of the Massim to date analyzed mitochondrial DNA (mtDNA) and Y-chromosome variation, and found regional genetic-geographic population structure for mtDNA but not for the Y-chromosome (van Oven et al. 2014). This was interpreted as a potential signature of the Kula, as the travel between islands to perform the trading of the goods is mostly mediated by males, which could reduce interisland genetic differences for the Y-chromosome but not for mtDNA. However, studies of genome-wide data provide much richer information concerning admixture and population history, as shown by two recent genome-wide studies of PNG (Bergström et al. 2017;Brucato et al. 2021), but genome-wide studies of the Massim are lacking as of yet. Here, we report the results of comprehensive genome-wide analyses of the Massim region with resulting insights into the genetic history of Oceania in general and implications for the Kula tradition in particular.

Overall Genetic Variation of Austronesian and Papuan Ancestries in the Massim
We generated genome-wide single nucleotide polymorphism (SNP) array data on the Illumina Infinium Multi-Ethnic Global Array (MEGA; ∼1.5 million SNPs) for 192 individuals from 15 groups spanning the entire Massim region including northern, southern, and western Massim as well as the mainland parts from Collingwood Bay ( fig. 1A). We merged our data with published array and whole-genome sequencing (WGS) data encompassing East Asia and Oceania and additionally included Africans and Europeans as more distant comparative groups ( fig.  1A; supplementary fig. S1 and table S1, Supplementary Material online). To obtain an overview of the genetic variation and genetic-geographic population structure revealed with our compiled genomic data set, we first performed principal component analysis (PCA). In a PCA plot focusing on the East Asian and Oceanian groups ( fig. 1B; supplementary fig. S2, Supplementary Material online), we observed a striking cline with East Asians at one pole and PNG highlanders at the other; the Massim groups fall in between together with groups from the Central Province of PNG, the Bismarck Archipelago (in short, Bismarcks), and the Solomon Archipelago (i.e., Bougainville and the Solomon Islands; in short, Solomons). Groups from the PNG lowlands are placed close to PNG highlanders, except for the Central group, which comprises more Austronesian speakers. Bellona, Renell, and Tikopia are Polynesian outliers in the Solomons, that is, groups that migrated back to the Solomons from Polynesia, which explains their placement further toward East Asians than any other Near Oceanian group studied. The northern Massim groups are closer to the East Asian pole, while the southern Massim group from Rossel is closest to the PNG highlander pole, and other southern Massim as well as all western Massim and Collingwood Bay groups fall in between. Notably, there is a division in the PCA within southern Massim, with Rossel being closest to the PNG mainland and highland groups and Sudest, which geographically is located next to Rossel, being close to Rossel, while the other southern Massim groups fall together with northern and western Massim groups and those from Collingwood Bay.
Since the PCA suggests mixed East Asian-Papuan ancestry in the Massim groups, we explored this further in an ADMIXTURE analysis. At K = 2, one component ( . 1B), namely that Massim groups harbor both Austronesian and Papuan ancestry with different proportions depending on their locations within the Massim region. As before, northern Massim groups have more Austronesian ancestry while Rossel Islanders from southern Massim have more Papuan ancestry. As in the PCA plot, the Massim groups are distributed along a linear cline between PNG highlanders with high amounts of Papuan ancestry at one pole, and Polynesian outliers in the Solomons with high amounts of Austronesian ancestry at the other pole. Moreover, the shape of the f3 plot suggests a single major admixture episode rather than continuous gene flow or a tree-like history (Bergstrom et al. 2020).
In addition to the drift/allele-sharing analyses, a haplotype-based method (GLOBETROTTER) also suggests a single pulse of admixture for all the Massim groups except for Rossel (supplementary fig. S5, Supplementary Material online); Rossel exhibits an unclear admixture signal probably due to the low amount of Austronesian ancestry. Moreover, it is likely that the incoming population was already admixed (supplementary fig. S5, Supplementary Material online), that is, had both Austronesian and Papuan ancestry. Consistently, the percentage of modeled Austronesian-related ancestry is highest in the northern Massim groups (∼42%), and the Papuan-related ancestry is highest in the Rossel group from southern Massim (∼73%; supplementary table S2, Supplementary Material online). GLOBETROTTER also infers admixture dates of ∼1-3 kya for the Massim groups, and similar dates are inferred with another method, MALDER (supplementary fig. S6, Supplementary Material online); additionally, MALDER does not find evidence for more than one admixture event. The inferred dates are not correlated with the amount of Austronesian ancestry (supplementary fig. S7, Supplementary Material online), suggesting that the higher amounts of Austronesian ancestry in some Massim groups cannot be explained by a longer period of contact, but instead must reflect other social circumstances.  The blue and pink components detected at K = 8 (fig. 2) overlap largely, but not completely, with the results for the Oceanian groups observed at K = 2 (supplementary fig. S3, Supplementary Material online). All of the groups exhibiting any of the four components newly detected with K = 8 (light green, black, dark green, and peach) in the Oceanians have more Papuan than Austronesian ancestry detected at K = 2 (supplementary fig. S3, Supplementary Material online) except for the Polynesian outliers (peach), which suggests substantial variation in Papuan ancestry as well as a drifted Austronesian ancestry in the Polynesian outliers. Similar results were obtained with a more strict threshold for kinship quality control, suggesting that this observed pattern is not due to cryptically related individuals (supplementary figs. S8B and S9, Supplementary Material online). Hence, we further investigated the Papuan and Austronesian ancestry-specific variation in the Massim.
We first calculated f4-statistics of the form f4(Test groups, East Asian Austronesians; Chimbu/Bougainville/Rossel, Southern Highlanders) to test for different affinities of each test group (all other Oceanian groups) with Chimbu, Bougainville, or Rossel-related ancestry when compared with Southern Highlanders. The results of this f4 statistic will be influenced by both differences in Papuan-related ancestry and different amounts of Austronesian-related ancestry in the test groups. Therefore, to control for the latter, we plotted this f4 statistic against an f4 statistic of the form f4(Test groups, Southern Highlanders; East Asian Austronesians, Australians). This f4 statistic was previously shown mathematically to provide a measure of Austronesian-related ancestry in the test groups (Lipson et al. 2020). Groups exhibiting significant deviations from the regression line in each plot would indicate differential Papuan ancestries in those groups. The results for the com- Material online), again several PNG groups exhibit stronger affinities with Southern Highlanders, along with the Polynesian Outliers from the Solomons. Interestingly, there is variation in genomic ancestry among the Massim groups: southern Massim groups share more ancestry with Rossel; Collingwood Bay groups (from mainland PNG) share more ancestry with the Southern Highlanders; and the remaining Massim groups do not share excess ancestry with either Rossell or Southern Highlanders.
We then carried out a haplotype-based method of analysis (ChromoPainter) in which we restricted Massim groups to be recipients and not donors (with the exception of Rossel, in order to have a source of the putative distinct Papuan ancestry in the southern Massim). The results (supplementary fig. S11, Supplementary Material online) are consistent with the f4 analyses and strongly support the existence of more than one distinct Papuan-related ancestry in the Massim groups (one related to Rossel, and at least one more related to other Papuan groups).
Switching Australians to non-Austronesian East Asians in the f4(Test groups, Southern Highlanders; East Asian Austronesians, Australians) further confirmed that the MBE non-Papuan ancestry in the Massim groups is associated with Austronesian ancestry (supplementary fig. S10D-E, Supplementary Material online). We next used f4-statistics to investigate variation in Austronesian-related ancestry in the Massim groups and found that Massim groups are equally related to the most differentiated Austronesian-related ancestries, East Asian Austronesians versus Polynesian outliers (supplementary fig. S12, Supplementary Material online), suggesting no significant differences in the source(s) of Austronesian-related ancestry among the Massim groups.
To study the ancestry-specific variation we detected in the Massim in more detail, we applied local ancestry inference via RFMix and extracted Austronesian and Papuan ancestry-specific segments. The amount of Austronesian-related ancestry inferred by RFMix is slightly lower than that inferred by the ADMIXTURE and GLOBETROTTER analyses, while similar proportions of Papuan ancestries are inferred by all three methods (supplementary table S2, Supplementary Material online). This suggests that the Papuan ancestry-specific segments are identified with high confidence, but there is more uncertainty in inferring Austronesian-related ancestry segments, probably due to a poorer proxy for this source. A PCA based on Papuan ancestry-specific segments shows three poles, consisting of the Bismarck and Solomon groups, Rossel, and Papuan highlanders ( The Bismarck and Solomon groups are separated from Rossel and Papuan highlanders at PC1, suggesting the Papuan ancestry in Rossel is closer to Papuan highlanders than to the Papuan ancestry of Bismarck and Solomon groups. This is further supported by Papuan ancestry-specific ADMIXTURE and TreeMix results (supplementary figs. S15 and S16, Supplementary Material online). In contrast to the Papuan ancestryspecific PCA, in the Austronesian ancestry-specific PCA all of the Oceanian groups fall together in a cluster quite distinct from East Asian groups (

Recent Contact Involving Massim Groups and Implications for the Kula Tradition
To investigate in more detail genetic contacts involving the Massim groups in the recent past, we analyzed sharing   In the range of 5-10 cM, roughly corresponding to ∼675 ya, the strong connection between mainland PNG and the offshore islands in Massim is no longer there, but the connection between northern and southern Massim remains. In the range of over 10 cM or roughly 225 ya, strong sharing is only seen within some parts of Austronesian ancestry-specific IBD blocks; and (C ) shared Papuan ancestry-specific IBD blocks. Identified IBD blocks are in the range of 1-5, 5-10 cM, and over 10 cM, with the mean number of shared IBD blocks equal to 0.5 or more. The group points are colored according to region. The heat plot segments are proportional to the average of the summed IBD length (cM).

MBE
the Papuan highlands, northern Massim, and southern Massim, respectively. To further test the idea that the extensive IBD sharing in the range of 1-5 cM reflects the Austronesian settlement, we extracted and analyzed Austronesian ancestry-specific and Papuan ancestryspecific IBD segments. The results show that in the range of 1-5 cM, Austronesian ancestry-specific IBD segments account for most of the sharing between regions, while Papuan ancestry-specific IBD sharing shows a clear distinction between the PNG mainland and Massim groups, versus the Bismarcks and Solomons ( fig. 4B and C; supplementary figs. S19 and S20, Supplementary Material online). In particular, the strong sharing between many Massim groups and the Central Province group in the ChromoPainter analysis (supplementary fig. S11, Supplementary Material online) can be attributed to Austronesian-related ancestry, as this accounts for the IBD sharing between them (supplementary figs. S18 and S19, Supplementary Material online). And, the closer relationship in Papuan-related ancestry between the PNG mainland and Massim groups, versus either of these and the Bismarck/Solomon groups, is consistent with the PCA/ADMIXTURE/TreeMix based on Papuan-related ancestry ( fig. 3A; supplementary figs. S15 and S16, Supplementary Material online).
Finally, to investigate the potential relationship between the Kula tradition and genomic variation, we compared the IBD sharing between Kula-practicing versus non-Kula-practicing groups (in short, Kula vs. non-Kula). We measured IBD similarities among Kula/non-Kula taking the differences in IBD sharing within groups into account by dividing the mean IBD sharing between groups by the mean IBD sharing within groups. A plot of IBD similarity versus geographic distance, for different lengths of IBD segments (corresponding to different time periods) shows that Kula groups exhibit higher IBD similarities than non-Kula groups separated by the same geographic distance (supplementary fig. S21, Supplementary Material online). To test for statistical significance and further controlling for background sharing through time by additionally including groups outside Massim, we then computed a relative IBD similarity statistic that takes into account the overall IBD sharing among all Oceanians for different IBD segment size bins. We find that relative IBD similarity statistics are significantly positive for all Massim groups (i.e., both Kula and non-Kula groups), indicating that the Massim region overall shows evidence of more contact among groups than does the rest of Oceania. Moreover, Kula groups show constantly and significantly higher relative IBD similarities than non-Kula groups ( fig. 5; supplementary fig. S22, Supplementary Material online), even before the formation of the Kula ring tradition ∼500 ya (Shaw 2016;Irwin et al. 2018).

Discussion
Our analyses of genome-wide data from an extensive sampling of individuals from across the Massim, together with data from other regions, indicate that all Massim groups share both Austronesian-related and Papuan-related ancestry, in agreement with the previous study of uniparental markers (van Oven et al. 2014). However, we also find important regional distinctions within the Massim with respect to various aspects of these ancestries.
First, there is more Austronesian-related ancestry in the northern Massim (average based on ADMIXTURE of 52%; supplementary fig. S3 and table S2, Supplementary Material online) than in the other regions (average of 34%, with Rossel having the lowest amount, 20%). In general, there is also more Austronesian-related ancestry for   3B; supplementary figs. S12 and S17, Supplementary Material online). These results therefore suggest differences in the contact relationships between the indigenous Papuan groups and the incoming Austronesians in the northern Massim versus elsewhere in the Massim, with Rossel being the least-impacted by the Austronesians, possibly due to its more remote location. It is also notable that while all of the Massim groups studied (with the exception of Rossel and 5 individuals from groups on mainland PNG) speak Austronesian languages, Austronesian ancestry is in the minority in most of the groups, and reaches a maximum of 52% in the northern Massim. Second, the Papuan-related ancestry of Papuan-speaking Rossel Islanders is distinct from Papuan-related ancestries identified previously and elsewhere in Near Oceania. Previous studies of genome-wide data have shown a major distinction between Papuan ancestries in the PNG Highlands versus Bougainville (Pugach et al. 2018), and in the western versus eastern regions of the PNG Highlands (Bergström et al. 2017;Brucato et al. 2021). We find support for both of these distinctions, as well as for a distinct Papuan-related ancestry on Rossel (figs. 2 and 3; supplementary figs. S10C and S15, Supplementary Material online) that is also present in lower amounts in other Massim groups, especially from the southern Massim. In addition to this Rossel-related Papuan ancestry, Massim groups (except Rossel) also have, at frequencies of 1-63%, another Papuan-related ancestry that is at the highest frequency in the western PNG Highlands and the southern PNG lowlands ( fig. 2;  supplementary fig. S15, Supplementary Material online). Whether this ancestry predates, is associated with, or postdates the arrival of the Austronesians cannot be determined from our data. The existence of a distinct Papuan-related ancestry on Rossel could reflect initial colonization by people with this ancestry, subsequent isolation and genetic drift, or both. Further attesting to its isolation is the fact that the unique Papuan-related (non-Austronesian) language, Yélî Dnye, is spoken on Rossel; Yélî Dnye has been variously classified as either a language isolate (Levinson and Majid 2013;Stebbins et al. 2017) or possibly related to Anêm and Ata, two languages of West New Britain in the Bismarck Archipelago (Ross 2005;Eberhard et al. 2021). Moreover, pottery was introduced relatively late on Rossel, around 500-550 years ago, compared with around 2.8 kya elsewhere in the southern Massim . The very high amount of sharing of IBD segments within Rossel (supplementary fig. S15, Supplementary Material online) further supports isolation and genetic drift as responsible for the development of the distinct Papuan-related ancestry on Rossel. We further note that the genetic isolation of Rossel in comparison to other Massim groups was not as apparent in a previous study of mtDNA and Y-chromosome variation in these same samples (van Oven et al. 2014), attesting to the value of genome-wide data for studies of human population history.
Third, there is a striking contrast in patterns of sharing of IBD segments of Austronesian versus Papuan ancestry between groups. There is extensive sharing of short IBD segments (1-5 cM) across the studied Massim region ( fig. 4A; supplementary fig. S18, Supplementary Material online), and Austronesian segments are widely shared among the PNG lowland and island groups but not with the highlands (fig. 4B), in keeping with previous observations of a lack of Austronesian-associated ancestry in the PNG highlands (Stoneking et al. 1990;Bergström et al. 2017;Pugach et al. 2018;Brucato et al. 2021). However, sharing of Papuan segments is strictly either among mainland PNG and Massim groups or among groups from the Bismarcks and Solomons; remarkably, there is no sharing of Papuan-related IBD segments between any mainland PNG or Massim group and any group from the Bismarcks or Solomons (fig. 4C). The Papuan ancestryspecific PCA/ADMIXTURE/TreeMix results also suggest a closer relationship between the mainland PNG and Massim groups compared with the Bismarcks and Solomons ( fig. 3A; supplementary figs. S15 and S16, Supplementary Material online). The extensive sharing of Austronesian-related IBD segments of 1-5 cM, which roughly corresponds to a time of ∼2.7 kya (Al-Asadi et al. 2019), is in keeping with the large impact of the Austronesian expansion, which spread rapidly across the lowland and island regions of Near Oceania shortly after its arrival around 3 kya (Spriggs 1995;Shaw 2016;Kirch 2017). The distinct geographic pattern in the sharing of Papuan-related ancestry-which, to the best of our knowledge, has not been noted previously-must have a different explanation; it cannot reflect the spread of people with both Austronesian and Papuan ancestry as then there should not be a distinction between patterns of IBD sharing for Austronesian-related versus Papuan-related segments (although we cannot rule out that the spread of Austronesian-related ancestry included a small amount of Papuan-related ancestry). We speculate that the arrival of the Austronesians may have impacted the indigenous Papuan societies, as documented in a recent archeological study (Shaw et al. 2022), resulting in enhanced movement of Papuans within these two geographic regions MBE (mainland/Massim, and Bismarck/Solomon Archipelagos). However, IBD sharing of segments of 1-5 cM reflects a time span of a few thousand years, and so the results in figure 4C could also reflect the movement of people prior to the arrival of the Austronesians, or movement unrelated to the arrival of the Austronesians (Brucato et al. 2021). For example, the flooding of the shallow continental shelf east of New Guinea, following the Last Glacial Maximum, appears to have led to depopulation of the current islands of the Massim region, with subsequent recolonization after ∼5 kya ; perhaps the IBD sharing results reflect these movements.
Finally, we found increased sharing of IBD segments between Massim groups practicing Kula versus Massim groups that do not participate in this specific trading exchange ritual in the Massim ( fig. 5; supplementary figs. S21 and S22, Supplementary Material online). Kula, made famous by Malinowski in his classic work Argonauts of the Western Pacific (1922), is an example of "… gift exchange with delayed reciprocity. Two kinds of objects are passed between a chain of partners in a large maritime region (the 'Massim'), providing strong networks of support, a competitive element between the participants and the thrill of adventure" (Kuehling 2021). The objects in question consist of decorated shell armbands (mwali) and decorated shell necklaces (bagi or souvlava) which travel in opposite directions through the participating islands (Malinowski 1922;Kuehling 2021); long-distance travel between islands is largely restricted to Kula-related voyages, leading to the expectation that Kula would have an impact on patterns of gene flow between islands. Indeed, a previous study of mtDNA and Y-chromosome variation in the same samples studied here found less genetic structure between islands for the Y-chromosome than for mtDNA, suggesting a potential impact of male-mediated Kula voyages (van Oven et al. 2014). The increased sharing of IBD segments for Kula versus non-Kula groups in the Massim is further evidence for the impact of this cultural practice on patterns of gene flow. However, we note that the increased sharing of IBD segments for Kula versus non-Kula groups occurs across all size classes of IBD segments ( fig. 5), and thus has an approximate associated time depth of a few thousand years (Ralph and Coop 2013;Al-Asadi et al. 2019), whereas archeological evidence suggests a time depth for Kula of ∼500 years (Shaw 2016;Irwin et al. 2018), as well as the existence of other interisland exchange networks that preceded the formation of Kula (Shaw 2016;Shaw and Langley 2017) and from which Kula may have developed. Thus, both archeological and genetic evidence suggest that Kula should be viewed as arising out of a previous history of enhanced contact among the islands involved, rather than necessarily introducing novel avenues of contact that did not exist before. We observed a decline in the relative amount of IBD sharing with longer IBD segments ( fig. 5; supplementary figs. S21 and S22, Supplementary Material online) and no apparent change in IBD sharing at the time of Kula ring formation. This may reflect either that the development of Kula was not associated with any increase in the intensity of contact between participating islands, or a lack of power to detect any such increase due to the nature of IBD sharing being restricted to fewer individuals in more recent times (Ralph and Coop 2013;Al-Asadi et al. 2019). Nonetheless, we identified some strong sharing of long IBD segments within northern and southern Massim, suggesting extensive regional interactions in the last few hundred years.
In conclusion, our results concerning the Massim region of PNG fill an important lacuna in genetic studies of Near Oceania. Austronesian-related ancestry varies across the region but overall is in the minority, even though Austronesian languages are in the majority. We demonstrate the existence of a distinct Papuan-related ancestry that is associated with Rossel Island and probably arose as a consequence of its isolation. In addition to the expected signal of a rapid and widespread dispersion of Austronesian-related ancestry across coastal and island Near Oceania, we also found an unexpected signal of substantial movement of people with Papuan-related ancestry that was geographically restricted, occurring exclusively between mainland PNG and the Massim region, and between the Bismarck and Solomon Archipelagoes. We speculate that these movements of people with Papuan-related ancestry may reflect a disruptive impact of the arrival of Austronesian people. Finally, we document the effect of a cultural trait, the Kula, on relative amounts of IBD sharing among participating versus nonparticipating groups; the Kula thus joins other examples of cultural traits, such as residence pattern (Seielstad et al. 1998;Oota et al. 2001) and social stratification (Bamshad et al. 1998), that can influence human genetic diversity. . Written informed consent was obtained from each donor, after the project was explained and all questions answered to the satisfaction of the donor. Genetic work within this study was additionally approved by the Ethics Commission of the University of Leipzig Medical Faculty. MtDNA and Y-chromosome data were published in a previous study (van Oven et al. 2014). Here, we generated genome-wide data (∼1.6 million SNPs) for 255 (192 from Massim, 33 from Gulf Province, and 30 from Central Province) individuals on the Illumina Infinium Multi-Ethnic Global Array (MEGA); genotyping was carried out by the Laboratorio de Servicios Genómicos at the Laboratorio Nacional de Genómica para la Biodiversidad, Irapuato, Mexico, on our request. We first merged our newly generated data with published data from Papuan populations on the MBE same SNP array (Bergström et al. 2017), and then with published WGS data from Oceanians, East Asians, Europeans, and Africans (Mallick et al. 2016;Vernot et al. 2016;Choin et al. 2021). For the WGS data, we obtained the jointly re-called genotypes from Choin et al. (2021). To avoid batch effects between the array and WGS data, we extracted the overlapping sites (including both monomorphic and polymorphic sites) and removed sites whose reference versus alternative alleles were inconsistent between the array and WGS data. For sites with more than two alleles, we first flipped the WGS data and then removed those flipped sites that still had more than two alleles. Merging was done using PLINK v1.9 (Purcell et al. 2007). For quality control, we first excluded sites with more than 5% missing data in the entire data set and sites with more than 50% missing data and/or Hardy-Weinberg equilibrium P values below 0.00005 within a population group (except for groups with only one individual). Then, we removed individuals with more than 5% missing data or with parents speaking different languages or coming from different locations. We also filtered out individuals to exclude up to first-degree kinship pairs. Data missingness and Hardy-Weinberg equilibrium were calculated using PLINK v1.9 while the individuals to be removed to avoid first-degree relatedness (kinship coefficient ≥0.177) were inferred using KING (Manichaikul et al. 2010), as implemented in PLINK v2 (Chang et al. 2015). We used KING to remove second-degree relatedness (kinship coefficient ≥0.0884) to further confirm that the components seen in ADMIXTURE results were not affected by cryptic kinship. There are 776 individuals and 1,408,767 SNPs remaining after quality control (supplementary table S1, Supplementary Material online indicates the individuals that were removed and why).

Population Structure Analyses
For population structure analyses, variants were pruned beforehand for linkage disequilibrium using PLINK v1.9, excluding one variant from pairs with r 2 > 0.4 within windows of 200 variants and a step size of 25 variants. There were 366,193 SNPs left after pruning. PCA was done with smartpca v16,000 (Patterson et al. 2006). Two individuals (papuan6278328 and papuan6278360) from Bergström et al. (2017) were identified as PCA outliers and excluded from this study. We then ran ADMIXTURE v1.3.0 (Alexander et al. 2009) for K = 2 to K = 15 with 100 replicates for each K with random seeds. We used pong v1.4.7 (Behr et al. 2016) to visualize the 20 ADMIXTURE replicates with the highest likelihoods for the major mode at each K. We also plotted K = 2 (to visualize the variation of East Asian-and Papuan-related ancestries) and K = 8

Data Phasing
We used consHap (Al Bkhetan et al. 2019) to obtain a consensus of phasing from the results of SHAPEIT v2 (Delaneau et al. 2013), BEAGLE v5.1 (Browning, Zhou, et al. 2018), and EAGLE v2 (Loh et al. 2016) using the genetic map from the 1000 Genomes Project Phase3 (Genomes Project Consortium et al. 2015). In general, phasing accuracy can be increased by increasing the number of iterations and conditioning states on which haplotype estimation is based (Browning and Browning 2011).

MBE
We therefore ran SHAPEIT v2 with options -burn 10,-prune 10, and -main 30 for iteration number with 500 conditioning states, leaving other parameters as default; BEAGLE v5.1 with options burnin = 12, iterations = 24, phase-states = 24, and ne = 800,000 (this is smaller than the default value, but is recommended by the authors for populations with a smaller effective population size, as expected for island populations in Oceania); EAGLE v2 with -Kpbwt 40,000 (which determines the number of conditioning haplotypes). We did not use a reference panel for phasing as 1,000 Genomes or other available WGS data sets lack representative cohorts for Oceanian populations.

IBD Analyses
We identified shared IBD blocks between each pair of individuals and homozygous-by-descent (HBD; same as ROH) blocks within each individual using refinedIBD (Browning and Browning 2013). Both identified IBD and HBD blocks are considered as IBD blocks in our analyses, which is analogous to pairwise shared coalescence (PSC) segments in a previous study (Al-Asadi et al. 2019). The IBD blocks within a 0.6 cM gap were merged using the program merge-ibd-segments from BEAGLE utilities (https:// faculty.washington.edu/browning/refined-ibd), allowing only one inconsistent genotype between the gap and block regions. We used IBD blocks at least 2 cM in length shared by individuals within a population to investigate the demography of each population group. Then, we used IBD blocks in 1-5, 5-10 cM, and over 10 cM to investigate the sharing between individuals from different populations for different time periods (Ralph and Coop 2013;Al-Asadi et al. 2019). For network visualization of the sharing between populations, the pairs with average sharing ≥ 0.5 (i.e., on average at least half of the pairs share IBD blocks) were kept to reduce noise and false positives. We summarize the patterns of shared IBD length within a population by averaging over all comparisons between individuals, that is, we define the average of summed IBD length L as where n is the number of individuals in population X and ibd(X i, X j ) is the length of IBD shared between individuals X i, and X j . For the number of blocks, an analogous equation applies. Similarly, for sharing between two populations X, Y, we define the average of summed IBD length L as where m is the number of individuals in population Y. In order to compare IBD sharing between Massim groups (Kula versus non-Kula), we use a statistic motivated by FST (Bhatia et al. 2013). In particular, we define the similarity statistic S as S (X, Y) = 2 L (X, Y) L (X) + L (Y) This statistic will be zero if there is no IBD sharing between X and Y (since L(X, Y ) is zero), and it will be one if IBD sharing is independent of population structure, that is, L(X ) = L(Y ) = L(X, Y ). For our analysis, we compute a matrix of pairwise S-statistics for all pairs of populations. In order to test whether there is more recent gene flow within the Kula/non-Kula group relative to the background, we use the relative similarity statistic R as where S within and S all are the average pairwise S-statistics within Kula/non-Kula populations and all Oceanian populations, respectively. If the IBD sharing among the Kula/ non-Kula groups and among all Oceanians were equal then R will be zero, and the more shared migration the Kula/non-Kula groups have, the larger R will be. We calculate the standard error of the R-statistic by jackknife resampling the chromosomes. The significance of the R-statistic between Kula-versus non-Kula groups was estimated by comparing the observed value to the distribution of simulated values generated by randomly assigning Massim groups to either Kula-or non-Kula groups and calculating the difference in relative IBD similarity. Scripts made for analyzing IBD results are available from https://github.com/dangliu/Massim_project.

ChromoPainter and GLOBETROTTER Analyses
To study haplotyple sharing, ChromoPainter v2 (Lawson et al. 2012) was run on the phased data set with sample sizes for each group randomly down-sampled to 5 (all individuals were used for groups with size <5). We began with 10 iterations of the EM (expectation maximization) process to estimate the switch rate and global mutation probability, using chromosomes 1, 5, 10, 15, and 20. With the estimated switch and global mutation rates, we ran the chromosomal painting process for all chromosomes, which then gave the output for downstream analyses. We first attempted to paint the Massim chromosomes using them only as recipients and all of the other individuals as both donors and recipients. The EM estimation of switch rate and global mutation probability were ∼142.27 and ∼0.0002, respectively, which were then used as the starting values for these parameters for all donors in the painting process. To investigate differential affinities of Massim groups to PNG highlanders versus Rossel, we also performed another run using the Massim (except for Rossel) only as recipients and all of the other individuals (including Rossel) as both donors and recipients. The EM estimation of switch rate and global mutation MBE probability for this analysis were ∼142.21 and ∼0.0002, respectively.
To investigate the admixture of Austronesian-and Papuan-related ancestries, GLOBETROTTER (Hellenthal et al. 2014) was run on the ChromoPainter output with all Massim as recipients and East Asians (including both Austronesian and non-Austronesians to increase power) and Papuan Highlanders (who showed no more than 0.0001 East Asian-related ancestry in the ADMIXTURE results for K = 2) as surrogates. We first tested the certainty and potential waves of admixture events, and then estimated the major and minor sources as well as the dates of admixture. The Rossel group showed an unclear signal for admixture inference, while a single pulse of admixture was inferred for the other groups. The distributions of admixture dates were estimated via 100 bootstraps.

MALDER Admixture Dating
We used MALDER (https://github.com/joepickrell/malder; last accessed on May 25, 2022) with default settings, using East Asian Austronesian and Southern Highland groups as reference sources, to date the time of admixture of Austronesian and Papuan ancestries in Massim groups.
Local Ancestry Inference and Ancestry-Specific Analyses MBE code EGAS00001006010. Scripts written/modified for ancestry-specific analyses and IBD analyses are available from https://github.com/dangliu/Massim_project.