-
PDF
- Split View
-
Views
-
Cite
Cite
Sam Fields, Camille Lyans Cole, Catherine Oei, Annie T Chen, Using named entity recognition and network analysis to distinguish personal networks from the social milieu in nineteenth-century Ottoman–Iraqi personal diaries, Digital Scholarship in the Humanities, Volume 38, Issue 1, April 2023, Pages 66–86, https://doi.org/10.1093/llc/fqac047
- Share Icon Share
Abstract
The diaries of Joseph Mathia Svoboda capture over 40 years of trade on the Tigris, describing his daily life and regular journeys as a steamboat purser during the late nineteenth and early twentieth centuries, specifically between the cities of Basra and Baghdad. They offer a unique perspective on daily life, community structure, and social relations. However, with over 600 pages of transcribed material and many more diaries still in the process of being transcribed, it is difficult to track patterns and changes in Joseph Svoboda’s social relationships and daily life by way of reading and inference alone. This article employs natural language processing (NLP) and network analysis to facilitate study of Svoboda’s social interactions, as well as his observations of his broader social milieu. Inspection of the networks and accompanying visualizations showed that Svoboda’s close interactions were primarily with kin, but his position as a steamship purser gave him a unique vantage point to encounter a wide range of persons of diverse backgrounds. Additionally, decomposing networks by time illustrated how significant life events facilitated change in social interactions.
1 Introduction
According to the official Ottoman count, in 1898/1899 the population of Baghdad was 80% Muslim, 18% Jewish, and 2% Christian (Bağdad vilayet-i celilesine mahsus salname, 1316 (1898/1899)). The rest of the Ottoman Empire was similarly multi-ethnic and multi-religious, though the specific population makeup varied from province to province and city to city. But what did numbers like that really mean for people’s lives? While popular narratives about the Middle East have perpetuated stereotypes about the region as characterized by ancient and unchanging enmities, scholars have long been interested in how people lived together—or not—in mixed cities like turn-of-the-century Baghdad.
The historiography of intercommunal relations in the Ottoman Empire has two main threads. On the one hand, scholars have shown how members of different communities embraced an explicitly multi-communal Ottoman imperial identity, especially after the second Constitutional Revolution of 1908 (Bashkin, 2010; Campos, 2011; Makdisi, 2019). On the other hand, historians have traced the emergence of sectarianism as a recognizable modern phenomenon, beginning in the 1860s but continuing in the early years of the twentieth century and beyond (Makdisi, 2000; Der Matossian, 2014; Bet-Shlimon, 2020). For all their differences, both of these historiographies are largely political in nature, focusing on often-dramatic events and public discourses.
It has been much more difficult to explore how intercommunal relations affected the contours of daily life. Many memoirs, written after the often-forced cultural homogenization of the twentieth century, are tinged with nostalgia for a remembered pre-sectarian/pre-nationalist utopia (Campos, 2021, p. 166). And, as Julia Clancy-Smith has pointed out, sources dealing with daily life tend to be fragmentary, with street conflict and communal quarrels generally much better documented than mundane social exchanges (Clancy-Smith, 2011, p. 57). Clancy-Smith’s work on nineteenth-century Tunis works around these constraints to address how Tunisians lived in close proximity to and interacted with ‘others’ – in this case, mostly Southern European immigrants. In another approach, Michelle Campos uses a GIS analysis of census data from Ottoman Jerusalem to show that neighborhoods were simultaneously mixed and deeply segregated, with shared spaces facilitating certain kinds of cross-communal social interactions and relationships (Campos, 2021).
One reason histories of daily life in the Ottoman Empire are so scarce is the lack of sources like diaries which historians of Europe and North America have used to reconstruct the ordinary and day-to-day. This is not universally true: scholars have used the ‘almanacs’ of Said Bey in Istanbul (Dumont, 1993) and the notebooks of Na‘um Bakhkhash in Aleppo (Masters, 2001; Grehan, 2019) to outline the daily economic and cultural practices of their authors. But for the most part, we do not know in any detail how people lived in mixed cities like Ottoman Baghdad, or what forms of community and difference actually structured daily life. Social interactions were not shaped only by the communal identities that dominate the literature; they involved ‘people who were also neighbors, workers, competitors, or illicit lovers’ (Clancy-Smith, 2011, p. 11). Who did people interact with on a daily basis? What kind of interactions were they? Where did they happen?
To begin to answer these questions, this article turns to the diaries of Joseph Mathia Svoboda, an Austro-Hungarian subject and native of Baghdad who worked as a steamship purser on the Tigris River. Forty-eight of Svoboda’s original sixty-one diaries, spanning the years 1865–1908, are extant, and they contain exceptionally detailed daily accounts of the shipping business, local and international political developments, and his interactions with friends, family members, officials, and others. Particularly because of their consistency and level of detail, the Svoboda diaries offer the kind of insight into quotidian social relationships missing from many other regional sources.
The same features which make the diaries so valuable—namely, their consistency, longevity, and detail—can make working with them unwieldy. To address the analytical challenges posed by such an extensive corpus, this article proposes an approach that combines close and distant reading. Specifically, we combine named entity recognition (NER) with social network analysis (SNA) to characterize the persons that Joseph mentions over time in a subset of his diaries.
1.1 NER, network analysis, and distant reading
Our combination of close- and distant-reading techniques draws on earlier work by both historians and literary scholars faced with large bodies of work like the Svoboda diaries. Distant reading is an approach that focuses on the analysis of large-scale, quantitative characteristics of corpora (Elwert, 2016), while close reading is a more classical humanities approach to the detail of selected works. After a backlash against early proponents of distant reading, recent work has argued for an ‘oscillation’ between close and distant or machine reading (Elford, 2016; Hesselberth et al., 2018).
Distant reading is not a single technique, but can refer to any ‘method of literary research and interpretation that draw[s] upon computational analysis to move beyond the human limitations of vision, memory, and attention’ (Houston, cited in Elford, 2016, p. 199). Here, we employ natural language processing (NLP) techniques to automatically identify people mentioned in the diaries, and then combine it with SNA to visualize and analyze the relationships between those people. Previous work has also combined NER and SNA to study textual corpora, from the date-books of the Italian president (Tebaldi et al., 2019) to the correspondence of the Portuguese Empire (Błoch et al., 2020).
NER, a NLP technique, refers to the automatic identification of named entities, defined as persons, organizations, place names, and numeric expressions such as dates or amounts from texts (Nadeau and Sekine, 2007; Desmet and Hoste, 2014). Using the persons extracted from the diaries using NER, we then render networks which we use as a starting point for an iterative process of close-reading the diaries. SNA is a method used to study relationships among individuals, groups, organizations, and other social units (Günay, 2012).
Given our interest in everyday interactions between persons of various backgrounds, relevant applications include the use of SNA to study contention between neighboring villages (Barkey and Van Rossem, 1997), as well as the use of narrative network analysis to study the evolution of political discussions and roles of actors in public life (Lansdall-Welfare et al., 2017; Padgett et al., 2020). As the current dataset is derived from a single source, we also draw upon research using quantitative analyses of literary works, such as Prado et al.’s temporal analysis of Alice in Wonderland and La Chanson de Roland (Prado et al., 2016). In network analysis of literary works, connections between actors can be inferred in a variety of ways, including through speech acts, characters who participate in the same event, and characters who appear in the same location (Kubis, 2021). Community detection algorithms can then be used to study the interactions and relationships between the characters (Zhang et al., 2021). In the corpus that we examine, the natural structure of the diary being separated into days made the NER and network analysis with subcommunity detection a promising approach to analysis.
1.2 Data and aims
Joseph Svoboda, who wrote the diaries which form the base of our study, worked as a steamship purser for the Euphrates and Tigris Steam Navigation Company (ETSN), a British firm that operated ships between Baghdad and Basra on the Tigris River beginning in the 1860s (Fig. 1). ETSN, which was owned by the trading company Lynch Brothers, played a central role in the complex political and economic relationship between Britain and the Ottoman Empire in Iraq (Cole, 2018), as the mutual precarity of steamship navigation and of the British-Ottoman relationship reinforced one another (Cole, 2016).

Map of cities along the Tigris and Euphrates rivers, rendered in QGIS. Basemap: Natural Earth II with Shaded Relief.
In addition to his unique personal perspective on the political, economic, and environmental histories of British steamshipping on the Tigris, Svoboda recorded observations about Ottoman steamships (Cole, 2021), environmental conditions, and local political developments. Above all, he wrote about people: from the family and friends he saw regularly; to co-workers and members of his community; wealthy and important grandees, officials, and merchants; steamship passengers and laborers; and everyone in between. In this article, we focus on the social aspects of Svoboda’s writing and life.
While the corpus of the Svoboda diaries spans more than 40 years, just three diaries (47, 48, and 49) have been fully digitized and published. We developed a pipeline that involves extraction of person mentions from the text of these three diaries using NER and then subsequent visualization using SNA and other techniques.
We use network graphs to study Svoboda’s daily life and interactions in two ways. First, we investigate the role of family and kinship in his life by visualizing kinship within the social network and characterizing the prominence of kin interactions through network metrics. The history of the family is complicated everywhere, but is particularly fraught in the historiography of the Middle East. Often, the Middle Eastern (or Muslim, or Arab) family has served as an explanation for the persistence of ‘traditional’ society in the region (Moumtaz, 2019). This one-dimensional vision of the social significance of family is belied by the variety of family and household forms that have existed historically across the region, and even within the confines of a single city (Doumani, 2003). Recently, scholars have used legal sources to show how people in the Ottoman Middle East used practices like inheritance to constitute families in different ways (Doumani, 2017). Moreover, because of the relational nature of family as something embedded within broader social relations, scholars in other contexts have argued for the utility of SNA as a way to conceptualize the meaning and extent of kinship relations (Wetherell et al., 1994).
In asking about kinship, we are approaching the network with pre-existing ideas about what might be important. In contrast, our second mode of analysis employs a community detection algorithm, which identifies community structures that may exist in complex systems (Chaudhary and Singh, 2020). Communities may be considered collections of nodes (in this case, persons) that have closer connections to one another than to nodes outside of their own community. We employ a modularity optimization algorithm, which optimizes a function that compares the density of edges within communities as compared to links between communities (Blondel et al., 2008). Other historians have worked with modularity optimization algorithms to, for example, detect distinct communities of news and information within the media landscape of early-modern Europe (Ryan, 2019). In this case, we employ a subcommunity detection algorithm to characterize the persons that Joseph describes around him as he goes about his daily life. We take the groups generated by the algorithm back to the diaries, moving between close and distant reading to better understand Svoboda’s social relations, and, more broadly, the social milieu of late Ottoman Iraq.
2 Corpus Construction and Annotation
Between the 1860s and 1908, Joseph kept sixty-one diaries. Today, nineteen diaries are available in the form of scanned, digital images of the original diary pages, and twenty-nine are available in the form of scanned, digital images of typed diary transcriptions. In this article, we focus on Diaries 47–49, for which transcriptions of the scanned, digital images are available, and which chronicle Svoboda’s life between November 1897 and October 1899. Table 1 presents an overview of the corpus.
Diary . | Number of entries . | Total number of tokens . | Mean number of tokens per entry . | Minimum number of tokens per entry . | Maximum number of tokens per entry . | Distribution of number of tokens per entry . | ||
---|---|---|---|---|---|---|---|---|
First quantile . | Median . | Third quantile . | ||||||
47 | 273 | 51,503 | 188.66 | 54 | 630 | 117 | 173 | 247 |
48 | 210 | 48,555 | 231.21 | 46 | 1,055 | 138 | 205.5 | 284.25 |
49 | 226 | 46,297 | 204.85 | 59 | 661 | 134.75 | 189 | 254.25 |
Diary . | Number of entries . | Total number of tokens . | Mean number of tokens per entry . | Minimum number of tokens per entry . | Maximum number of tokens per entry . | Distribution of number of tokens per entry . | ||
---|---|---|---|---|---|---|---|---|
First quantile . | Median . | Third quantile . | ||||||
47 | 273 | 51,503 | 188.66 | 54 | 630 | 117 | 173 | 247 |
48 | 210 | 48,555 | 231.21 | 46 | 1,055 | 138 | 205.5 | 284.25 |
49 | 226 | 46,297 | 204.85 | 59 | 661 | 134.75 | 189 | 254.25 |
Diary . | Number of entries . | Total number of tokens . | Mean number of tokens per entry . | Minimum number of tokens per entry . | Maximum number of tokens per entry . | Distribution of number of tokens per entry . | ||
---|---|---|---|---|---|---|---|---|
First quantile . | Median . | Third quantile . | ||||||
47 | 273 | 51,503 | 188.66 | 54 | 630 | 117 | 173 | 247 |
48 | 210 | 48,555 | 231.21 | 46 | 1,055 | 138 | 205.5 | 284.25 |
49 | 226 | 46,297 | 204.85 | 59 | 661 | 134.75 | 189 | 254.25 |
Diary . | Number of entries . | Total number of tokens . | Mean number of tokens per entry . | Minimum number of tokens per entry . | Maximum number of tokens per entry . | Distribution of number of tokens per entry . | ||
---|---|---|---|---|---|---|---|---|
First quantile . | Median . | Third quantile . | ||||||
47 | 273 | 51,503 | 188.66 | 54 | 630 | 117 | 173 | 247 |
48 | 210 | 48,555 | 231.21 | 46 | 1,055 | 138 | 205.5 | 284.25 |
49 | 226 | 46,297 | 204.85 | 59 | 661 | 134.75 | 189 | 254.25 |
The methodical manner in which Joseph Svoboda dated his diary entries allowed us to easily decompose the text of his diaries into a collection of dated entries. This segmentation of the diaries later allowed us to easily identify people who appeared together in the same dated diary entry. This then became the basis for defining a relationship among entities in our rendered social networks.
To perform NER, it is necessary to have a training set, or a corpus in which the set of entities that one wants to identify is known. Two annotators manually annotated all of the persons mentioned in Diaries 47 and 48. A salient characteristic of this corpus is that Joseph often identifies individuals not only by name, but by descriptors that may designate racial, ethnic, or religious background, occupation (‘inspector’, ‘merchant’, ‘waly’), or relationship (‘niece’, ‘cousin’) to him or others. In some instances, Joseph omits the name entirely and includes only the descriptor. To enable richer descriptions of persons, annotators captured all possible descriptors pertaining to a referenced individual as a single entity in their annotations. We also captured individuals referenced by other non-named means, like ethnicity or occupation, if the entities were capitalized. In this analysis, our annotation scheme only included a single class, ‘Person’, and we used the BILUO tagging schema employed by spaCy (Data Formats · Spacy API Documentation, 2022).
We identified all disagreements in annotations, such as an annotator’s omission of an entity or a difference in the span of an entity that both annotators identified. These disagreements were resolved according to our annotation guidelines and their resolutions included within our annotated collection of Diary 47. The inter-annotator agreement on Diary 47 was 91%. We deemed this to be acceptable, and thus, the two annotators divided up the entries in Diary 48 and annotated those to serve as the remaining entries in our reference standard.
3 Identifying Person Mentions Using NER
To identify the persons mentioned in Joseph’s diaries, we began by employing our annotations from Diary 47 to train a model using the NER module for SpaCy, a popular NLP software library (Vychegzhanin and Kotelnikov, 2019) [v3.1.1]. We evaluated two different types of agreement: Identical and Overlap. The identical agreement type requires that the span of the entity identified by the model and the entity recorded in the annotation data set be entirely the same. An overlap agreement is more lenient. If any part of the guessed entity overlaps with the entity recorded in the annotation data set, it would be considered an agreement. For example, in Diary 48, ‘Mr. Felix Faure’ appears. Whereas the annotators marked only the name, ‘Mr. Felix Faure’, there were instances in which the model identified ‘Mr. Felix Faure President of the French Republic’ or some variant as the name. There were also times in which the span identified by the model was less complete than that identified by the human annotators. In either situation, this would be considered overlapping, but not identical, agreement.
We evaluated the model’s performance using precision, recall, and F-measure. Recall and precision are different methods of quantifying a machine learning model. Recall (R) is the number of instances classified by the annotators that the system successfully identifies. Precision (P) essentially describes how accurately the model is guessing. In this study, we employ an adjustment for precision suggested by Liu et al. (2015) to account for issues in calculating precision involving partial matches. F-measure is the harmonic mean of precision and recall (Batbaatar and Ryu, 2019). As such, the formulas we used were: where TP = the number of entities correctly identified, NS = the number of entities identified by the system, and NG = the number of entities identified by the annotators
Performance metrics for NER are shown in Table 2. To estimate the performance of our algorithm on Diary 49, we annotated a random sample of 10% of the entries and calculated performance metrics on this set.
Method . | Agreement type . | 48 . | 49 . | ||||
---|---|---|---|---|---|---|---|
Prec. . | Recall . | F . | Prec. . | Recall . | F . | ||
SpaCya | Identical | 0.69 | 0.70 | 0.70 | 0.74 | 0.67 | 0.70 |
Overlap | 0.86 | 0.87 | 0.86 | 0.91 | 0.82 | 0.86 | |
Human-in-the-loop bootstrap | Identical | 0.68 | 0.77 | 0.72 | 0.79 | 0.79 | 0.79 |
Overlap | 0.84 | 0.94 | 0.89 | 0.93 | 0.94 | 0.93 |
Method . | Agreement type . | 48 . | 49 . | ||||
---|---|---|---|---|---|---|---|
Prec. . | Recall . | F . | Prec. . | Recall . | F . | ||
SpaCya | Identical | 0.69 | 0.70 | 0.70 | 0.74 | 0.67 | 0.70 |
Overlap | 0.86 | 0.87 | 0.86 | 0.91 | 0.82 | 0.86 | |
Human-in-the-loop bootstrap | Identical | 0.68 | 0.77 | 0.72 | 0.79 | 0.79 | 0.79 |
Overlap | 0.84 | 0.94 | 0.89 | 0.93 | 0.94 | 0.93 |
We performed this method five times and averaged the performance results to obtain a more representative estimate of the algorithm’s performance.
Method . | Agreement type . | 48 . | 49 . | ||||
---|---|---|---|---|---|---|---|
Prec. . | Recall . | F . | Prec. . | Recall . | F . | ||
SpaCya | Identical | 0.69 | 0.70 | 0.70 | 0.74 | 0.67 | 0.70 |
Overlap | 0.86 | 0.87 | 0.86 | 0.91 | 0.82 | 0.86 | |
Human-in-the-loop bootstrap | Identical | 0.68 | 0.77 | 0.72 | 0.79 | 0.79 | 0.79 |
Overlap | 0.84 | 0.94 | 0.89 | 0.93 | 0.94 | 0.93 |
Method . | Agreement type . | 48 . | 49 . | ||||
---|---|---|---|---|---|---|---|
Prec. . | Recall . | F . | Prec. . | Recall . | F . | ||
SpaCya | Identical | 0.69 | 0.70 | 0.70 | 0.74 | 0.67 | 0.70 |
Overlap | 0.86 | 0.87 | 0.86 | 0.91 | 0.82 | 0.86 | |
Human-in-the-loop bootstrap | Identical | 0.68 | 0.77 | 0.72 | 0.79 | 0.79 | 0.79 |
Overlap | 0.84 | 0.94 | 0.89 | 0.93 | 0.94 | 0.93 |
We performed this method five times and averaged the performance results to obtain a more representative estimate of the algorithm’s performance.
We observed that some entities were being missed in the automated extraction, but these same entities were being captured on other parts of the text; there were also entities that were identified that were not persons (e.g. ‘ghee’, ‘public debts’). We developed a human-in-the-loop method to address these issues. This method leverages entities that had previously been captured using NER on the diaries and uses these to extract entities from the diary in question.
Our human-in-the-loop pipeline enables the analyst to examine the results, make adjustments, and run the pipeline again until they are satisfied with the entities. Our pipeline performs named entity linking (NEL) concurrently with NER (Fig. 2). There are three main points at which analysts make determinations. In the first, the system identifies entities with three or more tokens separated by spaces, which sometimes suggests that two entities have been captured in one, or reflects an entity in which Joseph has included other descriptors of the person besides their name. The analyst can then specify appropriate shorter designations for these ‘long entities’, or suggest that they be broken up into two or more entities. The second interim point identifies terms that are similar to one another in terms of being transposable within a certain percentage of characters (‘close entities’) and prompts the analyst to indicate whether the terms should be resolved to the same unique entity. The third interim point involves identification of extracted entities that are consecutive and then presenting these to the analyst to confirm or reject combining the consecutive entities into a single entity.

There are three files critical to this process: a key file containing aliases for persons, which is commonly employed in NER to disambiguate persons; a list of ‘non-specific’ referents, such as ‘waly’, ‘Jew’, and ‘carpenter’, which were used by Joseph to describe persons he mentions; and a list of words/phrases to exclude. These files enable resolution of acceptable variants to a single entity, entity aggregation, and exclusion as appropriate.
Aside from the three points involving close inspection by analysts, there is a rule-based component that involves aggregation rules, in which relational terms such as ‘wife’, ‘wives’, and ‘family’ were collapsed to generic identifiers, with a few exceptions (e.g. Joseph’s wife was resolved to ‘Eliza’). A few place and organization names that also happened to be people’s names were also excluded using these rules (examples shown in Fig. 2).
First names posed somewhat of a challenge in disambiguation. In the case of individuals who were mentioned more frequently, first names were usually mapped to the most frequently occurring individual with that first name. For example, ‘Rufail’ maps to ‘Rufail Sayegh’. If a given first name could plausibly map to more than one individual (i.e., two people of the same first name were equally prominent in the three texts), then the first name was left unresolved (e.g. ‘Yousif’). There were two individuals who shared the same first and last name that we know of, Joseph’s brother Alexander, and his son Alexander. These two were mapped to the same name, but almost all mentions are of the son. The mappings were reviewed by three of the authors along with close examination of the text.
We can see that this method improves recall over the original method using SpaCy alone on Diaries 48 and 49. In terms of entities that were identified through the human-in-the-loop method but not in the reference standard for Diary 48, the causes included inconsistent annotation of telegrams, which were often in other languages (the example in Fig. 2, where ‘Asfar’ and ‘Svoboda’ appear next to one another separated by a space, is due to formatting of the telegram), completely unseen entities, and different spellings of the same entities, suggesting that additional functionality to account for transliteration of names could be helpful.
We performed a retrospective analysis of coreference resolution by reviewing the named entities and their linkages for a random sample of 10% of the days in each diary and making corrections as needed to produce a key. We then employed the key and the system predictions with the CoVal package (Moosavi and Strube, 2016) to calculate common coreference evaluation metrics (Table 3). Performance on the two diaries in the evaluation was comparable for all metrics, with those for MUC being slightly higher, but MUC is known for being the least discriminative metric. In reviewing the errors, though there were a few missed mentions, errors in spans resulting in over-identification of entities were more common. For example, ‘Terooza wife of Nassoory Andrea’ was identified as ‘Terooza’ and ‘Nassoory Andrea’ rather than a single entity. One of the challenging aspects of working with this corpus was the extent to which Joseph used descriptors to identify persons, resulting in longer entities being detected as two separate entities.
. | 48 . | 49 . | ||||
---|---|---|---|---|---|---|
. | Prec. . | Recall . | F . | Prec. . | Recall . | F . |
MUC | 0.97 | 0.97 | 0.97 | 0.93 | 0.97 | 0.95 |
B3 | 0.94 | 0.93 | 0.93 | 0.90 | 0.94 | 0.92 |
CEAF | 0.91 | 0.89 | 0.90 | 0.91 | 0.92 | 0.91 |
LEA | 0.92 | 0.92 | 0.92 | 0.89 | 0.93 | 0.91 |
. | 48 . | 49 . | ||||
---|---|---|---|---|---|---|
. | Prec. . | Recall . | F . | Prec. . | Recall . | F . |
MUC | 0.97 | 0.97 | 0.97 | 0.93 | 0.97 | 0.95 |
B3 | 0.94 | 0.93 | 0.93 | 0.90 | 0.94 | 0.92 |
CEAF | 0.91 | 0.89 | 0.90 | 0.91 | 0.92 | 0.91 |
LEA | 0.92 | 0.92 | 0.92 | 0.89 | 0.93 | 0.91 |
. | 48 . | 49 . | ||||
---|---|---|---|---|---|---|
. | Prec. . | Recall . | F . | Prec. . | Recall . | F . |
MUC | 0.97 | 0.97 | 0.97 | 0.93 | 0.97 | 0.95 |
B3 | 0.94 | 0.93 | 0.93 | 0.90 | 0.94 | 0.92 |
CEAF | 0.91 | 0.89 | 0.90 | 0.91 | 0.92 | 0.91 |
LEA | 0.92 | 0.92 | 0.92 | 0.89 | 0.93 | 0.91 |
. | 48 . | 49 . | ||||
---|---|---|---|---|---|---|
. | Prec. . | Recall . | F . | Prec. . | Recall . | F . |
MUC | 0.97 | 0.97 | 0.97 | 0.93 | 0.97 | 0.95 |
B3 | 0.94 | 0.93 | 0.93 | 0.90 | 0.94 | 0.92 |
CEAF | 0.91 | 0.89 | 0.90 | 0.91 | 0.92 | 0.91 |
LEA | 0.92 | 0.92 | 0.92 | 0.89 | 0.93 | 0.91 |
4 Exploring Joseph’s Social Relations
Having mapped the entities we identified in the diaries to a set of unique persons, we proceeded to explore patterns in Joseph’s social interactions. First, we noted the number of persons appearing in each diary and the frequency at which persons appeared, in terms of days mentioned (Table 4). About half of the persons' Svoboda mentions only appear 1 day in that same diary.
Diary . | Dates . | Persons . | Persons appearing once . | Persons per day (M/SD) . |
---|---|---|---|---|
47 | 4 November 1897—3 August 1898 (∼9 months) | 474 | 251 (53.0%) | 7.9 (5.4) |
48 | 3 August 1898—28 February 1899 (∼7 months) | 348 | 165 (47.4%) | 7.8 (5.9) |
49 | 28 February 1899—11 October 1899 (∼7 months) | 380 | 183 (48.2%) | 7.2 (4.8) |
Diary . | Dates . | Persons . | Persons appearing once . | Persons per day (M/SD) . |
---|---|---|---|---|
47 | 4 November 1897—3 August 1898 (∼9 months) | 474 | 251 (53.0%) | 7.9 (5.4) |
48 | 3 August 1898—28 February 1899 (∼7 months) | 348 | 165 (47.4%) | 7.8 (5.9) |
49 | 28 February 1899—11 October 1899 (∼7 months) | 380 | 183 (48.2%) | 7.2 (4.8) |
Diary . | Dates . | Persons . | Persons appearing once . | Persons per day (M/SD) . |
---|---|---|---|---|
47 | 4 November 1897—3 August 1898 (∼9 months) | 474 | 251 (53.0%) | 7.9 (5.4) |
48 | 3 August 1898—28 February 1899 (∼7 months) | 348 | 165 (47.4%) | 7.8 (5.9) |
49 | 28 February 1899—11 October 1899 (∼7 months) | 380 | 183 (48.2%) | 7.2 (4.8) |
Diary . | Dates . | Persons . | Persons appearing once . | Persons per day (M/SD) . |
---|---|---|---|---|
47 | 4 November 1897—3 August 1898 (∼9 months) | 474 | 251 (53.0%) | 7.9 (5.4) |
48 | 3 August 1898—28 February 1899 (∼7 months) | 348 | 165 (47.4%) | 7.8 (5.9) |
49 | 28 February 1899—11 October 1899 (∼7 months) | 380 | 183 (48.2%) | 7.2 (4.8) |
We also took note of kin, non-kin, and non-specific persons that Svoboda mentioned. For each named person, we determined if they were kin of Svoboda. We identified as kin all of Svoboda’s blood relations, as well as relations-by-marriage to one degree. So, for example, we counted Svoboda’s brother-in-law Antone Marine as kin, but Antone’s wife and children as non-kin. We generated horizontal bar plots illustrating Joseph’s most frequent mentions by filtering out individuals who appear on less than 5 days in the diaries. His most frequent social contacts are kin (orange bars in Fig. 3), and he mentions his wife Eliza and son Alexander much more frequently in Diaries 47 and 48 as compared to 49. There are a few non-kin with whom he interacts relatively frequently (see Fig. 3), though the only non-kin individual who appears prominently in all three diaries is Jeboory Asfar.

Frequency of prominent persons (left) and distribution of persons mentioned (right)
In addition, terms referring to ethnic and religious groups (e.g. ‘Jews’, ‘Christians’, and ‘Arabs’), appear frequently in the charts of prominent persons particularly in Diary 49, in which Joseph seems to focus less on his family and devote more of the diary to describing the passengers on the ship. Relational terms such as ‘sister’, ‘wife’, and ‘son’ appear frequently in all three diaries. Joseph often describes persons in relational terms (e.g. ‘Terooza wife of Antone’), at which point the descriptor ‘wife’ can be mentioned separately from the name itself (and thus be detected as a separate entity), or Joseph may not mention the name at all (e.g. ‘wife of Antone’). The word ‘wife’ was used in various contexts—when Joseph referred to his own wife, as well as when he referred to other people’s wives who he saw socially, and in reference to women travelling by steamship. In this visualization as well as the network visualizations, references to ‘my wife’ were resolved to ‘Eliza’ – Joseph’s wife—while generally other references to ‘wife’ were kept as part of the generic relational node ‘wife’. As the horizontal bar plots only depict individuals who appear on five or more days in any given diary, for each of the diaries there is a much longer tail of individuals—primarily non-kin—that he mentions in passing but does not elaborate upon at length. These individuals account for the most substantial portion of the charts showing the distribution of persons mentioned.
4.1 Using SNA to characterize social interactions
To characterize Joseph Svoboda’s interactions with others, we defined a network using persons that Svoboda mentioned in his diaries. The nodes of the network were the persons identified through NER; edges, or relations, between two nodes were considered to exist if both people appeared together in the same diary entry. We set the size of each node to the node’s frequency (the number of times persons appeared within the diary), and set the weight of each edge to the number of times two nodes or persons appeared together in the same diary entry. Being mentioned on the same day thus served as a proxy for a social connection, such that the more often two people were mentioned together, the stronger their social connection was presumed to be. When visualized, the edge weight might be considered the width of the edge.
We visualized the networks using Gephi (Bastian et al., 2009), an open-source network analysis software, with the ForceAtlas2 layout (Jacomy et al., 2014), a network layout algorithm that leverages energy attraction and repulsion in the network to approximate relation strength, and thus, closer social interaction, between person nodes. After running ForceAtlas2, we employed the LabelAdjust algorithm, which adjusts the spacing between the nodes to make the labels (person names) more readable. This dampened the sense of apparent closeness, but was necessary for the purposes of readability. To identify subcommunities, or persons who might share similarities with one another, we employed the Louvain method (Blondel et al., 2008) and colorized the nodes by modularity grouping. Figure 4 shows a network visualization rendered using this method, for Diary 49.

To characterize the modularity groupings, we studied the groupings in conjunction with the texts. We saw that person nodes of the same modularity grouping were often clustered together in diary entries. Many of the nodes in the green grouping were persons in Svoboda’s social circle that resided in Baghdad, and many of the nodes in the blue grouping represented persons who resided in Basra (Fig. 5a and 5b, respectively).

These first two modularity groupings constituted the second and third largest constituent parts of Svoboda’s network, with the Baghdad constituent comprising 32.9% of his relations, and Basra, 17.6%. Another modularity group consisted largely of people and groups associated with Joseph’s work on the ship (41.1%, noted in purple). Most of the individuals in this group are passengers (Fig. 5c), but it also includes individuals who Joseph encounters at intermediate destinations. Joseph’s brother Henry, who was employed on the second Lynch steamship and with whom he often exchanged letters while in transit, also appears in this group.
Aside from these three primary groupings, we see that in this network visualization, a few other groups can be discerned, reflecting diary subplots. For example, the orange group (7.9%) is based on diary entries involving the passage of a Persian Shahzada (prince) on the ship. From Joseph’s description, we learn some aspects of the social interaction, including who visits the Shahzada and the extent to which he and the persons accompanying him do and do not interact with the ship’s crew and passengers.
In addition to reflecting Joseph’s social interactions with individuals in different spaces and contexts, the modularity groupings offer some insight into how and where he encountered different social groups, perhaps pointing to the social role of steamships, in his life and in general. For example, ‘Jews’ appears as a relatively large node in the passenger modularity grouping, reflecting the fact that Joseph mainly interacted with Jewish people as passengers. The node is particularly large because of the importance of the annual Jewish pilgrimage to Azair for the steamship business. Because large numbers of Jewish passengers traveled on deck during the pilgrimage season, Joseph did not identify them individually, but only in terms of their group identity. He sometimes identifies Muslim pilgrims in the same way (e.g. ‘our passengers are mostly pilgrims returning from Mecca Persians & some Arabs’) but neither the hajj nor the Shi‘a pilgrimage to Najaf and Karbala was as important to the steamship business as the Jewish pilgrimage to Azair.
The prominence of the ‘Jews’ node in the ‘passengers’ modularity grouping, together with the fact that this group includes the largest proportion of individual Arab Muslims, speaks to the steamship as a space for some kinds of social mixing, albeit still a hierarchical one (Johnson, 2013, pp. 129–41; Reinhardt, 2018, pp. 148–78). While both the ‘Baghdad’ and ‘Basra’ groupings include some references, both generic and specific, to people who are neither European nor Christian, they are dominated by individuals in Svoboda’s immediate communities. In contrast, the ‘passengers’ grouping includes people across a spectrum of identities and origins, whether explicitly noted by Svoboda or not. However, we should not interpret this as an indication that people mixed freely across boundaries of class, gender, or ethnicity on the ships. Part of the reason the passengers appear connected is that Joseph frequently lists all passengers when they board the ship. Because we have used mentions on the same day as a proxy for connection, these people appear connected, whether they interacted or not. At the same time, Joseph’s diaries do describe a certain amount of mixing among people of different ethnicities, within the bounds of class. We can see this in his mentions of people who ‘mess’ or eat with the captain and ship’s officers, including Joseph himself (Fig. 5d).
On this journey, a mixed group of Ottomans, including people who would likely identify as Arabs today as well as officials of indeterminate origin—it is not entirely clear whether Joseph uses the word ‘Turk’ to indicate ethnicity, government employment, or some combination of the two—dined with Joseph and the ship’s European officers. It is not clear whether the final cabin passenger, Rezooki Serkis, who was a member of Joseph’s close community in Baghdad, dined with them or not. At other points in Diary 49, Joseph indicates that individuals including an accountant in the Ottoman Public Debt administration, a Jew named Meneshi Mathalon, the director of the Baghdad Customs House, and some American archaeologists dined with the ship’s officers. It is possible, but not visible in the diaries, that deck passengers likewise mixed across ethnic and religious lines, but within the bounds of class.
In addition to the extent of social mixing in Joseph’s life, his use of generic descriptors offers insight into how he viewed the world around him. He most often uses ethnic, religious, and national descriptors as adjectives to characterize specific people (‘German Consul’, ‘Persian Shahzada’, ‘Tilkefly girl’). Of these descriptors, he uses some to describe groups as well as individuals, and four (Arab, Christian, European, Jew) primarily to describe groups. While many of these descriptions are clear, Joseph’s use of words we would today consider straightforward ethnonyms—especially ‘Arab’ and ‘Turk’ – is inconsistent. This may be a result of Joseph’s social positionality between European and local populations. While Europeans regularly used ‘Turk’ and ‘Arab’ as ethnic descriptors in the nineteenth century, they were much less common, and more fluid in meaning, among Ottoman authors (Herzog, 2002, pp. 323–6). For Joseph, the words ‘Turk’ and ‘Arab’ are not linguistically parallel. So, for example, he never identifies the wealthy Baghdadis and Basrawis who travel in first- or second-class as Arabs. Most of the groups he describes as ‘Arabs’ are either foreigners (‘Arabs of Bahrein and Hassa’) or local ‘tribes’ (‘the Beni Sudd Arabs’). This usage suggests a sort of conflation between a modern conception of ethnicity, and an older Arabic-language usage to refer to rural and especially nomadic people. In contrast, Joseph most often uses ‘Turk’ or ‘Turkish’ to refer to individuals, and usually to Ottoman officials. In most cases, he seems to use ‘Turk’ largely as a category of nationality rather than ethnicity, parallel to how he uses ‘German’ or ‘American’. At the same time, it is not always clear why he chooses one descriptor over another. For example, in Diary 48, he identifies a passenger as, ‘a Mahomedan Mahmood Effendi Clerk in the Serai of Basreh.’ Why clarify that this person—an Ottoman official—is a Muslim? Going forward, we plan to study Joseph’s linguistic usage more closely, to better characterize how his language reflects his views of society.
In considering the divisions between the modularity groupings, it is important to remember that the network reflects Joseph’s own personal experience and the structure of his diary. So while we have not rendered the network as an ego-network (i.e., Joseph does not appear in the network), the individuals and relationships it depicts are those Joseph both observed and recorded. There are a few ways we can observe this. For example, while Hamdi Pasha, the waly (governor) of Basra, appears in a different modularity grouping and has no recorded connection to either of the governors of Baghdad (Atta Allah Pasha and Namik Pasha), from another perspective Hamdi Pasha would likely appear closely connected to the other officials. But we can only see what Joseph saw. Joseph’s perspective likely plays a role in making the steamships appear as a space of social mixing to the extent they do. His work entailed recording every passenger who boarded, so more than anyone else, he interacted at least superficially with every person who came on board.
4.2 Considering shifts across diaries
Considering the network visualizations holistically, we see that there are three main modularity groupings that tend to be observed, consisting primarily of persons residing in Baghdad, persons residing in Basra, and passengers that Svoboda encountered en route. A composite network, comprised of all three diaries, appears in Fig. 6. We observe that the Baghdad grouping (green) tends to have fewer nodes interspersed, which may make sense since his home was in Baghdad. Although Joseph spent substantial time in Basra, it might be most usefully understood as a stop, albeit the largest stop, on his journey back and forth from Baghdad to Basra, and thus his observations of his social circle there (blue) and the passengers (purple) were mixed together.

Composite network: Diaries 47–49. A few peripheral nodes are not shown.
Other than the three main modularity groupings, subcommunity detection using Louvain modularity on Diary 47 (Fig. 7) identified a modularity grouping in which Alexander appears prominently (which we will discuss further with Diary 48), as well as a few other small modularity groupings. As in Diary 49, these smaller groupings appear to be derived from diary entries that are relatively disconnected from his other entries. For example, the orange modularity grouping includes Joseph’s remarks about the replacement of the Waly of Basra and the passing of Seyd Selman Effendi. The fact that they are largely absent from the composite (Fig. 6) demonstrates how the combined network smooths over small or transient events, primarily reflecting the more persistent social groupings in Joseph’s life.

As we turn our attention to Diary 48, we observe that the Alexander node has become even more salient (Fig. 8). In much of the diaries, Joseph Svoboda writes about his frustrations with his son Alexander. In Diary 48, Alexander, who was living in Paris at the time, expresses the desire to stay in France and to marry a woman there. Though this also occurs in 47, in Diary 48 Joseph expends a significant amount of effort writing telegrams to Alexander and others including Pere Pierre, Monseigneur Altmayer, and others who appear prominently around the Alexander node. Some persons who in the other network visualizations appear in the Baghdad or Basra modularity groupings have now been assigned to the group containing Alexander, accentuating Joseph’s efforts to appeal to individuals in multiple of his social circles for help.

We plotted the differences in the number of times each person was mentioned in the diaries to study the extent of variability (Fig. 9). In doing so, we observe that Alexander exhibited a markedly different pattern as compared to all other persons. The color sequence for Alexander, magenta-peach-blue, indicates that Alexander was mentioned least in 49, moderately (for him) in Diary 47, and most frequently in 48. This increase in mentions in Diary 48 can be seen with Pere Pièrre and telegrams (represented by ‘Svoboda’ as addressee), which were part of Joseph’s efforts to deal with the situation with Alexander. His mentions of persons related to his duties as a purser (e.g. ‘Passengers’, ‘Jews’) also reach their lowest points in Diary 48, as illustrated by the blue points on the left-hand side of the associated persons.

Shifts in the frequency of person mentions in Diaries 47–49. Shifts in frequency of five or less have been omitted.
These shifts illustrate that, as the drama taking place with Alexander increasingly troubles Joseph, aside from the shifts in his social interactions, with increased interactions with certain individuals, Joseph expends a greater part of his mental efforts and energies on the issue. In Diary 49 as communications with and about Alexander decrease, Joseph may have a bit more capacity and interest in renewing his attentions to his observations of the world around him. We then observe that other interactions increase in prominence in the diaries, such as mentions of local groups and political figures. Figure 9 also illustrates that over the period of these three diaries, even though Joseph interacts with and/or encounters a large number of people, the frequency of these interactions remains relatively consistent for all but a small number of his connections.
5 Considering Joseph’s Social Milieu with Hierarchical Clustering
Last, we consider how the diaries vary in terms of characteristics of the person network depicted. Frequency refers to the number of times that a person was mentioned in a given diary. In terms of network metrics, degree centrality refers to the number of relations that a node has, and betweenness centrality refers to the number of times that a node acts as a bridge on the shortest path between nodes (Yoo, 2021). These characteristics can provide additional insight into the different roles that people played in Joseph’s life.
To better understand these roles, we performed hierarchical clustering, an exploratory data analysis technique that facilitates discovery of groups of similar objects, which is often used to visualize biomedical data (Škuta et al., 2014). The result of the clustered data is often displayed as a matrix visualization, or clustered heatmap, in which the colors of each cell correspond to its value, and a dendrogram, or tree structure, showing the relationships between the clusters. Rows/columns that are more similar to one another are ordered closer to each other. We clustered the persons that Joseph mentions in all three diaries by these three variables, using standardized Euclidean distance as the measure of similarity. We performed the clustering and rendered the results using the matplotlib and seaborn packages.
Figure 10 depicts the result of this clustering. At the top, we can see the entire clustering result, with all of the person referents. Each person’s degree, frequency, and betweenness centrality is depicted as a cell that is colored depending on its value, with black as 0, proceeding to red in the middle, and ending up with white as 1. Almost all of the persons have extremely small values on each of these dimensions; thus, the entire clustering result is almost completely black. However, we can see that at either end of the heatmap visualization, there are persons whose characteristics differ from the rest; we zoom in on these two ends in panes A and B.

Hierarchical clustering of persons using network and other metrics
The specific persons exhibiting high-intensity colors in Pane A, including Joseph’s son Alexander, wife Eliza, and other relatives, represent Joseph’s close network. Though many of these are individuals that he interacts with in person, there are those with a presence in his life in other ways. In particular, Alexander and Joseph’s brother Henry exhibit a higher betweenness centrality relative to their frequencies of mention. In the case of the former, this seemed to be partly because Joseph constantly worried about Alexander, writing to and about him, and in the case of Henry, because Joseph often communicated with him by letter and telegram. Thus, references to the two might occur on any day, irrespective of who else might appear on that day in Joseph’s diary.
In Pane B, we see other members of Joseph’s social circle, like his sister Eliza and Jeboory Asfar, who do not appear as frequently, but nevertheless exhibit a relatively high degree, suggesting that Joseph’s interactions with them went beyond a particular social circle. Thus, these metrics can be used to infer the importance and social role that a person plays, with some primarily interacting with him in person, and others not physically present but more or less constantly on Joseph’s mind (Alexander), or interacting with him in a constant stream of correspondence (Henry).
These were some of the prominent individuals in Joseph’s social relations; what of those he encountered in his professional life? The high-intensity hues of passengers, Jews, and Captain Cowley in Pane A are evidence of their day-to-day significance in Joseph’s working life. Based on the appearance of professions such as Inspector, Doctor, and Waly (meaning ‘governor’), we also understand that Joseph often included professions in his references to persons. Racial, ethnic, and religious descriptors are also apparent in Panes A and B, and their degree intensities perhaps serve as some relative indicator of their prominence in Joseph’s observations.
Similar to ‘Alexander’, the term ‘wife’ exhibits a pattern of relatively high degree and betweenness centrality as compared to its frequency, but for a different reason. Because we generally resolved referents to ‘wife of X’ to ‘wife’ irrespective of the identity of X, the ‘wife’ node includes people from different parts of Joseph’s social milieu. So rather than reflecting the significance of one specific person or class of people, this pattern speaks to how Joseph approached the world and understood people—and perhaps especially women—in terms of their relation to others.
6 Discussion
In this article, we illustrate how NER and SNA can be combined to facilitate an exploration of social interactions in the Svoboda diaries. We employed a multi-step process involving extraction of person mentions from the text, mapping of these mentions to unique persons, classification of those persons by characteristics of interest (i.e., kin, non-kin, and non-specific referents), and rendering of person networks with subcommunity detection. In doing so, we observed that network analysis, along with other visualizations, could be used to achieve a more nuanced analysis of Joseph’s social interactions by differentiating those he was close to or interacted with often, from those he merely observed or interacted with in passing in his role as a steamship purser.
With regard to the specific context of Ottoman Iraq, our research shows that while Joseph interacted regularly with people of many different ethnic and religious backgrounds, as well as both men and women, social mixing across ethnic, confessional, and to a lesser extent class, lines, was mostly confined to his professional life and the space of the steamship. Svoboda’s job may have offered him more opportunities than other Ottoman Iraqis to meet and interact with people of different backgrounds in his daily life, but it is also possible that other people moved in different kinds of mixed spaces. At the same time, Svoboda’s family remained dominant in his social networks in both Baghdad and Basra. While the literature on family in the Ottoman Middle East has often focused on households or on extended families in a single city, our research suggests that steamships enabled the spatial extension of close familial relations.
There are ways in which our approach differs from extant literature on ego networks, or networks made up of a certain individual and all of their ties (Arnaboldi et al., 2012). In some ways, a diary naturally lends itself to being conceptualized as an ego network. In this case, we employed a different approach, rendering networks that incorporated everyone except the central individual, giving us a panorama of those Joseph encountered. This perhaps led to a richer picture of Joseph’s social milieu, since it included not only those he interacted with, but also those that he did not. Moreover, this analysis facilitated the subtler differentiation of individuals in his two main social circles from those he had less interaction with. The modularity groupings invite the reader to examine the text more deeply to see how certain social groupings appear in the text (e.g. as passengers on the ship, in news reports he hears, and in casual conversation with friends), facilitating a deeper understanding of social interactions. Moreover, Svoboda’s language, and to a lesser extent, his reports of news from others, reflects differences in his and others’ perspectives towards those who he did not interact as much with or at all. His use of descriptors indicating familial, national, racial, ethnic, gender, and age-related adjectives also suggest how he viewed those he interacted with or observed around him. The salience of relational descriptors suggests that he saw family relationships as important, and that he often located people in terms of their families. However, our decision to collapse relational referents into generic nodes like ‘wife’ and ‘sister’, while illustrating the prominence of family and relationships in Joseph’s worldview, also prevented us from viewing the network at a more granular, individual level.
This work has other methodological implications. First, we observed that NER could be used to extract person entities from personal diaries, and subsequent network visualization and subcommunity detection effectively separated persons Joseph saw socially from those he described as ship passengers. The person entities could then be tied back to their respective passages and presented to the analyst to facilitate closer examination of passages pertinent to Joseph’s social interactions as opposed to steamship interactions, as the research questions called for, effectively facilitating close reading through keyword-in-context or text juxtaposed with visualizations, both common visualization approaches in digital humanities research (Jänicke et al., 2015, 2017).
The human-in-the-loop method facilitated named entity linking and improved the NER performance over the trained model considerably, but there are tradeoffs to be considered. On the one hand, the human-in-the-loop method increased the amount of effort on the part of the analyst. On the other, it provided a structure that enables the analyst to perform person identification and mapping more systematically. Similar to previous work in other domains, the introduction of human-in-the-loop procedures also had the inadvertent benefit of enabling us to perform data quality assurance of our project’s transcription work (Grønsund and Aanestad, 2020). For example, we were able to discern that there were different causes for the variation in the ways that persons were referred to: (1) situations in which Svoboda spelled persons’ names differently; (2) situations in which spelling variants were introduced during the transcription process; and (3) different English transliterations of proper names.
There are various limitations of our work. First, in this article, we employed a relatively coarse proxy for social interaction—that two persons were mentioned on the same day. It is certainly the case that two people might be mentioned on the same day and have nothing to do with each other, as Svoboda may have interacted with them at different times, or perhaps one was simply a passenger on the ship with whom Svoboda never interacted with in any meaningful way. Moreover, in any given account of the passengers on a ship, Joseph may mention passengers holding different class tickets who may or may not actually interact with one another. Relatedly, our method did not differentiate between different forms of interaction. For example, most of the time that Svoboda mentioned his son Alexander and his brother Henry, they were not physically present, but for different reasons—Alexander, because Svoboda was often worried about him and/or writing telegrams to address situations involving him, and Henry, because he often received letters from him. In future work, it might be useful to employ more granular distinctions in the modeling of relations, such as distinguishing between classes on the ship and different forms of interaction (e.g. letter, telegram, informal conversation). However, it is interesting to note that the modularity algorithm was partially successful in distinguishing different types of interactions, and that singular events were also reflected. One additional limitation is the use of the subcommunity detection on consecutive diaries. In the future, we plan to work with more diaries as they become available, and the methods demonstrated here could be useful in distinguishing changes over time.
This article has demonstrated an approach to combining NER with network analysis and subcommunity detection to study social interactions. Our exploration of social interactions using this unique source has also generated questions for future exploration. In performing NER, we realized Svoboda used descriptors connoting aspects of people’s backgrounds in different ways. Though we aimed to demonstrate the prominence and interactivity of these descriptors through network analysis, a deeper analysis of his language use could result in greater insights about social interactions and the perceptions of the time. Last, our current research leverages the diary structure to take snapshots of Joseph’s social interactions over broad swaths of time. Another approach might be to employ dynamic network visualizations taking into consideration more granular timeline data (Bruns, 2012), to facilitate a more nuanced understanding of shifts in Joseph’s social network.
Acknowledgements
We would like to thank the members of the Svoboda Diaries Project for their support, and in particular team members Yadi Wang and Daniel P. Saelid, for their assistance with reviewing the code and transcriptions that were instrumental parts of this work.
References
Elford (