The Great Parchment Book of the Honourable the Irish Society is a major surviving historical record of the estates of the county of Londonderry (in modern day Northern Ireland). It contains key data about landholding and population in the Irish province of Ulster and the city of Londonderry and its environs in the mid-17th century, at a time of social, religious, and political upheaval. Compiled in 1639, it was severely damaged in a fire in 1786, and due to the fragile state of the parchment, its contents have been mostly inaccessible since. We describe here a long-term, interdisciplinary, international partnership involving conservators, archivists, computer scientists, and digital humanists that developed a low-cost pipeline for conserving, digitizing, 3D-reconstructing, and virtually flattening the fire-damaged, buckled parchment, enabling new readings and understanding of the text to be created. For the first time, this article presents a complete overview of the project, detailing the conservation, digital acquisition, and digital reconstruction methods used, resulting in a new transcription and digital edition of the text in time for the 400th anniversary celebrations of the building of Londonderry’s city walls in 2013. We concentrate on the digital reconstruction pipeline that will be of interest to custodians of similarly fire-damaged historical parchment, whilst highlighting how working together on this project has produced an online resource that has focussed community reflection upon an important, but previously inaccessible, historical text.
We present here a major, groundbreaking partnership that physically conserved and digitally reconstructed a severely damaged parchment document of significant importance as a source for the City of London’s role in the Protestant colonization and administration of the Irish province of Ulster (now in Northern Ireland). Badly damaged in a fire, the Great Parchment Book was stored under restricted access, unavailable for handling or reading by historians. Established conservation methods for flattening were unsuitable due to the parchment’s fragility. Standard two-dimensional digitization techniques would not capture the physicality of the text adequately, and existing three-dimensional (3D) scanning methods for documentary material were unsuitable for this particular application. This paper presents a lightweight pipeline for the digital acquisition and generation of a 3D reconstruction of a severely fire-damaged document that exhibits complex geometry. The reconstruction allows a new, 3D surrogate of the document to be navigated, and a globally flattened image of the text to be created, enhancing its legibility. We also introduce a novel quality metric, relating our work to standard best practice in digitization for archival materials, to gauge the effective digitization resolution of our reconstruction approach. In addition, our system addresses the issue of provenance in 3D reconstruction: aiming to document and make transparent the capture and processing elements so that historians can trust whether a feature is present in the original text or whether it is an artefact of the reconstruction pipeline. This digital reconstruction, combined with transcription and online publishing of the resulting digital surrogates, have made the contents of the Great Parchment Book of the Honourable the Irish Society available for the first time in over 200 years.1
Our method is particularly relevant to libraries and archives holding similarly damaged historical documents that cannot be fully restored by conventional physical or digital means. Such documents—which are all too common—generally have restricted access due to their fragile nature, giving opportunities for digital representations to allow people to read contents remotely, without physical handling of the originals. Additionally, digital representations can allow for virtual reconstruction, including removal of geometric distortions, and corrections of discolouration.
We demonstrate how our reconstruction work has contributed to the Great Parchment Book project overall, assisting in the creation of a digital edition of the text in time for the major local and national commemoration of the 400th anniversary of the Plantation of Ulster. Our project demonstrates how advanced computational methods, when developed to encompass the needs of conservators, archivists, and palaeographers, can be used to enhance our understanding of historic texts, and create a resource to benefit a wide community.
2 The Great Parchment Book
The Great Parchment Book of the Honourable the Irish Society (hereafter ‘the Irish Society’) is a major survey of all the estates in the county of Derry managed by the City of London through the Irish Society and the London Livery Companies in the early 17th century. The Great Parchment Book was compiled in 1639 by a Commission instituted by Charles I to survey the lands which would now fall under his control. Given the relative paucity of archival records for early modern Ireland, the manuscript contains key data about landholding and population in a part of Ulster at this time, as well as information about the province’s relationship with London. However, the manuscript was severely damaged in a fire in 1786 and only limited information about its contents has been available since. In this section, we detail the history of the Great Parchment Book, explain its importance, and describe its current physical state which made the contents of the manuscript all but inaccessible before the undertaking of our project.
2.1 The Honourable the Irish Society
The history of the City of London,2 the Irish Society, and the Great Parchment Book are very much intertwined (Bond and Vandercom, 1842). The Nine Year’s War (1594–1603), where the forces of Gaelic Irish chieftains fought against English rule in Ireland, ended in Gaelic defeat and exile for their leaders, leaving northwest Ulster open to colonization (Canny, 2001; Curley, 2004). From 1609, King James VI of Scotland and I of England and Ireland (1566–1625) had a policy of settling Ulster with English and Scottish Protestants,3 known as the ‘plantation’, with the City of London Corporation4 and the London Livery Companies5 being reluctantly compelled to administer the plantation of the county of Derry (Curl, 2000). Originally established by the City of London's Court of Common Council on 30 January 1610, The Irish Society6 was formally incorporated by the royal charter on 29 March 1613 (Bond and Vandercom, 1842, p. xii) which also gave to the Society grants of lands and privileges. That year The Irish Society was placed in direct control of the City of Derry, which was renamed Londonderry, and the newly constituted county and estates surrounding it, with the county of Coleraine planted and converted into Londonderry, with parts of Donegal, Antrim, and Tyrone also part of the wider plantation (Moody, 1939; Curl, 2000).7 Although much has changed since (including the loss of all its estates), the Irish Society continues to operate to this day, and has always had its administrative centre (the Irish Chamber) at or near Guildhall8 at the heart of the Square Mile in London, maintaining ‘a tradition of care of its administrative records’ (The National Archives n. d. (a)).
2.2 Overview of the Great Parchment Book
The Great Parchment Book is the major surviving early record of the Irish Society relating to Irish estate management.9 The Great Parchment Book was compiled in 1639 after Charles I (1600–49) claimed as forfeit the estates constituting the entire county of Londonderry (which London was administering through the Irish Society and the Livery Companies) after a politically motivated case in the Star Chamber ruled that the Londoners had not fulfilled their obligations of plantation.10 Charles I then commissioned a survey intended to gather, in one single volume, full details of all the contracts and rented lands in Londonderry and Coleraine that he had successfully claimed:
This Commission authorised Sir Ralph Whitfield, Serjeant-at-law, Thomas Fotherley, gentleman of the King's Privy Chamber, Bramhall, Bishop of Derry and Sir William Parsons, Surveyor General of Ireland to collect and receive all sums due to the King in Londonderry, to seize on the King's behalf all castles, manors, lands and tenements lately belonging to the Londoners, and to conclude, on terms most profitable to the King, new contracts for leases of estates of inheritance with the existing tenants and others. Whitfield and Fotherley were the active members of the Commission, arriving in Ireland in April, and finishing about the beginning of October 1639. Attended by a clerk and two assistants, they toured the estates, and copies of all the contracts were entered in a ‘great parchment Booke’, thereafter presented to the King. (The National Archives, n. d. (b))11
Bond and Vandercom (1842) describe the book thus: ‘The various grants and agreements were engrossed on vellum, and signed by the respective parties, and were preserved, (bound up,) amongst the records of the Irish Society’ (p. 60): after the survey, the book made its way back to London to the Irish Chamber at Guildhall.
The Great Parchment Book then represents an important information source for the City of London’s role in the colonization and administration of Ulster, containing key data about landholding and population in 17th-century Londonderry: it is commonly described as the Domesday Book of Ulster, or Derry (Worshipful Company of Vintners, 2011; Santry, 2012; Reisz, 2013).
2.3 Physical condition: barriers to interpretation
In February 1786 a fire broke out amongst building works on the north side of Guildhall which ‘notwithstanding speedy assistance, burnt so furiously for some time, that the whole of that office was destroyed, together with all the books of accounts, several bonds, and considerable sum in bank-notes’ (Lambert, 1806, p. 292). A great many of the Irish Society's archives were incinerated or (like the Great Parchment Book) badly damaged: very few 17th-century records of the Irish Society remain as a result (Aldous, 1985). After the fire, and in spite of its parlous state, the surviving leaves of the Great Parchment Book were carefully preserved because of its importance to the Irish Society and significance to the history of Ulster:
the fragments […] which remain, are valuable and interesting, as they elucidate the titles of the twelve chief Companies to their manorial town lands, which are described by name. And they set forth the estimated quantity of land contained in the respective denominations of town lands, granted or demisted by the Commissioners. (Bond and Vandercom, 1842, p. 60)
In 1939, however, T. W. Moody described the book as ‘a mass of scorched and dirty fragments, most of them mere crusts. The details of Whitfield’s and Fotherley’s work are completely irrecoverable’ (see Fig. 1) (p. 400).
The extant manuscript consists of 165 separate parchment membranes, all damaged in the fire. It is unknown how many folios were in the original book, and the initial ordering of the sheets is also unknown: the grouping by livery company presented prior to this project is conjectural.12 The remaining sheets defy reading under normal conditions. Parchment consists of an irregular structure of organic fibres that are sensitive to their environment and can shrink, swell, and buckle if exposed to heat or humidity, creating dramatic and irregular geometric distortions, precisely what has happened in this instance (Chahine, 2000; Giurginca et al., 2009). This uneven shrinkage, warping, intorsion, and distortion has rendered much of the text of the Great Parchment Book illegible, although it is usually still visible.
The physical state of the text (at the Corporation of London Records Office13 it became affectionately known as the ‘poppadom’ book) combined with its fragility and distorted writing meant that, although it remained part of the City of London’s collections held at London Metropolitan Archives (LMA),14 the contents of the Great Parchment Book had been, prior to this project, unavailable to interested parties for over 200 years (Curl, 2000) (see Fig. 2). The motivation for revisiting physical conservation and digital restoration methods came with the approaching 400th anniversary of the building of the Londonderry city walls in 2013, and a planned programme of public engagement and commemoration. The ultimate aim was to make the document available as a central point in an exhibition in Derry Guildhall in 2013 during Derry∼Londonderry’s year as European City of Culture.15
We knew from the project’s outset that this was an undertaking without a certain result as we were committed to exploring new techniques and technologies, both in physical conservation and digital imaging. Each element of the project was a major piece of work in its own right and different funders were approached to support them. A full conservation assessment was first undertaken, to define a measure of the damage to each folio and to establish methods that could make subsequent digital acquisition and processing more effective. A partnership between LMA, UCL Department of Computer Science,16 and UCL Centre for Digital Humanities17 established a 4-year Engineering Doctorate (EngD) in the UCL Virtual Environments, Imaging and Visualization18 programme beginning in September 2010 (jointly funded by the Engineering and Physical Sciences Research Council19 and LMA) with the intention of designing a digital reconstruction workflow and software to capture, process, and display the writing on the parchment. The aim was to make the text legible, by digitally undoing part of the damage and deformation without having to resort to checking the newly conserved book, and ideally to reconstitute the manuscript digitally.
The research component of the EngD began in September 2011. Conservation of the parchment sheets was carried out by September 2012, and transcription of the manuscript was undertaken alongside conservation and digital restoration until Spring 2013, with the launch of the website scheduled for the end of May 2013 to coincide with the opening of the Derry Guildhall exhibition (although the website could be updated beyond this). Further interactive geometry work was done in collaboration with colleagues in the Interactive Geometry Lab20 within the Institute of Visual Computing21 at ETH Zurich22 throughout 2013. This was an aggressive schedule which required conservators, archivists, historians, digital humanists, and computer scientists working in tandem on physically conserving, digitally reconstructing, and reading the Great Parchment Book. We provide here a brief overview of the physical conservation methods used, before detailing the digital acquisition and reconstruction approach pursued.
3.1 Physical conservation
The treatment of such a degraded and fragile manuscript was challenging: traditional conservation alone would not produce sufficient results to make the manuscript accessible or suitable for exhibition.23 The parchment itself was too shrivelled to be returned to a readable state, although there had clearly been at least one attempt in the past to do so. No documentation regarding this has been found. A detailed condition assessment was carried out to establish the possible risks to the document’s integrity during storage and handling. The types of damage were identified and a condition rating system devised to establish the overall extent of the damage, which included: planar distortion of the surface caused by fire; deep and stiff creases from heavy gelatinization and denaturation of the parchment; the presence of large and small tears on the edges as a result of shrinking; the presence of a thick layer of calcite on the surface caused by strong dehydration of the skin during the fire; damage to the inks which after testing were revealed to be metallogallic (in some areas the ink had flaked off and in other parts it was completely lost, leaving a lighter colour on the parchment), and severe surface dirt present on many of the sheets.
The results confirmed that the parchment sheets were too damaged to be handled safely even after extensive conservation treatment. Forced flattening of the entire sheet would have facilitated the digitization process, but damaged the parchment irreversibly. Much of the text is visible but distorted. It was decided to humidify the parchment sheets as far as was appropriate to their fragile state24 to try to release only the creases that were obscuring the text. The ultimate aim was to gain legibility and enable the best access during the digitization phase of the project.
The practical conservation of the membranes was the essential first step. Different treatment options were tested on samples to determine the best way to humidify and dry the parchment under tension (see Woods, 1995 for an overview of this technique, also Clarkson, 1992; Singer, 1992; Hassel, 1999). Conservation work on the membranes included cleaning, humidification, and tension drying, using magnets placed on top of the parchment above a metal sheet to hold creases open during the drying process25 (see Smither n. d. for an overview) (see Fig. 3). The aim was to introduce as little moisture as possible to the parchment and tension it locally taking into account the constraints of daily working hours and the need to make the treated sheets available for digitization within the agreed time-frame. This approach opened out areas of the parchment so that as much of the text as possible could be accessed during the digital acquisition process.
Dry cleaning of the sheets was performed by means of a soft brush and a chemical sponge only where the inks were not flaking off. Repairs were carried out only on areas where handling during digitization would have compromised the integrity of the sheets (Avery et al., 2013; De Stefani, 2012, 2014). The digitization started as soon as the treatments on each sheet were completed. The treated sheets needed to be rehoused because their existing packaging was not of conservation standard and was damaging the sheets further. New packaging was provided for safe and long-term storage by means of an archival box and Tyvek sheets to interleave the parchment sheets (London Metropolitan Archives, 2015).
3.2 Existing digital approaches to flattening manuscripts
Before undertaking any digital acquisition and reconstruction work ourselves, it was first necessary to establish if there were any existing approaches that could be potentially of use, surveying previous work in advanced digitization of historical documents that have complex geometry. When digitizing flat documents, a single top-down image is generally viewed as sufficient to meet scholarly requirements.26 In the case of our document, a single image would be insufficient to produce a high-quality digital surrogate since folds would occlude regions of the folio, and some raised areas would be imaged with foreshortening effects. For these reasons, each folio needed to be three-dimensionally reconstructed to produce a digitization of sufficient quality. We now provide an overview of prior work in attempting to virtually flatten documents, including an explanation of why existing approaches could not be adopted for our task.
Previous work addressing the problem of computationally flattening documents typically deals with single images that exhibit small geometric distortions and a specific type of deformation, such as rectifying printed text captured by a flat-bed scanner or a camera, with the aim of reconstructing the shape of modern printed documents to improve the performance of Optical Character Recognition (OCR) algorithms. This approach assumes it is possible to rectify the deformation of a document by observation: extracting features that are used to estimate the underlying changes in the text. For example, Wada et al. (1997) and Zhang et al. (2004) assume the physical deformations to be caused by the spine of a book, and both propose methods to reconstruct the surface of bound documents captured with a flatbed scanner by using the shading cues in the image. Wu and Agam (2002) present a simple method for rectifying a warped image based on tracing lines of text and then using these to generate a deformation mesh. These line tracing methods work well on documents with clear text and good image contrast. Schneider et al. (2007) and Tian and Narasimham (2011s) reconstruct the shape of smoothly folded pages by detecting horizontal line directions and vertical strokes in the text, de-warping images of textual documents by estimating the 3D surface of the document. However, this relies on the text being printed on a light background in a regular font so that individual letters and strokes can be detected. The assumptions made by these approaches (that text is printed on a clean white page, with regular font and line spacing, and that the page has not undergone any non-isometric deformation such as warping or buckling) mean that such methods are not robust when applied to our damaged historical parchment since we cannot make such strong assumptions about shading and textual cues from our twisted, handwritten text.27
Brown and Seales (2001), Sun et al. (2005), Brown et al. (2007), and Bianco et al. (2010) approach the problem of virtually flattening arbitrarily warped documents, with fewer initial assumptions made about their shape or content. They acquire a 3D reconstruction of the document using a structured-light scanner, where the document surface is illuminated with a known pattern, and the distortion of the pattern as seen from a camera is used to compute the 3D shape and then flatten the resulting triangle mesh using a mass-spring simulation.28 The mesh is allowed to fall into a planar configuration under a gravity force while spring forces maintain its structure. We observe that this mimics the physical conservation approach of softening the parchments and then stretching them out. Brown et al. (2007) also add a photometric correction step to remove baked-in shading from the reconstruction’s texture. This type of approach produces impressive results considering its simplicity. However, the capture process suffers from problems with self-occluding objects since both the camera and projector must be able to see a point on the object to reconstruct it: if there are any regions of the page which cannot be seen by both the camera and the projector, they will not be included. Complex objects would require a number of separate scans to be performed and the resulting meshes registered with each other, which is not trivial. The mass-spring approach could also cause fold-overs when applied to documents with high levels of physical distortion, introducing unwanted overlaps. Finally, the isometry assumption built into the mass-spring system (i.e. that the document deforms uniformly in all directions) is not appropriate for completely arbitrarily deformed manuscripts which contain regions that have both shrunk and expanded, as is the case with denatured parchment, which is heavily deformed with pronounced folds and creases.29
Ulges et al. (2004), Lampert et al. (2005), and Koo et al. (2009) capture stereo images of a document from which a 3D surface can be reconstructed. These methods are demonstrated on open books, and assume there is no self-occlusion in the pages. Global conformal mapping is used in Brown and Pisula (2005) to unfold a document; however, this approach still makes strong assumptions about the type of deformations present. Samko et al. (2011, 2014) present a method for scanning and virtually unrolling scrolled historical documents written on parchment. The scrolls are scanned using X-ray tomography to produce 3D volumetric data. These data are treated as a set of volumetric slices. However, this is also unsuitable for our problem because it assumes that the deformation to the document is equal throughout, where parchment (as previously discussed) is likely to buckle and twist in non-isometric patterns.
In addition to the virtual unfolding of documents, a number of techniques have been developed to correct aspects of historical document degradation other than geometric distortion, namely, fading of the text as the ink degrades, and bleed-through of the text as the writing support deteriorates. These include multispectral imaging (Easton et al., 2003; Giacometti et al., 2015; Kim et al., 2010; Klein et al., 2008). However, preliminary experiments with this approach showed that there would be no improvement in visible text using multispectral imaging with the Great Parchment Book (MacDonald, n. d.). Ink bleed removal (correcting images from ink seeping through from the other side of the folio; Huang et al., 2010; Hanasusanto et al., 2010; Chan and Vese, 2001) is also not a primary concern for us, likewise, the use of intelligent interpretation support systems that propagate likely readings of the text is beyond this project’s scope (Terras, 2005; Roued-Olsen et al., 2009).
We therefore came to the conclusion that existing methods previously proposed for complex document acquisition suffered from occlusions (narrow areas which could not be captured due to the physicality of the document) or required complex manual alignment of partial scans. Developing our own pipeline for digitization and processing of documentary material was therefore necessary and would prove more efficient than adopting others’ systems. However, there are existing, wider approaches in cultural heritage digitization and in computational 3D graphics which have proved useful, allowing us to build on their techniques and tools, and informed and enhanced our work.
3.3 Adopting related 3D reconstruction approaches
Digitally flattening a document requires that we first capture a digital surrogate, to which flattening and restoration algorithms can be applied. Although existing flattening methods for documents did not hold solutions for us, there is much previous work dealing with the general topic of 3D reconstruction of objects, and many different established pipelines for undertaking such work. Contact approaches (which acquire the shape of an object by probing it with sensors) are obviously unsuitable for use on fragile historical artefacts. Non-contact methods exist which can be categorized into active methods (emitting some form of light or non-visible radiation, then detecting how the light interacts with the object to recover its shape, although conservators can have concerns about the use of lasers and such like in proximity to delicate objects) and passive methods (which analyse reflected ambient radiation). Both of these approaches have been much used in the digitization of a range of cultural and heritage objects.30 The types of 3D representations produced include point clouds (Snavely et al., 2006; Wu, 2011; Furukawa and Ponce, 2010), volumetric models (Kutulakos and Seitz, 2000; Furukawa and Ponce, 2010), and triangle meshes (Vergauwen and Gool, 2006; Autodesk, 2012). We therefore had a range of approaches to choose from, and our initial phases of development involved investigating different existing approaches to reconstruction and their suitability for our problem. We initially discussed the construction of a bespoke laser scanner, but during that phase determined that the parchment exhibits sufficient visual texture, given its non-uniformity, to allow structure-from-motion31 algorithms to perform well. We therefore turned our attention to multi-view stereo, a technique which is now commonly used in computer graphics ‘to reconstruct a complete 3D image of a model from a collection of images taken from known camera-viewpoints’ (Seitz et al., 2006, p. 519). A range of high-quality multi-view algorithms have been developed with various computational approaches being used, which can reach ‘remarkable’ accuracy to generate complete 3D object models, if there are enough 2D images available on which they can be based (ibid., p. 24). Multi-view stereo has been extensively used to record archaeological sites, architectural ruins, and museum objects, usually at larger scale (for example, see Pollefeys et al., 2001; Remondino, 2011; Wenzel et al., 2013) and is now a standard approach in computer vision (see Hartley and Zisserman, 2003 for an overview).
The multi-view stereo approach is very well suited for practical, manual acquisition of our deformed parchment, as it allows a user to freely choose viewpoints to reach all parts of the wrinkled surface, capturing a series of 2D digital images that we can then use to generate a 3D model. Previous approaches such as top-down cameras or structured-light scanners would not be able to cope with the self-occlusions in the pages and would produce incomplete reconstructions. Using a hand-held camera allows us to adapt the acquisition process to the highly varying shapes of the parchment, and guarantees full coverage of the parchment surface. The fact that we only use commonly available hardware makes this approach more accessible: archives and museums are also unlikely to have access to specialized scanning equipment or the expertise required to use it. In addition, there are already existing computational algorithms for multi-view stereo that we can adopt and adapt to fit our task.
There are now free end-to-end web services which compute textured 3D models from an uncalibrated set of 2D images: both ARC 3D32 (Vergauwen and Gool, 2006; Tingdahl and Van Gool, 2011) and Autodesk 123D Catch,33(2012) are popular in cultural heritage digitization34 and other services are also available.35 However, these are ‘black-box’ approaches where it is not clear how the model was generated from the input 2D images (Nguyen et al., 2012), and it is imperative that we understand how data move through our pipeline to generate surrogate models that we can trust (Terras, 2011). An alternative multi-view stereo approach that is fully documented, open-source, supported by multiple operating systems, and allows users to trace each individual pixel through the pipeline was available in Wu’s VisualSFM (2011) software which uses Wu et al.’s Graphics Processing Unit (GPU) implementation of scale-invariant feature transform (Lowe, 2004; Wu, 2007), and their multi-core bundle adjustment algorithm (Wu et al., 2011) to generate a sparse 3D reconstruction using structure from motion. This is a widely used36 dense multi-view stereo reconstruction workflow, performing very well on even highly unstructured image sets containing variants in lighting, image exposure, and lens type (Remondino et al., 2012). VisualSFM thus provided us with an approach and flexible tools upon which to build our digital reconstruction pipeline.
4 The Great Parchment Book Digital Reconstruction Pipeline
We first capture a set of high-resolution images and then perform two pre-processing steps: reconstruction, and computation of texture maps. We developed a viewer that allows a user to navigate the surface of this model, and to generate flattened representations of specific, local areas of the document, to aid interpretation. Additional, advanced mesh parameterization was then developed to compute a map that flattens the 3D surface into the 2D plane while introducing as little distortion as possible to produce images of the whole document, virtually recovered to an extent not possible with physical restoration methods. We detail here the capture, reconstruction, and computation of texture map phases. We also discuss how we can assess the quality of our reconstructions by relating them to established digitization standards in the cultural and heritage sector. Our digital models can be viewed and shared, allowing the contents of the book to be accessed more easily and without further handling of the original document. We describe both our interactive viewer and our global flattening approach, presenting results of our pipeline at every stage.
First, it should be noted that throughout the capture phase, it was imperative that the Great Parchment Book did not come to any harm. The studio setting at LMA was discussed with conservators, and a volunteer, a qualified conservator, assisted with handling the document. We evaluated the use of appropriate lights and supports for the parchment. As noted above, digitization happened soon after conservation, with the folios returning to store in improved housing: the conservation and digitization elements of the project are closely intertwined.
The first step of the acquisition process was to capture a set of overlapping 2D images that cover the entirety of the parchment (see Fig. 4). The folios were imaged using a hand-held Digital Single-Lens Reflex (DSLR) camera (Canon 5D Mark III). Each parchment was placed on a table covered with a black velvet cloth (to provide a matt background) surrounded by three large, evenly spaced diffuse lights to provide uniform illumination and minimize the amount of shade cast on the parchment due to self-shadowing. A ‘ColorChecker Color Rendition Chart’37 was placed on the table next to the parchment, providing a measure of scale and colour calibration.
For each folio, we first took a set of images (typically between eight and ten) in a circular formation so that the entire parchment is visible in each image. We then took many more close-up images, making sure to cover the entire surface of the folio thoroughly. For highly distorted areas of the parchment where the text had shrunk to a very small size, we use a macro lens to obtain extreme close-up images. Although algorithms exist for automatically selecting optimal camera viewpoints (Ahmadabadian et al., 2014), the selection of camera views in our reconstruction workflow is dependent on the judgment and experience of the human operator (see Figs 5 and 6).
In total, there were 305 folio sides to be captured (this is less than double the total number of folios—165—because some folios have a blank side), with 260 of these requiring macro capture. We captured 14,941 images. Typically, an image set for an individual folio will contain between fifty and sixty 22MP images, but can sometimes be as large as eighty or more images for extremely deformed folios, or as low as twenty or thirty for relatively flat ones. The most images captured per folio was eighty-nine, and the fewest was eleven, but there was an average of forty-nine images per folio: this indicates the variation in their shape. It took 10–15 min to capture each folio side, meaning twelve folios could be captured in 1 day. The entire capture phase took 24 days of work, although these were not consecutive but were dependent both on access to facilities at LMA, and fitting in with the conservation procedures.
As described above, we process the image sets with Wu’s VisualSFM (Wu, 2011) software to generate a sparse 3D reconstruction using structure from motion. We then apply Furukawa and Ponce’s Patch-based Multi View Stereo (PMVS) algorithm (Furukawa and Ponce, 2010) to generate a dense point reconstruction, examples of which are shown in Fig. 7 (top). This process also computes, along with the point reconstruction, calibration parameters for each input image (such as focal length and camera rotation) which allows us to determine the camera viewing direction of each input image.
The reconstruction is computed up to an arbitrary scale, so the distances in the resulting object space do not correspond to the true distances in real-life space. To correct for this, we allow a user to mark points on the ColorChecker which are a known distance apart, and we then triangulate their positions in object space to compute a scaling factor to allow distances in the model to match those in real-life space.
As can be seen in Fig. 7 (Top), the point clouds contain holes in certain areas. A meshing process smoothly interpolates a surface over holes.38 The next step in our pipeline is therefore to compute a triangle mesh from the dense point cloud, for which we use Kazhdan et al.’s (2006) Poisson Surface Reconstruction algorithm. This algorithm requires very little parameter tuning (we use the exact same parameters for every reconstruction), is resilient to noisy data, and is designed to interpolate holes in the point reconstruction. The algorithm makes use of the normal vectors associated with each point that is generated as part of the PMVS output, making it a natural choice to follow PMVS. We use the Poisson Surface Reconstruction implementation provided in MeshLab (Cignoni et al., 2008). Examples of our reconstructed meshes are shown in Fig. 7 (Middle).
4.1.3 Computation of texture maps
The final part of our reconstruction pipeline is to generate texture maps for the triangle meshes. We use the same texture-atlas generation method as Esteban and Schmitt (2004), originally proposed by Schmitt and Yemez (1999), since it is simple to implement and avoids having to compute a texture parameterization for the mesh.39 This calculation applies the appropriate texture images to place over the mesh model (Fig. 7, bottom). The resulting 3D model contains 100–170MB of data per folio.
4.1.4 Assessing reconstruction quality
Professional archival standards for document digitization describe minimum resolution for raster images (Federal Agencies Digitization Initiative, 2010). In the case of planar 2D artefacts that are imaged with flatbed scanners or in a fronto-parallel camera image, this minimum resolution is usually expressed in dots per inch (DPI): a measure of the sampling frequency of the document being imaged which gives the number of samples (i.e. image pixels) that are taken in the space of one linear inch on the surface of the document. In the case of the Great Parchment Book, the folios should ideally be imaged at 600 DPI, and at a minimum of 300 DPI (FADI, 2010). However, our generated models are not 2D images. How can we effectively assess their quality in relation to archival standards?
Measuring DPI is simple when digitizing a single flat object from a front-facing viewpoint. In our case, however, with a 3D reconstruction texture generated by blending many different images from different viewing distances and viewing angles, the effective sampling density of the reconstructed parchment varies across the surrogate surface, dependent on the acquisition conditions of the images contributing to each surface point. Therefore, assigning a single DPI quality label would not sufficiently characterize the data set. Also, since we cannot guarantee that every point on the manuscript surface is imaged from a fronto-parallel viewpoint, there will inevitably be a degree of anisotropy (or stretch) in the sampling. Instead we want to assess the ‘effective’ DPI of our model: that is, a measure of the frequency at which details on the parchment surface are sampled by the acquisition and reconstruction process. We generated this by looking at a distribution (or histogram) of effective DPI (see Pal, 2015, p. 84–5, for details of how this was computed), see Fig. 8.
The histogram of effective DPI shows that the majority of the mesh vertices are sampled at over 600 DPI. It also shows that the distribution is bi-modal, with a small second cluster of vertices sampled at around 200 DPI.40 We can analyse exactly where on the manuscript surface these variations in sampling occur: low-DPI vertices are mostly on the edges of the folio, which were most likely imaged less thoroughly due to the absence of text or other important features. We argue that this analysis provides an effective way to gauge the quality of a data set in terms easily communicated to archivists and conservators, and demonstrates that our models are of high-enough quality to be of use to historians and palaeographers who are used to relying on digital images of manuscripts of similar spatial quality.
4.2 Interactive document exploration
With the quality of our reconstructions assured, we could now proceed with exploring and exploiting the models to improve access to the document. Our next innovation was to create an interactive system41 that allows a user to navigate the surface of the 3D reconstruction of the Great Parchment Book, virtually flattening specific areas of interest when required. Previously, we discussed how standard metric-preserving surface parameterization algorithms are often not suitable for flattening parchment, because of the way parchment deforms when exposed to heat and/or moisture. We circumvented this difficulty by using an interactive viewer which presents a locally flattened view of the region of text that the user is currently focussed on, by undistorting local subsets of the mesh. This system aims to improve the accessibility and legibility of text in highly distorted documents, in a manner which does not require a global parameterization: since areas are being independently flattened, reconstruction artefacts elsewhere in the mesh will not affect them. This approach is inspired by the observation that, when transcribing a text, a palaeographer will only ever inspect a small section of a folio at any given time (Youtie, 1963) and is analogous to avoiding the distortions of large map projections when attempting to flatten a globe onto a 2D plane (Snyder, 1993). It is, therefore, unnecessary to un-distort the entire folio at once if the primary goal is to simply expose the content in a form that can be read. Instead, if a user looks at a particular region, it should be displayed in a way that is optimal in terms of its readability. Text should be visible, should not be distorted, and lines of text should be rectified so that they run horizontally from left to right, as is to be expected for this document. The interface provides the capability of visualizing the text in ways which are impossible with the physical document.
Local flattening was accomplished by using two modes: a local-affine mode renders the mesh in 3D and transforms it so that the target region is oriented to face the camera; and local-flattening mode allows the target region to be flattened into 2D independently from the rest of the mesh.42 The user can pan over the surface of the document, pausing in places of interest which would benefit from local flattening.
We can see the results of this system in a selection of flattened sections of parchment from different folios of the Great Parchment Book (see Figs 9 and 10).
Our system also addresses the issue of provenance. For historians studying the text through a digital representation, it is important to be able to judge whether a feature present in the surrogate was also present in the original text or whether it is an artefact of the reconstruction pipeline. Terras (2011) discusses this issue at length, focussing mainly on imaging artefacts, and the London Charter for the Computer-based Visualization of Cultural Heritage (Denard, 2012) stresses the importance of storing paradata which documents the process of generating visualizations of cultural heritage. In our case the most likely source of error is the 3D reconstruction process. We therefore document the provenance of the reconstruction by providing the user with smart access to the original image collection: for a given 3D view, the system displays the portion of an original image that best depicts the currently observed part of the parchment. By comparing the 3D reconstruction with the original images the user can better assess the content of the text in areas of the 3D reconstruction which seem to contain errors.
This system was used by the palaeographer who compiled the transcription of the Great Parchment Book, using our system alongside access to the original folios to gain a fuller understanding of the text contained within the parchment. The provenance feature was demonstrably useful in resolving any ambiguities in the model, and access to the local-flattening tool helped in the interpretation of the text.
4.3 Global flattening
With a system successfully in place to allow navigating and local flattening of areas of parchment, we were able to return to the difficult (and optional, for our purposes) issue of whether we could produce globally flattened areas of the text that had as little distortion as possible to produce useful 2D images of the unfolded manuscript. Mesh parameterization computes a map that flattens a 3D surface into the 2D plane by defining some geometric measure of distortion which the algorithm attempts to minimize in the mapping (see Sheffer et al., 2006 for an overview of this technique). Our task is to estimate the complex deformation of the parchment and invert it, thus restoring the original shape of the parchment.43
There are various constraints that help us in this approach (as the previous research in this area showed, it is important to understand documentary features to be able to address them). Before being damaged, the text in the documents was written in a uniform glyph size, in equally spaced horizontal lines, and with strict vertical page margins. We can therefore attempt to find a scaling field which captures the degree of shrinking or stretching of the text at each point, and an orientation field which captures the text direction on the document surface. We use a scale constraint based on identifying a sparse set of single characters (those without ascenders or descenders) and use template-matching-based optical character recognition to compute the bounding box, or x-height, of the text.44 The relative sizes of characters in different regions indicate the amount of shrinking or expansion that has occurred in that region, given the original text was written in approximately the same size throughout. We found that detecting the letter ‘a’ worked well because it has a distinctive shape and is very common, so we can detect a sufficient number of instances in each folio. We then use a semi-automatic approach to line detection (which is complicated by distortions, discolouration, fading, and ascending and descending glyphs): once a user begins to manually trace a line on the folio, the system continuously proposed a suggestion of the next section of line. The user can refine line identification until they are happy with the resulting transformation, and we use these measures of line and scale to invert the deformation of the text (see Figs 11, 12 and 13).45
Even after successful inverse distortion, the texture of the document still exhibits intensity and colour variations which convey the false impression the document is still distorted and not flat. These variations are a combination of shading baked into the texture at the time of acquisition, and the genuine discolouration of the parchment which has taken place in the course of the damage. While preserving these observed appearance variations is a useful feature to study the rectified text in the context of the original damage, which mitigates the risk of misinterpreting potential artefacts introduced by over-processing the content (Terras, 2011, Bentkowska-Kafel, 2013), many readers will prefer a cleaned-up colour appearance in addition to the unwarped geometry. Therefore, we optionally remove colour variations by normalizing the parchment texture’s appearance. This is achieved by independently scaling each colour channel by a spatially varying factor, so that all ink-free regions of the parchment roughly match the same colour (see Fig. 14).46 Video 3 (which is available in the online version of this paper) walks the user through the deformation, and optional colour correction, as presented in Pal et al., 2014. An overview of the processing pipeline, and a selection of successfully generated flattened images, is presented in Fig. 12.
5 Outputs and Impact
There are various ways in which this project has produced outputs that will have lasting impact. Obviously, the focus of this article has been on the acquisition and restoration methods that have enabled the contents of the Great Parchment Book to be accessed by researchers more easily and without further handling of the original, fragile, folios, assisting the production of the new transcript of the text. Originally tested on a small subset of six pages of parchment (Pal et al., 2014), our capture, restoration, and flattening process has now been used to virtually restore all remaining pages of the Great Parchment Book. The project worked with web-designers Headscape46 to develop a website that both kept the user community informed via the means of a project blog, and now hosts a readable and exploitable version of text, comprising a scholarly digital edition which features a searchable transcription as well as a glossary of the manuscript contents. The aim of the website is that it should be accessible and useful to a wide range of people—academic researchers and local and family historians alike. Our new flattened images were integrated into the website of the Great Parchment Book project alongside: images of the folio before conservation treatment; a new scholarly transcription of the original text generated after conservation and digital reconstruction—which has revealed significantly more information on practically every folio, providing a rich, new resource on the history of Ulster for historians—and a version of this new transcription more suitable for non-scholarly audiences.48 The texts, encoded in TEI-compliant XML,49 are fully searchable at http://www.greatparchmentbook.org/folios/. A video providing an overview of all aspects of the conservation, acquisition, digital reconstruction, transcription, and encoding, is available in the online version of this paper.50 Used with permission, LMA, The City of London Corporation. All rights reserved.
The virtually reconstructed Great Parchment Book is the centrepiece of an exhibition in Derry Guildhall which opened in 2013 to commemorate the 400th anniversary of the building of Londonderry’s city walls in 1613. An original, newly conserved folio from the book was also displayed during the first ten months of this exhibition (a rotating schedule includes various archival objects to ensure renewed interest). Both the Museum and Visitor Service of Derry City and Strabane District Council51 and LMA have used the document in their interpretation and outreach programmes, developing resources for schools and colleges based on the information it contains, with a particular school programme associated with the Great Parchment Book taking place in Derry∼Londonderry during the time the exhibition has been open (Stewart, 2013a, b). Both the website and the exhibition have been very well received: the exhibition had nearly 270,000 physical visitors in its first year and has had over 864,000 visitors at time of writing. Overall visitor feedback from the exhibition has been very positive, noting that it gives a balanced overview of the plantation for a variety of international audiences who may be learning about it for the first time, with high praise for the audio visual and original artefact and document material (McConnell, 2015). It has been a huge success that the exhibition has been so positively received by the Derry∼Londonderry community, leading to discussion and debate around the history of sensitive conflict. The website has received over 110,000 page views since it was launched in May 2013, and there is considerable interest in the project from academics who study the period. The newly created resource is used in undergraduate teaching in the School of English and History at the University of Ulster and is proving to be a ‘vital postgraduate and post-doctoral research tool’ given that this document can ‘revisit a contentious historical legacy’ (Kelly, 2016, p. 2). Interactions with the research community interested in the Great Parchment Book also occurred regularly in the form of workshops and presentations throughout the duration of this project: these are described as part of the project blog.
As well as this public engagement success, and the creation of a new scholarly resource, both the conservation and computation approaches in this project have led to further information regarding method and process that will benefit others. The research done on parchment degradation, treatments, sample preparation, trial procedures, results of the tests, and the methodology applied to repair the Great Parchment Book were recorded through photographic and written documentation. Regular updates were shared on the Great Parchment Book blog and are now referred to extensively by conservators, providing a resource for others attempting to conserve similarly damaged parchment, who are often in direct contact with LMA regarding their approach. LMA will offer training courses by the project conservator in the future, following the interest in this from the sector (Smith, 2016).
From a computational point of view, the project has had numerous successes. We have developed a pipeline for low-cost acquisition of highly detailed 3D models of a fire-damaged parchment, adapting previously available techniques mostly used for large-scale historic and architectural structures, to provide an accurate representation of a document. We have developed a technique to navigate the surface of this resulting model, allowing further close-reading and analysis: although we have used it to analyse the text we are most interested in here, the viewer can also explore arbitrary 3D models, allowing the user to inspect interesting surface details of objects for forensic or palaeographic examination. Our viewer allows local flattening and manipulation of the underlying mesh to support the type of physical manipulation a historian or palaeographer may wish, but be unable to do, with fragile historical texts or artefacts. Our understanding of the techniques of production of the Great Parchment Book has allowed us to generate high-quality globally flattened images of each folio, virtually smoothing and restoring the text, increasing its legibility for both general and specialist audiences. We have done so whilst considering palaeographic best practice and bearing data provenance in mind. The fact that our system allows users to interact with both the generated models, and continually to check the veracity of these models by comparing features in the 2D-captured images of the folios, builds trust in our approach. Additionally, we have developed a way in which our approach may be compared to established archival standards for the creation of digital surrogates, demonstrating that our resulting models are of similar spatial quality to 2D images acquired through more normative digitization procedures. Although we have not carried out detailed user testing of our software with a wide library and archive user community, the member of the project team who relied upon our software to aid in transcribing the Great Parchment Book was wholly positive about their experience in navigating and interrogating the surface of the document via the manipulation of its 3D surrogate, and we have had much successful feedback from the research community attending our events and workshops. There is now ample potential for taking this software out to a wider use community, and the joy of our approach to the digital acquisition of the text (using relatively affordable DSLR for capture, without the need for specialist equipment) should mean that this technique can be adopted by others in the library and archive community.
UCL has enabled free access to the digital restoration pipeline through a stand-alone version of our software,52 providing guidance on acquisition through (interactive) processing via our dedicated viewer, based on best-practice computational approaches. The open-sourcing of UCL’s platform should enable other institutions to access the acquisition and restoration process themselves. Meanwhile LMA is exploring the possibility of developing their role as a centre of expertise for the conservation, imaging, and digital restoration of distorted parchments, working in tandem with UCL to maintain the trajectory we have built up working on this project together.
The project has also led to further understanding of the structure of the Great Parchment Book itself, aiding in reconstructing the original ordering of its folios. Prior to conservation, and in the absence of any previous knowledge of the original make-up of the book, the 165 remaining folios had been arranged in a conjectural order. Because of gaps in the volume caused by entirely missing folios, the order of the Companies’ sections within the original bound Book can never be recovered beyond doubt. However, in the course of the project described here, where a closer reading of the text was possible and fragments were positively identified, we can be confident that our reordering of the surviving folios (as presented on the website, with many of the folios having new folio numbers) within each of the Companies is correct, given the numbering of the charters, their arrangement, and the way the text now reads (while still taking into account the existence of missing folios or sections of text within charters). The imaging and reading of individual folios in this project has led to a greater understanding of the document as a whole.
These various successes have meant the project has attracted significant public attention. LMA and UCL were very pleased to receive a Commendation of Merit in the European Succeed Awards 2014, which promote the take up and validation of research results in mass digitization (Succeed Project, 2014).53 The project has been featured in a range of newspaper and magazine articles (for example, Reisz, 2013; Davis, 2015). In June 2016, The Great Parchment Book was inscribed to the UK register of the UNESCO Memory of the World, recognizing the document’s importance (Great Parchment Book, 2016). Summarizing the successes of the project, the First Minister of Northern Ireland, the Rt Hon Peter D Robinson MLA, wrote in a guest blog post on the Great Parchment Book website:
I cannot praise the work of the LMA & UCL highly enough. In completing this mammoth project they have succeeded in opening a veritable treasure trove of information relating to a most significant period in the history of Ulster; and illustrating as never before the central role played by the London Guilds in the creation and preservation of the city of Londonderry and its environs. (Robinson, 2013).
The conservation, digital reconstruction, and resulting transcription of the Great Parchment Book have provided a lasting resource for historians researching the Plantation of Ulster in local, national, and international contexts. Our work on the computational approach to model, navigate, flatten, and ultimately read the damaged parchment will be applicable to similarly damaged material held elsewhere as we believe we are developing best-practice computational approaches to digitizing highly distorted, fire-damaged, historical documents which are all too common in library and archive collections. We have demonstrated how other existing approaches to this problem have proved inadequate, and adopted a semi-automatic approach to enable an expert user, such as a palaeographer or historian, to guide the estimation of virtually restoring texts to generate useful, accurate representations of the denatured parchment, that allow the text to be more legible. The digital image outputs from our system are of high quality, and the techniques used to generate them are transparent: they can be trusted by those accessing them. The digital images can also be integrated easily into scholarly digital editions when the presence of this new evidence has assisted in the text’s transcription.
The work described here is not theoretical: the conservation activities, digitization and restoration pipeline developed, and resulting online resource created, were accomplished within a societal context in time to provide a focus for a community celebration of the founding of Londonderry and the 400th anniversary of the building of its city walls, and a means of reflection on the lasting legacy of the Plantation of Ulster. We see here, in this project, how a regional museum and metropolitan archive can work together, whilst also interacting with university research mechanisms, to develop a process that helps our interpretation of primary historical texts, presenting online materials of benefit to a wide range of interested individuals, and engaging in community activities to respond to, and reflect upon, an important local and national anniversary. It is also important to stress that the pipeline we describe here is not entirely a digital one: we were dependent on the expert work of the conservators and archivists, and the historical and linguistic expertise of the palaeographer who carried out the transcription, and so this project also demonstrates a successful international, interdisciplinary project where aspects of conservation, computational science, and digital humanities research come together to benefit our understanding of an archival object to move forward the interpretation of our cultural inheritance.
We now encourage the scholarly community to make use of the texts and images available at http://www.greatparchmentbook.org/. This resource will be maintained by LMA to provide lasting access to a document of significant historical importance, the contents of which were not available until we undertook this novel and groundbreaking interdisciplinary work to conserve, image, and virtually recover the Great Parchment Book of the Honourable the Irish Society.
This work was financially supported by the UCL EngD VEIV Centre for Doctoral Training; The Engineering and Physical Sciences Research Council (grant EP/G037159/1); The European Research Council (grant iModel StG-2012-306877); Adobe Research; Derry Heritage and Museums Service (DHMS, now Museum and Visitor Service, Derry City and Strabane District Council), the UK’s National Manuscripts Conservation Trust; the Marc Fitch Fund; The Honourable The Irish Society; the City of London Corporation, London Metropolitan Archives, and a number of London livery companies: Clothworkers’ Company, Drapers’ Company, Fishmongers’ Company, Goldsmiths’ Company, Ironmongers’ Company, Mercers’ Company, Merchant Taylors’ Company, Skinners’ Company. Advice and support were provided by Professor James Stevens Curl, The British Library, The National Archives, and The Trustees of Lambeth Palace Library.