Cohort Profile Cohort Profile : The Oxford Biobank

Major progress has been made over the past decade in the understanding of the genetic background to chronic metabolic disease such as type 2 diabetes (T2D) and atherosclerotic cardiovascular disease (CVD). These disorders show a significant degree of heritability and disease pathogenesis that rely on the combination of a multitude of unfavourable genotypes on which over-nutrition, lack of physical exercise, obesity and smoking augment the phenotype. Currently, the number of common genetic variants robustly associated with CVD and T2D are increasing with the increasing size of discovery cohorts; for CVD, the number now exceeds 50 variants and for T2D and glycaemic traits, the corresponding number is about 75. Combining several genome-wide association studies (GWAS) datasets which include information on highly relevant intermediate phenotypes has potentially helped in discovery and replication of several disease loci and identification of novel pathways and pleiotropic genes. However, little is known about the functional consequences of most of the identified gene variants. The use of well-characterized bioresources, in which investigations into intermediate phenotypes can be performed, will be invaluable in order to provide mechanistic insight into these poorly characterized genes and thus promote translational research. To this end the Oxford Biobank (OBB) was set up with the primary goal of establishing a local cohort accessible for genomic translational research. The resource is built to enable studies on physiological consequences of genetic mechanisms of disease. A leading principle has been to seek informed consent from participants to be reapproached for future discrete projects. Therefore, based on the information gathered during a baseline visit, ‘recruit-by-genotype’ (RbG) and ‘recruit-by-phenotype’ (RbP) projects allow for detailed investigations of associations between genotypes and biomarkers, or monitoring of more detailed physiological processes. The OBB serves as a resource for researchers to investigate mechanisms leading to increased T2D and CVD susceptibility and to explore novel therapeutic targets in the prevention and treatment of chronic non-communicable diseases.


Why was the cohort set up?
Major progress has been made over the past decade in the understanding of the genetic background to chronic metabolic disease such as type 2 diabetes (T2D) and atherosclerotic cardiovascular disease (CVD). These disorders show a significant degree of heritability and disease pathogenesis that rely on the combination of a multitude of unfavourable genotypes on which over-nutrition, lack of physical exercise, obesity and smoking augment the phenotype. Currently, the number of common genetic variants robustly associated with CVD and T2D are increasing with the increasing size of discovery cohorts; for CVD, the number now exceeds 50 variants 1-3 and for T2D and glycaemic traits, the corresponding number is about 75. 4,5 Combining several genome-wide association studies (GWAS) datasets which include information on highly relevant intermediate phenotypes has potentially helped in discovery and replication of several disease loci and identification of novel pathways and pleiotropic genes. However, little is known about the functional consequences of most of the identified gene variants. The use of well-characterized bioresources, in which investigations into intermediate phenotypes can be performed, will be invaluable in order to provide mechanistic insight into these poorly characterized genes and thus promote translational research.
To this end the Oxford Biobank (OBB) was set up with the primary goal of establishing a local cohort accessible for genomic translational research. The resource is built to enable studies on physiological consequences of genetic mechanisms of disease. A leading principle has been to seek informed consent from participants to be reapproached for future discrete projects. Therefore, based on the information gathered during a baseline visit, 'recruit-by-genotype' (RbG) and 'recruit-by-phenotype' (RbP) projects allow for detailed investigations of associations between genotypes and biomarkers, or monitoring of more detailed physiological processes. The OBB serves as a resource for researchers to investigate mechanisms leading to increased T2D and CVD susceptibility and to explore novel therapeutic targets in the prevention and treatment of chronic non-communicable diseases.

Who is in the cohort?
The OBB is a random, population-based recruitment of healthy participants between the ages of 30 and 50 years from the Oxfordshire general population (approximately 800 000 inhabitants). Individuals with: previous diagnosis of myocardial infarction or heart failure currently on treatment; untreated malignancies; or other systemic ongoing disease, and pregnant women were excluded from participation. The OBB recruitment began in 1999 and includes 7640 (4316 women and 3324 men) individuals as of October 2016, with the aim of having a local cohort of 10 000 people among whom recalling can be achieved. This sample size is based on the ability to identify an average of 25 people who are homozygous for what is normally V C The Author 2017. Published by Oxford University Press on behalf of the International Epidemiological Association considered common genetic variants (minor allele frequency greater than 0.05). For the purpose of reaching out to even larger populations to allow for recruitment of carriers of rare gene variants or phenotypes, the Oxford Biobank is a partner of the National Institute of Health Research (NIHR) Bioresource currently reaching $100 000 people. Baseline demographics of the OBB participants are provided in Table 1.

Recruitment
The OBB includes a randomized, age-stratified sample obtained from Oxfordshire and the Thames Valley. The Thames Valley Primary Care Agency has enabled random recruitment by providing lists of Oxfordshire residents registered with a local general practitioner and aged  years. An invitation letter along with the study information and response sheet were sent to all the participants. Subjects who expressed willingness to enrol in the OBB were contacted by telephone or e-mail, in order to convey a brief overview of the study aims and objectives, by trained research nurses. Possible exclusions for active disease or previous history of T2D or CVD were confirmed during this contact, and only eligible participants were scheduled for a clinic visit. Eligible participants were then scheduled to visit the Clinical Research Unit at the Oxford Centre for Diabetes, Endocrinology and Metabolism for a baseline investigation. Exclusion criteria were type 1 and type 2 diabetes, established CVD, cancer, known autoimmune or severe inflammatory conditions, substance abuse or psychiatric condition making participation in Stage 2 (see later) unlikely. The OBB protocol is approved by the Oxfordshire Clinical Research Ethics Committee (08/ H0606/107þ5) and all participants have provided informed consent.
How often have they been followed up?
All participants have a detailed baseline characterization (Stage 1). Subsequently, selected volunteers are invited for a second visit (recall) to comply with a specific research protocol (Stage 2). Information on who is selected for such recall studies will be determined by the research question and the available information from the Stage 1 visit. Such recalls could be either 'recall-by-genotype' or 'recall-byphenotype'.

What has been measured?
The OBB has collected a broad range of metabolic-, CVDand obesity-related phenotypes based on blood plasma phenotyping, genetic biomarkers, questionnaires, anthropometric measurements and body composition assessment using dual-energy X-ray absorptiometry (DXA). A brief description of variables collected at baseline is provided below.
Anthropometry. This included height, weight, waist and hip circumference (WC and HC) measurements, and calliper-measured skinfold thickness of the upper arm (over biceps and triceps), subscapular, abdominal and thigh regions.
Questionnaire-based assessments. Information on potential risk exposures or confounders in disease pathology, such as physical activity, smoking and alcohol intake, were obtained using validated questionnaires. The OBB participants were also interviewed by trained nurses on family history of any chronic disease (such as the 'Rose' questionnaire for angina pectoris) given that the family history is a well-known predictor of CVD and T2D. The questionnaires were all adopted from previously used studies and have not been internally validated.
Blood pressure. An automatic pulse-detecting sphygmomanometer (Omron M3) was used to record systolic and diastolic blood pressure, using a standard protocol involving four sequential measurements after 10 min in the semirecumbent position. The average of the last three measurements was used.
Biochemistry. Venous antecubital blood was drawn after an overnight fast and immediately put on ice. Plasma was separated within 60 min, frozen at À20 C within 120 min and transferred to À80 C within 4 h. Plasma samples have been analysed for glucose, lipids/lipoproteins (cholesterol, triglycerides, high-density lipoprotein (HDL) cholesterol, apolipoprotein-B (ApoB), apolipoprotein A1, C-reactive protein (CRP), insulin, total non-esterified fatty acids (NEFA), glycerol, 3-hydroxybutyrate and lactate. A subset of samples have been analysed for insulin-like growth factor (IGF-1) and insulin-like growth factor binding protein-1 (IGFBP-1) (n ¼ $2200). Details of the platforms used for biochemical analysis are provided in Table 2. Adiponectin is currently being analysed in all participants. A biorepository of aliquots (10-15 x 0.5 ml of both EDTA-and heparin-anticoagulated plasma as well as serum) is stored for future use.
Metabolomics. The NMR-based metabolomics platform data containing $230 metabolites 6 has been performed on $7100 Oxford biobank plasma samples. Additionally, the mass spectroscopy-based technology Metabolon V R is available on a select set of 2250 samples on whom detailed DXA-acquired body composition data are available to study the association between specific fat depots and metabolome Genomics. For each OBB participant, 3 Â 5-ml aliquots of whole blood are collected and frozen at À80 C for isolation of genomic DNA. Single nucleotide polymorphism (SNP) array data have been generated using the Illumina Infinium Human Exome Beadchip 12v1 array platform for the first consecutive 5900 DNAs, and Affymetrix UK Biobank Axiom Array chip on the first consecutive 7500 participants. Beyond this, high throughput custom genotyping is facilitated by DNA being plated into 384-well format for typing on an Applied Biosystems 7900HT analyser using Applied Biosystems Taqman V R SNP genotyping chemistries, or by LGC Genomics KASP TM custom assays using KASP genotyping chemistry.
Body composition and bone mineral density assessment. Body composition is assessed using GE Lunar iDXA and all data are analysed with Encore software (version 11.0; GE. Medical Systems, Madison, WI, USA), which beyond regional body composition also includes an algorithm for quantification of visceral adipose tissue (VAT).

What has it found? Key findings and publications
The specific feature of the OBB is that all participants have provided informed consent to be re-contacted for follow-up studies. The cohort has therefore been used for both crosssectional analyses as well as dedicated follow-up studies.
Findings from cross-sectional studies from the baseline data. The 7640 participants recruited so far in the OBB IGF-1, insulin-like growth factor-1; IGFBP-1, insulin-like growth factor binding protein-1; hs-CRP, highly sensitive C-reactive protein; BMD, bone mineral density. All data presented as median (interquartile range) and a frequency (percentage).  have a wide range of phenotypes that allow studying specific disease characteristics in relation to both their genotype and their phenotype. The percentages of various incident phenotypes at baseline, such as impaired fasting glucose (IFG), insulin resistance (IR), undiagnosed T2D and hypertension, overweight and obesity, are provided in Table 3. Results from various study designs are summarized below.
Cross-sectional observational studies. The paradoxical association between upper body android and lower body gluteofemoral fat with CVD and T2D traits was shown using precise estimates of fat depot measured by DXA data among 3399 individuals. 25 Using other imaging techniques such as ultrasound, quantification of subcutaneous abdominal tissue layers (SAT) into deep and superficial SAT and their functional differences have been reported. 26 Studies involving postmenopausal women showed that abdominal obesity was characterized by increased CVD risk factors such as VLDL1-TG and apoB production, hepatic fat and non-HDL cholesterol, which has important implications for CVD risk in this group. 27 Recruit-by-phenotype (RbP) studies. With the rich abundance of data within the baseline OBB characterization, participants can be selected based on pre-defined phenotypic traits ( Table 1) for investigations of complex intermediary phenotypes. These include both in vivo physiological studies and case-control studies. Several in vivo studies using OBB have aimed at understanding adipose tissue biology, investigations into the T2D-and CVD-protective properties of gluteofemoral fat, and fatty acid trafficking. Participants have been selected to take part in complex protocols to study the metabolic physiology of the femoral adipose tissue depot. 28 Using stable isotope-labelled metabolic tracers combined with arterio-venous sampling techniques, it has been found that: (i) muscle and adipose tissue handle fatty acid uptake very differently; 29 and (ii) gluteofemoral adipose depots exhibit lower lipolytic activity 30 and, in relative terms, greater extraction of lipids from ectopic fat deposition. This could explain some of the CVD-and T2D-protective effects seen with expansion of this fat depot. [31][32][33][34] Deep physiological characterization of patients with rare genetic conditions requires access to carefully matched healthy controls for which OBB participants have been used. Examples of this includes familial combined hyperlipidaemia (FCHL), 35 Chuvash polycythaemia, 36 PTEN mutations 37 and extreme high bone mass. 38 Equally, in common disorders where pair-matching is essential for study design, OBB participants have been recruited as controls for studies of polycystic ovary syndrome 39,40 and insulin resistance. 41 Recruit-by-genotype studies (RbG). The first use of OBB for RbG studies was the in vivo physiological characterization of adipose tissue function according to PPARG Pro12Ala carrier status among 42 age-and BMI-matched individuals. The matching for BMI was done to isolate the effect of metabolic phenotype by the PPARG genotype from a potential adiposity effect. Obese individuals carrying the T2D-protective Ala12 variant have higher adipose tissue blood flow than Pro12 carriers. 42 The apolipoprotein-E (APOE) epsilon 4v variant is a risk gene variant for Alzheimer's disease, which has been investigated for brain blood flow in relation to memory testing in age-and sexmatched participants from OBB. 43 The physiological consequences of a PPP1R3A gene variant, identified in relation to digenic inheritance of partial lipodystrophy, 44 was tested using the RbG concept. 45 Besides metabolic disorders, the availability of large genotype data has also enabled the use of OBB in the investigation of other diseases. Using the RbG approach, we recently showed a protective homozygous trait for autoimmune diseases among carriers of tyrosine kinase-2 (TYK2). 46 An updated list of publications from OBB is available at [https://scholar.google.co.uk/citations?hl¼en&user¼xPs_ QwMAAAAJ].
What are the main strengths and weaknesses?
The strength of the cohort is in the triumvirate of detailed baseline characterization of a large random healthy population, the density of the genomic characterization and the recall capability. The cohort is not designed as a prospective follow-up cohort, and the phenotypic baseline characterization is dominated by metabolic measurements. The age range is limited to 30-50 years, and people with overt disease are excluded. We acknowledge that exclusion of T2DM and CVD cases enriched for genotypes of interest may introduce spurious associations due to collider effect and selection bias, particularly in genetic association studies and GWAS. 47,48 Care would be taken to use appropriate statistical methods to account for such bias. However, these effects are likely to be reasonably small with the upper age limit being 50 years in the cohort.
Can I get hold of the data? Where can I find out more?
The OBB is open for collaborative studies with academic and commercial partners after research protocols have been accepted by the OBB steering committee. Rules of engagement and contact with the OBB team can be found on the website [www.oxfordbiobank.org.uk].

OBB in a nutshell
• The Oxford Biobank is a population-based repository of biological material and health-related information on $8000 healthy participants, men and women aged 30-50 years, from Oxfordshire, UK.
• The bioresource includes a broad range of cardiovascular-and obesity-related phenotypes including biochemical and genetic biomarkers, anthropometric measurements and body composition assessed using dual energy X-ray absorptiometry.
• The cohort has the specific feature to allow for future dedicated recall studies based on baseline phenotype and genotype.
• With that capacity, the Oxford biobank is a resource for mechanistic research of genetic and phenotypic traits in a broad range of chronic disease such as cardiovascular disease, type 2 diabetes and obesity complications.
• Researchers interested in using the cohort should go through the online portal [www.oxfordbiobank.org.uk].