-
PDF
- Split View
-
Views
-
Cite
Cite
Furahini Tluway, Godfred Agongo, Vukosi Baloyi, Palwende Romuald Boua, Isaac Kisiangani, Moussa Lingani, Reneilwe Given Mashaba, Shukri F Mohamed, Engelbert A Nonterah, Cairo Bruce Ntimana, Toussaint Rouamba, Theophilous Mathema, Siyanda Madala, Dylan G Maghini, Ananyo Choudhury, Nigel J Crowther, Scott Hazelhurst, Dhriti Sengupta, Patrick Ansah, Solomon Simon Rampai Choma, Cornelius Debpuur, F Xavier Gómez-Olivé, Kathleen Kahn, Lisa K Micklesfield, Shane A Norris, Abraham R Oduro, Hermann Sorgho, Paulina Tindana, Halidou Tinto, Stephen Tollman, Alisha Wade, Michèle Ramsay, as members of AWI-Gen and the H3Africa Consortium , Cohort Profile: Africa Wits-INDEPTH partnership for Genomic studies (AWI-Gen) in four sub-Saharan African countries, International Journal of Epidemiology, Volume 54, Issue 1, February 2025, dyae173, https://doi.org/10.1093/ije/dyae173
- Share Icon Share
Africa Wits-INDEPTH partnership for Genomic studies (AWI-Gen) was established to examine genomic, environmental and behavioural factors influencing body composition and cardiometabolic diseases and traits in African populations.
Population-based longitudinal cohort, involving four sub-Saharan African countries representing rural and urban settings: West Africa [Nanoro (Burkina Faso) and Navrongo (Ghana)]; East Africa [Nairobi (Kenya)]; and South Africa (Agincourt, Dikgale/DIMAMO and Soweto).
Baseline data collected from 2013 to 2017, enrolling 12 032 adults [55.1% women; mean (SD) 51.9 (8.3) years, range 37–82], with follow-up ∼5 years later and a retention rate of almost 60%. In Wave 2 an additional 579 individuals were enrolled (n = 7804; 55.9% women; 57.0 (7.9), 39–98).
Main categories of data collected at two time points include sociodemographic characteristics, history of chronic diseases and lifestyle behaviours, with spirometry, cognition and frailty added in the second data collection wave. Measurements at both time points include anthropometry, blood pressure, carotid intima–media thickness and body fat distribution. Blood and urine samples are collected to measure biomarkers for diabetes, HIV, dyslipidaemia and kidney disease, and stool samples are collected in a women’s sub-sample for gut microbiome analyses. Genome-wide genotyping is available for all participants and whole genome sequences for a subset.
We encourage collaboration, and data are accessible through the AWI-Gen Principal Investigator [[email protected]] in consultation with the steering committee or the H3Africa Data and Biospecimen Access Committee.
Why was the cohort set up?
Cardiometabolic diseases (CMDs) are on the rise worldwide and are leading causes of premature mortality,1 a trend also observed in low- and middle-income countries (LMICs) including those in Africa.2 There is a paucity of data from African countries on CMD prevalence, particularly on the genetic predisposition to common diseases, despite the recognition that African populations have high genetic diversity and unique population substructure.3–5 To address this gap, we set up the Africa Wits-INDEPTH partnership for Genomic studies (AWI-Gen).
The AWI-Gen population cohort, initiated in 2012, was funded by the National Institutes of Health (NIH) as part of the Human Heredity and Health in Africa (H3Africa) Consortium.6 AWI-Gen is a collaboration between the University of the Witwatersrand (Wits) in Johannesburg and the International Network for the Demographic Evaluation of Populations and Their Health (INDEPTH). Five of the six study sites reside within existing health and demographic surveillance systems (HDSS). The remaining site is a Wits research unit in urban Soweto, Johannesburg.
AWI-Gen’s main goal is to provide a research resource to investigate the prevalence of CMD and associated risk factors, and to explore gene–gene and gene–environment interactions.7,8 Additional aims are to strengthen the capacity for genomic research in Africa and to understand the key ethical, legal and sociocultural issues that would affect implementation.
In 2012, a meeting was held with ethics committee members to consider appropriate models for informed consent, data sharing and community benefit.9 Community sensitization was followed by context-relevant stepwise permissions and individualized consent before entry into the study. The approach of broad consent for the future use of data and biological samples was agreed on by the research team and was also the preferred model adopted by H3Africa. The AWI-Gen process was documented at all sites, highlighting community engagement experiences and lessons learned.10,11 General health awareness and education were provided and point-of-care tests performed (glucose, lipids and HIV, the latter with supportive trained counsellors). Participants with health concerns (e.g. high blood pressure and high fasting glucose levels) were referred to the local health facilities with explanatory notes. Appropriate feedback was provided in reasonable time frames, and ongoing engagement and awareness of research progress were shared with communities.
The longitudinal nature of the cohort permits assessment of changes in measurements and behaviour over about a 5-year period, and their impact on cardiometabolic endpoints. Knowledge gained from the AWI-Gen study has the potential for translational impact, including policy briefs and development of strategies for precision medicine applications in participating communities.
Who is in the cohort?
AWI-Gen study participants are from five INDEPTH member HDSS centres in Africa, with diverse representation of populations from rural and urban environments in West, East and South Africa.7,8 These centres are Nanoro (Burkina Faso),12 Navrongo (Ghana),13 Nairobi (Kenya),14 Agincourt (South Africa)15 and Dikgale (later renamed DIMAMO) (South Africa)16 and the SAMRC/Wits Developmental Pathways for Health Research Unit (DPHRU) (Soweto, South Africa).17 The geographical location, number and biological sex distribution of participants from each study centre and a timeline for training, community engagement and recruitment, are shown in Figure 1, with study participant characteristics in Table 1.

Description of the AWI-Gen study across two waves of data collection. (A) Location of the six study locations in health and demographic surveillance systems (HDSS) and the Developmental Pathways to Health Research Unit (DPHRU), showing the number of participants recruited in Wave 1 and the number recalled in Wave 2. Note that the DIMAMO cohort was enhanced in Wave 2 due to the lower recruitment numbers in Wave 1. (B) Number of men and women per study centre with the augmentation in DIMAMO shown with the cross-hatching. (C) Timeline showing community engagement that is still ongoing, and recruitment during Wave 1 and Wave 2. Recruitment started earlier in Soweto since we onboarded women from a study on menopause which was already on the files from 2012. Six PhD students graduated through the Africa Wits-INDEPTH partnership for Genomic studies (AWI-Gen) study and a further three are in progress. AWI-Gen meetings coincided with some of the Human Heredity and Health in Africa (H3Africa) consortium meetings, and training workshops were held to ensure that data were collected according to the same protocols and standard operating procedures. The colour coding per centre is consistent across all figures
Africa Wits-INDEPTH partnership for Genomic studies (AWI-Gen) study site and study characteristics
. | Study site . | |||||
---|---|---|---|---|---|---|
. | Nanoro . | Navrongo . | Nairobi . | Dikgale/DIMAMO . | Agincourt . | Soweto . |
Country . | Burkina Faso . | Ghana . | Kenya . | South Africa . | ||
Study site characteristics | ||||||
HDSS Catchment Area, km2 | 594 | 1675 | 6 | 545 | 420 | 200 |
HDSS Catchment Area population, thousands | 63 | 156 | 75 | 36 | 115 | 1200 |
Population density per km2 | 105 | 91 | 14 833 | 113 | 274 | 6357 |
Map coordinates | 12.57N, 1.92W | 10.89N, 1.09W | 1.25S, 36.89E | 23.65S, 29.65E | 24.50S, 31.08E | 26.24S, 27.84E |
12.72N, 2.31W | 1.31S 36.87E | 23.90S 29.85E | 24.56S 31.25E | |||
Altitude, m | 313 | 196 | 1790 | 1250 | 400-600 | 1632 |
AWI-Gen study characteristics | ||||||
Wave 1 (n = 12 032) | ||||||
Participants (n) | 2097 | 2016 | 2003 | 1399 | 2486 | 2031 |
Female (%) | 49.6 | 54.2 | 53.9 | 69.4 | 57.8 | 49.3 |
Mean (SD) age (years) | 49.8 (5.9) | 51.1 (5.8) | 49.1 (6.1) | 52.2 (8.3) | 58.7 (11.0) | 49.3 (5.9) |
Age range (years) | 40–62 | 41–65 | 38–67 | 37–82 | 41–81 | 40–69 |
Wave 2 (n = 7804) | ||||||
Participants (n) | 1498 | 1208 | 1189 | 1239b | 1255 | 1415 |
Female (%) | 52.6 | 57.4 | 58.4 | 57.6 | 61.1 | 49.8 |
Mean (SD) age (years) | 55.0 (5.8) | 57.0 (5.7) | 54.3 (6.1) | 58.4 (9.4) | 62.6 (10.5) | 55.3 (6.0) |
Age range (years) | 45–67 | 47–71 | 43–73 | 39–98 | 46–86 | 44–73 |
Retention in Wave 2: n (%) | 599 (28.6) | 808 (40.1) | 814 (40.6) | 739 (52.8) | 1231 (49.5) | 616 (30.0)a |
. | Study site . | |||||
---|---|---|---|---|---|---|
. | Nanoro . | Navrongo . | Nairobi . | Dikgale/DIMAMO . | Agincourt . | Soweto . |
Country . | Burkina Faso . | Ghana . | Kenya . | South Africa . | ||
Study site characteristics | ||||||
HDSS Catchment Area, km2 | 594 | 1675 | 6 | 545 | 420 | 200 |
HDSS Catchment Area population, thousands | 63 | 156 | 75 | 36 | 115 | 1200 |
Population density per km2 | 105 | 91 | 14 833 | 113 | 274 | 6357 |
Map coordinates | 12.57N, 1.92W | 10.89N, 1.09W | 1.25S, 36.89E | 23.65S, 29.65E | 24.50S, 31.08E | 26.24S, 27.84E |
12.72N, 2.31W | 1.31S 36.87E | 23.90S 29.85E | 24.56S 31.25E | |||
Altitude, m | 313 | 196 | 1790 | 1250 | 400-600 | 1632 |
AWI-Gen study characteristics | ||||||
Wave 1 (n = 12 032) | ||||||
Participants (n) | 2097 | 2016 | 2003 | 1399 | 2486 | 2031 |
Female (%) | 49.6 | 54.2 | 53.9 | 69.4 | 57.8 | 49.3 |
Mean (SD) age (years) | 49.8 (5.9) | 51.1 (5.8) | 49.1 (6.1) | 52.2 (8.3) | 58.7 (11.0) | 49.3 (5.9) |
Age range (years) | 40–62 | 41–65 | 38–67 | 37–82 | 41–81 | 40–69 |
Wave 2 (n = 7804) | ||||||
Participants (n) | 1498 | 1208 | 1189 | 1239b | 1255 | 1415 |
Female (%) | 52.6 | 57.4 | 58.4 | 57.6 | 61.1 | 49.8 |
Mean (SD) age (years) | 55.0 (5.8) | 57.0 (5.7) | 54.3 (6.1) | 58.4 (9.4) | 62.6 (10.5) | 55.3 (6.0) |
Age range (years) | 45–67 | 47–71 | 43–73 | 39–98 | 46–86 | 44–73 |
Retention in Wave 2: n (%) | 599 (28.6) | 808 (40.1) | 814 (40.6) | 739 (52.8) | 1231 (49.5) | 616 (30.0)a |
HDSS, health and demographic surveillance survey; SD, standard deviation.
For the Soweto cohort, recall was finalized when 70% of participants had been recalled into Wave 2. Not all participants were recontacted.
For the Dikgale/DIMAMO cohort an additional 579 participants were recruited in Wave 2.
Africa Wits-INDEPTH partnership for Genomic studies (AWI-Gen) study site and study characteristics
. | Study site . | |||||
---|---|---|---|---|---|---|
. | Nanoro . | Navrongo . | Nairobi . | Dikgale/DIMAMO . | Agincourt . | Soweto . |
Country . | Burkina Faso . | Ghana . | Kenya . | South Africa . | ||
Study site characteristics | ||||||
HDSS Catchment Area, km2 | 594 | 1675 | 6 | 545 | 420 | 200 |
HDSS Catchment Area population, thousands | 63 | 156 | 75 | 36 | 115 | 1200 |
Population density per km2 | 105 | 91 | 14 833 | 113 | 274 | 6357 |
Map coordinates | 12.57N, 1.92W | 10.89N, 1.09W | 1.25S, 36.89E | 23.65S, 29.65E | 24.50S, 31.08E | 26.24S, 27.84E |
12.72N, 2.31W | 1.31S 36.87E | 23.90S 29.85E | 24.56S 31.25E | |||
Altitude, m | 313 | 196 | 1790 | 1250 | 400-600 | 1632 |
AWI-Gen study characteristics | ||||||
Wave 1 (n = 12 032) | ||||||
Participants (n) | 2097 | 2016 | 2003 | 1399 | 2486 | 2031 |
Female (%) | 49.6 | 54.2 | 53.9 | 69.4 | 57.8 | 49.3 |
Mean (SD) age (years) | 49.8 (5.9) | 51.1 (5.8) | 49.1 (6.1) | 52.2 (8.3) | 58.7 (11.0) | 49.3 (5.9) |
Age range (years) | 40–62 | 41–65 | 38–67 | 37–82 | 41–81 | 40–69 |
Wave 2 (n = 7804) | ||||||
Participants (n) | 1498 | 1208 | 1189 | 1239b | 1255 | 1415 |
Female (%) | 52.6 | 57.4 | 58.4 | 57.6 | 61.1 | 49.8 |
Mean (SD) age (years) | 55.0 (5.8) | 57.0 (5.7) | 54.3 (6.1) | 58.4 (9.4) | 62.6 (10.5) | 55.3 (6.0) |
Age range (years) | 45–67 | 47–71 | 43–73 | 39–98 | 46–86 | 44–73 |
Retention in Wave 2: n (%) | 599 (28.6) | 808 (40.1) | 814 (40.6) | 739 (52.8) | 1231 (49.5) | 616 (30.0)a |
. | Study site . | |||||
---|---|---|---|---|---|---|
. | Nanoro . | Navrongo . | Nairobi . | Dikgale/DIMAMO . | Agincourt . | Soweto . |
Country . | Burkina Faso . | Ghana . | Kenya . | South Africa . | ||
Study site characteristics | ||||||
HDSS Catchment Area, km2 | 594 | 1675 | 6 | 545 | 420 | 200 |
HDSS Catchment Area population, thousands | 63 | 156 | 75 | 36 | 115 | 1200 |
Population density per km2 | 105 | 91 | 14 833 | 113 | 274 | 6357 |
Map coordinates | 12.57N, 1.92W | 10.89N, 1.09W | 1.25S, 36.89E | 23.65S, 29.65E | 24.50S, 31.08E | 26.24S, 27.84E |
12.72N, 2.31W | 1.31S 36.87E | 23.90S 29.85E | 24.56S 31.25E | |||
Altitude, m | 313 | 196 | 1790 | 1250 | 400-600 | 1632 |
AWI-Gen study characteristics | ||||||
Wave 1 (n = 12 032) | ||||||
Participants (n) | 2097 | 2016 | 2003 | 1399 | 2486 | 2031 |
Female (%) | 49.6 | 54.2 | 53.9 | 69.4 | 57.8 | 49.3 |
Mean (SD) age (years) | 49.8 (5.9) | 51.1 (5.8) | 49.1 (6.1) | 52.2 (8.3) | 58.7 (11.0) | 49.3 (5.9) |
Age range (years) | 40–62 | 41–65 | 38–67 | 37–82 | 41–81 | 40–69 |
Wave 2 (n = 7804) | ||||||
Participants (n) | 1498 | 1208 | 1189 | 1239b | 1255 | 1415 |
Female (%) | 52.6 | 57.4 | 58.4 | 57.6 | 61.1 | 49.8 |
Mean (SD) age (years) | 55.0 (5.8) | 57.0 (5.7) | 54.3 (6.1) | 58.4 (9.4) | 62.6 (10.5) | 55.3 (6.0) |
Age range (years) | 45–67 | 47–71 | 43–73 | 39–98 | 46–86 | 44–73 |
Retention in Wave 2: n (%) | 599 (28.6) | 808 (40.1) | 814 (40.6) | 739 (52.8) | 1231 (49.5) | 616 (30.0)a |
HDSS, health and demographic surveillance survey; SD, standard deviation.
For the Soweto cohort, recall was finalized when 70% of participants had been recalled into Wave 2. Not all participants were recontacted.
For the Dikgale/DIMAMO cohort an additional 579 participants were recruited in Wave 2.
At baseline (Wave 1), AWI-Gen was a population-based cross-sectional survey that aimed to enrol 2000 participants, 40–60 years of age, approximately equal numbers of men and women, from each of the six study centres. Random sampling was used to select participants from existing populations based at the HDSS. Each study centre used mechanisms to minimize selection bias. Criteria excluded closely related individuals, pregnant women and individuals resident in the community for <10 years. Closely related individuals were considered first-degree relatives, but despite this, genetic analysis revealed significant relatedness among participants in some centres. Ethics approval was obtained from the Wits Human Research Ethics Committee (Medical) and participating institutions and/or national ethics committees. Participants gave informed consent to participate in the study before taking part.
We enrolled 12 032 participants, with 88.1% between 40 and 60 years of age. Agincourt had 447 individuals >60 years of age due to harmonization with the Health and Aging in Africa: A Longitudinal Study in South Africa (HAALSI) cohort.8 Participants were selected using purposeful sampling based on existing population sample frames from the respective study sites; details are provided in Ali et al.8 and in the Supplementary Material (available as Supplementary data at IJE online).
How often have they been followed up?
Data and sample collection was conducted at two time points (Figure 1a and Table 1). During Wave 1 (2013–17), 12 032 participants were enrolled. Wave 2 of data collection took place between 2018–22, as a closed cohort with one exception: the Dikgale/DIMAMO study centre was augmented with an additional 579 new participants in Wave 2. A 60% retention of the original cohort was achieved (range 47–72% across centres) and data and samples collected on 7804 participants (7225 recalled and 579 new participants) in Wave 2 (Table 1). Those remaining in the cohort were on average younger [50.0 (34–82) years vs 5229–81] and female (57.1% vs 52.0% women).
What has been measured?
The AWI-Gen cohort variables, collection instruments and protocols for Wave 1 are described in detail in Ali et al.8 and include sociodemographic characteristics, lifestyle behaviour, medical history of chronic and infectious diseases, anthropometry, body fat distribution, blood pressure and blood and urine biomarkers. In Wave 2, additional variables and measurements were included (Table 2).
Data and variables collected during Africa Wits-INDEPTH partnership for Genomic studies (AWI-Gen) Wave 1 (baseline) and Wave 2
Data domain . | Variables . | Wave 1 . | Wave 2 . |
---|---|---|---|
Phenotype data collected | |||
Sociodemographic characteristics | Age at enrolment, sex, ethno-linguistics, country, level of education, employment status, marital status | Yes | Yes |
Family composition and household attributes | Number of siblings and children, household size and other socioeconomic characteristics | Yes | Yes |
Reproductive history and menopause status (women) | Number of pregnancies and live births, menstrual and family planning history, date of final menstrual period | Yes | Yes |
Lifestyle behaviours and exposure to pesticides | Diet, physical activity, sleep, tobacco and alcohol use, exposure to pesticides | Yes | Yes |
Infectious diseases | Tuberculosis, HIV and malaria | Yes | Yes |
History of chronic diseases and risk factors | History of diabetes, hypertension, cardiac diseases, chronic kidney disease, dyslipidaemia, stroke, cancers | Yes | Yes |
Cognition and frailty | Immediate and delayed recall, orientation, grip strength, chair rise sit to stand, gait speed | No | Yes |
Trauma | History of assault, injuries, serious illness, domestic violence, bereavement, loss of employment, | No | Yes |
Respiratory healtha | History of upper and lower respiratory infections, asthma | No | Yes |
Microbiomea | History of antibiotic use, diarrhoea, intestinal worms, probiotic use | No | Yes |
Measurements | |||
Anthropometry | Weight, height, waist and hip circumferences | Yes | Yes |
Vital signs | Blood pressure, pulse | Yes | Yes |
Ultrasound measurements | Visceral and subcutaneous adipose tissue, left, and right intima–media thickness | Yes | Yes |
Spirometrya | Spirometry test | No | Yes |
Biomarkers | |||
Blood | Serum HDL-C and LDL-C and triglycerides, serum creatinine, insulin and fasting blood glucose, sex hormones measurementsa | Yes | Yes |
Urine | Albumin, creatinine and total protein | Yes | Yes |
Stoola | Gut microbiome characterization | Yes | Yes |
Data domain . | Variables . | Wave 1 . | Wave 2 . |
---|---|---|---|
Phenotype data collected | |||
Sociodemographic characteristics | Age at enrolment, sex, ethno-linguistics, country, level of education, employment status, marital status | Yes | Yes |
Family composition and household attributes | Number of siblings and children, household size and other socioeconomic characteristics | Yes | Yes |
Reproductive history and menopause status (women) | Number of pregnancies and live births, menstrual and family planning history, date of final menstrual period | Yes | Yes |
Lifestyle behaviours and exposure to pesticides | Diet, physical activity, sleep, tobacco and alcohol use, exposure to pesticides | Yes | Yes |
Infectious diseases | Tuberculosis, HIV and malaria | Yes | Yes |
History of chronic diseases and risk factors | History of diabetes, hypertension, cardiac diseases, chronic kidney disease, dyslipidaemia, stroke, cancers | Yes | Yes |
Cognition and frailty | Immediate and delayed recall, orientation, grip strength, chair rise sit to stand, gait speed | No | Yes |
Trauma | History of assault, injuries, serious illness, domestic violence, bereavement, loss of employment, | No | Yes |
Respiratory healtha | History of upper and lower respiratory infections, asthma | No | Yes |
Microbiomea | History of antibiotic use, diarrhoea, intestinal worms, probiotic use | No | Yes |
Measurements | |||
Anthropometry | Weight, height, waist and hip circumferences | Yes | Yes |
Vital signs | Blood pressure, pulse | Yes | Yes |
Ultrasound measurements | Visceral and subcutaneous adipose tissue, left, and right intima–media thickness | Yes | Yes |
Spirometrya | Spirometry test | No | Yes |
Biomarkers | |||
Blood | Serum HDL-C and LDL-C and triglycerides, serum creatinine, insulin and fasting blood glucose, sex hormones measurementsa | Yes | Yes |
Urine | Albumin, creatinine and total protein | Yes | Yes |
Stoola | Gut microbiome characterization | Yes | Yes |
HDL-C, high-density lipoprotein cholesterol; LDL-C, low-density lipoprotein cholesterol.
These variables are available for a subset of the AWI-Gen cohort.
Data and variables collected during Africa Wits-INDEPTH partnership for Genomic studies (AWI-Gen) Wave 1 (baseline) and Wave 2
Data domain . | Variables . | Wave 1 . | Wave 2 . |
---|---|---|---|
Phenotype data collected | |||
Sociodemographic characteristics | Age at enrolment, sex, ethno-linguistics, country, level of education, employment status, marital status | Yes | Yes |
Family composition and household attributes | Number of siblings and children, household size and other socioeconomic characteristics | Yes | Yes |
Reproductive history and menopause status (women) | Number of pregnancies and live births, menstrual and family planning history, date of final menstrual period | Yes | Yes |
Lifestyle behaviours and exposure to pesticides | Diet, physical activity, sleep, tobacco and alcohol use, exposure to pesticides | Yes | Yes |
Infectious diseases | Tuberculosis, HIV and malaria | Yes | Yes |
History of chronic diseases and risk factors | History of diabetes, hypertension, cardiac diseases, chronic kidney disease, dyslipidaemia, stroke, cancers | Yes | Yes |
Cognition and frailty | Immediate and delayed recall, orientation, grip strength, chair rise sit to stand, gait speed | No | Yes |
Trauma | History of assault, injuries, serious illness, domestic violence, bereavement, loss of employment, | No | Yes |
Respiratory healtha | History of upper and lower respiratory infections, asthma | No | Yes |
Microbiomea | History of antibiotic use, diarrhoea, intestinal worms, probiotic use | No | Yes |
Measurements | |||
Anthropometry | Weight, height, waist and hip circumferences | Yes | Yes |
Vital signs | Blood pressure, pulse | Yes | Yes |
Ultrasound measurements | Visceral and subcutaneous adipose tissue, left, and right intima–media thickness | Yes | Yes |
Spirometrya | Spirometry test | No | Yes |
Biomarkers | |||
Blood | Serum HDL-C and LDL-C and triglycerides, serum creatinine, insulin and fasting blood glucose, sex hormones measurementsa | Yes | Yes |
Urine | Albumin, creatinine and total protein | Yes | Yes |
Stoola | Gut microbiome characterization | Yes | Yes |
Data domain . | Variables . | Wave 1 . | Wave 2 . |
---|---|---|---|
Phenotype data collected | |||
Sociodemographic characteristics | Age at enrolment, sex, ethno-linguistics, country, level of education, employment status, marital status | Yes | Yes |
Family composition and household attributes | Number of siblings and children, household size and other socioeconomic characteristics | Yes | Yes |
Reproductive history and menopause status (women) | Number of pregnancies and live births, menstrual and family planning history, date of final menstrual period | Yes | Yes |
Lifestyle behaviours and exposure to pesticides | Diet, physical activity, sleep, tobacco and alcohol use, exposure to pesticides | Yes | Yes |
Infectious diseases | Tuberculosis, HIV and malaria | Yes | Yes |
History of chronic diseases and risk factors | History of diabetes, hypertension, cardiac diseases, chronic kidney disease, dyslipidaemia, stroke, cancers | Yes | Yes |
Cognition and frailty | Immediate and delayed recall, orientation, grip strength, chair rise sit to stand, gait speed | No | Yes |
Trauma | History of assault, injuries, serious illness, domestic violence, bereavement, loss of employment, | No | Yes |
Respiratory healtha | History of upper and lower respiratory infections, asthma | No | Yes |
Microbiomea | History of antibiotic use, diarrhoea, intestinal worms, probiotic use | No | Yes |
Measurements | |||
Anthropometry | Weight, height, waist and hip circumferences | Yes | Yes |
Vital signs | Blood pressure, pulse | Yes | Yes |
Ultrasound measurements | Visceral and subcutaneous adipose tissue, left, and right intima–media thickness | Yes | Yes |
Spirometrya | Spirometry test | No | Yes |
Biomarkers | |||
Blood | Serum HDL-C and LDL-C and triglycerides, serum creatinine, insulin and fasting blood glucose, sex hormones measurementsa | Yes | Yes |
Urine | Albumin, creatinine and total protein | Yes | Yes |
Stoola | Gut microbiome characterization | Yes | Yes |
HDL-C, high-density lipoprotein cholesterol; LDL-C, low-density lipoprotein cholesterol.
These variables are available for a subset of the AWI-Gen cohort.
Genomic data: about 11 000 AWI-Gen participants were genotyped on the H3Africa Consortium single nucleotide variant (SNV) genotyping array, and additional genotypes imputed using the African Genome Reference panel at the Sanger Imputation Service,18 with a final dataset of 10 603 participants and 13.9 million SNVs. About 100 South African participants and 60 from Burkina Faso and Ghana have whole genome sequence (WGS) data (30x coverage), and eight have PacBio WGS data (10–20x). A further ∼600 South African participants have WGS (4x) data and matching transcriptome data from RNA extracted from whole blood (study performed in collaboration with Variant Bio). There are 16S and shotgun sequence data from gut microbiome samples of 169 participants at two study sites from Wave 1, and shotgun sequence data from 1824 participants at six study sites (overlap of 76) from Wave 2.
What has it found?
Demographic characterization and key variables at baseline (Wave 1) and follow-up, roughly 5 years later, (Wave 2) are shown in Figure 2 and Table 3. The age distributions are shown across the two waves of data collection (Figure 2A) and the genetic-informed population sub-structure is demonstrated in a plot based on principal component analysis (Figure 1B). The percentage of individuals with specific risk factors [body mass index (BMI); Figure 2C] and CMD outcomes (obesity and overweight, diabetes, chronic kidney disease and hypertension; Figure 2D) increased over the 5-year period, as would be expected for participants in this age group.

Outcomes from the longitudinal AWI-Gen study highlighting genetic diversity and key cardiometabolic diseases risk factors and endpoints. (A) Distribution of ages of participants in Wave 1 and Wave 2. (B) Principal component (PC) analysis using genome-wide genotyping data from the H3Africa single nucleotide polymorphism (SNP) array, showing the clustering of participants from the South, East and West Africa study centres. It highlights the population sub-structure and emphasizes the need to adjust for ancestry in genome-wide association studies. (C) Distribution of body mass index (BMI) in women (F) and men (M) in Wave 1 and Wave 2, noting the higher BMI in women and overall, in the East and South African cohorts. Note that outliers and individuals with BMI >70 (n = 2) were excluded from this visualization. Lower and upper hinges correspond to first and third quartiles, and whiskers indicate largest and smallest values within 1.5 * interquartile range. (D) The prevalence % of the combined cohort of four key cardiometabolic outcomes, diabetes, obesity and overweight, chronic kidney disease and hypertension, at two different time points (Wave 1 and Wave 2). Almost all conditions have a higher prevalence at Wave 2 when participants are about 5 years older. Note the different scales for the percentages
Demographic characteristics and key outcome variables for AWI-Gen at baseline (Wave 1) and follow-up (Wave 2), roughly 5 years later, comparing men and women
. | Wave 1 . | Wave 2 . | ||||
---|---|---|---|---|---|---|
Key variable . | Total n (%) . | Male n (%) . | Female n (%) . | Total n (%) . | Male n (%) . | Female n (%) . |
Age groups in years | 12 032 | 7807 | ||||
<50 | 5164 (42.9) | 2365 (43.8) | 2799 (42.2) | 1441 (18.5) | 702 (20.4) | 739 (16.9) |
50–60 | 5535 (46.0) | 2431 (45.0) | 3104 (46.8) | 3884 (49.8) | 1625 (47.2) | 2259 (51.8) |
>60 | 1333 (11.1) | 609 (11.3) | 724 (10.9) | 2482 (31.8) | 1116 (32.4) | 1366 (31.3) |
Age in years | 12 032 | 7807 | ||||
Mean (SD) | 51.9 (8.3) | 51.8 (8.4) | 52.0 (8.3) | 57.0 (7.9) | 57.0 (8.2) | 57.1 (7.7) |
Body mass index | 11 945 | 7715 | ||||
Underweight | 1317 (11.0) | 744 (13.9) | 573 (8.7) | 984 (12.7) | 555 (16.2) | 429 (10.0) |
Normal | 5706 (47.8) | 3246 (60.5) | 2460 (37.4) | 3338 (43.3) | 1863 (54.4) | 1475 (34.4) |
Overweight | 2371 (19.9) | 971 (18.1) | 1400 (21.3) | 1574 (20.4) | 690 (20.2) | 884 (20.6) |
Obese | 2551 (21.4) | 409 (7.6) | 2142 (32.6) | 1819 (23.6) | 314 (9.2) | 1505 (35.1) |
Hypertension | 12 032 | 7734 | ||||
No | 7494 (62.3) | 3513 (65.0) | 3981 (60.1) | 4158 (53.8) | 879 (54.7) | 2279 (53.0) |
Yes | 4538 (37.7) | 1892 (35.0) | 2646 (39.9) | 3576 (46.2) | 1553 (45.3) | 2023 (47.0) |
Diabetes | 12 032 | 7748 | ||||
No | 11240 (93.4) | 5100 (94.4) | 6140 (92.7) | 6814 (87.9) | 3068 (89.2) | 3746 (86.9) |
Yes | 792 (6.6) | 305 (5.6) | 487 (7.4) | 934 (12.1) | 372 (10.8) | 562 (13.1) |
Chronic kidney disease | 9836 | 6862 | ||||
No | 8648 (87.9) | 4295 (88.9) | 4353 (87.0) | 5454 (79.5) | 2482 (80.6) | 2972 (78.6) |
Yes | 1188 (12.1) | 539 (11.1) | 649 (13.0) | 1408 (20.5) | 599 (19.4) | 809 (21.4) |
. | Wave 1 . | Wave 2 . | ||||
---|---|---|---|---|---|---|
Key variable . | Total n (%) . | Male n (%) . | Female n (%) . | Total n (%) . | Male n (%) . | Female n (%) . |
Age groups in years | 12 032 | 7807 | ||||
<50 | 5164 (42.9) | 2365 (43.8) | 2799 (42.2) | 1441 (18.5) | 702 (20.4) | 739 (16.9) |
50–60 | 5535 (46.0) | 2431 (45.0) | 3104 (46.8) | 3884 (49.8) | 1625 (47.2) | 2259 (51.8) |
>60 | 1333 (11.1) | 609 (11.3) | 724 (10.9) | 2482 (31.8) | 1116 (32.4) | 1366 (31.3) |
Age in years | 12 032 | 7807 | ||||
Mean (SD) | 51.9 (8.3) | 51.8 (8.4) | 52.0 (8.3) | 57.0 (7.9) | 57.0 (8.2) | 57.1 (7.7) |
Body mass index | 11 945 | 7715 | ||||
Underweight | 1317 (11.0) | 744 (13.9) | 573 (8.7) | 984 (12.7) | 555 (16.2) | 429 (10.0) |
Normal | 5706 (47.8) | 3246 (60.5) | 2460 (37.4) | 3338 (43.3) | 1863 (54.4) | 1475 (34.4) |
Overweight | 2371 (19.9) | 971 (18.1) | 1400 (21.3) | 1574 (20.4) | 690 (20.2) | 884 (20.6) |
Obese | 2551 (21.4) | 409 (7.6) | 2142 (32.6) | 1819 (23.6) | 314 (9.2) | 1505 (35.1) |
Hypertension | 12 032 | 7734 | ||||
No | 7494 (62.3) | 3513 (65.0) | 3981 (60.1) | 4158 (53.8) | 879 (54.7) | 2279 (53.0) |
Yes | 4538 (37.7) | 1892 (35.0) | 2646 (39.9) | 3576 (46.2) | 1553 (45.3) | 2023 (47.0) |
Diabetes | 12 032 | 7748 | ||||
No | 11240 (93.4) | 5100 (94.4) | 6140 (92.7) | 6814 (87.9) | 3068 (89.2) | 3746 (86.9) |
Yes | 792 (6.6) | 305 (5.6) | 487 (7.4) | 934 (12.1) | 372 (10.8) | 562 (13.1) |
Chronic kidney disease | 9836 | 6862 | ||||
No | 8648 (87.9) | 4295 (88.9) | 4353 (87.0) | 5454 (79.5) | 2482 (80.6) | 2972 (78.6) |
Yes | 1188 (12.1) | 539 (11.1) | 649 (13.0) | 1408 (20.5) | 599 (19.4) | 809 (21.4) |
Demographic characteristics and key outcome variables for AWI-Gen at baseline (Wave 1) and follow-up (Wave 2), roughly 5 years later, comparing men and women
. | Wave 1 . | Wave 2 . | ||||
---|---|---|---|---|---|---|
Key variable . | Total n (%) . | Male n (%) . | Female n (%) . | Total n (%) . | Male n (%) . | Female n (%) . |
Age groups in years | 12 032 | 7807 | ||||
<50 | 5164 (42.9) | 2365 (43.8) | 2799 (42.2) | 1441 (18.5) | 702 (20.4) | 739 (16.9) |
50–60 | 5535 (46.0) | 2431 (45.0) | 3104 (46.8) | 3884 (49.8) | 1625 (47.2) | 2259 (51.8) |
>60 | 1333 (11.1) | 609 (11.3) | 724 (10.9) | 2482 (31.8) | 1116 (32.4) | 1366 (31.3) |
Age in years | 12 032 | 7807 | ||||
Mean (SD) | 51.9 (8.3) | 51.8 (8.4) | 52.0 (8.3) | 57.0 (7.9) | 57.0 (8.2) | 57.1 (7.7) |
Body mass index | 11 945 | 7715 | ||||
Underweight | 1317 (11.0) | 744 (13.9) | 573 (8.7) | 984 (12.7) | 555 (16.2) | 429 (10.0) |
Normal | 5706 (47.8) | 3246 (60.5) | 2460 (37.4) | 3338 (43.3) | 1863 (54.4) | 1475 (34.4) |
Overweight | 2371 (19.9) | 971 (18.1) | 1400 (21.3) | 1574 (20.4) | 690 (20.2) | 884 (20.6) |
Obese | 2551 (21.4) | 409 (7.6) | 2142 (32.6) | 1819 (23.6) | 314 (9.2) | 1505 (35.1) |
Hypertension | 12 032 | 7734 | ||||
No | 7494 (62.3) | 3513 (65.0) | 3981 (60.1) | 4158 (53.8) | 879 (54.7) | 2279 (53.0) |
Yes | 4538 (37.7) | 1892 (35.0) | 2646 (39.9) | 3576 (46.2) | 1553 (45.3) | 2023 (47.0) |
Diabetes | 12 032 | 7748 | ||||
No | 11240 (93.4) | 5100 (94.4) | 6140 (92.7) | 6814 (87.9) | 3068 (89.2) | 3746 (86.9) |
Yes | 792 (6.6) | 305 (5.6) | 487 (7.4) | 934 (12.1) | 372 (10.8) | 562 (13.1) |
Chronic kidney disease | 9836 | 6862 | ||||
No | 8648 (87.9) | 4295 (88.9) | 4353 (87.0) | 5454 (79.5) | 2482 (80.6) | 2972 (78.6) |
Yes | 1188 (12.1) | 539 (11.1) | 649 (13.0) | 1408 (20.5) | 599 (19.4) | 809 (21.4) |
. | Wave 1 . | Wave 2 . | ||||
---|---|---|---|---|---|---|
Key variable . | Total n (%) . | Male n (%) . | Female n (%) . | Total n (%) . | Male n (%) . | Female n (%) . |
Age groups in years | 12 032 | 7807 | ||||
<50 | 5164 (42.9) | 2365 (43.8) | 2799 (42.2) | 1441 (18.5) | 702 (20.4) | 739 (16.9) |
50–60 | 5535 (46.0) | 2431 (45.0) | 3104 (46.8) | 3884 (49.8) | 1625 (47.2) | 2259 (51.8) |
>60 | 1333 (11.1) | 609 (11.3) | 724 (10.9) | 2482 (31.8) | 1116 (32.4) | 1366 (31.3) |
Age in years | 12 032 | 7807 | ||||
Mean (SD) | 51.9 (8.3) | 51.8 (8.4) | 52.0 (8.3) | 57.0 (7.9) | 57.0 (8.2) | 57.1 (7.7) |
Body mass index | 11 945 | 7715 | ||||
Underweight | 1317 (11.0) | 744 (13.9) | 573 (8.7) | 984 (12.7) | 555 (16.2) | 429 (10.0) |
Normal | 5706 (47.8) | 3246 (60.5) | 2460 (37.4) | 3338 (43.3) | 1863 (54.4) | 1475 (34.4) |
Overweight | 2371 (19.9) | 971 (18.1) | 1400 (21.3) | 1574 (20.4) | 690 (20.2) | 884 (20.6) |
Obese | 2551 (21.4) | 409 (7.6) | 2142 (32.6) | 1819 (23.6) | 314 (9.2) | 1505 (35.1) |
Hypertension | 12 032 | 7734 | ||||
No | 7494 (62.3) | 3513 (65.0) | 3981 (60.1) | 4158 (53.8) | 879 (54.7) | 2279 (53.0) |
Yes | 4538 (37.7) | 1892 (35.0) | 2646 (39.9) | 3576 (46.2) | 1553 (45.3) | 2023 (47.0) |
Diabetes | 12 032 | 7748 | ||||
No | 11240 (93.4) | 5100 (94.4) | 6140 (92.7) | 6814 (87.9) | 3068 (89.2) | 3746 (86.9) |
Yes | 792 (6.6) | 305 (5.6) | 487 (7.4) | 934 (12.1) | 372 (10.8) | 562 (13.1) |
Chronic kidney disease | 9836 | 6862 | ||||
No | 8648 (87.9) | 4295 (88.9) | 4353 (87.0) | 5454 (79.5) | 2482 (80.6) | 2972 (78.6) |
Yes | 1188 (12.1) | 539 (11.1) | 649 (13.0) | 1408 (20.5) | 599 (19.4) | 809 (21.4) |
At baseline, we observed the highest prevalence of obesity (BMI ≥30) at the three South African study sites (42.3–66.6% in women and 2.81–17.5% in men) and the lowest in West Africa (1.2–4.2% in women and 1.19–2.20% in men).19 The South African study sites had a mean prevalence of hypertension of over 40.0%, followed by Nairobi (25.6%) and Navrongo (24.5%) [hypertension was defined as self-report and/or systolic blood pressure (SBP) ≥ 140 mmHg and/or diastolic blood pressure (DBP) ≥ 90].20 Indications of kidney damage [estimated glomerular filtration rate (eGFR) <60 mL/min per 1.73 m2 and/or albuminuria (urine creatinine ratio >3 mg/mmol)] were present in 10.7%21; and diabetes prevalence was 5.5% (95% CI 4.4% to 6.5%) (defined as self-report and/or fasting plasma glucose ≥7 mmol/L and/or random plasma glucose ≥11.1 mmol/L).22 A cross-sectional comparison of CMD risk factors between pre- and postmenopausal women in Wave 1 highlighted that differences were more prominent in the West African cohorts despite, or perhaps because, the CMD risk factors were more prevalent at all ages in the South and East African cohorts.23 Multimorbidity was common across the AWI-Gen study sites, with overall prevalence of 47.2% in women and 35.0% in men.24 Commonly used 10-year CVD risk calculators were applied to all AWI-Gen participants, with risk varying from 2.6% to 6.5%, and with differences between study sites with highest risk in South Africa.25 These data highlighted the inappropriateness of current CVD risk calculators for use in an African setting. Each of the AWI-Gen publications highlights differences between the study sites, with the South African sites having a higher prevalence of CMD indicators as they are further along the health epidemiological transition, and the West African sites having the lowest levels.
A high prevalence of HIV infection was observed at the South (14–34%) and East African (12%) study sites, with low levels in West Africa (<1%). Interactions with HIV and the impact of HIV on CMD was examined in the South and East African study sites.19
A pilot gut microbiome study in Agincourt (n = 118) and Soweto (n = 51) in Wave 1 showed important differences between the two sites, and revealed previously undescribed gut microbial diversity.26 The larger metagenomic study of 1820 female participants in Wave 2, across all six study sites, revealed unique taxonomic patterns and distributions and many novel species.27
Patterns of tobacco and alcohol use showed important regional and sex-specific differences, with 68.4% of men and 33.3% of women being current alcohol users, and 34.5% of men but only 2.1% of women being current smokers.28 We examined gene–smoking interactions and their effect on the carotid intima–media thickness (CIMT) and detected sex differences.29 The RCBTB1 region was involved in gene–smoking interactions and identified novel associations. Genetic associations identified eight independent variants associated with either smoking initiation or cessation.30
A key objective of AWI-Gen was to perform genome-wide association studies (GWAS) with different indicators of CMD disease and specific outcomes. With representation from West, East and Southern Africa, and demonstrated clear population structure (Figure 2B), we used different approaches to adjust for population structure during the analyses; ‘mega-analysis’, which includes the full dataset and uses principal component-based correction for population structure, and independent GWASs for each geographical region followed by a meta-analysis. Our lipid trait GWAS compared results from the two approaches and found the outcomes to be largely similar. Our studies replicated many well-known genetic associations suggesting global transferability of loci associated with several cardiometabolic traits, but the identification of novel associations highlighted the value of working with data from African populations. Harnessing the increased genetic diversity and the generally lower linkage disequilibrium in African populations allowed us to fine-map some well-known associated loci. The underperformance of European-based polygenic scores (PGSs) in our dataset emphasizes the need for more PGS-based research on ancestrally diverse populations. GWAS on: carotid intima–media thickness (CIMT)31; lipid traits32; hypertension and blood pressure traits33; and urinary albumin creatinine ratios as a marker of kidney function34; are published, and other GWASs are in the publication pipeline.
We explored runs of homozygosity as a marker of inbreeding depression, to assess their effect on CMD-associated traits and to examine the effect of different levels of urbanization (using night-time luminosity as a proxy marker).35 Sex effects were evident, with inbreeding having a decreasing effect in men and an increasing effect in women for BMI, subcutaneous adipose tissue, low-density lipoprotein and total cholesterol levels, with more intense effects in men.
In the South African subset of AWI-Gen, the genetic data revealed population structure that aligns with language sub-families and geography, which is largely driven by differential gene flow from the Khoe-San hunter gatherers.5 We call for a shift from the practice of combining all South-Eastern Bantu speakers from South Africa into a single group in health-related studies to recognize the genetic differences between the groups and their potential impact on the phenotype. AWI-Gen played a key role in leading the data analysis and interpretation for the pan-African H3Africa publication in Nature in 2020 and contributed data from the West African study sites.4 Over 60 AWI-Gen papers have been published over the past 12 years.
What are the main strengths and weaknesses?
A major strength of AWI-Gen is that it capitalized on the unique characteristics of existing HDSS centres that offer established infrastructure for surveillance, research and longitudinal follow-up, and sustained contact with their communities. This provided access to participants, ensuring geographical spread with highly diverse demographic and phenotype data in addition to genome-wide genetic data. Another strength is that these populations are largely treatment naïve and, therefore, measurements reflect base physiological characteristics. Importantly, data were collected according to harmonized standard operating procedures, measurements were done using the same equipment sourced from a single supplier, laboratory assays were performed in a single laboratory, and training was conducted to ensure uniform approaches by fieldworkers and laboratory staff across the sites. A broad consent model was followed, and participants were asked for specific consent for the AWI-Gen study and broad consent for future research studies, provided these studies had ethics approval from the Wits HREC (Medical).
The two waves of data collection allow genomic association studies with disease incidence and with rates of change of key parameters that may affect onset and progression of disease. AWI-Gen and its members supported and were part of the Pan-African Bioinformatics Network for H3Africa (H3ABioNet),36,37 which contributed significantly to bioinformatics capacity and infrastructure developed to support genetics and genomic research in Africa.
Despite the significant strengths, AWI-Gen is limited to a specific age group (middle-aged adults), specific geographical locations and a ∼60% cohort retention, and therefore findings may not be generalizable to all African populations.
Can I get hold of the data? Where can I find out more?
AWI-Gen datasets from data collection Wave 1 have been submitted to the European Genome-phenome Archive (EGA) [https://www.ebi.ac.uk/ega/studies/EGAS000010024820] and can be accessed through the H3Africa data and Biospecimen Access Committee (DBAC). Non-human microbiome data have been submitted to the European Nucleotide Archive (ENA) data repository [https://www.ebi.ac.uk/ena/browser/home] and can be accessed without restriction. Wave 2 data will also be available through the EGA. The DNA samples have been submitted to the H3Africa Biorepository in South Africa. For new collaborations and data and sample access, contact the AWI-Gen Principal Investigator Michele Ramsay at [[email protected]] or the H3Africa DBAC [https://catalog.h3africa.org].
Ethics approval
Human Research Ethics Committee (Medical) of the University of the Witwatersrand (M121029, M16021, M170880, M2210108), with additional ethics approvals from the participating institutions and/or national ethics committees.
Data availability
See ‘Can I get hold of the data?’ above.
Supplementary data
Supplementary data are available at IJE online.
Author contributions
Implementation of field activities and supervision of data collection and project administration: G.A., P.R.B., V.B., I.K., M.L., R.G.M, S.F.M, E.A.N. and C.B.N. Data management: quality control of preliminary analyses, tables and figure generation: D.G.M., T.M., S.M. and T.R.. Site principal investigators (PIs) and co-PI, project leads, senior scientists: P.A., A.C., S.S.R.C., N.J.C., S.H., F.X.G., F, K.K., L.K.M, S.A.N., A.R.O., C.P., D.S., H.S., H.T., P.T., S.T., A.W.). Senior project manager (F.T.) and the principal investigator (M.R.) were responsible for drafting the initial version of this manuscript. All authors had the opportunity to revise and critically review the manuscript, and each accepts accountability for the accuracy and integrity of the manuscript.
Use of artificial intelligence (AI) tools
No AI tools were used collecting and/or analysing data, producing images or graphical elements nor in writing the paper.
Funding
The AWI-Gen Collaborative Centre is funded by the National Human Genome Research Institute (NHGRI), the National Institute of Environmental Health Sciences (NIEHS), of the National Institutes of Health (NIH) under award number U54HG006938 and its supplements, as part of the H3Africa Consortium, and by the Department of Science and Innovation, South Africa, award number DST/CON 0056/2014. D.G.M was supported by the NIH Fogarty Global Health Equity Scholars Program (NIH FIC D43TW010540).
Acknowledgements
We acknowledge the funders, the AWI-Gen participants, the H3Africa Consortium and the extraordinary study coordinators, field teams and investigators at all six study centres across the duration of the study.
Conflict of interest
None declared.