Cohort Profile: The Vukuzazi (‘Wake Up and Know Yourself’ in isiZulu) population science programme

and use; quality anthropometric measurements and blood pressure readings; blood sample and chest

sub-Saharan Africa. 5 In response to the changing scientific and public health priorities posed by the intersection of the HIV, TB and NCDs epidemics, the Africa Health Research Institute (AHRI) expanded the Africa Centre Demographic Information System to form the Population Intervention Programme (PIP) in 2017. 6 In 2018, Vukuzazi ('Wake up and know yourself' in isiZulu) was established as a nested closed cohort of PIP. Vukuzazi offered community-based health phenotyping and comprehensive bio-sampling to all resident adult (15 years) members of the southern part of the PIP study area, while building upon the existing wealth of demographic and HIV information collected over the previous 20 years. The objective of the programme was to determine the prevalence and overlap of infectious diseases and NCDs in the population 20 years into the HIV epidemic. Additionally, the programme aimed to create a data, image and bio-sample repository that could be used to understand the host, pathogen, social and environmental determinants of specific states of health and disease in the population, and provide the basis for follow-up intervention studies.
The Vukuzazi programme was conducted in the southern part of the PIP area, 6 (Figure 1) which corresponds with the original Africa Centre Demographic Information System surveillance area, near the market town of Mtubatuba. The area covers 438 km 2 and includes a population of approximately 12 000 households with 65 000 residents (people who spend most of their nights in the surveillance area) and 30 000 non-residents (people who are household members but spend most of their nights outside the surveillance area). About 40% of resident members are younger than 15 years. Households in the southern PIP have been under continuous demographic surveillance since 2000. 1 In 2003, annual HIV serosurveys were added to describe the demographic social and health impacts of the rapidly progressing HIV epidemic. 1 The PIP area has 11 peripheral primary health care clinics (of which seven are in the southern part), and one district hospital lies outside the study area. The location of the area covered by the Vukuzazi programme, sites where the mobile camp enrolled participants and other features of the area are shown in Figure 1.
The scientific objectives and proposed activities of the Vukuzazi programme were conceptualized and developed in consultation with the AHRI Community Advisory Board (CAB) and other local stakeholders, including elected and traditional leaders and representatives of the local Department of Health (DoH).

Who is in the cohort?
Ethical approval for the study was obtained from the Ethics Committees of the University of KwaZulu-Natal (BE560/17), London School of Hygiene & Tropical Medicine (#14722), the Partners Institutional Review Board (2018P001802), and from the University of Alabama at Birmingham (#300007237).
The eligible study population consisted of all resident members of the southern PIP area who were aged 15 years during the baseline data collection (May 2018-March 2020). Non-resident members were excluded from participation.
Participants were recruited in a two-stage process ( Figure 2): (i) a household visit during which all eligible participants were invited to participate (the invitation process); and (ii) a formal informed consent and enrolment process at the Vukuzazi health camp (Figure 3).

Invitation process at the participant's homestead
This component was scheduled approximately 1 week before the Vukuzazi mobile health camp visited an area. We

Key Features
• The Vukuzazi programme was established to address public health responses and scientific priorities in light of the convergence of the HIV, tuberculosis and non-communicable disease epidemics in South Africa.
• The programme is a nested closed cohort of the Population Intervention Programme and performed communitybased health phenotyping in the uMkhanyakude district of rural KwaZulu-Natal, South Africa, while simultaneously creating a data, image and bio-sample repository.
• The community-based phenotyping consisted of: a nurse-administered health questionnaire on personal history of HIV, tuberculosis, diabetes, hypertension, cancer, tobacco and alcohol use; quality of life; anthropometric measurements and blood pressure readings; venous blood sample for clinical testing and bio-banking; digital chest radiograph, sputum collection and sputum mycobacterial tests; as well as rectal swab collection for biobanking.
• Data for secondary analysis, access to the biological samples and imaging data are available through request on this link: [https://data.ahri.org].  used the demographic surveillance data to populate an eligibility list, compiled by fieldworkers at the most recent household survey but no more than 1 year before the Vukuzazi visit. Based on this list, the community-based study team visited the household and invited all eligible members to the Vukuzazi health camp. The team explained the rationale for the study, using an information brochure written in isiZulu which was left with the household. Additionally, the team left personalized invitation cards for each eligible member which included the individual's unique PIP identification number, a quick response (QR) barcode, appointment date and the location of the Vukuzazi mobile health camp. If no residents were home when a household was visited, three visit attempts were conducted before a household was considered non-contactable.

Informed consent and enrolment at the Vukuzazi mobile health camp
During the 22-month study data collection period, the Vukuzazi mobile health camp moved through the PIP area with the goal of conducting procedures within two kilometres of all eligible homesteads. Vukuzazi camp sites were selected based on accessibility, sufficient space and mobile connectivity. Permission was sought from the local traditional authorities and leaders before selection of the final sites. The mobile health camp stayed on average 2 days at each location. In total, the camp was set up at 78 sites over the observation period ( Figure 1). Participants presented to the health camp with their unique invitation card. After verification of their identity, they proceeded to the next station for the informed consent process and an explanation of the study components, including the risks and benefits of participation, as well as information about storage of samples for genetic analyses and future research. Those who consented were given a barcoded wristband which functioned as a unique identifier during their Vukuzazi camp interaction, checking their identifiers at each station.

Recruitment
About 39 000 individuals were eligible 1 year before data collection. At the moment of data collection, about 3000 of them had died or moved out of the study area ( Figure  4). This resulted in 36 097 individuals eligible for enrolment, of whom approximately three-quarters were contacted and accepted the invitation to visit the Vukuzazi mobile camp (about one-quarter were unable to be contacted, and of the contacted individuals only 2% refused the invitation). Among the 26 795 individuals who accepted the invitation, however, approximately one-quarter of eligible participants did not appear at the Vukuzazi camp, resulting in 18 041 individuals (50.0% of the eligible population) who enrolled in the Vukuzazi programme.
The Vukuzazi programme enrolled participants across the age range of the eligible population, with slight under-representation of the younger ages (Table 1). Females outnumbered males in the underlying population structure ( Figure 5) and enrolled in higher proportions than males ( Figure 6). Most participants resided in a rural (64%) or peri-urban (31%) area of the study area and had very high rates of unemployment (57%, Table 1). They were less highly educated, had lower rates of employment, were less likely to have out-migrated during the past 5 years and had a lower socioeconomic score compared with eligible non-participants. They were also more likely to have visited a clinic in the past year and to have ever consented to home-based HIV counselling and testing during the individual surveys (Table 1).

How often have they been followed up?
Vukuzazi is nested within the AHRI PIP, 6 which will allow for both demographic and health system follow-up of participants. Participants will be observed prospectively three times per year through the PIP demographic surveillance activities, which will record changes to their household residence, demographic and vital status. In addition, participants' prospective clinical data will be collected through clinic and hospital data systems established through PIP. 6,7 In 2017, AHRI implemented the ClinicLink system 6,7 to collect the date of and self-reported reasons for all visits by individuals who attend one of the 11 clinics in the PIP area (whether self-referred or referred to care after screening tests). In this ClinicLink system, data are electronically captured and individuals are linked to their PIP identification number at the time of the visit. Data on all admissions to Hlabisa District Hospital (the local referral hospital) are collected through the Hospital Information System. Successful referrals or clinic attendance were monitored through ClinicLink. Successful referrals from Vukuzazi were entered into the relevant DoH care pathway by an AHRI professional nurse based at the health facility.
Finally, in accordance with ongoing PIP activities, verbal autopsy will be performed for all deaths occurring in the population. 6 These follow-up mechanisms allow Vukuzazi's 2018-20 data to serve as a baseline assessment of a longitudinal cohort that includes clinical outcome assessment. Additional nested disease-and health behaviour-specific protocols will provide additional specific followup information. 8 Long-term plans for the cohort will be driven both by scientific priorities and funding; currently we are considering repeating clinical measures every 3 to 5 years.
What has been measured?
The majority of measurements and samples were collected at the Vukuzazi mobile health camp; participants with abnormal measurements or results had a follow-up home visit with additional data collected at that time. The participant flow through the Vukuzazi mobile camp is summarized in Figure 3 and the data types collected during the two visits are described in Table 2. The mobile health camp consisted of module elements that could be used in parallel to accommodate up to 100 participants per day ( Figure 3).
After completing the informed consent procedure, participants submitted two self-administered rectal swabs, and women of childbearing age submitted a urine sample to ascertain pregnancy status. An enrolled nurse then administered a health questionnaire (see Supplementary material, available as Supplementary data at IJE online) on personal history of HIV, tuberculosis, diabetes, hypertension, cancer, tobacco and alcohol use, 9 symptoms of TB (cough, fever, night sweats of any duration or unintended weight loss), 10 and quality of life. 11 Next, a research nurse performed anthropometric measurements ( Table 2) and blood pressure readings following the World Health Organization STEPwise protocol. 9 After 30 min of inactivity, three blood pressure measurements, with 5 min resting time in between, were collected using portable electronic devices (OMRON Healthcare, Model M6). Participants then moved to the blood station where 38.5 ml of blood was collected. All participants except for pregnant women underwent a digital chest X-ray (Canon CXDI-NE), performed in a mobile X-ray unit also equipped with an artificial intelligence computer-aided detection for TB (CAD4TB) score system (Version 5), which detects lung field abnormalities in real time. 12 Any participant with one of the four cardinal TB symptoms, or a CAD4TB score above a pre-defined threshold or who was pregnant, was asked to submit a sputum sample. After completion of the chest X-ray and/or sputum collection, the visit to the health camp was completed at the exit station, where data were checked for completeness, and participants received a food parcel and reimbursement voucher (of the value of R 100equivalent to approximately USD$6).
All chest X-rays were interpreted by CAD4TBv5 and, within 14 days, underwent a standardized interpretation by an experienced radiologist. The study physician reviewed all data from questionnaires, anthropomorphic and blood pressure measurements and laboratory and radiological testing, and initiated appropriate clinical management.
Participants with normal or expected findings (e.g. known HIV-positive on antiretroviral therapy with a suppressed viral load) received an SMS informing them that their results were normal. Participants who shared a phone or who had invalid test results received a phone call to ensure that the correct participant was contacted. Participants with abnormal findings (elevated blood pressure in someone not on antihypertensive treatment, or an X-ray suggestive of TB) or testing results (new HIV diagnosis, positive GeneXpert Mycobacterium tuberculosis/Rifampin (Xpert for MTB), glycated haemoglobin (HbA1c) >6.5%) were booked for an appointment for a home-based follow-up visit by a study nurse.
At the follow-up visits, the following measurements were made ( Table 2): three blood pressure measurements (performed as described above) for people with elevated blood pressure at the Vukuzazi health camp; and point-ofcare fasting blood glucose for people with HbA1c >6.5%. On the basis of the Vukuzazi health camp results and, when appropriate, the repeated measurements, referrals were made for further care at one of the local primary health care clinics or hospital.
In addition to the data collected through the Vukuzazi health camp, the following data on Vukuzazi participants are available through linkage with other data systems ( Table  2): detailed demographic data (including education, socioeconomic status, marital status), sexual behaviour, household composition, structure and familial relationships (including family trees), household location (including geospatial coordinates), self-reported hospital attendance and diagnosis data, verbal autopsy data collected through the PIP surveillance, 6 TB-and antiretroviral therapy-related data through Three Interlinked Electronic Register (TIER.net) 13 and Covid-19-related data through the Covid-19 Surveillance protocol. 8 What has it found?
The data generated from the Vukuzazi study will create a clinical research platform for future longitudinal studies of HIV, TB, NCDs and multimorbidity. Use of mobile clinics and mobile X-ray units for multi-disease health screening enabled the study to enrol people within reach of their place of residence, thereby increasing participation rates. The study demonstrated the usefulness of digital health tools (such as mobile data collection tablets, digital chest X-rays) in collecting clinical data without need for paper records, thereby minimizing data entry and transfer errors and challenges associated with paper-based data collection. A key advantage of electronic data collection is speed of accessibility of data from multiple sources. We were able to return test results and conduct clinical review of all  participants, leading to linkage to care for participants with abnormal results. The combination of decentralized mobile data collection and efficient participant-friendly data systems resulted in high rates of successful completion of study components. The focus on multimorbidity was also a departure from many previous biomedical studies in the region, which often have a limited scope related to HIV and/or tuberculosis. The broad range of conditions considered might have helped reduce stigma associated with participation in the study. 14 Acceptance of community health screening and of the biobank platform A qualitative sub-study of participant and non-participant experiences in Vukuzazi found that most participants had a positive impression of the research and their participation, although they emphasized the health screening components over the research components and had limited understanding of genetics and biobanking. 14 Nonparticipants largely cited scheduling conflicts as their reason for not participating.

Prevalence of infectious and non-communicable disease
Among the four conditions assessed, HIV had the highest overall population prevalence (34.2%). Previously undiagnosed active TB was prevalent in 1.0%, lifetime TB in 21.8%, elevated blood pressure in 23.0% and elevated blood glucose in 8.5% of the population. Appropriate treatment and resulting control of disease was highest for HIV (76.3%) and lower for elevated blood pressure (40.0%), active tuberculosis (31.3%) and elevated blood glucose (6.9%). 15 Distribution of the four diseases was heterogeneously distributed between sexes, across the life course and geospatially. Rates of multimorbidity (more than one of the two diseases) were high and increased with age. Among the disease risk factors assessed, obesity had the highest overall prevalence (30.1%).

Health-related quality of life
The contrast between controlled HIV and uncontrolled non-communicable diseases also appeared to have effects on health-related quality of life. The presence of controlled HIV was unexpectedly associated with better overall selfreported health (manuscript under review). This was in contrast to having more non-infectious conditions (whether controlled or not), which was associated with worse overall self-reported health. In South Africa and similar settings, where intersecting non-communicable and infectious disease epidemics are prevalent, incorporating multimorbidity into health care strategies and improving the integration of care may improve overall well-being.

Asymptomatic TB
During the baseline survey we found that 1.0% of the screened population had microbiologically proven TB, and that 80% of these individuals were asymptomatic. Individuals newly diagnosed with active TB in communitybased screening have fewer symptoms, compared with people diagnosed at health facilities (manuscript under review). Asymptomatic TB eludes traditional symptom-based TB screening algorithms, but in most cases it is identifiable by chest X-ray abnormality. 16 In addition to often being asymptomatic, TB detected through community-based screening may have its own spectrum of radiological features compared with TB diagnosed in clinics and hospitals, making it a distinct use case for automated imaging algorithms.

Screening for hypertension and diabetes
We found low positive predictive values of single screening measurements of blood pressure and blood glucose to detect hypertension and diabetes when compared with sequential testing on separate days. Single measurement screening for hypertension and diabetes resulted in approximately a 50% rate of potentially unnecessary or premature referrals into the health sector for these conditions (manuscript under review). These findings suggest that repeated testing for NCDs should be considered during community-based screenings, particularly in settings when there is concern for over-burdening health systems with potentially unnecessary referrals.
What are the main strength and weaknesses?

Strengths
The Vukuzazi programme has demonstrated that demographic surveillance programmes can be effectively used to support multi-disease health characterization of populations within a demographic surveillance area. The PIP surveillance provided a comprehensive sampling frame, enabling us to determine and quantify participation and disease prevalence rates. Data collected in Vukuzazi can be linked to previous and prospective demographic, socioeconomic, health and behavioural data routinely collected as part of the annual surveillance. The large size and completeness of the linked data, image and bio-sample repositories create the possibility for further detailed studies into the biological and social determinants of health and disease in the population, and interventions to improve outcomes.

Weaknesses
We only enrolled half of the eligible participants. Contact rates were low among participants who were unavailable due to work, school and other social commitments. Various strategies such as varying recruitment times and working on weekends were employed, without significantly improving enrolment rates of difficult-to-enrol demographic sub-populations (e.g. working men). Nonparticipation could be non-random which needs to be considered in the interpretation of disease prevalence estimates. However, the existing wealth of information that has been collected on the eligible population since 2000 allows us to examine the difference between Vukuzazi participants and non-participants using statistical techniques to take into account non-response. This can be explored in future analyses and follow-up studies.
Can I get hold of the data? And the biological samples or imaging data? Where can I find out more?
Researchers who would like to use the data for secondary analyses, or use the biological samples or imaging data, must complete a letter of intent accessible on this link: [https://data.ahri.org/index.php/home], in which they express their interest and outline the intended use of the data or biological samples. The Vukuzazi Scientific Steering Committee will review requests on an ongoing basis and invite proposals for collaboration. Vukuzazi data are accessible through the AHRI Data Repository after approval and completion of a data use agreement. Data documentation including questionnaires used for data collection and a full data dictionary are available on the AHRI Data Repository.

Supplementary Data
Supplementary data are available at IJE online