Impact evaluation of a maternal and neonatal health training intervention in private Ugandan facilities

Abstract Global and country-specific targets for reductions in maternal and neonatal mortality in low-resource settings will not be achieved without improvements in the quality of care for optimal facility-based obstetric and newborn care. This global call includes the private sector, which is increasingly serving low-resource pregnant women. The primary aim of this study was to estimate the impact of a clinical and management-training programme delivered by a non-governmental organization [LifeNet International] that partners with clinics on adherence to global standards of clinical quality during labour and delivery in rural Uganda. The secondary aim included describing the effect of the LifeNet training on pre-discharge neonatal and maternal mortality. The LifeNet programme delivered maternal and neonatal clinical trainings over a 10-month period in 2017–18. Direct clinical observations of obstetric deliveries were conducted at baseline (n = 263 pre-intervention) and endline (n = 321 post-intervention) for six faith-based, not-for-profit primary healthcare facilities in the greater Masaka area of Uganda. Direct observation comprised the entire delivery process, from initial client assessment to discharge, and included emergency management (e.g. postpartum haemorrhage and neonatal resuscitation). Data were supplemented by daily facility-based assessments of infrastructure during the study periods. Results showed positive and clinically meaningful increases in observed handwashing, observed delayed cord clamping, partograph use documentation and observed 1- and/or 5-minute APGAR assessments (rapid scoring system for assessing clinical status of newborn), in particular, between baseline and endline. High-quality intrapartum facility-based care is critical for reducing maternal and early neonatal mortality, and this evaluation of the LifeNet intervention indicates that their clinical training programme improved the practice of quality maternal and neonatal healthcare at all six primary care clinics in Uganda, at least over a relatively short-term period. However, for several of these quality indicators, the adherence rates, although improved, were still far from 100% and could benefit from further improvement via refresher trainings and/or a closer examination of the barriers to adherence.


Introduction
Global and country-specific targets for reductions in maternal and neonatal mortality in low-resource settings will not be achieved without significant improvements in optimal facilitybased obstetric and newborn care (Koblinsky et al., 2016;Lawn et al., 2014;Campbell et al., 2016). In Uganda, the maternal mortality ratio is 336 maternal deaths per 100 000 live births, and the neonatal mortality rate is 27 deaths within the first month of life per 1000 live births (Uganda Bureau of Statistics (UBOS) and ICF, 2018). These statistics represent only modest gains over the last decade (DHS, 2016). While rates of facility-based deliveries have increased in Uganda and the proportion of those deliveries attended by skilled providers has also increased, mortality data indicate that progress on maternal and neonatal health (MNH) outcomes has been stagnant (Uganda Bureau of Statistics (UBOS) and Macro International Inc., 2007; Uganda Bureau of Statistics (UBOS) and ICF International Inc., 2012; Uganda Bureau of Statistics (UBOS) and ICF, 2018;Republic of Uganda, 2016).
Both the Every Newborn Action Plan and the World Health Organization (WHO) Strategy for Ending Preventable Maternal Mortality prioritize actions to improve quality of care (QoC) at birth, going beyond simply increasing coverage (WHO, 2015a;2014a). There is increasing global attention to closing the gap between coverage and QoC for facility-based deliveries. The WHO framework for Maternal, Newborn and Child Health QoC highlights that evidencedbased clinical care is only one of eight domains of QoC that must be addressed in comprehensive high-quality health services (Tunçalp et al., 2015). There are also calls for • The LifeNet International training intervention improved the majority of the pre-specified maternal and neonatal quality of care (QoC) indicators at rural primary care clinics in Uganda. • However, there are still pronounced clinical quality gaps that should be addressed, including routine provider handwashing, maintaining sterility of gloves prior to use, timely use of the partograph, APGAR assessment and delayed cord clamping. • Direct observations of clinical practice are essential for monitoring QoC and highlighting areas for improvement.
developing better and more standardized measures of QoC for obstetric and newborn care with wide recognition that direct clinical observations (DCOs) are still the gold standard for monitoring adherence to clinical evidence-based best practices (WHO, 2014b;Semrau et al., 2017). The evidence to support effective QoC interventions addressing maternal and neonatal mortality in low-resource settings via strengthened facility-based intrapartum care is thus far mixed. A review of strategies for improving provider performance revealed that multi-faceted approaches such as training plus strengthen infrastructure and management support were more likely to have a stronger evidence base (Rowe et al., 2018). However, often findings indicate that changes in essential birth practices do not have an impact on health outcomes such as the Better Birth Trial in India (Semrau et al., 2017;Kara et al., 2017). While that trial showed that providers were implementing components of the WHO Safe Childbirth Checklist, there was no significant effect on maternal or perinatal mortality. In Uganda and Zambia, a multidonor, health systems initiative was designed to reduce deaths related to pregnancy and childbirth called 'Saving Mothers, Giving Life'. Results revealed that while they documented a significant reduction in facility-based maternal mortality, they found no significant change in pre-discharge neonatal mortality rates and no significant differences in women's reports of receiving evidence-based clinical services during delivery (Conlon et al., 2019;Kruk et al., 2016). The authors highlight that more direct observations and refined measurement of labour and delivery processes could have improved their QoC assessments.
Compounding the QoC evidence gap is that many evaluations focus exclusively on public sector services (Evans et al., 2018) yet, increasingly, the private sector is serving low-resource pregnant women and is a critical component of national health systems (McPake and Hanson, 2016). In Uganda, ∼50% of health services are provided by faith-based organizations (Olivier et al., 2015), and 22% of Ugandan women give birth in private sector health facilities with wide variation by region (DHS, 2016). While urban areas have more women with higher socioeconomic status accessing private facilities, in some rural areas, the closest health facility is a faith-based private health facility (DHS, 2016). Clinical social franchising is a growing private sector intervention in low-resource settings that purports to increase the utilization of quality services yet we cannot assume that the private sector is offering higher technical quality in the absence of evidence (Montagu et al., 2016;Hirose et al., 2018).
The primary aim of this study was to estimate the impact of LifeNet International's clinical and management-training programme on adherence to global standards of clinical quality during labour and delivery by health providers in rural, private sector health facilities in the greater Masaka region of Uganda. Secondary aims included describing the effect of LifeNet's clinical training modules on pre-discharge neonatal mortality and maternal mortality. A detailed analysis of QoC practices can better highlight the challenges and opportunities for strengthening high-quality obstetric and newborn care and its potential impact on maternal and neonatal outcomes in low-resource settings.

Materials and methods
This quasi-experimental study was a pretest-posttest observational study design to estimate the effects of a clinical and management training intervention delivered by an international non-governmental organization (NGO) [LifeNet International] on improvements in QoC for maternal and neonatal healthcare in six LifeNet-affiliated health facilities in the greater Masaka district area, Uganda. The training intervention period was August 1, 2017 to May 29, 2018. Baseline data were collected prior to initiation of the LifeNet training programme (May 15 to July 17, 2017) and post-intervention endline data were collected immediately after the completion of the training programme (May 29 to August 12, 2018).

Intervention
LifeNet International, a registered as a 501(c)(3) nonprofit organization in the USA, and an international NGO in Uganda, had previously developed and implemented an integrated health training package that requires 2-years of engagement with selected partner facilities in Uganda. For the purposes of this study, a modified training intervention was designed so that for our study sites, the first 10 months were front loaded with training modules relevant to MNH. The intervention used on-site monthly staff training, addressed team-based behaviour, incorporated quality assurance activities and focused on both medical (e.g. evidence-based clinical care) and management (e.g. record keeping, essential medicine monitoring and management) knowledge and practice tools to support implementation of high-quality care. The MNH modules were evidence-based best practices that had been validated and endorsed by the international medical community to address the leading causes of mortality and morbidity for mothers and newborns in Uganda (American Academy of Pediatrics (AAP), 2016; American Congress of Obstetricians and Gynecologists (ACOG), 2015; WHO, 2011;WHO, 2012a,b;WHO, 2015b,c;WHO, 2017a). The trainings were delivered at each facility on a rotating monthly basis. The primary clinical trainer held a bachelor's degree in public health and a diploma in nursing, while the primary management trainer had a bachelor's degree in accounting. Trainers delivered these in-facility ∼2 hour long didactic and handson modules to the majority of the clinic staff at a designated time and then followed-up with an additional training for any clinicians who could not attend due to running normal operations during the training. Training materials given to staff included partographs and clinical best practice sheets that summarized the training content. The trainings were a mix of lectures, videos, demonstrations and practice (e.g. assembling a condom tamponade to control postpartum haemorrhage (PPH), neonatal resuscitation practice on a dummy, etc.). Each element of LifeNet's MNH Package aligned with the Ugandan government's 2016-2020 RMNCAH Sharpened Plan, which emphasizes evidence-based, high-impact health solutions (Republic of Uganda, 2016).
There were 13 MNH-related training modules required by the clinical staff, and non-clinical administrative staff were required to attend at least sessions 1, 3 and 4 as noted below. Modules included (1) documentation and record keeping, (2) partograph and delivery records, (3) basic patient assessment, (4) infection prevention, (5) intravenous (IV) usage, (6) antenatal care, (7) hypertension and pre-eclampsia, (8) first trimester high-risk pregnancies, (9) second and third trimester high-risk pregnancies, (10) normal (uncomplicated) deliveries, (11) PPH, (12) first 5 minutes, APGAR (rapid scoring system for assessing clinical status of newborn), neonatal assessment and (13) neonatal resuscitation. There was significant variability between individual providers and between facilities on session attendance per module ranging from 56 to 100% attendance at individual sessions among relevant staff who should have attended. If new staff joined the clinics midevaluation, LifeNet tried to on-board and briefly 'catch-up' these new staff on missed trainings. There were no other known QoC interventions happening at the clinics during our study period to the best of our knowledge.

Study sites
Six rural, faith-based health facilities in the greater Masaka area, new to partnering with LifeNet International, participated in the study. Study facilities were selected based on their proximity to Masaka town (for research supervision purposes) and sufficient obstetric delivery volume (16+ deliveries per month). All facility managers were accredited by the Roman Catholic Diocese of Masaka. Maternal delivery fees averaged ∼30 000 UGX ($8.50 USD). The number of clinical employees per facility ranged from three to nine for all services and between two to six health providers for obstetric services. Clinical providers observed during the study included midwives, comprehensive nurses, clinical officers and doctors. All facilities were designated as able to serve pregnant women with uncomplicated deliveries, including women living with human immunodeficiency virus (HIV), per government guidelines that indicated they must be able to deliver basic emergency obstetric care. One facility was a level IV referral facility capable of performing surgeries including C-sections, with the remaining five facilities being of level III meaning they had no surgical capacity and thus no C-section services (WHO, 2017b). Additional details about study site selection and data collection procedures, including online access to data collection forms, have been published elsewhere (Egger et al., 2020).

Data collection instruments and procedures
During both the pre-and post-intervention study periods, QoC data were collected using three methods: • Direct Clinical Observation (DCO) of childbirth deliveries documented on a checklist of the clinical encounter. • Medical record extraction to document the recorded QoC practices and health outcomes (source of data for neonatal and maternal pre-discharge mortality) ['medical records' were most often non-standardized ledgers at baseline with very little information routinely captured and then LifeNet-developed medical chart forms at endline], including information from a supplemental 28-day follow-up phone call to consented participants to capture post-discharge neonatal and maternal mortality. • Facility checklists that were used as a daily audit to track necessary supplies and medicines at each health facility.
Research assistants (RAs), all fluent in English and Luganda, were trained by Duke University, which led the external evaluation, to collect data via direct observation of clinical encounters, review of medical records and daily facility checklists. Ten RAs were deployed for baseline (preintervention) data collection and 10 RAs conducted endline (post-intervention) data collection. About half were hired for both timepoints and the RAs reflected a mix of clinical (licensed nurses) and non-clinical health research backgrounds. Study training for the RAs for each timepoint included a 5-day review of LifeNet's quality improvement training programme delivered by LifeNet staff, followed by a 5-day training in study procedures and research ethics by the university-based study investigators. One to three RAs were assigned per study facility based on delivery volume. RAs were 'on call' to respond to all deliveries for observation at their clinics. A shift-schedule ensured that nearly all consecutive maternal deliveries were observed during the study periods. Deliveries not observed were assumed to be missed completely at random.
The DCO form was informed by USAID's (United States Agency for International Development) Maternal and Child Health Integrated Program (MCHIP) Maternal and Newborn QoC Survey for Labor and Delivery (2013). Study RAs assessed the extent to which providers adhered to best practice standards of care for all stages of maternal deliveries: initial client assessment, first stage of labour, second and third stages of labour, immediate newborn and postpartum care and medical information documentation, including emergency procedures of newborn resuscitation and PPH management.
Data were recorded on paper-based DCO forms and then entered electronically into a Research Electronic Data Capture (REDCap) form. More than one RA could observe a single delivery if RAs changed shifts during a delivery. As such, the RA noted the sections of the delivery process that she observed. Several of the RAs had at least as much clinical training as the providers they observed. To maintain objectivity, the RAs were trained to intervene in the delivery only if they felt that they were critically needed, and if they believed either the life of the mother or the child was in danger. If the RA intervened in any way, this was recorded in the relevant notes section of the DCO form and the respective data were excluded from primary analyses.
Data from medical records were largely from the postintervention timepoint only and were typically extracted and directly entered into REDCap after the DCO data were entered. Part of the intervention was that clinics were trained to use a more comprehensive two-page medical record form developed by LifeNet in order to document clinical care and health outcomes in a standardized way across sites. While most clinics had access to individual partograph sheets from the Ministry of Health, the partograph was embedded in the LifeNet medical record form so patients had one in their record. During the discharge process, women were asked to provide a contact phone number to be included in their medical chart for the purposes of a brief 28-day follow-up check-in by the RAs or clinic staff.
The facility checklist was also completed daily for each facility. The checklist was informed by the USAID's MCHIP Facility Inventory Quality of Care Tool (2013a). Study RAs documented availability of resources, support systems and facility infrastructure elements necessary to provide a level of service intended to meet national or international standards. The checklist was completed on paper first and then entered into REDCap.

QoC indicators
Priority indicators of clinical quality for all deliveries were determined based on the global literature as noted above in the modules and were identified prior to data collection to align with each stage of maternal delivery: initial client assessment, first, second and third stages of labour, immediate newborn and postpartum care and medical information documentation. We also collected observational data on newborn resuscitation and PPH management as the circumstances arose. Adherence to each QoC indicator was defined as the proportion of eligible deliveries where the provider adhered to the recommended best practice and each was measured as a binary response (0/1). For each indicator, the numerator represented the number of deliveries where the provider adhered to best practices and the denominator represented all eligible deliveries. As a result, the denominator (N) differed for each indicator. Our study indicators reflect both 'sensitive' indicators, those that require direct observation for confirmation and often have a critical time element to reflect optimal best practice (e.g. maintaining glove sterility and timing of cord clamping) and non-sensitive or 'crude,' indicators that could, in theory, have been measured using standard techniques like medical chart abstraction or provider interviews. We believe that sensitive measures may be a better true measure of clinical quality and thus better illuminate potential effects on health impact, than crude measures. Therefore, we created multiple indicators per domain to highlight varying levels of sensitivity.
There are a total of 17 QoC indicators relevant to all women presenting for delivery. For handwashing, we have three indicators: provider washes hands at least once during labour and delivery, provider washes hands only once but it is for clean up after birth and provider washes hands at three important timepoints-right before initial vaginal examination, again during the first stage labour and then in preparation for delivery. For sterile glove use, we documented three indicators: use of any type of gloves, use of surgical gloves specifically and then use of surgical gloves without compromising their sterility prior to use. Sterile cord cutting and clamping had three indicators: use of a sterile cord clamp or sterile string to tie off the umbilical cord, use of a sterile blade or sterile scissors to cut the cord, and delayed cord clamping recorded as more than one minute after delivery.
For partograph use, we have two indicators: any partograph use, either observed or documented at any time and partograph used in real-time to monitor labour. To prevent PPH, administration of a uterotonic during the second and third stages of labour was documented, and as part of best practice during initial assessment, testing the woman's urine for the presence of protein was observed. Appropriate use and timing of the APGAR is best practice, and we had documented use at one and/or 5 minutes post-delivery via medical record documentation and/or observation.

Sample size and power calculations
Our study was powered on the primary aim of estimating a difference in the prevalence of the QoC indicator, 'realtime proper use of the partograph', comparing the baseline and endline periods. With an expected 155 maternal deliveries observed in each of the two time periods, we estimated achieving greater than 80% power to detect a difference as small as 14% in the proportion of those encounters where the partograph was used properly when the baseline (i.e. 'pre-') proportion is estimated to be 20% (20 vs 34%, based on a Chi-squared test of independence, assuming a two-sided alpha level of 0.05). Our power was estimated to reach 90% if the difference in the two proportions was as large as or larger than 16.5%. We expected to see similar, if not higher, power on most of our other indicators. If the baseline proportion of the partograph indicator was lower than 20%, then our estimates of power were conservative. The study exceeded our target sample size requirements and achieved a priori statistical power to meet all of our primary research aims.

Data management and analysis
Data were analysed using Stata 16 software (StataCorp, College Station, TX). Analysis of study data focused on the change in the prevalence (i.e. the adherence) from baseline (pre-intervention) to endline (post-intervention). Data were analysed separately at the individual (i.e. patient-provider encounter) and clinic cluster levels and results were compared. Individual-level analysis utilized a generalized estimating equation (GEE) framework to account for clustering in the response by clinic and assumed an exchangeable working correlation structure. Models assuming an independent working correlation structure were fit as a sensitivity analysis. The GEE model was fit to prevalent data with a log-link function using a modified Poisson regression approach (Zou, 2004). Models were fit in Stata using the xtgeebcv command, which allows for a finite sample bias correction due to the small number of clusters in our study (Gallis et al., 2020). We used the bias-corrected variance method proposed by Kauermann and Carroll (2001). Models included an indicator parameter for time (i.e. baseline vs endline), and exponentiation of this parameter estimated the population averaged prevalence ratio (PR). A positive value for the PR can be interpreted as an average increase in adherence over the study period. A negative value for the PR can be interpreted as an average decrease in the adherence over the study period. Cluster-level analyses were performed in MS Excel using methods for pair-matched clusters described in Hayes and Moulton (2017).
Ethical approvals were obtained from [Duke University U.S. Academic Institution], The AIDS Support Organization in Uganda [not affiliated with authors], and the Uganda National Council of Science and Technology. All maternity clients (or her self-designated proxy) gave written informed consent to participate in the study and were provided with a copy of the consent form with contact information. Consent was obtained at admission for childbirth; however, our study team also confirmed consent again post-delivery. No participant rescinded their consent.

Results
To appropriately assess clinical quality practices, facility infrastructure and resources were assessed daily at each study clinic during the pre-and post-direct observation periods (Table 1). There were 364 observed facility days (6 clinics, ∼60 days each) at baseline and 424 observed facility days (6 clinics, ∼70 days each) at endline. Overall, facility infrastructure was relatively high with little variation in the availability of equipment and supplies, although there were some notable improvements from baseline to endline. There was one clinic in particular that drove some of the lower levels of available supplies and equipment at baseline, moving from 0 to 100% on a few indicators (e.g. oral thermometer and IV equipment). Other notable changes include the percentage of clinic days with sterile cloths doubling over time (17 to 31%) but still low, and the availability of condom tamponade packages during a third of observed facility days at endline (35%), compared to almost none at baseline.
For the baseline (pre-intervention) assessment, 263 deliveries were consented and observed (representing 94% of all women presenting for delivery) and 228 women (87%) were followed up 28 days later for questions on maternal and neonatal mortality. For the endline (post-intervention) followup assessment, 321 deliveries were consented and observed (representing 90% of all women presenting for delivery) and 304 women (95%) were followed up 28 days later for mortality data.
Results of individual level analysis, estimating the PR for each of the 17 clinical quality indicators and comparing the adherence pre-and post-intervention are presented in Table 2. For all 15 of the indicators where we were able to estimate a model-based PR, we observed a PR of greater than 1.0, indicating a positive trend toward improved adherence. We observed a significant increase in the prevalence of adherence over the study period for three indicators (Table 2). These increases were large and clinically meaningful; however, there was considerable variation in adherence by clinic for many indicators, as indicated by the wide confidence intervals. Overall, for the indicators 'handwashing at least once during initial assessment, labour and/or in preparation for delivery', 'delayed cord clamping' and 'any partograph documentation,' we observed increases indicating that the average provider was 4.79, 2.48 and 7.99 times more likely to perform these three procedures post-implementation of the LifeNet training, respectively.
Adherence data for each of the 17 clinical quality indicators, stratified by clinic and time period, is presented in Table 3. There was considerable variation in adherence across clinics for each of the indicators. Due to zero cell counts (e.g. the provider adhered to the indicator 0 times in a clinic at a specific time point), we were only able to estimate the cluster-level PR for 6 of the 17 indicators. However, Due to the expected rarity of mortality, the study was not powered to detect significant differences in neonatal and maternal mortality; nonetheless, we report the deaths that were observed as valuable descriptive data including information on stillbirths, health status at discharge and health Training modules that covered key QoC indicators (although messages could be re-emphasized in subsequent modules): handwashing and sterile glovemodules 4, 10 and 11; sterile cord cutting and clamping-module 12; partograph use-modules 2 and 10; uterotonic use for prevention of PPH-modules 10 and 11; urine testing-module 7; APGAR-Modules 10 and 12. b GEE-based log-risk models were fit to data using a variance correction to account for the small number of clinics in the data. Due to convergence issues, all models are unadjusted. c Due to zero cell totals indicating complete lack of adherence, estimated variance is very high and these results should be interpreted with caution. d Inestimable due to model non-convergence.
status up to 28 days postpartum (Table 4). The data indicate a decreased pre-discharge neonatal mortality prevalence at the conclusion of the LifeNet training intervention. During the pre-intervention study period, there were seven predischarge neonatal deaths out of 263 observed live births that indicates a pre-discharge neonatal mortality prevalence of 0.0268 (95% CI (exact): 0.011, 0.054) at baseline. During the post-intervention study period, there were three pre-discharge neonatal deaths out of 321 observed live births, which indicates a pre-discharge neonatal mortality prevalence of 0.009 (95% CI (exact): 0.002-0.027) at endline. The 28-day neonatal mortality at baseline was 0.042 (95% CI (exact): 0.021, 0.074) (n = 11) and at endline it was 0.016 (95% CI (exact): 0.005, 0.036) (n = 5). During the baseline period, there were zero pre-discharge maternal deaths out of 263 observed deliveries. During the endline period, there was one maternal death out of 321 observed deliveries but it was not documented pre-discharge. The labouring woman was discharged from the LifeNet partner clinic in poor condition and died upon arrival at the next referral hospital to which she was transferred (it was a complicated twin delivery).
During the baseline period, 19 of 24 (79%) cases of attempted neonatal resuscitation were successful. During the endline period, 13 of 14 (93%) attempted resuscitations were successful. The vast majority of cases at both baseline and endline were observed to conduct appropriate resuscitation techniques, including suctioning the airway, rubbing the back, positioning the head, ventilating, etc.
Data collectors observed and documented QoC for managing PPH if it occurred. At baseline, there were eight recorded observations of PPH. In all eight instances, bleeding was monitored, uterine massage was performed, uterotonic was given and the provider performed an abdominal exam for uterine contraction and examined the vagina and perineum for lacerations or cervical tear. Three providers examined the placenta for completeness, six started IV fluids, three performed uterine exploration, none used mechanical evacuation or manual removal of the placenta, two performed aortic compression, none used balloon or condom tamponade, two used uterine   sutures, five gave antibiotics and seven raised the woman's legs at some point after the PPH began. PPH causes listed in the medical records included coagulopathy, laceration, atonic uterus, incomplete expulsion of placenta and retained placenta. At endline, there were five recorded observations of PPH. In all five instances, bleeding was monitored, uterine massage was performed, uterotonic was given and the provider performed an abdominal exam for uterine contraction and examined the vagina and perineum for lacerations or cervical tear. Three providers examined the placenta for completeness, two started IV fluids, none performed uterine exploration, two used mechanical evacuation or manual removal of the placenta, one performed aortic compression, none used balloon or condom tamponade, none used uterine sutures, one gave antibiotics and one raised the woman's legs at some point after the PPH began. PPH causes listed in the medical records included incomplete expulsion of placenta, atonic uterus and retained placenta.

Discussion
The global calls for improved QoC for MNH align with recognition that service coverage is not enough for an impact on health outcomes (Countdown to 2030Collaboration, 2018WHO, OECD & International Bank for Reconstruction and Development. 2018). High-quality intrapartum facility-based care is critical for reducing maternal and early neonatal mortality, and accountability and action are guiding purposes of quality measurement (Kruk et al., 2018). This evaluation indicates that the LifeNet training intervention significantly improved maternal and neonatal healthcare quality on several key indicators and supported general positive trends across all indicators, at all six primary care clinics in Uganda, at least over a relatively short-term period. The QoC indicators reflect gold standard measurement via direct observation and are robust; however, for several of these quality indicators, the adherence rates were far from 100% and could still benefit from further improvement.
Clinical social franchising is a rapidly expanding intervention in the private sector for improving QoC and increasing utilization of quality services via branded, quality-assured services, yet their impact has been mixed (Montagu et al., 2016;Beyeler et al., 2013). However, external evaluations of these organizations, such as LifeNet International, are critical for objectively documenting impact given the pressure of sustainability (both for the franchisor and franchisee perspective) these private sector efforts face. There are many clinical QoC training initiatives but fewer models that focus on sustained engagement with partner clinics with evidence of impact over time (Rowe et al., 2018).
The study results, while largely positive and clinically meaningful, highlight tremendous shortcomings in QoC. LifeNet International's earliest training modules focus on infection control and handwashing as a core best practice. Despite the lessons and reviews, only two of the six clinics were observed to have providers who washed their hands at least once before and/or during deliveries the majority of the time at endline. Qualitative insights during the study on this suboptimal practice included provider confidence in and reliance upon gloves (without understanding how glove sterility could be comprised) and difficulties drying hands prior to putting on gloves. These nuanced issues could be addressed via a refresher training; however, if structural issues, such as lack of towels for drying hands are not addressed, training alone will not succeed. Furthermore, if the evaluation had relied on provider self-report for handwashing, adherence estimates would have been seriously biased as most providers did wash their hands, but only after delivery for clean-up purposes, which has no effect on the sterility of the delivery.
We observed a large and clinically meaningful increase in real-time partograph use over the study period; however, this government-endorsed best practice is still underutilized. The evidence of partograph efficacy in reducing maternal and neonatal deaths is mixed (Lavender et al., 2018); but if the government of Uganda and the WHO continue to recommend its use and LifeNet trains on it, further work should be done to understand why providers fail to utilize it during deliveries. A review of barriers to real-time use of the partograph in low-resource settings revealed many reasons including lack of graphing skills, stockouts, an organizational acceptance of retrospective documentation, and providers not internalizing its function and value (Ollerhead and Osri, 2014). We expect the training by LifeNet could have impacted multiple barriers, and future monitoring by LifeNet could include open-ended questions to providers to better understand uptake or lack thereof. Additional qualitative data more broadly could have helped us better understand why some indicators had more dramatic improvements than others.
The issue of medical documentation during and not after providing clinical services is an important one. Most of our study clinics only had one or two clinical staff available to provide maternity services and time was precious. In addition, as noted earlier, half of the deliveries occurred at night and power losses were common. Real-time, appropriate medical documentation was likely impeded by both physical constraints and provider prioritization issues. We received informal qualitative feedback from our local study team that there were strong indications that the LifeNet medical record itself improved clinical practice. Prior to our study, clinics typically wrote minimal information about each delivery in a ledger. For the purposes of the study, we needed to standardize the medical records, and in the process, we enhanced the type and quality of information required for documentation. For example, the record asked what was used to cut the umbilical cord, how many minutes after birth was the cord clamped and whether breastfeeding was initiated within 1 hour, all of which would not have been typically recorded in a ledger. That said, we have evidence that the medical records were often not valid measures of clinical quality as they often did not reflect directly observed practice (Kim et al., 2021), tending to overestimate good clinical practice. Even with these limitations, we recommend utilization of these more detailed, checklist-type medical records as they appear to also serve as prompts for high-quality practices (Rowe et al., 2018;Tolu et al., 2020). Because our study collected data using this medical chart during both the baseline and endline periods, our measured effect of the LifeNet intervention (i.e. PRs) is very likely to be conservative since it does not account for the effect of the medical chart itself in improving clinical quality.
Our effect estimates should also be interpreted more as measures of effectiveness than efficacy since we did not attempt to control compliance of providers to the intervention. Staff turnover at the clinics, particularly at the provider and in-charge levels, may have minimized or negated the effect of LifeNet's clinical training on some indicators. Between baseline and endline, our study team was tracking implementation of the training modules against staff attendance and realized that LifeNet did not have a routinely monitored onboarding process for catching up staff on past training lessons. Near the end of the 10 month intervention period, LifeNet course-corrected by strengthening their monitoring system to properly track whether newly hired staff were individually caught up on missed content. Likewise, poor attendance to LifeNet's training by key clinic staff, such as the in-charge, may have minimized the buy-in to the trainings and the training's overall effects. Potential under-compliance with the LifeNet trainings by medical providers likely led to effect estimates that are biased conservatively towards the null.

Study limitations
This study had several limitations. First, because we were unable to include a randomly assigned control group over the same time period as the intervention (e.g. an randomized controlled trial [RCT]), we cannot infer causality with respect to the intervention on improving QoC. For example, while the LifeNet staff and our study team monitored for other maternal and neonatal interventions during the study period and did not identify any, there is still a chance something could have been implemented in the study clinics at the same time as our study period that affected healthcare quality. Furthermore, while most providers were assessed at both baseline and endline, some new providers were included in endline who were not included in baseline. Any differences in provider-level characteristics (e.g. time of relevant experience) that may have varied across study periods and which are prognostic of the outcome (QoC measures) may have confounded the effect of the intervention. We attempted to control for the potential confounding effect of study clinic (i.e. the distribution of clinical encounters varied by both clinic and time period); however, due to many zero cell sizes, our adjusted models did not converge, and so we present unadjusted effect estimates only. However, we separately present results of cluster-level analyses that match on clinic, which eliminates any confounding of the PR by clinic. Since these cluster-level PR estimates were very similar to our individual level estimates we do not believe confounding by clinic is of significant concern to the validity of our estimates.
Although these factors affect our causal inference, the timing of changes in outcomes and the positive trends and significance of the changes across many indicators specific to LifeNet's training, considered alongside the monitoring of information on the (lack of) other interventions that may have occurred at study clinics, lead us to believe the general magnitude and direction of our estimates, and any measurable improvements in adherence to these indicators are likely attributable to the LifeNet training.
Second, our team had to review and select priority indicators without accepted global guidance on what core indicators best represent QoC. There are now efforts underway to validate and standardize selected observational measures of quality for newborn and maternal facility-based care, which may better reflect indicators that influence health impact (Day et al., 2019). In addition, we did not capture an important domain of QoC, respectful maternity care (RMC) that is now clearly part of new WHO guidelines on a positive childbirth experience (WHO, 2018). LifeNet did not have an RMCspecific module at the time of the study (they did address what they called patient-centred, compassionate care), and thus, we did not measure it. As an increasingly important aspect of high QoC that could have implications for other MNH indicators, we highly recommend it be included in future trainings based on new guidance coming from WHO and associated monitoring indicators. One hint about the clinical environment that we noticed is that patients were much more willing to share their phone numbers with our RAs (with whom they built some rapport during the direct observation of their delivery) than with the providers in order to assess 28-day health outcomes. This could reflect a patient-provider communication gap that could be further strengthened.
And finally, as noted earlier, this study was not powered for mortality outcomes. Even though we documented a reduction in the pre-discharge risk of death, comparing the baseline to the endline periods, our estimates are not precise, and thus, this difference may be real or it may be due to random chance.

Study strengths
The most notable strength of the study was our use of direct observation. By not relying on medical records or provider self-report, we have reduced the potential bias of overestimating impact. For example, in a companion article from this study, we estimated the average specificity of our QoC indicators at only 34% when using medical records instead of direct observation (Kim et al., 2021). In addition, the study was designed as a 'real-world' impact evaluation, and accordingly, we note that the study clinics experienced some clinical staff turnover that is typical in a health system. Any impact from the training intervention should reflect the impact under a realistic clinical context. The study QoC indicators were also practice-oriented and did not focus on provider knowledge. This was intentional as increases in knowledge often do not translate into sustained changes in practice and therefore would not be impactful for the study clinics (Bang et al., 2016;Rowe et al., 2018).

Future research or policy implications
Both public and private sector clinical QoC training interventions need a robust evidence base in order to scale and achieve impact in low-resource settings. As the MNH field struggles with the knowledge-practice gap, more research is needed to better understand whether more sensitive measures of clinical quality, captured by intensive data collection procedures, such as direct clinical observation, lead to better health outcomes when programmes have access to higher-quality clinical practice data. This study indicates that a social franchised-based clinical quality training, like the one performed by LifeNet International, provides clinically significant improvements in maternal and neonatal healthcare, at least over the short term. Future evaluations should focus on the durability of these effects over longer time periods, including impacts on morbidity and mortality, taking into full consideration issues related to training compliance, staff turnover and continuing medical education and refresher trainings.