The Role of Program Evaluation in Keeping Army Health “Army Strong”: Translating Lessons Learned Into Best Practices

ABSTRACT Service Members and military beneficiaries face complex and ill-structured challenges, including suicide, sexual violence, increasing health care costs, and the evolving coronavirus pandemic. Military and other government practitioners must identify effective programs, policies, and initiatives to preserve the health and ensure the readiness of our Force. Both research and program evaluation are critical to identify interventions best positioned to prevent disease, protect the public’s health, and promote health and well-being within our ranks to retain a medically ready force and reduce the global burden of disease. While military and medical leaders are typically well versed in research and understand the role of research in evidence-informed decisions, they may be less aware of program evaluation. Program evaluation is the systematic application of scientific methods to assess the design, implementation, improvement, or outcomes of a program, policy, or initiative. Although program evaluators commonly utilize scientific or research methods to answer evaluation questions, evaluation ultimately differs from research in its intent. Several recently published federal and Department of Defense policies specifically reference program evaluation, emphasizing its importance to the military and government as a whole. The Army is uniquely positioned to conduct medical and public health evaluation activities and there are several Army organizations and entities that routinely perform this work. For example, the United States Army Public Health Center (APHC) is among recognized military experts in public health assessment and program evaluation. Given the breadth of our work, the APHC understands the challenges to conducting evaluation studies in the Army and we have thoughtfully examined the conditions common to successful evaluation studies. In this commentary, we share our lessons learned to assist military colleagues, potential partners, and others in successfully evaluating the programs, policies, and initiatives necessary to keep our Service Members and beneficiaries healthy and ready. There are several challenges to executing evaluation studies in the Army that may be relevant across all Services. These include but are not limited to frequent Army leadership transitions, urgency to report study results, lack of program documentation and adequate planning for evaluation, expectation management to ensure stakeholders are well-informed about the evaluation process, and a disorganized data landscape. These challenges may hinder the successful execution of evaluation studies, or prevent them from being attempted in the first place, depriving Army leaders of quality, actionable information to make evidence-informed decisions. Despite the aforementioned challenges, we have identified a number of best practices to overcome these challenges and conduct successful evaluation studies. These facilitators of successful evaluations can be summarized as: collaboration with engaged stakeholders who understand the value of evaluation, evaluation studies aligned with larger strategic priorities, agile methodology, thoughtful evaluation planning, and effective communication with stakeholders. We wholeheartedly recommend and encourage program evaluation at every opportunity, and we anticipate the call for evaluation and evidence-informed decisions to continually increase. Our hope is that others – to include partners and stakeholders within and external to the military – will be able to leverage and apply this information, especially the identified best practices, in their evaluation efforts to ensure success.

Military and other government practitioners must identify effective actions to address vital public health and medical concerns for service members and military beneficiaries. Public health crises, such as the evolving coronavirus pandemic, further emphasize the need for evidence-informed public health and medical practices. Both research and program evaluation are critical to identify interventions best positioned to prevent disease, protect the public's health, and promote health and well-being within our ranks to retain a medically ready force and reduce the global burden of disease.
While military and medical leaders are typically well versed in research (the U.S. Army recently established the 4-star Army Futures Command to centralize its research and development activities) and understand the role of research in evidence-informed decisions, they may be less aware of program evaluation and the integral role it plays in identifying and assessing program, policy, and initiative effectiveness and impact. Military and medical leaders' awareness of program evaluation is essential and can help them consider critical practice-based evidence, results, and recommendations that may improve the health and readiness of service members and their families.

PROGRAM EVALUATION SCOPE AND PURPOSE
Program evaluation is the systematic application of scientific methods to assess the design, implementation, improvement, or outcomes of a program. 1 Importantly, in this context, a program is defined as a systematic implementation of a set of activities with identified target audiences, utilizing clear objectives, goals, and outcomes. As a key component of one of the 10 essential public health services, 2 the purpose of program evaluation is to demonstrate the need for or measure the effectiveness of programs, policies, or initiatives; examine their strengths and weaknesses; identify best practices; and improve their design, implementation, and impact.
Although program evaluators commonly utilize scientific or research methods to answer evaluation questions, evaluation ultimately differs from research in its purpose and intent. 3 Generally speaking, research aims to test a theory or produce new or generalizable knowledge and can involve manipulation of the participants' situation. Program evaluation, on the other hand, commonly focuses on assessing the effectiveness and impact of a new or existing program or initiative for a specific stakeholder or group of stakeholders, measuring processes and effects in "real-world" circumstances, and proactively identifying and fixing issues of program quality or resource alignment to poise the program or initiative for success (see Fig. 1). [4][5][6]

PROGRAM EVALUATION IN THE U.S. ARMY
Several recently published federal and DoD policies specifically reference program evaluation (see Table I), emphasizing its importance to the government. Any structured Army activity can potentially be evaluated-a program, policy, or initiative. In many ways, the Army is uniquely positioned to conduct medical and public health evaluation activities and there are several Army organizations and entities that routinely perform this work. For example, the U.S. APHC mission, in part, is to assure the quality and effectiveness of the Army Public Health Enterprise, and several divisions within the Center routinely execute program evaluation studies. Specifically, the Public Health Assessment Division (PHAD) advocates for, builds capacity for, and provides comprehensive program evaluation services to inform evidencedriven public health decision-making within the Army Public Health Enterprise and improve programs, policies, and environments for the Total Army Family. Since its inception in 2009, the PHAD has executed more than 100 evaluation studies in collaboration with key stakeholders and partners to examine the implementation and effectiveness of a variety of public health and prevention activities. These include tobacco control policies; embedded and virtual behavioral health services; Army Wellness Centers; public health accreditation and quality improvement efforts; Commander's Ready and Resilient Councils; sleep, activity, and nutrition education programs; and various Command Health Assessments. Through this work and more, the APHC is among recognized FIGURE 1. Differences between program evaluation and research. The distinctions between evaluation and research exist in the methods and analysis steps, as well as in the focus, goal, setting, and values related to each. [4][5][6]  Example resources and references relevant to consideration Consideration 1. Resource 1. Military missions are largely driven by regulatory authority. Understanding the regulatory authority that exists to support the evaluation of whatever is being assessed is crucial. It is critical to be aware of these regulatory authorities and review and approval processes prior to commencing any evaluation study.
Public Law No: 115-435, Foundations for Evidence-Based Policymaking Act of 2018 (January 14, 2019) calls for each federal agency to "designate a senior employee as Evaluation Officer to coordinate evidence-building activities and…to advise on statistical policy, techniques, and procedures." Resource 2(a). There has been an increased focus on evaluation in the last 5-10 years in the military, often in the context of making resource decisions. This can be advantageous in that senior leaders are now asking for evaluations to be conducted. Additionally, program, policy, or initiative stakeholders are seeking out evaluation to show evidence of effectiveness to garner continued or increased resourcing and to increasingly demonstrate a commitment to program improvement and accountability. It can be challenging in that leaders may want to make decisions using limited information and in that program, policy, or initiative stakeholders may be resistant to being evaluated for the fear of losing resources if results are unfavorable. DoD Instruction 1342.22 (Military Family Readiness, April 11, 2017) states, "The impact of family readiness services shall be measured through program evaluation that uses valid and reliable outcome, customer satisfaction, cost, and process measures that are linked to specific and measurable performance goals… to inform decisions regarding sustainment, modification or termination of family readiness services" (p. 27). Resource 2(b).
Multiple Army Regulations including AR 40-5 (Army Public Health Program, May 12, 2020) and AR 600-63 (Army Health Promotion, April 1, 2015) reference program evaluation and the need for the information and evidence it provides. Recently published Department of the Army Pamphlet 40-11, to accompany AR 40-5, provides specific guidance and information relevant to program evaluation in Chapter 2 (Quality Management and Accreditation) and Appendix B (Public Health Program Evaluation).

Consideration 3.
Resource 3(a). There are important authorities and approval processes for collecting primary evaluation data from service members. Several regulatory authorities and approval processes define the types of information that can be collected from military audiences (i.e., active duty service members, reservists and Guardsmen, women, and family members). Any entity collecting data needs to recognize and mitigate potential human subjects concerns (e.g., over-surveying and undue command influence). Consideration 4. Resource 4(a). Not all work is publicly accessible and there are parameters on what can be communicated, to whom, and in what sequence. Some evaluation work is considered privileged information or sensitive in nature and may not be available through publicly released reports or papers. There may be strict guidance relevant to who receives what information, as well as how and when they receive it. When possible, agreements for review and release of results should be made upfront, so valuable lessons learned can be shared with other military organizations, at a minimum.
The DoD asks contributors to use a document classification system outlined in DoD Instruction 5230.24, Distribution Statements on Technical Documents, dated August 23, 2012, to indicate how broadly documents should be distributed based on defined criteria. Resource 4(b).
Further considerations on the security of DoD information can be found in DoD Regulation 5200.1 R "Information Security Program".

Consideration 5.
Resource 5(a). It is critical to understand and be able to operate within the military environment when executing evaluation studies. The military, and each branch of service, has a culture of its own with established norms, language, culture, and traditions. Trying to execute an evaluation study as one would in a civilian or academic setting will not only make the evaluation team appear culturally incompetent but also will be an exercise in futility. Leveraging chains of command, understanding the "mission first" mentality, and knowing and using military terms relevant to evaluation (e.g., MOPs-measures of performance; MOEs-measures of effectiveness; AARs-After-Action Reviews) are vital to evaluation success. military experts in public health assessment and program evaluation. Given the breadth of our work, we understand the challenges to conducting evaluation studies in the Army and have thoughtfully examined the conditions common to successful evaluation studies. From our perspective, key characteristics of successful evaluations include: • execution according to a planned timeline, • anticipation and reduction of points of friction, • collaboration with stakeholders to address unforeseen challenges to the satisfaction of all parties, and • application of findings in stakeholder decision-making.
In this commentary, we share our lessons learned to assist military colleagues, potential partners, and others in successfully evaluating the programs, policies, and initiatives necessary to keep our service members and beneficiaries healthy and ready.

CHALLENGES OF EXECUTING EVALUATION STUDIES IN THE ARMY
Despite laws, regulations, and scientific consensus regarding the value of program evaluation, the Army and military at large can experience challenges in conducting evaluation studies. These challenges include any factors that make planning for or completing an evaluation especially difficult and often result in the need for additional resources (e.g., time, personnel, and funding). There are several challenges to executing evaluation studies in the Army, including frequent Army leadership transitions, urgency to report study results, lack of program documentation and planning, expectation management, and a disorganized data landscape. These challenges may hinder the successful execution of evaluation studies or prevent them from being attempted in the first place, depriving Army leaders of quality, actionable information to make evidence-informed decisions. The following paragraphs describe each challenge and its implications.
The Army command cycle is generally 2-4 years, meaning that Army leadership, often the decision-authority or champion for program evaluation, transitions frequently. These transitions cause a disruption in stakeholder continuity, which can be especially challenging when one leader who supports evaluation is replaced with another leader who has different strategic priorities. The timing of when Army leaders transition during the evaluation process will have different implications. For example, leadership transitions that occur at the beginning of the evaluation process may change the scope of the evaluation or reduce the priority for the study, which may delay or prevent execution. Leadership transitions that occur at the end of the evaluation process may affect how the results are interpreted and actioned.
One direct ramification of the short command cycle in the Army setting is the urgency to conduct evaluation studies and report results quickly for rapid decisionmaking. Depending on leaders' priorities for evaluation and data-informed action, the evaluation process may need to be accelerated to report key findings and recommendations to leadership within a command cycle. An urgent evaluation process is especially challenging because of the compromise between the desired scope of information and the application of rigorous methods to meet the need for immediate datainformed action. An accelerated timeline may limit or broaden the scope of the evaluation; limited scope makes it difficult to provide all of the information necessary for thorough datainformed decision-making, and a broadened scope may lack methods with appropriate precision, risking the quality of the information provided.
Similar to those within the civilian sector, some existing Army programs, policies, and initiatives have been implemented to address an assumed immediate need or leadership priority without data to justify a true gap in services or population need. The urgency for program execution often leads to a lack of program documentation and adequate planning for evaluation. Examples of program documentation that may be missing include a logic model, specified programmatic goals and objectives, program implementation guide, or evaluation plan. This lack of program documentation and evaluation planning can result in a different understanding between program stakeholders, leaders, and evaluators of the intended outcomes and what constitutes program success; an inadequate explanation of why selected activities/interventions are expected to result in desired outcomes; a lack of baseline or foundational data to assist in measuring outcomes and changes over time; and frustration among all parties due to slow timelines and effort needed to create shared understanding.
Expertise is diverse across the Army, so it can be challenging to manage expectations about program evaluation to ensure that stakeholders are well-informed about the evaluation process. Managing stakeholders' expectations occurs at many different levels, including: distinguishing between monitoring and evaluation, differentiating between strategies aimed at directly influencing health behaviors or outcomes versus systems-focused strategies, and understanding that program evaluation is not intended to be punitive but instead should be used to inform program improvement. First, many Army programs focus data collection on monitoring (e.g., service utilization and program participation) because it provides support for program fidelity; however, this limits opportunities for evaluation studies aimed at assessing the achievement of program outcomes, which can be challenging when outcomes are highly desired information. Second, it can take years to understand the pathways to outcomes, which can be especially challenging for Army programs that are implemented as systems-focused strategies. In this case, evaluation methods are best targeted at the efficacy of those systems-level outcomes rather than individuallevel outcomes. Third, the perception that evaluation is punitive may hinder stakeholders' engagement in the evaluation process.
For any given program and its associated evaluation, data sources can range from DoD-level personnel data, Defense Health Agency health records, Army-level data on physical fitness tests or environmental indicators, program monitoring data, and any primary data collected for the evaluation itself. Most of the platforms where data are stored are siloed with varying requirements for access, causing a disorganized data landscape and additional roadblocks to answering evaluation questions. The ramifications of disorganized data include distrust in Army data platforms because of issues with data quality and unnecessary participant burden via redundant data collection. There are current efforts to integrate data systems in the Army, such as the Person-Event Data Environment, now called cPeople 7,8 ; however, many data systems are not closely tracked or monitored, which causes quality issues that either prevent data collection or extraction or require additional data collection efforts. For example, one recently planned evaluation study could not be conducted because the required data system lacked rosters for participant identification; the evaluation team was unable to identify who actually participated in the program. When this occurs, additional data can often be collected outside of these data systems, although frequent and redundant data collection can burden Soldiers.

FACILITATORS FOR SUCCESSFUL EVALUATION STUDIES IN THE ARMY
Despite the aforementioned challenges, we have identified a number of best practices to overcome these challenges and execute successful evaluation studies. These facilitators can be summarized as: stakeholder factors, alignment with strategy, agile methodology, thoughtful evaluation planning, and effective communication.
Collaborating with engaged stakeholders who understand the value of evaluation has resulted in more effective studies and use of findings compared to studies where stakeholders (e.g., program managers) have been directed (i.e., "voluntold") to participate in the evaluation process. In the ideal case, the enthusiasm of the stakeholder is balanced by their trust in the evaluation process and team; a stakeholder who is too passionate may delay the study through ongoing schedule coordination and adjustment, while stakeholders who lack engagement are difficult to reach for decisionmaking. Stakeholders who value evaluation understand that quality work takes time and remain engaged throughout the process-including through taking action on recommendations. Evaluation success has been stunted on occasions when stakeholders are difficult to reach when collecting data, formulating recommendations, or disseminating results.
Evaluation studies in the Army range from discrete analyses linking program variables and health outcomes to long-term efforts, which include multiple iterations of pilot programs. Regardless of the scope of the evaluation, the studies aligned with larger strategic priorities have been most successfully executed. When the broader organization (e.g., the unit, the command, and the Army) prioritizes the work, there is greater stakeholder buy-in at all levels with fewer barriers impeding completion.
As mentioned previously, working within the parameters of a military organization comes with its challenges and bounds; however, successful evaluation teams find ways to be agile in their planning, execution, and reporting. Evaluations that can be tailored to stakeholders' needs are critical in this space. Tailoring not only includes ensuring that the data collection tools are focused on gathering the right information, but the evaluation team's attentiveness and flexibility to stakeholder information needs, preferences for involvement, and intent and priority of certain deliverables. Successful studies often include discrete phases and interim deliverables; a framework with explicit decision points allows the team to shift focus in response to important results or adapt to a changing backdrop without disrupting an evaluation plan already in motion. Other successful studies use established access to data systems or seek to augment the current landscape by connecting new sources; the ability to gather information without always relying on primary, self-reported data has been beneficial to our work and appreciated by stakeholders. Additionally, we have recently modified our suite of evaluation deliverables to offer options beyond a standard technical report (e.g., briefing slides and evaluation findings summaries with infographics). These deliverables embed agility into the evaluation process by providing space for conversations with stakeholders on how they would like to present their findings and which audiences they would like to reach. These conversations can streamline deliverable development and provide a shared understanding of when certain types of evaluation results and recommendations are needed and for whom.
Including evaluation from the conception phase of a program facilitates the entire evaluation process. It is far more straightforward to embed tools and mechanisms to collect pilot, monitoring, and outcome data when the development of the program and evaluation plan are symbiotically integrated. Even in cases where evaluation planning occurs late in program development or after program inception, it seems that the earlier the evaluation plan is incorporated, the more successful and smooth the evaluation. When evaluation teams work with program developers in the early stages of program development, it is easier to understand the rationale behind the program (rather than sifting through incomplete program documentation later) and recommend touch points to collect different types of data to best answer their questions (rather than attempting to fit methods to the program later).
As mentioned above, stakeholders' hesitation to evaluate a program may stem from fears of negative or inconclusive results leading to reduced funding or termination; thus, it is worth stating explicitly: success is greatly enhanced when effective communication practices with stakeholders are embedded throughout the evaluation process. The right mode and frequency of communication can look different for each study; for an extremely engaged and enthusiastic stakeholder, this might mean downshifting from weekly (or daily!) phone calls to planned meetings every other week. For a stakeholder with a great deal of trust in the evaluation process and a slower-paced longitudinal study, this might mean a monthly check-in via email. There is an understanding, of course, that any meeting schedule is subject to change when critical milestones are forthcoming, for example, needing more availability from the stakeholder during active data collection, less engagement during analysis, and additional time to discuss findings, implications, and recommendations as the study is nearing completion. Neither the stakeholder nor the evaluator can operate in a silo, and planning for critical communication is one way to ensure consistent dialogue.

ADDITIONAL CONSIDERATIONS
In addition to the challenges and facilitators of program evaluation within the Army, there are several important considerations that are relevant to this type of work in the military more broadly (see Table I). These considerations are neither challenges nor facilitators, neither positives nor negatives, but instead are contextual factors that may impact an evaluation's success or failure and the application of its results. Alongside these considerations are relevant references or resources, as applicable, which might help others understand and appreciate the context and incorporate it into evaluation planning and execution.

CONCLUSIONS
As summarized throughout this commentary, there are challenges to program evaluation in the Army, and many of these may be common across military services. Despite these challenges, there are numerous facilitators that enhance the likelihood of evaluation success and the ability to collect, analyze, and report information that is useful, scientifically sound, and-perhaps most importantly-actioned by stakeholders, partners, and decision-makers. Our lessons learned in conducting program evaluation studies yielded the challenges and facilitators that we shared in this commentary; we believe the facilitators for evaluation success should be considered best practices, and we now strive to incorporate them in as many of our studies as possible.
Importantly, every evaluation study has its challenges; however, the best practices shared in this commentary may help mitigate or even eliminate some of these challenges. In our experience, a study that is devoid of the facilitators identified, or one without a healthy balance of challenges and facilitators, is likely to prove frustrating or infeasible. For that reason, we recommend a thorough and thoughtful look at the context and situation surrounding a study prior to its inception, with a particular emphasis on examining the extent to which facilitators are present or can be implemented within the proposed evaluation.

IMPLICATIONS
Taken together, the challenges, facilitators, and considerations presented here provide an overview of our salient lessons learned as we have evaluated numerous health-related policies, programs, and practices in the Army. We wholeheartedly recommend and encourage program evaluation at every opportunity, and we anticipate the call for evaluation and evidence-informed decisions to continually increase. Our hope is that others-to include partners and stakeholders within and external to the military-will be able to leverage and apply this information, especially the identified best practices, in their evaluation efforts to ensure success.
Our service men, women, and leaders are held to the highest standards. It is, therefore, our duty as medical and health professionals to ensure the programs, services, policies, initiatives, and interventions offered to our service members and their families are similarly held to the highest standard. Today's military is continually faced with complex and illstructured challenges like suicide, sexual violence, increasing health care costs, and pandemics. Thoughtful and successful program evaluation, combined with the cutting-edge research for which the military is well known, will provide the necessary evidence we need to take decisive actions to preserve the health and ensure the readiness of our Force.