Abstract

Objective

The necessity for pre-injury baseline computerized neurocognitive assessments versus comparing post-concussion outcomes to manufacturer-provided normative data is unclear. Manufacturer-provided norms may not be equivalent to institution-specific norms, which poses risks for misclassifying the presence of impairment when comparing individual post-concussion performance to manufacturer-provided norms. The objective of this cohort study was to compare institutionally derived normative data to manufacturer-provided normative values provided by ImPACT® Applications, Incorporated.

Method

National Collegiate Athletic Association Division 1 university student athletes (n = 952; aged 19.2 ± 1.4 years, 42.5% female) from one university participated in this study by completing pre-injury baseline Immediate Post-Concussion Assessment and Cognitive Test (ImPACT) assessments. Participants were separated into 4 groups based on ImPACT’s age and gender norms: males <18 years old (n = 186), females <18 years old (n = 165), males >19 years old (n = 361) or females >19 years old (n = 240). Comparisons were made between manufacturer-provided norms and institutionally derived normative data for each of ImPACT’s clinical composite scores: Verbal (VEM) and Visual (VIM) Memory, Visual Motor Speed (VMS), and Reaction Time (RT). Outcome scores were compared for all groups using a Chi-squared goodness of fit analysis.

Results

Institutionally derived normative data indicated above average performance for VEM, VIM, and VMS, and slightly below average performance for RT compared to the manufacturer-provided data (χ2 ≥ 20.867; p < 0.001).

Conclusions

Differences between manufacturer- and institution-based normative value distributions were observed. This has implications for an increased risk of misclassifying impairment following a concussion in lieu of comparison to baseline assessment and therefore supports the need to utilize baseline testing when feasible, or otherwise compare to institutionally derived norms rather than manufacturer-provided norms.

Introduction

Neurocognitive testing has been deemed a “cornerstone” of sport-related concussion (SRC) management and is recommended as part of a multimodal concussion assessment paradigm (Broglio et al., 2014; Meehan et al., 2012). Currently, computerized neurocognitive tests (CNTs) are used in multiple settings (Lynall et al., 2013). More specifically, the Immediate Post-Concussion Assessment and Cognitive Test (ImPACT) has been reported as the most widely used CNT at the secondary school and collegiate levels of sport (Buckley et al., 2015; Lynall et al., 2013; Meehan et al., 2012). Reports also indicate that the majority of surveyed athletic trainers are administering pre-injury (baseline) and post-injury CNT assessments to some or all of their student athletes (Buckley et al., 2015; Meehan et al., 2012). Although assessment with CNTs has been reported to add meaningful information to the multimodal assessment paradigm as an adjunct to the clinical exam, there is no consensus regarding the efficacy of performing baseline assessments compared to solely conducting post-injury assessments (Broglio et al., 2007b; Echemendia et al., 2013; Elbin et al., 2013; Lynall et al., 2013; Moser et al., 2015; Resch et al., 2013b; Resch et al., 2016; Schatz & Robertshaw, 2014).

Baseline testing is purported to improve the information obtained from CNTs in the clinical decision-making process by providing a unique, patient-specific data comparison for measurement and monitoring of post-concussion impairment, especially in cases where ADHD, learning disability (LD), and/or very high or very low neurocognitive performance abilities are present (Broglio et al., 2014; Cottle et al., 2017; Echemendia et al., 2013; Elbin et al., 2013; Moser et al., 2015; Resch et al., 2013b; Resch et al., 2016; Schmidt et al., 2012). Rationale for the lack of clinical implementation of baseline testing includes lack of access to healthcare professionals by athletes, high demands placed on time, personnel and monetary resources, necessity of training for proper administration and interpretation of the test, outcome validity assessment, and possibility for practice effects in cases where re-testing is indicated (Broglio et al., 2014; Buckley et al., 2015; Echemendia et al., 2013; Moser et al., 2015; Register-Mihalik et al., 2012; Resch et al., 2013b). Resource limitations and sources of random and systematic error affect the ability of baseline assessment performance to adequately represent an individual’s true neurocognitive ability. Additionally, CNTs have been demonstrated to have variable and sub-optimal test–retest reliability across varying intervals (Brett et al., 2016; Broglio et al., 2007a; Elbin et al., 2011; Nakayama et al., 2014; Iverson et al., 2003; Register-Mihalik et al., 2012; Resch et al., 2013a; Schatz & Ferris, 2013). Sub-optimal test–retest reliability may be influenced by random and systematic error and may limit the ability of the CNT to detect a clinically meaningful change following an injury (i.e., change not due to measurement error, test-taker motivation, etc.; Alsalaheen et al., 2016; Echemendia et al., 2012; Rabinowitz et al., 2015; Randolph, 2011; Resch et al., 2013b; Schatz & Robertshaw, 2014). In a related study, 88% of participants who had potentially underperformed on initial baseline assessments were able to improve their scores with subsequent assessments (Walton et al., 2018). In order to account for measurement instability associated with individual test-to-test comparisons, the practice of comparing post-injury assessments to normative data has been suggested as an alternative to conducting baseline assessments (Echemendia et al., 2012; Schmidt et al., 2012).

The manufacturers of ImPACT provide normative data for each of its outcome scores that are thought to be representative of athletes in the United States (Lovell, 2004). Recent research has demonstrated that the ImPACT’s normative data may not reflect all test-takers adequately but this concept is currently developing within the literature (Broglio et al., 2014; Cottle et al., 2017; Echemendia et al., 2013; Elbin et al., 2013; Moser et al., 2015; Schatz & Robertshaw, 2014; Schmidt et al., 2012). In a study by Schatz and Robertshaw (2014), more athletes were correctly identified as impaired following SRC when compared to their individual baseline scores than were identified when compared to normative data. The authors also reported that those with “above average” baseline ImPACT outcome scores were more frequently misclassified as having no impairment post-injury when compared to manufacturer-provided normative data as opposed to their own baseline performances. In order to determine the clinical utility of the normative data comparison method it is pertinent to know how well institutionally based normative data compares to manufacturer-provided normative data.

Therefore, the primary purpose of our study was to examine how the distribution of one university’s institutionally based normative ImPACT data compared to that of the ImPACT’s manufacturer-provided normative data. We hypothesized that the institution’s student athlete population would perform differently than the manufacturer-provided normative data. Our hypothesis is based on a previous report that suggested ~40% of student athletes are expected to score below the 16th percentile when compared to manufacturer-provided normative data (Iverson & Schatz, 2015). In a separate but related study, we observed that only ~15% of student athletes from the current institution met this criterion (Walton et al., 2018). Specifically, we hypothesized that the institutionally based normative data would indicate better performance than the manufacturer-provided normative data for all outcome scores.

Materials and Methods

Participants

This sample consisted of all student athletes from a National Collegiate Athletic Association (NCAA) Division I public university who completed pre-injury baseline ImPACT assessments as part of their institution’s SRC management protocol. This university competes in a Power 5 conference and is considered an academically selective institution. The current study was approved by the institutional review board, and each participant provided informed consent prior to their baseline assessment.

Procedures

All student athletes were administered the ImPACT individually or in pairs as part of routine baseline testing. The test was administered to participants seated at a desktop or laptop computer running the Windows 7 operating system. Each computer was equipped with an external mouse and keyboard. Tests were taken in a quiet room with limited environmental distractions, and tests were proctored by a certified athletic trainer who had been trained in the administration of the ImPACT. Student athletes were provided supplemental verbal instructions for any or all of the sub-tests upon request and were allowed to ask questions freely. Supplemental instructions did not deviate from the written instructions provided by the manufacturer. Testing took approximately 25 min to complete. Validity of each assessment was determined by the ImPACT’s automated validity criteria.

The ImPACT Test

The ImPACT test (Version 2.1; ImPACT® Applications, Inc., Pittsburgh, PA) consists of six sub-tests that contribute to four neurocognitive outcome scores: Verbal Memory (VEM), Visual Memory (VIM), Visual Motor Speed (VMS) and Reaction Time (RT; Lovell, 2004). Each of these outcomes is presented as a raw score, and a percentile rank is given based on that raw score compared to the manufacturer-provided normative data for the test-taker’s age and sex.

Data Analysis

Analyses for each outcome score was completed separately for each of the age and sex subgroups determined by the ImPACT’s normative data set (Table 1; Lovell, 2004). Normative institutionally derived baseline data were compared to the manufacturer-provided normative data using Chi-squared goodness of fit (χ2) tests. A total of 16 χ2 tests were performed: there were four age and sex subgroups, and each subgroup was compared to the manufacturer-provided data for all four of ImPACT’s neurocognitive outcome scores. The manufacturer-provided normative data were placed into five performance categories (bins) that were derived from the “classification ranges” presented in the ImPACT user manual (Table 2; Lovell, 2004). The ImPACT’s classification ranges represent performance capabilities based on a normal distribution from “severely impaired” (<0.13th percentile) to “very superior” (≥98th percentile). The five bins used for our analyses combined the lowest four classification ranges (everything below the 10th percentile) as well as the highest two classification ranges (everything above the 90th percentile) while maintaining the middle three classifications. This classification structure was chosen to reflect the manufacturer’s preset interpretation guidelines of below normal, within a normal range, and above normal performances. The lowest bins were combined so that only whole number percentiles would be represented for below average performances, whereas the upper percentiles were subsequently combined to maintain symmetry. Our classification structure also allowed enough data to be placed in each bin for statistical analysis.

Table 1

Mean ImPACT outcome composite scores for age and sex groups

GroupNVEM mean (SD)VIM mean (SD)VMS mean (SD)RT mean (SD)
Men ≤18 years18688.1 (9.60)81.3 (8.20)42.9 (6.19)0.54 (0.05)
Women ≤18 years16590.8 (8.18)81.3 (10.72)44.0 (5.45)0.55 (0.06)
Men ≥19 years36188.4 (8.64)79.8 (11.94)42.4 (6.08)0.54 (0.06)
Women ≥19 years24089.4 (8.94)77.6 (12.75)43.3 (5.74)0.55 (0.06)
GroupNVEM mean (SD)VIM mean (SD)VMS mean (SD)RT mean (SD)
Men ≤18 years18688.1 (9.60)81.3 (8.20)42.9 (6.19)0.54 (0.05)
Women ≤18 years16590.8 (8.18)81.3 (10.72)44.0 (5.45)0.55 (0.06)
Men ≥19 years36188.4 (8.64)79.8 (11.94)42.4 (6.08)0.54 (0.06)
Women ≥19 years24089.4 (8.94)77.6 (12.75)43.3 (5.74)0.55 (0.06)

Note: ImPACT outcome scores: VEM = Verbal Memory Composite; VIM = Visual Memory Composite; VMS = Visual Motor Speed Composite; RT = Reaction Time Composite.

Table 1

Mean ImPACT outcome composite scores for age and sex groups

GroupNVEM mean (SD)VIM mean (SD)VMS mean (SD)RT mean (SD)
Men ≤18 years18688.1 (9.60)81.3 (8.20)42.9 (6.19)0.54 (0.05)
Women ≤18 years16590.8 (8.18)81.3 (10.72)44.0 (5.45)0.55 (0.06)
Men ≥19 years36188.4 (8.64)79.8 (11.94)42.4 (6.08)0.54 (0.06)
Women ≥19 years24089.4 (8.94)77.6 (12.75)43.3 (5.74)0.55 (0.06)
GroupNVEM mean (SD)VIM mean (SD)VMS mean (SD)RT mean (SD)
Men ≤18 years18688.1 (9.60)81.3 (8.20)42.9 (6.19)0.54 (0.05)
Women ≤18 years16590.8 (8.18)81.3 (10.72)44.0 (5.45)0.55 (0.06)
Men ≥19 years36188.4 (8.64)79.8 (11.94)42.4 (6.08)0.54 (0.06)
Women ≥19 years24089.4 (8.94)77.6 (12.75)43.3 (5.74)0.55 (0.06)

Note: ImPACT outcome scores: VEM = Verbal Memory Composite; VIM = Visual Memory Composite; VMS = Visual Motor Speed Composite; RT = Reaction Time Composite.

Table 2

ImPACT normative data classifications and Chi-squared (χ2) bin structure for the current study

ImPACT norm percentile ranks ImPACT classification range χ2 Bin for the current studyχ2 Cumulative probabilityχ2 Expected bin probability
<0.13Severely impairedBelow average9%9%
0.13–0.35Moderately impaired
0.38–1.9Mildly impaired
2–9Borderline
10–24Low averageLow average24%15%
25–75AverageAverage75%51%
76–90High averageHigh average90%15%
91–97SuperiorAbove average100%10%
≥98Very superior
ImPACT norm percentile ranks ImPACT classification range χ2 Bin for the current studyχ2 Cumulative probabilityχ2 Expected bin probability
<0.13Severely impairedBelow average9%9%
0.13–0.35Moderately impaired
0.38–1.9Mildly impaired
2–9Borderline
10–24Low averageLow average24%15%
25–75AverageAverage75%51%
76–90High averageHigh average90%15%
91–97SuperiorAbove average100%10%
≥98Very superior

Note: ImPACT normative percentile ranks and classification ranges are according to the ImPACT’s Clinical User Manual (Lovell, 2004).

Table 2

ImPACT normative data classifications and Chi-squared (χ2) bin structure for the current study

ImPACT norm percentile ranks ImPACT classification range χ2 Bin for the current studyχ2 Cumulative probabilityχ2 Expected bin probability
<0.13Severely impairedBelow average9%9%
0.13–0.35Moderately impaired
0.38–1.9Mildly impaired
2–9Borderline
10–24Low averageLow average24%15%
25–75AverageAverage75%51%
76–90High averageHigh average90%15%
91–97SuperiorAbove average100%10%
≥98Very superior
ImPACT norm percentile ranks ImPACT classification range χ2 Bin for the current studyχ2 Cumulative probabilityχ2 Expected bin probability
<0.13Severely impairedBelow average9%9%
0.13–0.35Moderately impaired
0.38–1.9Mildly impaired
2–9Borderline
10–24Low averageLow average24%15%
25–75AverageAverage75%51%
76–90High averageHigh average90%15%
91–97SuperiorAbove average100%10%
≥98Very superior

Note: ImPACT normative percentile ranks and classification ranges are according to the ImPACT’s Clinical User Manual (Lovell, 2004).

The expected values of each bin within χ2 analyses were calculated by multiplying the number of individuals in each group by the expected probability associated with that particular bin (Table 2). For example, the “average” performance bin for VEM in the group of males ≥19 years of age was expected to contain 51% of that group, so this expected probability was multiplied by the group size:
$$\begin{equation*} 0.51\times 361\approx 184 \ \textrm{expected participants.}\end{equation*}$$

The observed raw composite scores for the institutionally derived groups were then compared to those from the manufacturer-provided data associated with each bin. The observed number of participants from the institutionally based data who fell into each bin was then compared to the expected number of participants using a χ2 distribution with 4 degrees of freedom. Cramer’s V was calculated as a measure of the size of effect of the difference between each institutionally based outcome and the manufacturer-provided outcome (Cramer, 1946), and values were interpreted as small (V ≤ 0.3), medium (0.31 ≤ V ≤ 0.49), or large (V ≥ 0.5; Cohen, 1988). All analyses were performed with α = 0.05. Type I error attributed to multiple comparisons was accounted for using a Bonferroni correction resulting in α ≤ 0.003. χ2 and Cramer’s V were calculated using the R statistical package (R Foundation, Vienna, Austria).

Results

Participants were between 17 and 29 years of age (mean ± standard deviation = 19.2 ± 1.4 years) and represented all varsity sports at the institution between the 2014–2015 and 2016–2017 academic years. The sample was 42.5% female. Of the 961 student athletes who were administered the ImPACT test, 9 (1%) had invalid performances (designated as “Baseline ++” on their individual outcome report) according to the ImPACT’s automated validity criteria and were removed from our analyses. The remaining 952 participants were included in our analyses.

Institutionally derived normative values were significantly different from manufacturer-provided normative values (χ2 ≥ 20.87; p ≤ 0.001) for all outcomes in each subgroup (Fig. 1). Cramer’s V effect sizes were predominantly large (0.50–1.42) to medium (0.34–0.49), with only one small effect (V = 0.24) found for VMS in males ≥19 years of age. Observed institutionally derived distributions represented better performance on the VEM, VIM, and VMS outcome scores compared to manufacturer-provided values. With regard to the RT distributions, the largest proportion of institutionally based normative data was observed to be placed in the low average bin with the smallest proportion of observed scores found in the above average bin.

Fig. 1

Institutionally based normative data comparison to ImPACT’s manufacturer-provided normative data. This figure represents the proportion of observed scores for each subgroup on each ImPACT outcome for each performance category (χ2 bin). Performance Categories: Below Average = raw ImPACT composite scores that fell in the bottom 9% of scores according to the ImPACT’s manufacturer-provided normative data; Low Average = raw ImPACT composite scores from the 10th percentile through the 24th percentile according to the manufacturer-provided data; Average = raw ImPACT composite scores from the 25th percentile through the 75th percentile according to the manufacturer-provided data; High Average = raw ImPACT composite scores from the 76th percentile through the 90th percentile according to the manufacturer-provided data; Above Average = raw composite scores from the 91st percentile through the 100th percentile according to the manufacturer-provided data.

Discussion

General Findings

This study sought to determine how well institutionally derived normative composite scores of the ImPACT test compare to manufacturer-provided normative values. Generally, the institutionally based norms indicated superior performance compared to manufacturer-provided norms for the VEM, VIM, and VMS composite scores as the observed proportions in the “high average” and “above average” bins were substantially greater than expected. The RT composite scores of the institutionally based data were also different from the manufacturer-provided distribution, as evidenced by an increased percentage of “low average” observations compared to the manufacturer-provided expected percentages.

These observed differences indicate that institutionally derived normative data may provide more clinically relevant information than manufacturer-provided values. Specifically, the institutionally based data are more reflective of a given institutions’ student athlete population than manufacturer norms in the absence of individual baseline assessments. Therefore, caution is needed when using manufacturer-provided data to make post-injury comparisons without a valid baseline assessment due to the inherent risks of “false-positive” and “false-negative” outcomes regarding the presence of impairment (Echemendia et al., 2013; Resch et al., 2013b; Schatz & Robertshaw, 2014; Schmidt et al., 2012). Hypothetically, a false-positive could be seen in the current population when comparing RT to manufacturer-provided data as a slight decrease in performance from baseline could be seen as below normal compared to manufacturer-provided values even though it would not necessarily be indicative of meaningful change from baseline performance. Conversely, a false-negative finding could result anytime from an above average performer with post-injury VEM, VIM, or VMS scores in the average or low average range. This would be interpreted as normal when compared to manufacturer-provided normative values, even though it could also potentially be considered a meaningful change with regard to individual baseline performance.

Individual Institutions May Vary

In the current study, the institutionally derived data could be considered comparable to that of other institutions; however, the testing paradigm used at this institution is highly controlled and may lack external validity. Interestingly, the average outcome scores of the sample used for the current study were found to be higher than the majority of those reported elsewhere in the literature for collegiate student athletes (Table 3; Broglio et al., 2007a, 2007b; Cottle et al., 2017; Echemendia et al., 2012; Henry & Sandel, 2015; Nakayama et al., 2014; Register-Mihalik et al., 2012; Resch et al., 2013a; Resch et al., 2016). It is possible that the high level of performance observed in this study’s participants is reflective of the environmental control (e.g., a quiet environment and limited test takers) and proctoring practices used at this institution. These concepts have been discussed in multiple reviews and have been recommended as “best practices” for performing baseline assessments (Echemendia et al., 2013; Moser et al., 2015; Resch et al., 2013b). In addition to environmental control(s), it is possible that the pattern of high and above average performance on the memory tasks in combination with low average reaction time represents a trade-off of speed for accuracy for these particular student athletes. This trade-off pattern probably does not represent all athletes at all post-secondary institutions, and therefore future research is needed to address the fit of institutionally based normative data from a variety of institutions compared to manufacturer-provided norms. Scholastic aptitude test (SAT) scores have previously been shown to be related to CNT performance on the Automated Neuropsychological Assessment Metrics and therefore may also be related to ImPACT performance (Brown et al., 2007). As such, schools whose student populations achieve higher or lower SAT test scores may have institution-specific ImPACT norms that reflect these SAT scores. This relationship has not yet been reported, and the query warrants further study.

Table 3

Comparison of average reported values for ImPACT outcome scores

ImPACTCurrent StudyOther CollegeaMP 50 th Percentile
Outcome scoreRange of 50th percentiles (weighted mean)Weighted meanRange of 50th percentiles
VEM89–92 (89.0)86.684–87
VIM77–82 (79.8)75.873–76
VMS42.55–44.6 (43.00)40.2539.34–41.04
RT0.54–0.56 (0.54)0.570.56–0.58
ImPACTCurrent StudyOther CollegeaMP 50 th Percentile
Outcome scoreRange of 50th percentiles (weighted mean)Weighted meanRange of 50th percentiles
VEM89–92 (89.0)86.684–87
VIM77–82 (79.8)75.873–76
VMS42.55–44.6 (43.00)40.2539.34–41.04
RT0.54–0.56 (0.54)0.570.56–0.58

Notes: This table shows institutionally based values for all age and sex groups in the current study compared to previously reported average values in collegiate samplesa as well as the ImPACT manufacturer-provided 50th percentile values (Lovell, 2004). Ranges are observed 50th percentiles for each of the age and sex groups. Weighted means are average scores for each age and sex group weighted by their proportion of the total sample. VEM = Verbal Memory Composite; VIM = Visual Memory Composite; VMS = Visual Motor Speed Composite; and RT = Reaction Time Composite. a (Broglio et al., 2007a, 2007b; Cottle et al., 2017; Echemendia et al., 2012; Henry & Sandel, 2015; Nakayama et al., 2014; Register-Mihalik et al., 2012; Resch et al., 2013a; Resch et al., 2016).

Table 3

Comparison of average reported values for ImPACT outcome scores

ImPACTCurrent StudyOther CollegeaMP 50 th Percentile
Outcome scoreRange of 50th percentiles (weighted mean)Weighted meanRange of 50th percentiles
VEM89–92 (89.0)86.684–87
VIM77–82 (79.8)75.873–76
VMS42.55–44.6 (43.00)40.2539.34–41.04
RT0.54–0.56 (0.54)0.570.56–0.58
ImPACTCurrent StudyOther CollegeaMP 50 th Percentile
Outcome scoreRange of 50th percentiles (weighted mean)Weighted meanRange of 50th percentiles
VEM89–92 (89.0)86.684–87
VIM77–82 (79.8)75.873–76
VMS42.55–44.6 (43.00)40.2539.34–41.04
RT0.54–0.56 (0.54)0.570.56–0.58

Notes: This table shows institutionally based values for all age and sex groups in the current study compared to previously reported average values in collegiate samplesa as well as the ImPACT manufacturer-provided 50th percentile values (Lovell, 2004). Ranges are observed 50th percentiles for each of the age and sex groups. Weighted means are average scores for each age and sex group weighted by their proportion of the total sample. VEM = Verbal Memory Composite; VIM = Visual Memory Composite; VMS = Visual Motor Speed Composite; and RT = Reaction Time Composite. a (Broglio et al., 2007a, 2007b; Cottle et al., 2017; Echemendia et al., 2012; Henry & Sandel, 2015; Nakayama et al., 2014; Register-Mihalik et al., 2012; Resch et al., 2013a; Resch et al., 2016).

The Case for Baseline Assessments

Previous literature has stated that those in the middle of a normative distribution would not be at risk for misclassification when using normative data in lieu of individual baseline assessments (Resch et al., 2013b; Schatz & Robertshaw, 2014). Although this concept is theoretically sound, individual baseline assessments need to be performed to determine how well individuals fit with normative data. Without knowing how an individual compares to normative data at baseline, the risk of misclassification following SRC cannot be properly assessed. Therefore, as the current data suggest institutionally based data may be distributed differently than manufacturer-provided data, the use of a proctored individual baseline assessment in a well-controlled environment is recommended for use in clinical decision-making following a concussion in student athletes. In the absence of an individual baseline assessment, institutionally based norms could be used as a comparator based on previously collected pre-injury baselines from that institution. These strategies may improve the contribution of ImPACT to the sensitivity of the multimodal assessment battery as current evidence suggests the objective portion of the battery (i.e., the neurocognitive assessment) is less sensitive (76%–79%) than when symptoms are simultaneously included (up to 95% sensitivity; Broglio et al., 2007b; Resch et al., 2016).

Limitations

The current study is not without its limitations. Our results are based on a single institution and the generalizability of these results are likely limited only to similar institution types (i.e., academically selective public universities participating in a NCAA Division I Power 5 conference). Post-injury comparisons were not performed, and therefore we cannot corroborate previous findings regarding misclassification risks within our current study. The baseline assessment methodology of proctoring the ImPACT individually or in pairs in a controlled environment may lack external validity for institutions where this methodology is not feasible due to resource constraints on money, space, and/or personnel to perform the testing. Similarly, the differences between the institutionally based norms and the manufacturer-provided norms may not be due to individual performance capabilities but rather due to testing methods (e.g., testing in large groups). Unfortunately, we were unable to discern between the effects of our testing methodology and the individual abilities of the current sample due to chosen study design.

Conclusions

Individual institutions that compare post-injury assessments to manufacturer-provided normative data are encouraged to explore whether their student athletes’ institutionally based normative data profile fits with the ImPACT manufacturer-provided norms. If institutionally derived norms are unknown or are not an adequate fit to manufacturer-provided norms, institutions should strongly consider the benefit of using individual baseline assessments, rather than manufacturer-provided norms, for comparison of post-SRC scores. Regardless of the methodology used for neurocognitive assessment, a multimodal approach including other domains of function (symptoms, balance, etc.) is warranted given the limitations of neurocognitive assessment alone. When baseline testing is not feasible, institutionally based norms may be a better comparator than manufacturer-provided norms following SRC.

Funding

This work was not funded.

Conflict of Interest

None declared.

Acknowledgements

The authors would like to thank the university’s Sports Medicine staff and the faculty in the Department of Kinesiology for their assistance and critical feedback that led to the successful completion of this project.

References

Alsalaheen
,
B.
,
Stockdale
,
K.
,
Pechumer
,
D.
, &
Broglio
,
S. P.
(
2016
).
Measurement error in the immediate postconcussion assessment and cognitive testing (ImPACT): Systematic review
.
The Journal of Head Trauma Rehabilitation
,
31
(
4
),
242
251
. doi: https://doi.org/10.1097/HTR.0000000000000175.

Brett
,
B. L.
,
Smyk
,
N.
,
Solomon
,
G.
,
Baughman
,
B. C.
, &
Schatz
,
P.
(
2016
).
Long-term stability and reliability of baseline cognitive assessments in high school athletes using ImPACT at 1-, 2-, and 3-year test–retest intervals
.
Archives of Clinical Neuropsychology
.
31
(
8
),
904
914
. doi: https://doi.org/10.1093/arclin/acw055.

Broglio
,
S.
,
Cantu
,
R. C.
,
Gioia
,
G. A.
et al.  (
2014
).
National Athletic Trainers’ Association position statement: Management of sport concussion
.
Journal of Athletic Training
,
49
(
2
),
245
265
.

Broglio
,
S.
,
Ferrara
,
M. S.
,
Macciocchi
,
S.
,
Baumgartner
,
T.
, &
Elliot
,
R.
(
2007a
).
Test–retest reliability of computerized concussion assessment programs
.
Journal of Athletic Training
,
42
(
4
),
509
514
.

Broglio
,
S. P.
,
Macciocchi
,
S. N.
, &
Ferrara
,
M. S.
(
2007b
).
Sensitivity of the concussion assessment battery
.
Neurosurgery
,
60
(
6
),
1050
1057
,
discussion 1057–8
. doi: https://doi.org/10.1227/01.neu.0000255479.90999.c0.

Brown
,
C. N.
,
Guskiewicz
,
K. M.
, &
Bleiberg
,
J.
(
2007
).
Athlete characteristics and outcome scores for computerized neuropsychological assessment: A preliminary analysis
.
Journal of Athletic Training
,
42
(
4
),
515
523
.

Buckley
,
T. A.
,
Burdette
,
G.
, &
Kelly
,
K.
(
2015
).
Concussion-management practice patterns of National Collegiate Athletic Association Division II and III athletic trainers: How the other half lives
.
Journal of Athletic Training
,
50
(
8
),
879
888
. doi: https://doi.org/10.4085/1062-6050-50.7.04.

Cohen
,
J.
(
1988
).
Statistical power analysis for the behavioral sciences
(2nd ed.).
Hillsdale, NJ
:
L. Erlbaum Associates
.

Cottle
,
J. E.
,
Hall
,
E. E.
,
Patel
,
K.
,
Barnes
,
K. P.
, &
Ketcham
,
C. J.
(
2017
).
Concussion baseline testing: Preexisting factors, symptoms, and neurocognitive performance
.
Journal of Athletic Training
,
52
(
2
),
77
81
. doi: https://doi.org/10.4085/1062-6050-51.12.21.

Cramer
,
H.
(
1946
).
Mathematical methods of statistics
.
Princeton, NJ
:
Princeton University Press
.

Echemendia
,
R. J.
,
Bruce
,
J. M.
,
Bailey
,
C. M.
,
Sanders
,
J. F.
,
Arnett
,
P.
, &
Vargas
,
G.
(
2012
).
The utility of post-concussion neuropsychological data in identifying cognitive change following sports-related MTBI in the absence of baseline data
.
The Clinical Neuropsychologist
,
26
(
7
),
1077
1091
. doi: https://doi.org/10.1080/13854046.2012.721006.

Echemendia
,
R. J.
,
Iverson
,
G. L.
,
McCrea
,
M.
,
Macciocchi
,
S. N.
,
Gioia
,
G. A.
,
Putukian
,
M.
et al.  (
2013
).
Advances in neuropsychological assessment of sport-related concussion
.
British Journal of Sports Medicine
,
47
(
5
),
294
298
. doi: https://doi.org/10.1136/bjsports-2013-092186.

Elbin
,
R. J.
,
Kontos
,
A. P.
,
Kegel
,
N.
,
Johnson
,
E.
,
Burkhart
,
S.
, &
Schatz
,
P.
(
2013
).
Individual and combined effects of LD and ADHD on computerized neurocognitive concussion test performance: Evidence for separate norms
.
Archives of Clinical Neuropsychology
,
28
(
5
),
476
484
. doi: https://doi.org/10.1093/arclin/act024.

Elbin
,
R. J.
,
Schatz
,
P.
, &
Covassin
,
T.
(
2011
).
One-year test–retest reliability of the online version of ImPACT in high school athletes
.
The American Journal of Sports Medicine
,
39
(
11
),
2319
2324
. doi: https://doi.org/10.1177/0363546511417173.

Henry
,
L. C.
, &
Sandel
,
N.
(
2015
).
Adolescent subtest norms for the ImPACT neurocognitive battery
.
Applied Neuropsychology: Child
,
4
(
4
),
266
276
. doi: https://doi.org/10.1080/21622965.2014.911094.

Iverson
,
G. L.
,
Lovell
,
M. R.
, &
Collins
,
M. W.
(
2003
).
Interpreting change on ImPACT following sport concussion
.
Clin Neuropsychol.
,
17
(
4
),
460
467
. doi: .

Iverson
,
G. L.
, &
Schatz
,
P.
(
2015
).
Advanced topics in neuropsychological assessment following sport-related concussion
.
Brain Injury
,
29
(
2
),
263
275
. doi: https://doi.org/10.3109/02699052.2014.965214.

Lovell
,
M.
(
2004
).
ImPACT Version 2.0 Clinical User's Manual
.

Lynall
,
R. C.
,
Laudner
,
K. G.
,
Mihalik
,
J. P.
, &
Stanek
,
J. M.
(
2013
).
Concussion-assessment and -management techniques used by athletic trainers
.
Journal of Athletic Training
,
48
(
6
),
844
850
. doi: https://doi.org/10.4085/1062-6050-48.6.04.

Meehan
,
W. P.
,
d’Hemecourt
,
P.
,
Collins
,
C. L.
,
Taylor
,
A. M.
, &
Comstock
,
R. D.
(
2012
).
Computerized neurocognitive testing for the management of sport-related concussions
.
Pediatrics
,
129
(
1
),
38
44
. doi: https://doi.org/10.1542/peds.2011-1972.

Moser
,
R. S.
,
Schatz
,
P.
, &
Lichtenstein
,
J. D.
(
2015
).
The importance of proper administration and interpretation of neuropsychological baseline and postconcussion computerized testing
.
Applied Neuropsychology: Child
,
4
(
1
),
41
48
. doi: https://doi.org/10.1080/21622965.2013.791825.

Nakayama
,
Y.
,
Covassin
,
T.
,
Schatz
,
P.
,
Nogle
,
S.
, &
Kovan
,
J.
(
2014
).
Examination of the test–retest reliability of a computerized neurocognitive test battery
.
The American Journal of Sports Medicine
,
42
(
8
),
2000
2005
. doi: https://doi.org/10.1177/0363546514535901.

Rabinowitz
,
A. R.
,
Merritt
,
V. C.
, &
Arnett
,
P. A.
(
2015
).
The return-to-play incentive and the effect of motivation on neuropsychological test-performance: Implications for baseline concussion testing
.
Developmental Neuropsychology
,
40
(
1
),
29
33
. doi: https://doi.org/10.1080/87565641.2014.1001066.

Randolph
,
C.
(
2011
).
Baseline neuropsychological testing in managing sport-related concussion: Does it modify risk?
Current Sports Medicine Reports
,
10
(
1
),
21
26
.

Register-Mihalik
,
J. K.
,
Kontos
,
D. L.
,
Guskiewicz
,
K. M.
,
Mihalik
,
J. P.
,
Conder
,
R.
, &
Shields
,
E. W.
(
2012
).
Age-related differences and reliability on computerized and paper-and-pencil neurocognitive assessment batteries
.
Journal of Athletic Training
,
47
(
3
),
297
305
. doi: https://doi.org/10.4085/1062-6050-47.3.13.

Resch
,
J.
,
Driscoll
,
A.
,
McCaffrey
,
N.
,
Brown
,
C.
,
Ferrara
,
M. S.
,
Macciocchi
,
S.
et al.  (
2013a
).
ImPact test–retest reliability: Reliably unreliable?
Journal of Athletic Training
,
48
(
4
),
506
511
. doi: https://doi.org/10.4085/1062-6050-48.3.09.

Resch
,
J. E.
,
Brown
,
C. N.
,
Schmidt
,
J.
,
Macciocchi
,
S. N.
,
Blueitt
,
D.
,
Cullum
,
C. M.
et al.  (
2016
).
The sensitivity and specificity of clinical measures of sport concussion: Three tests are better than one
.
BMJ Open Sport & Exercise Medicine
,
2
(
1
), e000012. doi: https://doi.org/10.1136/bmjsem-2015-000012.

Resch
,
J. E.
,
McCrea
,
M. A.
, &
Cullum
,
C. M.
(
2013b
).
Computerized neurocognitive testing in the management of sport-related concussion: An update
.
Neuropsychology Review
,
23
(
4
),
335
349
. doi: https://doi.org/10.1007/s11065-013-9242-5.

Walton
,
S. R.
,
Broshek
,
D. K.
,
Freeman
,
J. R.
,
Cullum
,
C. M.
, &
Resch
,
J. E.
(
2018
).
Valid but invalid: Suboptimal ImPACT(c) baseline performance in university athletes
.
Medicine and Science in Sports and Exercise
.

Schatz
,
P.
, &
Ferris
,
C. S.
(
2013
).
One-month test–retest reliability of the ImPACT test battery
.
Archives of Clinical Neuropsychology
,
28
,
499
504
.

Schatz
,
P.
, &
Robertshaw
,
S.
(
2014
).
Comparing post-concussive neurocognitive test data to normative data presents risks for under-classifying “above average” athletes
.
Archives of Clinical Neuropsychology
,
29
(
7
),
625
632
. doi: https://doi.org/10.1093/arclin/acu041.

Schmidt
,
J. D.
,
Register-Mihalik
,
J. K.
,
Mihalik
,
J. P.
,
Kerr
,
Z. Y.
, &
Guskiewicz
,
K. M.
(
2012
).
Identifying impairments after concussion: Normative data versus individualized baselines
.
Medicine and Science in Sports and Exercise
,
44
(
9
),
1621
1628
. doi: https://doi.org/10.1249/MSS.0b013e318258a9fb.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)