Dietary Assessment Methods to Estimate (Poly)phenol Intake in Epidemiological Studies: A Systematic Review

Abstract Nutritional epidemiological studies have frequently reported associations between higher (poly)phenol intake and a decrease in the risk or incidence of noncommunicable diseases. However, the assessment methods that have been used to quantify the intakes of these compounds in large-population samples are highly variable. This systematic review aims to characterize the methods used to assess dietary (poly)phenol intake in observational studies, report the validation status of the methods, and give recommendations on method selection and data reporting. Three databases were searched for publications that have used dietary assessment methods to measure (poly)phenol intake and 549 eligible full texts were identified. Food-frequency questionnaires were found to be the most commonly used tool to assess dietary (poly)phenol intake (73%). Published data from peer-reviewed journals were the major source of (poly)phenol content data (25%). An increasing number of studies used open-access databases such as Phenol-Explorer and USDA databases on flavonoid content since their inception, which accounted for 11% and 23% of the data sources, respectively. Only 16% of the studies reported a method that had been validated for measuring the target (poly)phenols. For future research we recommend: 1) selecting a validated dietary assessment tool according to the target compounds and target period of measurement; 2) applying and combining comprehensive (poly)phenol content databases such as USDA and Phenol-Explorer; 3) detailing the methods used to assess (poly)phenol intake, including dietary assessment method, (poly)phenol content data source; 4) follow the Strengthening the Reporting of Observational Studies in Epidemiology—Nutritional Epidemiology (STROBE-nut) framework; and 5) complementing dietary intake assessment based on questionnaires with measurement of (poly)phenols in biofluids using appropriate and validated analytical methods.


Introduction
Diet is one of the most important modifiable factors for the prevention and management of noncommunicable diseases (1,2). In recent decades, the understanding of diet has evolved from what was believed to be a limited combination of 150 identified nutrients into a much wider range of components including non-nutrients and potentially bioactive compounds such as phytochemicals (3). The development FIGURE 1 Assessment methods of (poly)phenol intake and key points to notice. Dietary assessment and biomarker are 2 approaches to estimate dietary (poly)phenol intake. In the dietary assessment approach, dietary intakes via food sources of (poly)phenols are estimated by dietary assessment tools such as FFQs, food records, or 24-h recalls. Food content data of (poly)phenols can be obtained from Phenol-Explorer, USDA, some country-based databases, or self-analyzed data. Food intake data are matched with available (poly)phenol content data by individual items and multiplied to calculate (poly)phenol intakes (mg/d). Key points to notice from each step are also listed in the corresponding boxes. FFQ, food-frequency questionnaire; STROBE-nut, Strengthening the Reporting of Observational Studies in Epidemiology-Nutritional Epidemiology.
nutritional intervention trials (10). Several large prospective studies such as the Nurses' Health Study (11,12), the Health Professionals Follow-Up Study (13), and the European Prospective Investigation into Cancer and Nutrition (EPIC) (14) have reported that higher intake of (poly)phenols is associated with a lower risk of cancer and cardiovascular incidence. However, the results of epidemiological studies are based on the assumption that the assessment of the exposure of interest is reliable and accurate. While dietary assessments of various nutrients (macronutrients, fibers, minerals, and vitamins) are well established through routine nutrient database assessment and validated assay methods (15), the assessment of novel bioactives such as (poly)phenols in freeliving population groups is still in its infancy (Figure 1). Challenges remain in unknown errors from self-reporting, various study designs and tools, unstandardized data coding and processing, and limited sources in food content data.
To better understand the health benefits of (poly)phenols, accurate and reliable methods to measure (poly)phenol intake are required. Given the increasing reporting in nutritional epidemiology of (poly)phenol intake there is an urgent need to understand the strengths and limitations of currently used methods in published studies. Previous systematic reviews investigating the relation between polyphenol intake and health outcomes (16)(17)(18)(19) have reported significant heterogeneity across studies reported, which could largely come from the different assessment methods used. To date, no study has described and compared the performance of different tools for estimating (poly)phenol intake.
This systematic review aims to 1) characterize the observational studies reporting (poly)phenol intake, 2) report current validation status of the assessment methods of (poly)phenol intake, and 3) provide recommendations on choosing the right tools and framework on reporting (poly)phenol intake in nutritional epidemiological studies.

Information extraction and synthesis
An information extraction tool was first developed and tested on pilot data of 3 full texts to refine the tool. Reviewers read the full texts of studies that met the inclusion criteria and retrieved information using a standard database in Microsoft Excel (Microsoft Corporation). The following information was extracted: 1) first and corresponding author's name; 2) year of publication; 3) country or region, study name, study design, and number and characteristics of subjects; 4) dietary assessment methods (including validation status of the method); 5) (poly)phenol content database; and 6) adjustments made in reporting (poly)phenols.
A narrative approach was taken in the synthesis of the results. The included papers from the same study or cohort were grouped. Qualitative analyses were conducted to determine the frequency of different dietary assessment methods and (poly)phenol content databases used in the included papers. For studies that had reported using a validated method to measure (poly)phenol intake, additional information on 1) reference methods, 2) statistical analysis method, and 3) validity of the method was extracted. For papers reporting both dietary intake and biomarker concentrations, the analytical methods and correlations between the 2 measurements were also extracted.

Results
The study selection process of the systematic review is presented in Figure 2. Among a total of 7882 records obtained from searching, 5386 unique records were screened for titles and 1567 were screened for abstracts. Then, 729 full texts were examined further and 182 papers were excluded for the following reasons: no (poly)phenol assessment conducted (n = 25), (poly)phenol assessment based only on biomarkers (n = 46), data not available as a full text (n = 61), intervention conducted (n = 14), identical as included paper (n = 23), review (n = 11) and not relevant (n = 2). Two papers were included through hand-search. In the end, 549 papers were included in the qualitative synthesis of data. Characteristics of the included studies are shown in Supplemental Table 2. Quality of the included papers based on the 6 questions was as follows: 33% were ranked good, 60.5% were fair, and 6.5% were poor ( Table 1).

Discussion
The creditability of nutritional epidemiological research relies on the use of valid and reliable tools to measure dietary exposures. To our knowledge, this is the first systematic review that has characterized and critically evaluated the methods used to measure dietary (poly)phenol intake in epidemiological studies.
A multistage process is used for the estimation of dietary (poly)phenol intake in nutritional epidemiological studies as detailed in Figure 1. Dietary assessment requires the recording of food and beverage intake by participants; however, the method of collection differs in the level of detail. Different dietary assessment tools, such as FFQs, food diaries, and 24-h recalls, vary in their ability to capture the food sources of dietary (poly)phenols according to their design and method of validity ( Table 3). In this study we found that FFQs are the most popular dietary assessment tools used to measure food sources of (poly)phenol intake. This is likely due to the low burden of the method towards participants and researchers alike, and their ability to measure longterm exposure to dietary factors (205). However, compared with dietary recall and records, FFQs have limited ability to cover the wide range of food sources of (poly)phenols and differentiate the food items due to the predefined list of food groups covered in the questionnaire. Moreover, the structure and food groups included in FFQs can differ between studies depending on the research questions. For example, if an FFQ is used to measure total and subclasses of flavonoid intake, important sources of flavonoids should be covered in the list such as tea, fruits and vegetables, soy products, legumes and beans, cocoa products, and red wine (206). At the same time, each FFQ item should cover only 1 type of food that has a different (poly)phenol content profile, and all items should be listed separately (207). In many FFQs the potential to measure subclasses of polyphenols is hampered by combining of items in FFQ categories-for example, red and white wine (67) and apples and pears (12,67). Unlike FFQs consisting of a predefined list of food groups and frequencies of intake, dietary recalls or food records are not restricted and allow matching of individual food items with

Challenges Recommendations/resources needed
Dietary assessment tool not designed to capture (poly)phenol diet sources and variabilities 1) Choose a tool that covers the food sources of target compounds, and has foods with different (poly)phenol profiles differentiated 2) Consider the frequency and timing of measurement to make sure the target time period is represented 3) Use multiple measurements of dietary records rather than FFQs if possible Dietary assessment methods not validated/insufficiently validated to measure (poly)phenol intakes 1) Validate the tool specifically for measuring the intake of target (poly)phenols 2) Use other well-established dietary assessments and established biomarkers as reference methods 3) Conduct multiple statistical analysis to reflect validation status: correlation coefficients, cross-classification (Cohen's κ), Bland-Altman 4) Provide evidence of validity and reproducibility Limited data on (poly)phenol content in foods 1) Choose a database that covers the content data of all food sources of the target compound; combine different sources of data to make up the limitations of single databases 2) Choose databases of high quality: with reliable analytical methods and data source, and consistent data between multiple sources; use data from comparable analytical methods if need to summarize the total 3) Choose the data that can match up with the food item in the measured diet, in terms of food origin and species; apply food-processing yield factors if applicable 4) Check the updates of the database and search for newly published data if possible 5) Use standard recipes that can reflect the diet in target population Insufficient reporting on methods 1) Follow STROBE-nut framework (21) 2) Describe the dietary assessment methods used in detail: food groups and number of items measured, whether similar foods are distinguished in items; how the assessment was conducted, time range measured, and validation of the methods 3) Report clearly whether the dietary assessment method is validated for targeted (poly)phenols; if it is validated, describe the reference method used including sample size and characteristics of the population, how the reference method was conducted, statistical analysis methods used and validity/reproducibility results; if biomarkers are used to validate the dietary assessment, report details of the biomarkers and analytical methods applied 4) Report the name of the database used or cite the reference paper; describe the analytical method used to get the food content data and whether compounds were measured individually or in aglycones; report the retention factors used 5) Report how food items were matched, how missing items and missing compound values were analyzed, and the adjustment made on the intake amount 1 FFQ, food frequency questionnaire; STROBE-nut, Strengthening the Reporting of Observational Studies in Epidemiology-Nutritional Epidemiology.
(poly)phenol content data. However, repeat measurements are needed to enable the dietary data to represent the time period of estimation, especially for 24-h dietary recalls (208). For example, 24-h recalls should be repeated 3 times during a 7-d period, including 2 weekdays and 1 weekend day, to represent habitual dietary intake (134,199,211). Food records should be conducted in different seasons to be able to represent yearly intake (46,212). In this review we found that ∼15% of the studies used 24-h/48-h recalls or food diaries to measure food sources of (poly)phenols, which is much lower compared with studies using FFQs. This may result in a higher burden on participants and researchers when using dietary recalls or records (209). Clear instructions on completion and photos of portion sizes (45,47,(213)(214)(215) are recommended to support the participants, while standardized coding protocols and trained coders are needed to interpret the questionnaires in high quality consistently (209). The strengths and limitations of different methods in measuring (poly)phenol intakes are listed in Table 4.
In terms of (poly)phenol content data source, we found that open-access databases are becoming the most widely used resources for estimating (poly)phenol content of foods in the studies we reviewed. The development of the USDA databases in 1999 (216) and Phenol-Explorer in 2010 (217) has led to a growing number of researchers using these comprehensive databases in their studies over the last 20 y. Many papers combined different sources of (poly)phenol data to serve the purpose and scope of the individual studies. For example, many studies applied both USDA and Phenol-Explorer databases to cover the wide range of food items measured in the dietary assessment. Meanwhile, some other studies combined data from domestic databases to match up with the diet of the local population, such as Chinese food (218)(219)(220)(221), Korean food (222)(223)(224)(225), and UK food (81,182,226,227). Data from published papers are also commonly applied to cover the food sources of (poly)phenols that do not appear in the databases. A systematic review that included 157 studies published between 2004 and 2014 reporting food-composition tools for (poly)phenol intake assessment (228) found that 60% of studies used published accessible databases (including USDA, Phenol-Explorer, countrybased databases, and other public databases according to the groupings in the current study), and 33% of the literature applied >1 database. The result is in accordance with our findings, where 49% of studies used publicly accessible databases and 20% of studies used >1 data source of (poly)phenol contents. The Phenol-Explorer database and USDA database are the 2 most comprehensive databases on (poly)phenol content in foods. The Phenol-Explorer database retrieves all classes and subclasses of (poly)phenol content data in foods published in scientific papers, books, and reviews and includes critical evaluations of experiment details on sampling, (poly)phenol extraction, and analytical methods (217). Mean values of each (poly)phenol content are provided in different categories of analytical methods used such as chromatography, chromatography after hydrolysis, and the Folin assay method (217). In addition, retention factors of compounds after food processing are also available (229). The USDA database for flavonoid content is mainly focused on a specific number of flavonoids compounds, which are retrieved from published papers and evaluated for quality using a standardized procedure and scoring system developed by the Nutrient Data Laboratory of the USDA (87). Flavonoid content data from the United States and other countries are included in the database. Only the data generated by acceptable analytical methods that can result in good separation of the target compounds, such as HPLC, capillary zone electrophoresis, and micellar electrokinetic capillary chromatography, are included (87). Different from Phenol-Explorer, which shows content data from different methods separately, the USDA content data are measured as glucosides and converted into aglycones to be comparable and consistent across the database. These 2 databases are free to access for the public, include data with relative acceptable analytical methods, and integrate different sources to provide reliable (poly)phenol content data.
The current available databases have limitations that may hinder the accuracy of (poly)phenol measurement. First, many foods and compounds are missing from the databases due to the lack of analytical data, which would lead to underestimation of the dietary intake of less-studied compounds and foods. In both Phenol-Explorer and USDA databases, frequently, content data of only a small number of phenolic compounds are available for a food item. Therefore, underestimation of intake can occur when calculating total (poly)phenol intake by summarizing the intakes of individual classes and subclasses of compounds. Second, the analytical methods that have been used to measure (poly)phenols in food are not consistent in accuracy. Some of the food content data are only available from spectrophotometric methods such as the Folin-Ciocalteu method (230). The Folin method is a colorimetric method measuring levels of total antioxidant capacity rather than total phenolics (231). Data from these spectrophotometric methods are highly inaccurate compared with the content data from analytical methods that can quantify the compounds individually, such as HPLC. In addition, many (poly)phenols are quantified with standards of their parent compounds (e.g., quantify resveratrol glucosides with resveratrol) or similar compounds (e.g., quantify tyrosol with hydroxytyrosol) (232). Even though this is common practice, especially when authentic standards are not commercially available, quantifying compounds with other standards can lead to inaccurate results (233). In addition, the content data may not be reliable if they are derived from a small number of studies due to interlaboratory variability. Furthermore, the databases are usually updated after long periods; therefore, there is a time lag between newly published values and database update. Last, the information can lack details on the multiple factors influencing polyphenol content of food such as origin, species, storage, and processing procedures. Similar to nutrients, the food contents of phytochemicals can be highly variable under the influence of the above factors (207). Domestic data may be more accurate than using data from other countries; however, there are limited compounds in country-based databases (234,235) because of the huge expense and difficulties in analysis. Phenol-Explorer has been updated on yield factors related to cooking in recent years (229); however, the data available are still limited. Although more data and improvements in data quality are needed, the establishment of these databases is a very useful step towards more accurate analysis.
While many studies used a validated tool to measure nutrient intake, most of them were not validated for the target (poly)phenols. This limitation may introduce an unknown amount of systematic error in the estimation. The validity of measuring (poly)phenol intake could vary from the validity of measuring other nutrients or foods, especially considering the challenges in dietary assessment tools and food content databases mentioned previously. In addition to the low number of validated studies, we found the quality of the validation studies to be low, with 50% of the studies ranked as "fair" and 13% as "poor." We identified the following concerns: 1) most of the validation evidence was provided only by correlation coefficients with estimations derived from other dietary assessment methods, 2) no evidence of reproducibility was provided in most studies, and 3) the validation study design and results were insufficiently reported. The poor validation and reporting of (poly)phenol assessment restrict the evaluation of the existing evidence in meta-analysis.
The last data extraction of this study was conducted in May 2020. At the time of writing the manuscript, further papers reporting dietary assessment of (poly)phenol intakes have been published (236)(237)(238)(239). In agreement with our findings, most of the papers (236-238) used FFQs to estimate (poly)phenol intake. Phenol-Explorer (236,239) and USDA databases (236)(237)(238) were used as (poly)phenol composition data sources. Yue et al. (236) reported moderate to high validity (Spearman's rank correlation coefficients were 0.4-0.7 or ≥0.7) and high reproducibility (rank interclass correlation coefficients were ∼0.8) of an FFQ on reporting flavonoids compared with two 7-d weighed dietary records with both Phenol-Explorer database and a Harvard database that was mainly based on the USDA database.
Outside the remit of this review, it is important to mention that another approach to estimate (poly)phenol intake in epidemiological studies is the use of biomarkers of (poly)phenol intake in biofluids. This approach is considered to be more objective as it directly reflects "bioavailable" (poly)phenol exposure levels and does not depend on selfreported data and inaccuracies of tools and databases. The dietary assessment polyphenol database method is simple and easy to conduct, although it is prone to errors resulting from misreport (240) and limited information in the current databases (241). Biomarkers of (poly)phenol intake can be used to validate or calibrate the dietary assessment approach. Therefore, the integration of (poly)phenol biomarkers into the dietary assessment can provide a more robust result, especially when linking (poly)phenol intakes to health outcomes (242). However, the biomarker method requires access to specialized analytical techniques such as LC and MS, which are less accessible compared with dietary assessment. The accuracy of the analytical methods depends largely on the availability of authentic chemical standards, and validation of the methods is also needed. In addition, the short halflife of many (poly)phenol metabolites could hamper their potentials to represent habitual diet (242). Despite the fast development in this field, there are very few validated, efficient, and accessible methods that are available for use in epidemiology studies (210). In this study we found a limited number of studies (n = 57, 10%) that reported both dietary intake and biomarker concentrations of (poly)phenols and, of these, only 43 (75%) reported the correlation coefficients between the 2 measurements. The correlation coefficients varied widely between different samples, compounds, and analytical methods used to measure biomarkers and dietary assessment methods. Interestingly, better correlations between dietary intake of (poly)phenols and (poly)phenol biomarkers were found between food diaries or recalls than FFQs in a few studies (44,45,62,67,68), which indicated the advantage of food records or recalls. In future studies that measure dietary intake of (poly)phenols, measurement of biomarkers should be taken into consideration. Also, more efforts are needed in the development of analytical methods that are validated for measuring (poly)phenol biomarkers and, at the same time, are suitable (fast, high-throughput) to use in large epidemiological studies.
There has been an exponential increase in nutritional epidemiology studies reporting associations between (poly)phenol intake and health outcomes (17)(18)(19). However, it remains a challenge to be able to advise the public on the likely intake level that is beneficial to health due to the existence of methodological issues in measuring (poly)phenol intake identified in this review, including limited ability and validity of the dietary assessment tools, limited food content data of (poly)phenols, and insufficient reporting of the results (Table 4). To strengthen the quality of evidence on (poly)phenol intake and health, our recommendations on choosing dietary assessment methods are summarized in Table 4. The first step is to describe clearly the scope of the estimation and have a target compound or a group of (poly)phenols and define a target time period of measurement according to the research question. When choosing the dietary assessment tool, careful consideration should be given to select the one that can cover the food sources of the target compounds and represent the diet in the target time range. The dietary assessment tool should be validated for the target compounds with the use of other, more robust dietary assessment tools or ideally provide correlations with biomarkers of (poly)phenol intake. If possible, the use of multiple measurements of dietary records to collect dietary intake data is recommended. The chosen food content database of (poly)phenols should cover the content data of food sources of the target compounds. The combination of USDA and Phenol-Explorer databases is the most comprehensive approach at the moment. The use of domestic databases and recipes to match with the diet of the population if available is also recommended. The reporting of observational studies estimating (poly)phenol intake should follow the STROBE-nut framework (21), including additional details that are specific to (poly)phenol analysis as described in Table 4.
In summary, the findings of this systematic review suggest that further research is needed to develop tools that are specifically designed to measure (poly)phenol intake. Improvements in current food content databases are also essential to provide more reliable, detailed, and up-to-date data. International collaborations on setting up standards and guidance on food content analysis regarding phytochemical compounds are also needed. Validation of the tools, especially combining the biomarker or metabolomics approach to validate or calibrate the dietary assessment methods, could provide more reliable evidence on relations between (poly)phenol intake and health outcomes. Future research should complement the dietary intake data with quantification of biomarkers of (poly)phenol intake. Therefore, development of fast, high-throughput, sensitive, and accurate analytical methods to measure concentrations of phenolic metabolites in biofluids is also needed. Understanding the different methods of measurement and their strengths and limitations, as set out in this review, is an important step towards developing a standardized approach to measurement and reporting dietary (poly)phenol intake. This will enable comparison between studies and future pooling of results in systematic reviews to strengthen the evidence base.