Background: The efficacy of breast self-examination in helping to reduce mortality from breast cancer has not been rigorously demonstrated. Purpose: To assess efficacy, a large, randomized trial was initiated in Shanghai, China. Methods: From October 1989 to October 1991, 267040 current and retired female employees associated with 520 factories in the Shanghai Textile Industry Bureau were randomly assigned on the basis of factory to either a self-examination instruction group (133375 women) or a control group (133665 women). The women were born within the period from 1925 through 1958. Women in the instruction group were given intensive training in breast self-examination, including the use of silicone breast models and personalized instruction, plus two subsequent reinforcement sessions and multiple reminders to practice the technique. Women in the control group were asked to attend training sessions on the prevention of low back pain. All women have been followed for the development of breast diseases and for death from breast cancer. Results: A high level of participation during the first 4–5 years of the trial was documented among women in the instruction group. Randomly sampled women in this group demonstrated greater proficiency in detecting lumps in breast models than did randomly sampled women in the control group. Approximately equal numbers of breast cancers were detected in the two groups (331 in the instruction group and 322 in the control group) through 1994, which is the last year for which case-finding efforts have been completed. The breast cancers detected in the instruction group were not diagnosed at an appreciably earlier stage or smaller size than those in the control group. More benign breast lesions were detected in the instruction group than in the control group (1457 versus 623, respectively), suggesting a higher index of suspicion for women who received training. Cumulative breast cancer mortality rates through 5 years from entry into the study were nearly equivalent for the two groups. Conclusions: Breast self-examination has not led to a reduction in mortality from breast cancer in this study cohort in the first several years since the trial began. A shift toward the diagnosis of disease at a less advanced stage in women given instruction has also not been demonstrated. Longer follow-up of participants in this trial is required before final assessment can be made of the efficacy of breast self-examination. Implications: At this time, there is insufficient evidence to recommend for or against the teaching of breast self-examination. [J Natl Cancer Inst 1997;89:355-65]
Secondary prevention, the detection and treatment of early disease, is the only method currently available to reduce mortality from breast cancer ( 1 – 8 ). Although mammography, with and without clinical breast examination by a physician or nurse, has been shown to be effective in reducing mortality from breast cancer in women who are older than 50 years ( 1 – 3 ), it requires considerable expenditure of funds; therefore, it is not practical for use in some populations. Breast self-examination (BSE) requires less monetary expense and is a procedure that can be performed by every woman on a regular basis. Theoretically, this procedure should be of value in detecting interval cancers that become palpable between annual mammographic screenings, and it may be more sensitive than a clinical breast examination because women may be able to detect subtle changes in their own breasts that might not be discernible on manual palpation by others.
Most studies have shown that women who reported regular practice of BSE presented with smaller tumors ( 9 – 16 ) and neoplasms that had less frequently spread to the axillary lymph nodes ( 9 – 15 , 17 ) than women who did not practice BSE. More favorable survival rates have also been observed among women with breast cancer who reported having practiced BSE than among women who did not ( 18 ), but the extent to which this is due to lead time bias is unknown, and one U.S. study ( 19 ) showed that women who regularly practiced this procedure tended to have other characteristics associated with favorable disease outcome. One case-control study ( 20 ) showed that the frequency of BSE did not differ between women with advanced breast cancer and unaffected control subjects. However, among women reporting a thorough self-examination, BSE was associated with a decrease in the occurrence of advanced breast cancer, regardless of the frequency with which it was performed, suggesting some benefit from the procedure if it is performed regularly and correctly. In Finland ( 21 ), women who returned a calendar indicating that they had practiced BSE (largely monthly) in the 1-year period following BSE instruction had lower mortality rates from breast cancer in years 3 through 6 of follow-up than women in the general population. However, mortality rates from all causes were also lower in the study group, suggesting a “healthy volunteer” effect.
In a nonrandomized trial in the U.K. ( 22 ), 45- to 64-year-old women in two study centers were identified from general practitioners' lists in 1979–1981 and were invited to attend BSE classes. Compared with women in four comparison centers, those in the center with the lower attendance rate (31%) had a breast cancer standardized mortality ratio (SMR) of 0.78, but those in the center with the higher attendance rate (53%) had a breast cancer SMR of 1.14. Although these results provide no evidence that BSE is efficacious in reducing deaths from breast cancer, the study had limited power to detect any true effect that is not large because of the low level of attendance.
The only previous randomized trial of BSE was conducted in Leningrad (now St. Petersburg) and Moscow, but only results from the Leningrad portion of the trial are available ( 23 , 24 ). More than 120000 40- to 84-year-old women who attended 28 polyclinics were randomly assigned on the basis of clinic to a BSE instruction or a control group. After 8 years of follow-up, 81 breast cancer deaths occurred among the 60 221 women in the BSE instruction group compared with 74 deaths among the 60 089 women in the control group ( 24 ), indicating no beneficial effect of BSE. However, compliance with BSE instruction fell precipitously so that, by the 4th year after training, just 18% of the women reported monthly practice ( 23 ). This study, therefore, has not provided an adequate evaluation of the efficacy of BSE.
The efficacy of BSE in reducing mortality from breast cancer thus remains unproved, and in 1995 the U.S. Preventive Health Services Task Force concluded that there is insufficient evidence to recommend for or against the teaching of BSE ( 25 ). The need to rigorously evaluate BSE for efficacy in reducing mortality from breast cancer has been recognized for more than a decade. A 1983 World Health Organization consultation on “Selfexamination in Breast Cancer Early Detection Programs” ( 26 ) and a 1989 workshop of the International Union Against Cancer Project on Evaluation of Screening for Cancer ( 27 ) both recommended randomized trials of BSE in multiple settings. A clear demonstration of the efficacy of BSE could be an important motivational factor in convincing women to practice this procedure, and it would justify the expenditure of funds to train women in proper techniques and encourage regular practice. Conversely, well-conducted trials showing no evidence of efficacy would justify utilization of limited funds for other purposes.
We report here preliminary results from the only randomized trial of which we are aware that is likely to yield additional valid information on the efficacy of BSE in reducing mortality from breast cancer.
Subjects and Methods The Setting
This study is being conducted in the Shanghai Textile Industry Bureau (STIB) in Shanghai, China. In 1989, when the study was initiated, the STIB was composed of 520 work units ranging in size from fewer than 100 to more than 10 000 female employees. Although some work units are educational, medical, administrative, and management facilities, most are individual factories, and work units and factories are used synonymously in this article. Under the Chinese socialist system, women upon entering the work force were assigned to a work unit. Most women also lived in housing allocated to them by their work unit and received their primary medical services in clinics located in their assigned factories. They also were provided shower facilities in their factories. Few women transferred from one factory to another, and, upon retirement, women continued to receive their primary medical care and housing as well as their pensions through their factory. Delivery ofall benefits for retired workers was coordinated through a Retired Workers Committee in each factory. This committee maintained regular contact with retired workers and recorded dates of all deaths. About two thirds of the persons requiring medical care beyond the primary level were referred to one of three hospitals operated by the STIB, and one third were referred to other hospitals having health care contracts with individual factories. Mammography has not been available as a screening method in the STIB. This setting provided an ideal environment in which to initiate a randomized trial of BSE: Large numbers of women could be recruited and contacted periodically for purposes of BSE instruction and regular practice of BSE, for monitoring of adherence to study procedures, and for follow-up to identify breast diseases and mortality. In addition, the efficacy of BSE could be assessed in the absence of mammography.
The study is implemented in Shanghai through the STIB Station for Prevention and Treatment of Cancer, under the direction of the head of the Department of Epidemiology (D. L. Gao), who also serves as secretary of an advisory committee to the study, known as a “Leading Group.” The Chair of the Leading Group, the Deputy Director of the STIB Department of Education and Health, is responsible for all factory medical clinics. Other committee members include the directors and deputy directors of the three STIB hospitals. This group serves as the local decision-making body for the trial and assists in gaining the necessary cooperation from administrative and management units, factory managers, and hospital personnel.
The Leading Group also functions as a local institutional review board (IRB). All study procedures are reviewed by both this group and the IRB of the Fred Hutchinson Cancer Research Center. Because of the innocuous nature of all study procedures and the low level of education of some textile workers, the study subjects were asked to give verbal rather than written consent to participate.
The primary links to the study subjects are through 34 nurses or former STIB factory medical workers, referred to as “BSE workers,” who work with approximately 5000 factory medical workers to jointly conduct all field operations. Ten additional individuals plan and monitor all field activities, operate a tumor and death registry for the STIB, collect histologic slides and abstract information from medical records on all women who had breast lesion biopsies, ascertain the cause of death in women suspected of dying of breast cancer, process data, and send data and slides to Seattle.
All study instruments and instruction manuals are developed in Seattle in consultation with colleagues in Shanghai, translated into Chinese by native Shanghainese-speaking individuals, pilot tested in one to seven pilot study factories, and modified if necessary before implementation. Most data are either optically scanned or key entered in Shanghai and transmitted to Seattle on diskettes. A data-processing unit in Seattle developed all key entry screens and programs for the optical scanner, reviews and edits all data from Shanghai, incorporates data into a computerized database, and prepares files for analysis.
Factories were stratified by total numbers of workers as obtained from factory records (five strata) and hospital of affiliation (four strata, one for each of the STIB hospitals and one for all others), and factories in each strata were randomly allocated to the BSE instruction or control group. All eligible women in each factory were assigned to the study arm of their factory. Implementation of baseline study activities took about 2 years. Women in the factories in each study arm who were recruited during the 1st year were designated as belonging to study group 1; the remaining women were designated as belonging to study group 2. In all subsequent cycles of study activities, this grouping has been retained, so that in any one year, study procedures are administered to half of the women in each arm of the trial.
All women born during the period from 1925 through 1958 who were permanent residents of Shanghai and either current or retired employees ofthe STIB were eligible to participate in the study. In 1988, the BSE workers visited each factory and, with the assistance of the medical workers, identified from factory records each woman meeting the eligibility criteria. Each eligible woman's name in Chinese characters was recorded on one line of a bilingual form, along with her worker identification number, year and month of birth, whether she was currently working or retired, and (if currently working) her workshop within the factory and whether she was a laborer or cadre (managerial level). These listings were sent to Seattle, where all data (but not the Chinese names) were key entered. The original forms with the Chinese names, plus computerized listings of women in the same order as these listings on the page facing the names, were assembled into notebooks for each factory. Each factory was assigned a study number, and the factory and notebook page and line number provided a unique identifier for each woman. These notebooks were sent to the medical clinic in each factory and provided the means by which study workers could identify the name associated with any study number, or vice versa.
Baseline activities were conducted during the period from October 1989 to October 1991. Medical workers in each factory were instructed to attempt to administer a four-page, optically scannable baseline questionnaire to all women listed in the factory notebook. In addition, they were asked to interview any other eligible women who were not listed in the notebook and to add the names of these women and other identifying information to blank lines in the notebook. The medical workers were also instructed to complete the identifying portion of the questionnaire for women who were not interviewed in order to provide a means to measure response rates. The questionnaire was designed to obtain information on the major recognized and suspected risk factors for breast cancer, plus smoking history, alcohol use, contraceptive practices, prior breast cancer, and prior screening for breast cancer.
Line 1 of Table 1 includes all women in the original census plus women who were added to the study at the time of the baseline activities. These additions included some truly eligible women who had begun work in the factories since the census was taken. They also included ineligible women who were erroneously added to the rosters. These latter women plus women in the census who had left the STIB or who had otherwise become ineligible by the time of the baseline activities are shown in line 2 of the table. The remaining 273 779 women, shown in line 3, were those eligible for inclusion in the trial. Of these, only 5410 women failed to complete the baseline questionnaire, for a response rate of 98%. Theoretically, the women who were not interviewed should have been included in the cohort because they were excluded after randomization assignments were revealed to the study workers. However, few of these women participated in further study activities, and they would have been difficult to follow. Since they represent a small proportion of the total and are distributed equally between the two arms of the study, their exclusion could not appreciably alter the results of the trial. A history of breast cancer was reported on the baseline questionnaire by 0.5% of the women in each group, and these women have also been excluded from the analyses presented in this article, leaving a final cohort size of 267 040, with nearly equal numbers of women in the two arms of the study.
The factory level randomization procedure successfully divided the study population into groups at equal risk of breast cancer. The two treatment groups were virtually identical with respect to age (60% under age 50 years) and known and suspected risk factors for breast cancer, as ascertained from the baseline questionnaire ( Table 2 ). In addition, at baseline, nearly equal proportions ofwomen in the two groups had ever had their breasts examined by a practitionerwithin the past year.
BSE Instruction, Reinforcement, and Compliance Monitoring
Quality BSE includes the use of proper techniques (massaging search pattern, use of finger pads, and firm pressure) and complete coverage of the breast with proper positioning of the body ( 28 – 32 ). Motivation to continue BSE is dependent on providing personalized instruction ( 21 , 33 , 34 ) and periodic reinforcement ( 29 , 35 ). Accordingly, the training involved an initial BSE instruction class, which included individual practice, two waves of structured reinforcement activities, regular BSE practice observed by medical workers, handouts for factory medical workers, and reminder posters. This approach was designed to instill performance of proper BSE technique, to inform women about the need to practice BSE, to identify barriers to practice and how to overcome them, and to encourage women to consider BSE a routine health practice.
At the time of baseline questionnaire administration, the medical workers in the factories in the BSE instruction group scheduled women to attend BSE classes that were taught by two BSE workers. Working with groups of about 10 women and using a variety of visual aids, the BSE workers conveyed information about normal breast anatomy, breast cancer, and correct BSE technique, and they gave a demonstration of BSE and proper palpation technique using silicone breast models. They taught a three-step technique, which consisted of inspection in the mirror to look for asymmetry and dimpling, in addition to palpation in both standing and lying positions with the ipsilateral arm above the head. Palpation instruction emphasized using a circular motion with the pads of the three middle fingers while pressing firmly, systematically covering the entire breast and the axilla, and squeezing the nipple to detect discharge. Instruction also included a discussion of perceived barriers to regular practice. This group instruction was followed by individual instruction and practice for each participant on both breast models and themselves. BSE workers and medical workers observed each woman practice BSE and, if necessary, corrected her technique.
Measures of BSE practice in prior studies have been based on self-report ( 36 ), which may not be accurate ( 37 ). Measures of compliance in this study are based on observed behavior. During the year following initial BSE instruction, medical workers asked current workers to come to factory medical clinics for supervised BSE practice 1 month, 3 months, 6 months, 9 months, and 12 months after initial BSE instruction; retired workers were asked to participate in supervised BSE practice 1 month, 6 months, and 12 months after initial BSE instruction. In all years after the first, supervised BSE practice was scheduled every 6 months for all women. Supervised BSE ended in July 1995. During these sessions, medical workers observed the women practice BSE, corrected and reinforced their BSE technique, and recorded attendance on an optically scanned form.
One year after the initial BSE instruction, medical workers in each factory scheduled women in groups of approximately 10 to attend a reinforcement session. After taking attendance, the medical workers showed a video (developed by the study team) that emphasized the importance of BSE and reviewed BSE technique. They then discussed the importance of regular BSE practice, using as a focus a reminder poster entitled “Protect your own health with your own hands. ” This slogan also appears in the video and has become the slogan for the study. The reinforcement session concluded with supervised BSE practice.
A second wave of BSE reinforcement activities was carried out from 1993 to 1995. A video entitled “BSE Right or Wrong” illustrated examples of correct and incorrect techniques. BSE workers showed a segment that included a mistake in BSE technique and then stopped the tape. They asked the women to identify the error in the segment. After the error was identified, the group discussed the correct technique. The BSE worker then showed the next segment of the tape, which illustrated the correct technique. The BSE workers proceeded through the video in this manner. After seeing the video, the women participated in supervised BSE practice. Attendance at video sessions was also recorded on optically scanned forms.
Medical workers in each BSE instruction factory were also encouraged to develop additional methods to remind women to practice BSE and to attend supervised BSE sessions. These workers were surveyed in 1993 and 1994 to obtain information on the methods used. At least one additional method was used in virtually all factories. In 1993, the most common methods reported for current and retired workers, respectively, were individual contact (95.0% and 90.6% of all factories), workshop rounds (65.3% and 9.4%), factory broadcasts (39.3% and 7.7%), individual letters (24.8% and 74.5%), home visits (28.9% and 68.9%), and reminding women when they came to various meetings for other purposes (43.4% and 23.8%). In 1994, the main methods reported for current workers were personal contact (73.2%) and workshop rounds (20.6%). For retired workers, the main methods were reminding women when they came to the factory for special occasions (94.3%) or to pick up their pension payments (42.4%), in addition to individual contact (37.5%), letters (37.5%), or asking a friend or a relative to remind the women (37.1%); home visits and reminders at meetings fell to 10.3% and 7.6% of factories, respectively.
Analogue to BSE Reinforcement for Control Factories
Simultaneously with the second round of BSE reinforcement sessions, study participants in control factories were asked to attend educational sessions on the prevention of low back pain. These sessions were similar to the BSE reinforcement sessions, except that a video on low back pain was shown. The purposes of this program were to meet requests from control factory personnel for health information (which was important for maintaining cooperation) and to provide a mechanism for collecting vital status and other data on women in control factories in a manner comparable to that in the BSE instruction work units.
Assessing BSE Competency
Random samples of 10 intervention and 10 control factories, stratified on number of study women per factory, were selected before, immediately after, and 1 year after each reinforcement video. This sampling pattern was selected to provide information on the immediate and long-term (1-year) effects of the reinforcement procedures on BSE competency. No video was shown in control factories during the first round of reinforcement; the timing of the assessment activities in control factories in relation to the first video was thus related to the time of the video in the corresponding instruction group factories. In each selected factory, 25 current and 25 retired workers were randomly selected. If selected women could not be found or refused to participate, attempts were made to recruit replacements from a list of 20 additional randomly selected women. In BSE instruction work units, the women were asked about BSE practice; in control work units, they were asked whether they ever examined their breasts themselves. The women in both instruction and control work units were then asked to demonstrate their lump-finding ability on three breast models randomly selected from a set of six models. Each woman was given a maximum of 4 minutes to palpate each of the three models. The particular sizes and arrangements of lumps in the series of six models were developed specifically for evaluation purposes ( 38 ). In instruction work units, after palpating the three models, the women were asked to demonstrate their BSE technique on themselves. BSE workers observed the women perform BSE, assessed both the areas of the breast covered and the BSE technique, and recorded the results on a standardized form.
Economic reforms in China since 1994 have resulted in closures of some factories, the laying off of unproductive workers, early retirements, and individuals leaving the STIB to work in the private sector. In addition, beginning in 1995, factories within 4.5 km of the central business district of Shanghai are being moved to special textile industry zones that have been established in several surrounding counties. These changes have made follow-up of women in the cohort more difficult.
Some factories have also been merged. To minimize the level of cross-contamination when factories in different arms of the study are merged, the factory-wide study activities that are implemented following a merger are those of the study arm of the largest of the two merged factories. In all statistical analyses, however, individual women are retained in their original study arm. Thus, factory mergers somewhat reduce the level of reinforcement in the instruction group and enhance awareness of BSE among some women in the control group.
Follow-up of the Study Cohort
Follow-up of the cohort was aided by the development of computer software that provided the capability to key enter and to subsequently print Chinese characters. During 1992, the names of all women having a baseline questionnaire were key entered and incorporated into the study database.
Active follow-up of the entire cohort for vital status and continued residence in the Shanghai area has been performed periodically. In 1990–1991, 1991–1992, and 1992–1993, lists of women in the order in which they appeared in the factory notebooks were printed onto follow-up forms. These lists were taken to each factory and corrected and updated by the BSE workers, with the assistance of factory medical workers, from factory or medical clinic records or by direct contact with the study subjects. In addition, attendance at the two BSE reinforcement sessions and the low back pain sessions was taken, providing active follow-up of the entire cohort through 1994–1995.
In addition to these periodic surveys of the entire cohort, the BSE workers visit each factory every 1–2 months and inquire, at the medical clinic, the payroll office, and the retirement committee office, about deaths. Also, the medical workers in each factory are instructed to record on two different standardized forms women's changes of status within the factory (i.e., from active working to retirement status) and transfers out of the factory to either STIB or non-STIB work units; these forms are collected by the BSE workers during their visits to the factories. Finally, individual reports on each current and retired worker who dies are sent to an STIB tumor and death registry, which is operated by the Shanghai co-investigator on this project. This registry also receives annual summaries from each factory of all current and retired workers who have died in the preceding year. These reports are reviewed by the study staff to identify deceased members of the study cohort.
Surveillance for Breast Diseases
When a current or retired factory worker develops a breast problem or finds an asymptomatic lump, she reports to her factory's medical clinic, where it is evaluated by a medical worker. If the medical worker agrees that a potential lesion is present, she refers the woman to a surgeon for further evaluation. Medical workers have been encouraged to refer women to one of the three STIB hospitals, where breast clinics have been established to evaluate women in this study. Some women, however, are referred to other hospitals that have contracts with some factories to provide medical care to their employees. If a breast cancer is found, the usual treatment in Shanghai is initiated, without further involvement of study personnel. Factory medical workers record the identity of each woman who had a lump biopsy, how the lump was detected, and the intervals between initial lump discovery and first medical consultation (patient delay) and between initial consultation and diagnosis (system delay) ( 19 ). This information is abstracted onto standardized forms by the BSE workers during their factory visits, which constitutes the primary means of active case finding. This information is supplemented by periodic visits to the three STIB hospitals and visits to other hospitals as needed.
To detect breast cancers that may be missed by this active surveillance system, two additional case-finding mechanisms are utilized. The STIB tumor and death registry receives annual reports from each factory of all current and retired employees who developed or died of any neoplasm in the preceding year, and these reports are reviewed manually. In addition, the records of the Shanghai cancer registry, which is operated by the Shanghai Cancer Institute, are similarly reviewed each year.
Study personnel review the medical records of all women found to have a histologically confirmed benign or malignant breast lesion. The tumor size, its location within the breast, and its histologic classification are recorded, as are the stage of disease ( see below ) and the details of treatment for all cancers. Before being sent to Seattle for storage, histologic slides of all benign and malignant tumors are reviewed by local pathologists for quality and histologic diagnosis. Samples of slides are read by a reference pathologist in Seattle for diagnostic confirmation.
Deaths Due to Breast Cancer
The primary end point in this study is death due to breast cancer. Multiple methods have therefore been used to ensure that a high proportion of such deaths are identified: All known deaths are matched to the pathology data to identify women with any breast disease who have died; BSE workers attempt to determine annually the vital status of all patients with breast cancer by visiting eachone at home; as part of a nested case-control study of breast cancer and induced abortion (conducted in 1995), an attempt was made to locate all cases of breast cancer occurring in the study cohort, and the vital status of each case patient not interviewed was ascertained; and reports of all deaths due to breast cancer are ascertained annually from STIB tumor and death registry reports.
To ascertain the cause of death, for each death identified by any of these means, a retired physician reviews clinical and hospital records and interviews family members if necessary. Without knowing whether the deceased individual was in the instruction or control group, the physician then makes a judgment as to whether the woman's death was “very likely,” “probably,” “probably not,” or “very likely not” due to breast cancer. She records this conclusion along with the evidence used in making her decision on a form that is sent to Seattle, where it is reviewed by a Chinese physician, translated into English, and reviewed again by the principal investigator. If a consensus is not reached, additional information is sought from Shanghai.
As Table 3 shows, a high proportion of all women attended baseline instruction and the first round of reinforcement sessions. The level of attendance at all sessions was somewhat higher among current workers than among retired workers. Data for reinforcement session 2 are complete only for the women instudy group 1; their overall attendance fell from 96.7% at baseline activities (mostly in 1990) to 96.1% at reinforcement session 1 (mostly in 1991) to 89.1% at reinforcement session 2(mostly in 1994). The level of attendance at reinforcement session 2 will probably be lower for the women in study group 2 (mostly conducted in 1995). This trend reflects the increasing difficulty in conducting the study as a result of China's economic reforms. However, the proportion of women in study group 1 who attended all three instructional sessions, shown in the last column of Table 3 , represents a highly successful effort. Since BSE practice was a part of all three sessions, these figures not only represent success in delivering instruction and reinforcement but also provide one measure of compliance: 83.6% of the women in study group 1 practiced BSE under careful supervision on three separate occasions from 1989 through 1994.
Another measure of compliance is attendance at the individual BSE practice sessions that were supervised by factory medical workers. From 1989 through the first half of 1995, of the nearly 1.3 million individual sessions that were scheduled, women attended 83.2% of them. The percentage of appointments that were kept was more than 90% through 1992, 86.3% in 1993, 69.5% in 1994, and 41.3% in the first half of 1995. The drop in the rate of attendance, which began in mid-1993, reflects the difficulty in maintaining active participation in the study caused by the changes associated with economic reforms. Again, however, there was an acceptably high level of participation during the first 5 years of the trial.
The ability of randomly selected women to find various types of lumps in breast models is shown in Table 4 . The numbers of factories sampled (and women tested) that are represented in the table were as follows: 10 factories (427 women), five factories (202 women), and 10 factories (432 women) for the instruction group before, immediately after, and 1 year after video 1; 10 factories (420 women), five factories (232 women), and five factories (277 women) for the instruction group in relation to video 2; 10 factories (427 women), five factories (234 women), and 10 factories (473 women) for the control group in relation to video 1; and 10 factories (459 women), five factories (226 women), and five factories (244 women) for the control group in relation to video 2. Ten factories are not represented in all cells of the table because data collection and processing are not completed. The tested women were fewer than the targeted numbers of 500 (from 10 factories) or 250 (from five factories) because of women refusing to participate or not being located or because of insufficient numbers of women meeting the sampling criteria in the selected factories. The total numbers of lumps of each type that could have been detected by the tested women ( Table 4 ) were calculated by summing the number of lumps of each type in the three models on which each woman was tested and then summing these numbers for all tested women. The percentages of these lumps that were found by the women are shown in the table. Women in the instruction group consistently found a higher percentage of the lumps than did women in the control group. This was true for both easily palpated lumps (10 mm indiameter, hard, and superficially placed) and those that were more difficult to feel (3 mm, soft, and deep). As expected, among women in the instruction group, lump-detecting ability was greater in women assessed immediately after the videos than in women assessed before the videos, and sensitivity fell to about the pre-video level among women assessed 1 year later. Among women in the control group, no trends in relation to video-viewing time were evident. Women in the instruction group also demonstrated greater specificity in lump-finding ability with the models than control women. The percentages of women in the instruction and control groups who erroneously reported finding one or more lumps that were not there were 25% and 40%, respectively, when tested on models with no lumps and ranged from 17% to 28% and 26% to 44%, respectively, when tested on models with from one to five lumps.
The proportion of sampled women in the instruction group who correctly demonstrated various features of BSE practice on themselves was also higher for women assessed directly after a video than for women assessed before the video, and competency tended to decline after 1 year. Some examples of features that appeared to improve and then decline 1 year after the video include palpation of the peripheral and periaureolar areas of the breast, squeezing the nipple for discharge, and use of a firm circular motion. However, a high proportion of women in the pre-video and 1-year post-video samples correctly demonstrated most features of BSE practice. Well over 90% of the women in all samples correctly palpated all but the peripheral and periaureolar areas of the breast, and over two thirds in all samples useda firm, circular motion. In most of the samples, these percentages were considerably higher, often in the 80%–90% range. Women in the control group were not asked to demonstrate BSE.
Completeness of Follow-up
A total of 653 cases of breast cancer were detected in the study cohort through 1994, which is the last year for which case-finding efforts have been completed. Based on age-specific incidence rates from the Shanghai cancer registry for 1983–1987 ( 39 ) and assuming a 7% increase in incidence from that time to the midpoint of the 1990–1994 follow-up period ( 40 ), 605 cases would have been expected to occur, suggesting that few, if any, cases have been missed. Since virtually all cases have been successfully followed, it can also be assumed that few deaths due to breast cancer have been missed. All analyses presented are based on cases identified through 1994.
As expected, nearly equal numbers of breast cancers occurred in the instruction and control groups (331 and 322, respectively), but many more benign lesions were detected in the instruction group than in the control group (1457 versus 623), suggesting a higher index of suspicion in the women who received BSE training than in those who did not. The numbers of breast cancer cases that occurred in each year through 1994 are shown in Table 5 . There is little indication that more cases were detected in the instruction group than in the control group during the early years of the trial when BSE-training activities were being initiated.
Cases of breast cancer that were diagnosed within 6 months of the baseline activities in a woman's factory are arbitrarily considered prevalent cases; all others are considered incident cases. As shown in Table 6 , the proportions ofpatients who were diagnosed with disease confined to the breast (stage 1) or smalltumors (≤2 cm) were not consistently greater for the women in the instruction group than for those in the control group. Higher percentages of the incident tumors in women 50 years of age or older in the instruction group than in the control group were diagnosed when their disease was at stage 1 or 2 cm or less, but the differences between the two groups were small.
In the instruction group, no trends in tumor size or stage at diagnosis in relation to the time since the last supervised BSE were observed. Information on how a woman first found her lump was available for 311 of the cases in the instruction group: 257 (82.6%) women reported having found their lump while practicing BSE, and 53 women found it by other means. The proportion of tumors that were diagnosed at stage 1 was somewhat higher for those who reported detecting their tumor during BSE (58.2%) than for those who did not (50.9%), but the proportions reporting tumors diagnosed at 2 cm or less were similar in these two groups of women (46.7% and 48.1%, respectively).
Information on the time from the initial discovery of a breast problem to a consultation with a factory medical worker was available for 309 women in the instruction group and for 292 women in the control group. Slightly more women in the former group (71.5%) than in the latter group (64.4%) presented themselves for evaluation within 1 month. The time from the initial evaluation to treatment was similar for the two groups.
Information on the first course of therapy is currently available for 309 case patients in the instruction group and for 296 case patients in the control group who presented with stage 1 or 2 disease. Among these women, similar proportions in the two arms of the trial had various types of surgical procedures and received tamoxifen, radiation therapy, chemotherapy, and traditional Chinese herbal treatment. Too few women presented within situ disease (17 cases) or distant metastasis (five cases) for meaningful analysis on the basis of treatment.
A total of 1436 (1.1%) of the 133 375 women in the instruction group and a total of 1648 (1.2%) of the 133 665 women in the control group are currently known to have died by the end of 1994. The similarity in the numbers of deaths in the two groups is further evidence of the validity of the randomization procedure and indicates that comparable follow-up procedures are being implemented in the two groups.
Fifty-seven of the 653 women who developed breast cancer before the end of 1994 died by the end of that year. Fifty of these deceased women were judged to have died of their breast cancer. Seventeen of the deaths in the instruction group and 14 in the control group were judged as “very likely” caused by breast cancer, and eight deaths in the instruction group and 11 in the control group were classified as “probably” caused by breastcancer. Table 5 shows the numbers of deaths due to breast cancer in each year of the study. As expected, the numbers of deaths due to breast cancer initially increased over time during the early years of the trial as more women were recruited and as women with breast cancer that developed during and following baselineactivities went on to die. Equal numbers of women died of breast cancer in the instruction and control groups.
Table 7 shows the breast cancer mortality rates through 5 years since entry into the study. The relatively lower numbers of woman-years in the latest time periods reflect inclusion of data accrual only through 1994 (right censoring). There were no significant differences between the two groups in the rates of breast cancer deaths. As shown in Fig. 1 , the cumulative breast cancer mortality rates through year 5, as estimated by standard life table procedures, were only slightly lower in the instruction group than in the control group (30.9 and 32.7 per 100000, respectively).
After 5 years of follow-up, the cumulative breast cancer mortality rate was not appreciably lower for women who received BSE instruction than for those who did not. This finding is not unexpected. For women of comparable age in the Swedish randomized trials of mammography ( 3 ), for example, a difference in cumulative breast cancer mortality rates between the screened and unscreened groups did not begin to emerge until the 5th year after randomization. Although the breast cancer mortality rates in the 5th year after entry into this trial were slightly lower in the instruction group than in the control group, these rates were based on small numbers of breast cancer deaths and represented the experience of only the subset of study participants who had entered the trial in the 1st year of its implementation. It would therefore be premature to conclude that evidence for a beneficial effect of BSE may be emerging. The treatment received by women in the two arms of this trial was comparable, so it is unlikely that a difference in treatment obscured a true effect on survival of early detection by BSE.
In the mammography trials, malignant tumors that developed in the screened groups tended to be detected when they were smaller and at a less advanced stage than those diagnosed in women in the control group ( 7 ). A similar finding was not observed in the present study. Therefore, before the largely negative result based on the primary end point of breast cancer death is dismissed as a consequence of insufficient duration of follow up, alternative explanations that would also reduce differences between the two comparison groups in the extent of disease at diagnosis must be considered. These explanations include failure of the randomization process, poor levels of compliance and competency in practicing BSE, screening in the control group, and imprecise information on the extent of disease.
The randomization procedure used in this trial successfully divided the study population into two groups (instruction and control) of virtually equal size and risk of breast cancer. They were comparable in distribution with respect to age and the major risk factors for breast cancer, and nearly equal proportions of the women in the two groups reported a history of breast cancer and developed breast cancer following entry into the trial.The two groups also have experienced similar rates of mortality from all causes. In addition, the time from initial medical contact to definitive diagnosis and initiation of treatment was similar for women with breast cancer in the two arms of the study. The two groups are therefore not likely to differ with respect to factors influencing tumor progression or delays in receiving treatment, which could affect stage and tumor size at diagnosis.
Poor compliance in the intervention group is also an unlikely explanation of our findings. Of 311 breast cancer case patients in the instruction group with relevant information available, 287 (92.3%) reported ever having attended a BSE training class, and 257 (82.6%) reported having found their lump while practicing BSE. Also, rates of compliance with supervised BSE-screening schedules and levels of attendance at reinforcement sessions were high for the first 5 years of the study. Although the proportion of women who practiced BSE outside supervised sessions is unknown, the enthusiasm for the program demonstrated by the high participation rates at all program activities and the reminders in showers and clinics, along with the availability of facilities for BSE practice in all factories, would suggest that the overall level of BSE activity in the women in the instruction group was as high as one could reasonably expect a mass program such as ours to achieve.
Lump-finding ability in silicone breast models was used as a surrogate measure of BSE proficiency. Compared with sampled women in the control group, those sampled in the instruction group demonstrated greater skill (both in sensitivity and specificity) in finding lumps in breast models, and the reinforcement sessions, which were attended by a high proportion of women in the instruction group, were shown to improve lump-finding ability in the models. The only prior experience that the sampled women in the instruction group had with breast models was at the time of initial BSE instruction. Breast models were not used during the subsequent reinforcement sessions or in the videos, both of which emphasized proper technique and practice on the woman's own breasts. It is therefore unlikely that the level of skill observed for women in the instruction group is specific for finding lumps in silicone models; it more likely represents true BSE proficiency. Also, sampled women in the instruction group were generally able to demonstrate correct BSE techniques on themselves. Failure to enhance the level of competence in practicing BSE and finding breast lumps is thus not a likely explanation for the results to date.
There is also considerable evidence that the results are not due to screening in the control group. We have estimated (details not presented) that no more than 5% of the women in the control group factories transferred to instruction group factories early enough to have received appreciable instruction in BSE. Also, in a survey conducted in 1995, medical workers in only 8% of the factories reported that manual breast palpation was performed in conjunction with routine gynecologic cancer screening, and this type of screening had been performed in an equal percentage of instruction and control group factories. No screening mammography is available in any of the work units. Finally, of the 301 case patients with breast cancer in the control group for whom information was available on the means of detection, only five (1.7%) reported ever having received training on how to examine their breasts.
Another possible explanation for the similarities in tumor size and stage at diagnosis between the breast cancers in the instruction and the control groups is imprecise information on these features of the tumors. Tumor size and stage are recorded from pathology reports and hospital records from multiple facilities. We have no way of standardizing how lumps are measured and have recorded either clinical estimates of size or measurements made by pathologists, depending on which is available. Also, we only have been able to obtain with some degree of confidence stage in four broad categories (in situ [stage 0], limited to the breast [stage 1], regional spread [stage 2], and distant metastasis [stage 3]). Attempts are also made to collect information onTNM stage (41), but in many cases this information is not sufficiently detailed to allow use of this system. In addition, there is some tendency to underestimate the percentage of women with distant metastasis at presentation. Of 534 women initially classified as having stage 1 or 2 disease and who were independently reviewed for metastasis, nine additional women with distant metastasis at diagnosis were found. We have no reason to suspect that the level of misclassification of tumor size or stage would differ according to study arm. The net effect of this likely misclassification therefore would be to obscure any differences in size or stage of the tumor at diagnosis between the cases in the instruction and control groups. It should also be noted, however, that the absence of an appreciable increase in the total number of breast cancer cases detected in the instruction group during the early years of the trial argues against women having brought forward the time that their breast cancer was diagnosed by practicing BSE.
Although no differences in tumor size or stage between the breast cancers in the two study arms were large or statistically significant, it is note worthy that the differences in tumor size and stage were both in the expected direction only for incident cases in women aged 50 years or older, which is the same group for whom mammography has been shown to be efficacious. Since this difference may have been attenuated by misclassification of tumor size and stage, this observation, plus the higher index of suspicion in the instruction group than in the control group, as evidenced by the larger numbers of benign lesions detected by women in the instruction group, along with evidence that the findings to date are not likely due to inadequacies in the implementation of the trial, all support the contention that continuation of this study is both warranted and needed to determine whether BSE is efficacious.
Given the high level of BSE activities in the instruction group during the first 5 years of the trial, this study should provide valid information on the efficacy of BSE in reducing mortality from breast cancer. The primary test for differences between randomization groups will be performed using a weighted logrank statistic with the time scale being the time from randomization. The test will be two-sided with a type 1 error rate (size) of 0.05. The weight function used will be linear and will putrelatively little weight on any differences in the first few years after randomization because such differences are expected to be small. The test will be stratified on the basis of hospital affiliation of the factory (STIB hospital 1, 2, 3, or other), as analysis of baseline data has indicated that such stratification eliminates aggregation of breast cancer incidence rates among factories.
If we observe an average of 25 breast cancer deaths in each of the 6 additional years of follow-up, for a total trial duration of 10 years, then the study will have 80% power to detect a true reduction in mortality from breast cancer as small as 30%. Although the annual numbers of breast cancer deaths to date is somewhat fewer than 25, the number is expected to increase considerably as the population ages and as the number of years following diagnosis of the incident cases increases.
If there is no true reduction in breast cancer mortality rates achievable by efforts such as ours, then the large size of this trial will minimize the likelihood of spuriously observing a beneficial effect by chance. If we do not demonstrate a reduction in breast cancer mortality rates, we will be able to indicate the level of BSE practice activity that would have to be exceeded in order for a program to have a chance of improving breast cancer mortality (if BSE truly is efficacious). This level is presumably not culturally dependent (although the means to achieve it certainly would be). Since this study will have shown that this level is quite high, a negative result would caution individuals responsible for breast cancer control activities not to include BSE instruction in their programs unless they can confidently expect (and document) levels of competency and compliance greater than we were able to achieve.
Until results of this trial, or comparable efforts by others, are available, there will be no reason to modify the conclusion of the 1995 U.S. Preventive Health Services Task Force that there is insufficient evidence to recommend for or against the teaching of BSE ( 25 ).