U.S. public perceptions of the sensitivity of brain data

Abstract As we approach an era of potentially widespread consumer neurotechnology, scholars and organizations worldwide have started to raise concerns about the data privacy issues these devices will present. Notably absent in these discussions is empirical evidence about how the public perceives that same information. This article presents the results of a nationwide survey on public perceptions of brain data, to inform discussions of law and policy regarding brain data governance. The survey reveals that the public may perceive certain brain data as less sensitive than other ‘private’ information, like social security numbers, but more sensitive than some ‘public’ information, like media preferences. The findings also reveal that not all inferences about mental experiences may be perceived as equally sensitive, and perhaps not all data should be treated alike in ethical and policy discussions. An enhanced understanding of public perceptions of brain data could advance the development of ethical and legal norms concerning consumer neurotechnology.


I. INTRODUCTION
In the age of rapidly advancing technologies, we are witnessing an era of 'big data', characterized by the expansive use of Internet and sophisticated artificial intelligence systems to amass large data sets. 1 The healthcare sector, in particular, stands to benefit immensely from this trend, given the wide-ranging nature and potential applications of data in both biomedical research and commercial settings. 2This paper zeroes in on the intriguing realm of 'brain data' and aims to shed light on public perceptions of the sensitivity of this data, especially compared to other types of personal data.
Over the past decade, direct-to-consumer neurotechnologies have started to become increasing available and more popular. 3Portable non-invasive electroencephalogram (EEG) devices and electromyography (EMG) devices can now be directly purchased by consumers for a wide range of applications. 4These devices enable the monitoring of attention, 5 the measurement of stress, 6 and seamless interaction with the virtual world 7 through the recording and analysis of brainwaves and electrical impulses from muscles.The rising trend of Brain Computer Interfaces (BCI) is noteworthy, appealing to a vast demographic from aiding those with disabilities to enhancing the gaming experience for enthusiasts.They cater to individuals with disabilities as well as everyday users in the gaming and entertainment sectors. 8Neurotechnology also has the ability to collect personal data beyond biological brain function, such as individuals' emotional, 9 mental, and affective states, and even human thoughts. 10More sensitive, private information, like whether an individual has a substance abuse disorder or specific numeric combinations a person is thinking of (ie an untold PIN), 11 can also be extracted using brainwave data. 12In the not-too-distant future, these devices may become one of the primary ways in which we interact with the rest of our technology, including how we drive our cars, 13 play video games, 14 and interact with our environment. 15ith these advances come potential privacy risks associated with the collection and use of neural data.The potential for misuse is not just theoretical, but already being realized, in settings from the workplace to government use of neurotechnologies for criminal interrogation. 16eupane et al. demonstrate how Hemorrhage-a term they used when an attack on brainwave privacy is conducted by machine-learning software-analyzed publicly available EEG datasets and predicted the presence of alcohol usage disorder with 96% precision. 17While this technique was carried out under controlled settings, real-life consequences of Hemorrhage can include blackmail, targeted medical or insurance scams, and serious career implications. 18AlterEgo, a headphone that detects silent speech using neuromuscular signals, has an interface with 92% accuracy for digit recognition, which could lead to password leaks. 19Finally, as Kellmeyer and others have remarked, there is the potential for de-anonymization of brain data. 20MRI data can be used to reconstruct an individual's face which could ultimately lead to reidentification. 21While 'big brain data' collected and sold by EEG device manufacturers is anonymized, the possibility of subject re-identification becomes more likely when the personalized features of brain data are cross-referenced with other types of data. 22hen such data is de-anonymized, it could expose individuals to the risk of identity theft or criminal acts 23 and be utilized for commercial and marketing purposes. 24In light of recent neural data leaks from corporations such as DeepMind and Facebook, there is a need for closer monitoring on the private sector's use of brain data.In recent decades, there has been a significant increase in the literature addressing the challenges of collecting various forms of neural information using consumer neurotech devices.On one side of the debate, numerous studies sound the alarm about the processing of sensitive neural data and the risks for human rights violations.Other scholars cite the sensitivity of the data itself, the location of the data within corporations, and the risk of inadvertent exposure of this information as motivating concerns. 26rofessor Farahany has described mental privacy as the 'final frontier of privacy', 27 while Marcello Ienca describes 'brain information' as 'the most intimate and private of all information',28 and Donald Kennedy, the former editor-in-chief of Science, describes brain data as constituting his 'most intimate identity'. 29On its own, brain activity data may be able to reveal whether one is attentive or distracted, 30 as well as drowsy or alert. 31In research settings, neurotech devices have shown the potential to measure emotional responses, 32 identify likes and dislikes, and even assist in the diagnosis of depression 33 and ADHD. 34On a larger scale, the raw data collected by EEGs may even be capable of augmenting already existing algorithms that use personal data like browsing history to target advertisements, calculate insurance premiums, or match potential partners. 35 prevailing concern among scholars is the potential for neurotech users to unwittingly overshare.They might not fully grasp the myriad ways in which their data can be harnessed.Farahany reveals numerous instances in which consumers have already started to do so. 36Ienca et al. argue that many users may underestimate the informational richness and versatility of brain data, leaving them vulnerable to unintentionally surrendering their data privacy rights. 37Consumers may unknowingly authorize a company which uses their data to gain deep insights into their personal profiles. 38This enhances the risk of de-anonymization and public exposure. 39rain data is already being commodified by consumer neurotechnology companies, creating a risk of exposure without users' consent.Farahany describes how one major neurotech company, Entertech, has already entered into partnerships with other companies to share the brain data they have aggregated through use of their consumer EEG devices. 40Kellmeyer describes the act of 'neurohacking', which he defines as the 'illicit access to a neurotechnological device or a software program that processes neural data'. 41A hacker could hypothetically take over a BCI system connected to a robotic prosthetic and cause grave physical harm to the user and others. 42Ienca et al. point to the Cambridge Analytica scandal that engulfed Facebook in 2018 as an example of the dangers associated with loosely regulated data collection. 43According to them, the scandal, in which the personal data of 87 million Facebook users was shared without their consent, showed that online providers such as Facebook often lack the willingness and capability to restrict data collection, which enables third parties to access data without authorization. 44While there is a push to treat neural data with the same legal protections afforded to other sensitive data such as health data, achieving this has proven challenging due to the inherent uncertainty in its processing as discussed by Hallinan et al. 45 Coaxes and Wexler assert that the attention on maintaining privacy distracts us from more pressing issues such as the misleading advertising on EEG devices. 46Another dimension of this issue was examined by Gerber et al. in a survey in which participants were asked to rank different risk scenarios in terms of severity and probability of data leaks. 47The findings showed that abstract, hypothetical scenarios involving their health were considered less alarming when compared to specific scenarios. 48This further suggests that the public is not fully informed about data privacy risks and that perceptions vary based on imminent threat, emphasizing the need for a rise in public awareness. 49People may willingly share their data with organizations such as insurance providers, employers, and law enforcement agencies that may have a deliberate interest in monitoring people. 50

• US public perceptions of the sensitivity of brain data
On the other side of the debate, there is also the argument that with the development of data regulations and safeguards, the benefits of large-scale brain data sharing and consumption outweigh the concerns.Jwa et al. state that it may be unrealistic to completely eliminate the risk of neural data breaches and more efforts should be placed on prevention measures than safeguarding privacy, which ultimately limits the rewards of neuroscience research. 51The potential benefits of large-scale brain data extend to both medical and non-medical sectors.Such data has the capacity to drive advancements in therapeutic medical solutions and the quality of lifesaving neurotechnological devices.Amidst these promising possibilities, concerns exist regarding the effectiveness of consumer neural devices.Wexler and Reiner have argued that there is a lack of scientific consensus on the efficacy of such devices 52 .Rather than enforcing more restrictions, they encourage more comprehensive consumer evaluations and testing to foster the growth and increased reliability of neurotechnology. 53

I.B. Industry Perceptions of Consumer Neurotech Data
The perception of brain data collection risks is not uniform even among industry leaders in neurotechnology.A study by Minielly et al. sought to understand these leaders' views on scholarly privacy concerns. 54Minielly et al. interviewed senior executives from prominent neurowearable firms and distilled their insights into four primary themes: data collection and management, ethical principles, the unique nature of brain data vis-à-vis international policies and regulations, and prevailing standards.The findings regarding data collection and management were particularly illustrative of the challenges inherent in finding a balance between gathering user data for commercial purposes and upholding data privacy safeguards. 55While some interviewees saw privacy as the primary concern, pointing to their own policies of not selling data and implementing 'very strict privacy controls', others had no qualms admitting that 'companies need to own the data in order to monetize [it] . . .and not being able to [do so] would hinder [a] company's ability to exist'. 56he interviews were also quite revealing when it came to how the industry views the user's consent and the 'exceptionalism' of brain data.While some of the interviewees acknowledged that 'people are a little more personal about brain data', there was substantial pushback against the idea that brain data could reveal something that would otherwise be indeterminable through existing methods, like tracking online behavior. 57Instead, as one interviewee acknowledged, 'most companies . . .don't have data privacy policies and don't put them in place purposefully to try and not bring [the issues surrounding data privacy] to light because there is no winning that debate'.Ultimately, the utilization of personal data holds significance for the individuals who supply this information.Moon emphasizes the importance of consumer perceptions in shaping health data privacy and usage policies by combining insights from studies investigating consumer preferences for health data sharing. 59Mirroring the principles of trust and ethical practice in healthcare delivery, Moon describes consumers as the 'key stakeholders' in the policy framework governing health informatics. 60Humancentric data governance and decision-making start with individuals' experiences and perceptions of the data they share. 61otably absent in these discussions is concrete empirical data about how the public perceives the sensitivity of their own brain data.Does the public see their brain data as something uniquely sensitive that can reveal their 'most intimate identity', as Donald Kennedy describes? 62Do they recognize the unique risks of using consumer neurotechnology?Do they understand the inferences that can be drawn from their brain data?Or, based on their current understanding of neurotechnology, do they regard their brain data as merely another data point, like their online browsing history or birthday?
Here, we present the findings of a nationwide survey conducted in the USA, examining the public's perception of brain data in comparison to other personal information.While the survey was conducted several years ago, there has not yet been a mass market consumer neurotechnology product launched, although there are several major product launches believed to be on the near-term horizon. 63These results inform discussions of law and policy regarding the governance of the data collected by neurotechnology and help to establish a baseline before major product launches in this field.This research offers an empirical grounding of public perceptions and attitudes with respect to the sensitivity of brain data, and their perceptions of unique risks and benefits associated with consumer neurotechnology to further guide ethical, policy, and regulatory discussions on the same.

II. STUDY DESIGN
Prior studies have explored public perceptions of neural and health data, including data sharing and privacy 64 , neuroimaging data 65 , and data collected using BCI. 66The Pew Research Center, a nonpartisan think tank that conducts public opinion polling and analysis, shared a similar but broader objective.In 2014, they conducted a national study that explored Americans' privacy behaviors and attitudes toward different types of personal information. 67Participants were asked to rate the sensitivity of 16 different personal information items, as either 'not at all sensitive', 'not too sensitive', 'somewhat sensitive', or 'very sensitive'.The study revealed, unsurprisingly, that Americans found certain types of data to be more sensitive than others.Specifically, social security numbers were widely regarded as the most sensitive piece of information, while media tastes and purchasing habits were among the least sensitive pieces of information. 68artin and Nissenbaum explored individuals' perspectives on data collection and the possible invasion of privacy in their 2017 study. 69They cited the Pew Research Center's survey for its high standing in public discourse and its consistent rating of information sensitivity along a scale. 70n addition to adopting the Pew study's methodology of gaging the public's perspectives on various types of data, the goal of our study was to identify where brain data lies among the varying degrees of data sensitivity.This study builds upon prior research to understand how information that can currently or could one day be decoded from the data collected by consumer neurotechnology-thoughts in one's mind, concentration levels, and one's emotions and feelings-fits into public perceptions of privacy.

II.A. Methods
We designed a survey modeled on a 2014 Pew research study and included 17 additional information items, detailed in Table 1.We retained all 16 data categories from the Pew study to provide a foundation for understanding public perceptions of data sensitivity and contrast this with the newly introduced items.Given that the Pew study had only one category related to personal health, our questions explored more diverse facets of neural and health data.Items varied from credit score history to alcohol consumption.We defined data as 'neural' or 'brain' if it directly stemmed from brain function or neurological activity.
A brief explanation of the study's purpose, to collect opinions on privacy in the digital age, was given as contextual information to participants prior to the start of the survey.The survey began with general questions on participants' feelings about privacy in the digital age.This was followed by three privacy pragmatist measurement questions by Alan Westin, whose contributions have had a lasting effect on the concept of privacy in research and policy.Despite criticisms of his framework, Westin's threetiered privacy categories (fundamentalists, pragmatists, and unconcerned) are frequently employed in privacy research, including the Pew and Martin and Nissenbaum studies. 71Subsequent sections transitioned to questions on data sensitivity, where participants gaged the sensitivity of various data types using a four-point scale: (i) 'very Six demographic questions were included to ensure diverse representation.Question order was randomized in each section to minimize order biases.

II.A.1. Recruitment and Sampling
Survey Sampling International, LLC (SSI), a data collection service that specializes in conducting surveys using nationally representative sampling, facilitated recruitment.Their approach, much like Gf K Group's KnowledgePanel from the 2014 Pew study, leveraged online panels, advertisements, social media, and direct emails.While SSI mainly focuses on online recruitment, their methods align with other bioethics studies using internet-only participants. 72,73,74

II.A.2. Pilot Testing
Before the full survey, pilot tests were undertaken via Amazon's Mechanical Turk (MTurk). 75These pilots assessed clarity, word choice, and survey duration, laying the groundwork for the main survey's structure and guiding sample size determination.Rather than a conventional power analysis, our sample size determination was driven by the data from the pilot.

II.A.3. Data Collection
Between Dec. 1 and 7, 2017, 2576 participants were surveyed via SSI.The survey, accessible across various digital devices, was hosted on Qualtrics.SSI remained unaware of the specific data.Each survey took about 10 min, and all responses were mandatory.Out of the total, 1126 were discarded due to various reasons, with 1450 responses being analyzed.Demographic details of participants are in Table 2.
Compensation was set at $0.50 per participant, regardless of response inclusion.This compensation model was adopted from MTurk and approved by our Institutional Review Board.We acknowledge ethical considerations concerning payment levels, especially given MTurk's historical payment trends.Funding for this study was provided by Duke University's Bass Connections program and the Duke Science, Law, and Policy Lab (SLAP Lab).

II.A.4. Data Analysis
All reported analyses were conducted in R (4.3.1;R Core Team, 2023)76 with the associated scripts being publicly available. 77To discern the underlying variables impacting perceived data sensitivity, an exploratory factor analysis with a 'promax' rotation was conducted, using the 'psych' package (Revelle, 2020)78 in R. Horn's parallel analysis, run with the 'paran' package, helped determine the number of factors in the exploratory factor analysis (Dinno, 2018)79 in R. For each individual, we calculated factor scores for three factors: mental/affective, public, and private information.The factor score  represents an individual's perceived sensitivity for one of the three broad categories of information rather than a particular information item.To obtain a factor score for each factor, we only considered information items whose factor loadings were greater than 0.4 (or smaller than −0.4 for negative loadings); factor scores were then computed as the weighted average of those information items.This approach ensured that the factor scores were in the same range as the original ratings (1-4) and allowed for comparisons across factors.To compare the perceived sensitivity of these three information categories (ie factor scores), as well as to investigate its rela tionship to socioeconomic status, linear mixed-effects models (LMMs) were fitted using the 'lme4' package (Bates et al., 2015)80 in R and significance for fixed effects was assessed using Satterthwaite approximations to degrees of freedom.In all LMMs, continuous variables were standardized, categorical predictors were coded using deviation coding, and participants were included as random intercepts.The 'emmeans' package (Lenth, 2020)81 was used to probe significant interactions between predictors.All 95% confidence intervals around estimates were computed using parametric bootstrapping with 1000 simulations.

III.A. Sensitivity Rankings Mirrored the Pew Study
The top three items from the 16 featured in the Pew survey, rated as 'very sensitive', aligned with our study's findings.Specifically, social security number was identified as the most sensitive, with 89.2% of participants indicating so.Meanwhile, political views, media preferences, and basic purchasing habits were seen as the least sensitive in both studies.Figure 1 illustrates the sensitivity rankings for all 33 information items, while Table 3 presents the mean and standard deviation for these items.We conducted an exploratory factor analysis of the participants' perceived sensitivity of all 33 different information items to determine which information items asked about  in the survey had statistically similar patterns of perceived sensitivity.Based on this analysis, we grouped items into three categories.Figure 2 shows the factor loadings for each item.Using a threshold of 0.4, we identified and subsequently analyzed 27 items, excluding the remaining six (friends, substance use, genetic information, relationship history, health, and credit score).

III.E. Tech Adoption and Sensitivity
Lastly, we wanted to see if an individual's willingness to adopt technology early correlated with their perceived sensitivity.We fitted an LMM to assess whether an

IV. DISCUSSION
In 2017, the same year this study was conducted, an article described the human brain as carrying terabytes of valuable biological and clinical data. 82This conceptualization underscored the gravity of harnessing brain data in the digital age, especially as neurotechnology advancements burgeon.

• US public perceptions of the sensitivity of brain data
Subsequent literature surrounding neurotechnologies and data privacy has amplified the clarion call for addressing disparities in data governance and protection measures. 83For instance, to define brain data, scholars propose: 'Human brain data are quantitative data about human brain structure, activity, and function'. 84This vast realm ranges from neurobiological metrics, like EEG and fMRI, to rich sociopsychological contexts, especially when merged with non-neural data such as smartphone usage patterns. 85his study sought to better understand how the public perceives the sensitivity of brain data.We compared opinions on specific brain data variables ('mental/affective' column in Table 4) against other non-neural variables of information ('private' and 'public' columns in Table 4) to better guide the ethical and legal conversations in this domain.
Contrasting with findings from Schmitt et al., where the US public apparently made little distinction across diverse data categories, 86 our insights align more with the 2014 Pew study.For example, while 'thoughts in one's mind' was regarded with heightened sensitivity, other facets of brain data like 'concentration', 'focus', 'alertness', and 'drowsiness' were nestled lower in the hierarchy, even trailing behind items like 'relationship history'.

IV.A. Making Sense of Public Perceptions
We observed an intriguing order of preference among the three data categories; 'mental/affective' data, which included our brain data targets, was perceived as more sensitive than 'public' information but ranked below 'private' data.Significantly, both our findings and the 2014 Pew study ascertained that an individual's social security number outpaces all forms of mental/affective data in terms of perceived sensitivity.
We believe that there are several potential explanations for the dissonance between public and expert perspectives on the sensitivity of brain data.It is possible that consumers are less familiar with the potential uses of data collected from consumer neurotech devices and, therefore, are less wary of the risks associated with that information being inadvertently exposed.Our survey distribution took place before highly publicized data breaches like the 2018 Cambridge Analytica scandal, which had lasting impacts on the public perceptions of data collection and security.A qualitative study conducted during the aftermath of the Cambridge Analytica scandal asked participants about their understanding of online privacy and personal data. 87Not only did their results show that there was a general lack of understanding of how personal data is utilized, but more surprisingly, individuals considered themselves immune to the consequences of breaches affecting their personal data. 88One can assume that as neurotechnology becomes more prevalent in society (as exemplified by Elon Musk's mainstream innovation Neuralink), consumer awareness will also rise with it.However, there remains an evident gap in consumers' understanding of their brain data.More deliberate efforts in educating the public will enable consumers to better discern the implications of data that neurotechnology can and will capture.
Societal conditioning undeniably shapes our perceptions of 'sensitive' information.As observed in the Pew study and ours, social security numbers are universally viewed as highly sensitive, 89 likely due to the longstanding fears and repercussions associated with identity theft. 90By contrast, the ramifications of misused brain data, although potentially calamitous, might not be as ingrained in the public psyche.Neupane et al.'s prediction about tech behemoths like Facebook or Neuralink pioneering thought-totype technology further underscores the impending challenges to our conventional privacy norms. 91ur findings also hint that certain kinds of brain data might intrinsically be viewed as less sensitive than others.For example, generalized affective states like attention and anxiety may be perceived as less intimate than content-rich information like personal thoughts or mental imagery.Such distinctions are pivotal to future studies.If brain data indeed spans a sensitivity spectrum, similar to other personal data types, policies could be crafted to address this spectrum to ensure the balance between promoting innovation, while safeguarding the cognitive liberty of individuals.
Lastly, perceptions of brain data's utility, especially for altruistic purposes such as advancing brain research, might temper public apprehensions at least with respect to some actors.Schmitt et al.'s insights reveal a preference among the US public for data deployment for 'the common good' such as public health research at nonprofit organizations, rather than profit-driven purposes such as private organizations or economies. 92These findings, albeit not entirely aligned with current US data protection laws, spotlight the need for recalibrating legal safeguards. 93Future studies could dive more deeply into these motivations to better inform policy that balances data types and their intended applications.

V. CONCLUSION
Neurotechnology is rapidly evolving, prompting a need to reconsider our understanding and handling of brain data.Our study has explored public perceptions of different types of brain data and found that sensitivity varies depending on the inferences drawn from the data.However, this study underscores a notable gap: the nature and implications of the raw data itself.
While our analysis largely revolved around inferred data-insights generated from raw brain signals-it hints at a spectrum of data sensitivity, from 'mental/affective' data to more direct cognitive processes, suggesting a nuanced approach to data protection.However, raw brain data, which captures basic electrical patterns (like in 89 Madden, supra note 67.90 Irshad, S. and Soomro, T.R., 2018.Identity theft and social media.18

20
• US public perceptions of the sensitivity of brain data EMG and EEG), presents distinct challenges. 94As technology progresses, such raw data holds potential for diverse personal insights, perhaps beyond the current public understanding.
Several frameworks, including those advanced by Farahany, 95 the Neurotechnology Ethics Taskforce (NET), 96 and Ienca et al., 97 guide the treatment of brain data, addressing its storage, use, and dissemination.Grounded in public perceptions, these guidelines can be fine-tuned to be more effective.The urgency of brain data protection is gaining global traction.Chile's recent legal amendments concerning brain data, 98 and the Organization for Economic Cooperation and Development's (OECD) 2019 recommendations underscore this momentum. 99UNESCO's recent report 100 and initiatives 101 further spotlight the pressing nature of this issue.The focus, however, needs to squarely address the inferences that can be drawn from brain data as well as the nature of the data itself.
If consumer neurotechnology trends toward edge computing and storage, our findings highlight the importance of policies targeting permissible inferences from brain data.Conversely, if edge computing is not the future, there exists a potential disconnect between public perceptions of brain data and the protections required for individual mental privacy.
Our findings emphasize the crucial need to delve deeper into public perceptions of raw brain data and its latent implications.A nuanced approach to regulation is essential, distinguishing between raw and inferred data, with each carrying its own set of immediate and potential implications.
In conclusion, there is an evident need to understand public perceptions surrounding raw brain data and its potential implications.Regulatory strategies should differentiate between raw and inferred data, understanding the unique characteristics of each.Addressing this subject requires an integrated perspective, combining insights from neuroscientists, ethicists, policymakers, data analysts, the public, and a broad and diverse set of stakeholders.While our study provides important insights on public perceptions about inferences that can be drawn from brain data, it also underscores the need for expanded research, especially concerning the less-explored territory of public perception and understanding of the nature of raw brain data.

Figure 1 .
Figure 1.Sensitivity of all information questions.

Figure 3 .
Figure 3. Perceived sensitivity of different types of information.

17 Figure 4 .
Figure 4. Perceived sensitivity of mental/affective information by socioeconomic status.

Figure 5 .
Figure 5. Early tech adoption has no correlation with perceived sensitivity of information. 25

Table 2 .
Demographic information of the survey respondents and corresponding US averages

14 •
US public perceptions of the sensitivity of brain data