This paper defends an orthodox model of the linguistic intuitions which form a central source of evidence for generative grammars. According to this orthodox conception, linguistic intuitions are the upshot of a system of grammatical competence as it interacts with performance systems for perceiving and articulating language. So conceived, probing speakers’ linguistic intuitions allows us to investigate the competence–performance distinction empirically, so as to determine the grammars that speakers are competent in. This model has been attacked by Michael Devitt in his recent book and a series of papers. In its place, Devitt advances a model of linguistic intuitions whereby they are speakers’ theory-laden judgements about the properties of languages. In this paper, I try to make clear the rationale behind the orthodox model and the inadequacies of Devitt's model.
Intuitions as Evidence
An example: intuitions about binding
Acceptability and interpretability
The Orthodox Model: Linguistic Intuitions as Data for Psychological Theories
How do intuitions bear on competence theories?
Intuitions and judgements
Linguistic intuitions and visual impressions
Are linguistic intuitions the ‘voice of competence’?
Are linguistic intuitions and visual reports disanalogous?
Devitt's Model: Linguistic Intuitions as Theory-laden Judgements
Devitt's model and belief-independence
Devitt's model and folk theory
A modification to Devitt's model
Devitt's alternative view of the evidence
In order to determine what the grammar of a speaker's language is, generative grammarians draw upon a distinction between what is licensed by a speaker's grammatical competence—roughly, a system of linguistic information pairing sounds and linguistic forms over an unbounded range—and what is an effect of extraneous performance factors engaged in putting that competence to use (see, in particular, Chomsky , pp. 3–62). This paper examines how generative grammarians get a grip on this distinction empirically, what sort of evidence is brought to bear in generative grammar, and what sorts of hypotheses it is brought to bear on. The primary aim of the paper is to defend what I take to be an orthodox model of linguistic intuitions as they form a central source of evidence for generative grammars. According to this orthodox view, linguistic intuitions can be used to investigate the structure of a dedicated system of grammatical competence as it interacts with performance systems for perceiving and articulating language. So conceived, evidence from speakers’ linguistic intuitions allows the grammarian to investigate the competence–performance distinction empirically and thereby determine the grammatical structures that speakers are competent with. This orthodox model has been attacked by Michael Devitt. In its place, he advances a model of linguistic intuitions whereby they are speakers’ theory-laden judgements about the properties of languages (Devitt [2006a], [2006b], [2006c]). I aim to make clear the rationale behind the orthodox model and the inadequacy of Devitt's proposed alternative.1
Intuitions as Evidence
Both psychologists and philosophers are interested in people's intuitions. But the locus of their interest may differ. Psychologists are interested in gathering data on subjects’ intuitions, sometimes in elaborately designed experiments, because these intuitions reflect the workings of the psychological systems of the subjects that have them. Philosophers may also be interested in people's intuitions because of what they reveal about the psychological states of the people that have them. But philosophers sometimes seem to be interested in people's intuitions because they are revelatory of a non-psychological domain of facts. This interest is well motivated where there is good reason to think that the subject is well positioned with respect to the facts in question, perhaps because he has some special knowledge.
Generative grammarians draw upon the intuitions of competent speakers. We all have such intuitions. If I say to you ‘John posted the letter to Bill’, you immediately recognize that as a part of your language. If I were to ask you whether it was OK, a perfectly good sentence of your language, then, no doubt, you would say that it was. However, if I say to you ‘to posted Bill the John letter’ you are likely to recognize the words as part of your language but also recognize that there is something amiss in the way they have been put together. In fact we have very intricate intuitions about the linguistic forms of our language and their meanings. Grammarians use these intuitions to investigate grammatical structure.
Chomsky's teacher Harris held that linguists cannot investigate the structure of languages by examining speakers’ intuitions:
We do not ask a speaker whether his language contains certain elements or whether they have certain dependencies or substitutabilities [… the speaker's habits] are not sufficiently close to all the distributional details, nor is the speaker sufficiently aware of them. Hence we cannot directly investigate the rules of the ‘language’ via some system of habits or some neurological machine that generates all the utterances of the language. (Harris , p. 45)
Harris surmized that, rather than investigate speakers’ linguistic judgements, the grammarian has to investigate ‘some actual corpus of utterances’ from which we derive ‘such regularities as would have generated those utterances’ (Harris , p. 45). But current scientific practice suggests that Harris was wrong about what could be learnt from speakers’ judgements. Though the questions are not framed in the metalinguistic way that Harris considered, grammarians do investigate grammatical structure by probing native speakers’ intuitions. The orthodox view is that these intuitions are yielded by special cognitive systems responsible for recognizing and shaping grammatical categories in speakers’ utterances.
If the aim of linguistic inquiry is a theory of the grammatical competence system as it is situated within linguistic cognition then it is natural to seek data from the subjects whose cognitive capacities are the domain of inquiry, just as in other areas of psychology. The central issue that Devitt has raised is whether generative grammarians, like psychologists, are interested in speakers’ intuitions because they are data for theories about speakers, and more particularly their grammatical system (see Devitt [2006a], pp. 95–125). Devitt suggests, to the contrary, that grammarians are concerned with these intuitions because they are revelatory of a domain of non-psychological facts to which speakers may have access through empirical reflection. According to the orthodoxy, grammatical principles are part of an explanation of a range of data concerning the proclivities of speakers. The data are that they find certain forms acceptable and that certain interpretations are available to them. On Devitt's view, the data bear primarily on the properties of the presented sounds and marks rather than on the cognitive states of the speakers that intuit them.
The fact that intuitions may have a different significance to psychologists and philosophers, and the fact that there are competing accounts of how linguistic intuitions serve as evidence, suggests that the nature of intuition as a general category is not well enough understood to provide a model of linguistic intuition. The term ‘intuition’ has been applied across a range of domains and cognitive abilities, and there may not be the kind of singular phenomenon that the term suggests. As Fiengo remarks, we might do well to focus for present purposes on the question of ‘what linguistic intuitions must be like if they are to be the data of Linguistics’ (Fiengo , p. 255).
An example: intuitions about binding
Although a wide variety of intuitive judgements are brought to bear on generative grammars, I’ll focus on one illustrative example of how intuitions have been exploited to support theoretical hypotheses. It concerns the analysis of sentences containing reflexives like myself (this presentation is adapted from Adger , pp. 116–20). Consider the following examples:
(1) I shaved myself.
(2) *Myself shaved me.
English-speakers intuitively judge (1) to be a perfectly good sentence, but immediately recognize that there is something amiss with (2). As English-speakers, we would replace (2) with (1). Generative grammarians draw upon the intuited difference between (1) and (2), and related constructions, as evidence for a theory of the structures containing reflexives that are sentences of human languages.
It turns out that the relevant intuitions about permissible reflexive constructions and their possible meanings can be explained in terms of the notion of c-command (constituent-command).2 Linguistic structures can be represented by tree diagrams, and the relations between the nodes in the tree can be described in familial terms. Hence, if node Y is directly above node X in the tree then Y is X's parent, and if Z is directly above Y in the tree then Z is X's grandparent and so on. If Y is also directly above W in the tree, then X and W are sisters. And we can say that the node Z contains all these nodes that are descended of it as children, grandchildren and so forth. In tree diagrammatic terms, a node X c-commands a node Y if, and only if, X's sister either: (i) is Y, or (ii) contains Y. Reflexive constructions, like (1) and (2), are an example of c-command in action. In order for a reflexive to be part of an acceptable sentence like (1), it has to enter into special relationships with other constituents in the structure in which it occurs. For a reflexive, like myself in (1), to be a part of a good sentence it must be bound: it must have its interpretation fixed by an antecedent in that sentence. Myself in (1) is bound by I. Linguists aim to discover how it is determined when a potential antecedent can bind a reflexive; for all seems to go well with (1) but not with (2). A well-known hypothesis (Principle A of Binding Theory) is that a reflexive must be bound by an antecedent that c-commands it.3 Evidence from speakers’ intuitions can be marshalled in support of this hypothesis as follows.
What could explain the contrasting intuitive judgements English-speakers make about (1) and (2)? It seems that in (1) the reflexive is co-referential with another expression in the sentence, in this case I. So we might form the following generalization: a reflexive must be co-referential with another expression in the sentence. However, this generalization would not explain the intuited difference between (1) and (2) because the myself and me in (2) could co-refer, and yet (2) would still be a recognisably poor sentence, though (1) is not. We might instead look at some of the properties of the different lexical items in (1) and (2). Lexical features pertaining to number, person, and gender are called Φ-features. We might hypothesize that a reflexive and its antecedent must bear the same Φ-features. We could then frame a new hypothesis about reflexives: that they must be co-referential with another expression in the sentence that shares the same Φ-feature specification. But again this wouldn't explain the intuited difference between (1) and (2). In both (1) and (2) we have two expressions that share singular, first-person Φ-features, one of which is a reflexive. So this hypothesis falsely predicts that speakers should intuitively judge (1) and (2) to be equally good.
An aspect of (1) that we need to capture is that myself doesn't merely co-refer with I but is actually bound by I. Speakers must always interpret it to refer to the same individual that I does. If, in addition to this dependency, we add the notion of c-command, we can drastically improve upon our previous generalizations: a reflexive must be bound by an antecedent that c-commands it. I will assume, unrealistically but for ease of exposition, that simple sentences like (1) are composed of NP-VP as in (1a):
(1a) [s [NP I][VP [V shaved][NP myself]]]4
The NP I c-commands the NP myself in (1). On this analysis, in simple sentences like (1), the object of a sentence will be c-commanded by the subject of a sentence, since the object is contained in the subject's sister and not vice-versa, as marked in (1a). In (2) the reflexive is not bound by a c-commanding antecedent because even if myself is interpreted as co-referential with me it is not c-commanded by me. Principle A, making use of c-command relations, thus explains the intuited contrast between (1) and (2).
Drawing on further intuitions data, we can test Principle A, which appeals to binding and c-command, against competitors such as the principle that a reflexive must be bound by a preceding expression. This competing explanation does not extend to further data such as our intuitive judgements about (3).
(3) *[The man I saw] shaved myself.
In (3) the pronoun I, which is a potential antecedent for myself, precedes the reflexive but (3) is not a good sentence. As indicated by the square brackets, The man I saw is a constituent of sentence (3), according to standard tests for constituency5, and contains I. But there is further structure to The man I saw, so I will be at a position hierarchically below The man I saw. I doesn't c-command myself as myself is not a sister of I and there is a node above I that does not contain the reflexive. Hence, I does not c-command myself though it does precede it. The notion we need is not precedence but c-command.
Linguists might unearth further evidence from speakers’ intuitions that overturns Principle A. There are some apparent counterexamples to Principle A in English, such as ‘Jane saw Stuart's picture of herself’. These ‘picture reflexives’ are of borderline acceptability to English-speakers. There are further cases of contrastives like ‘Bill can't imagine why Mary would want anyone other than himself’. Speakers’ intuitive judgements about these apparent counterexamples are marginal: some speakers think they are OK but nearly all speakers judge that the sentences sound better with ‘him’ than ‘himself’. Whether these sentences constitute genuine counterexamples to Principle A depends upon their proper analyses. (See Kayne  for further discussion and see Boeckx , pp. 105–9, for a discussion of reconstruction effects that once seemed to violate Principle A.)6 But the important point is that such sentences are marginal for everyone; for the linguist and non-linguist alike.7
That's how evidence from intuitions can issue in such theoretical principles as Principle A. Principle A is supported by the intuitions data to the extent it is the best explanation of that data. An aim of grammatical theory is to explain these intuitive judgements of acceptability and unacceptability, and the structural interpretations that speakers judge sentences to have.
Acceptability and interpretability
In classifying the data, the term acceptable is used to refer to utterances that are relatively natural and easy to comprehend without any paper-and-pen analysis. Acceptability is a matter of degree. The unacceptable structures are relatively more difficult though they may be grammatical for all that. Chomsky gives the following characterization of acceptability:
[L]et us use the term ‘acceptable’ to refer to utterances that are perfectly natural and immediately comprehensible […] Obviously acceptability will be a matter of degree, along various dimensions […] The more acceptable sentences are those that are more likely to be produced, more easily understood, less clumsy, and in some sense more natural. The unacceptable sentences one would tend to avoid and replace by more acceptable variants, wherever possible. (Chomsky , pp. 10–1)
We can use the term interpretable to refer to the fact that a string has a natural interpretation, though it may have more than one such interpretation. Where a string is associated with more than one structural interpretation, we have structural ambiguity.
Intuitions about acceptability and interpretability can dissociate where, for example, speakers find a string acceptable in principle but uninterpretable as in (4):
(4) Colourless green ideas sleep furiously.
Although speakers struggle to assign (4) an interpretation and find it odd, they recognize it is unlike (5):
(5) *Ideas colourless furiously green sleep.
The string presented in (5) is ‘word salad’. Examples like (4) suggest that we can sometimes prise apart speakers’ grammatical sensitivities from their ability to find a literal meaning for a string. Speakers’ intuitions about (4) suggest that while they do not have direct awareness of its underlying structural properties, they do have an immediate sense of whether what they are confronted with has the structure of a sentence of their language and what that broad structure is.
Acceptability and interpretability as data sources are to be distinguished from the theoretical notion of grammaticality, and what is generated by a grammar. Speakers have no intuitions about what a grammar mandates, in the theoretical sense of a grammar that concerns linguists.8 This is reflected in the distinction between grammatical competence and linguistic performance. A speaker's grammatical competence system is just one component amongst an ensemble of systems responsible for their intuitions about acceptability and interpretability. Acceptability elicits or classifies intuitions but it is not something that can get a full explanation from linguistic theory as it looks to involve a range of factors beyond grammar. These include processing factors, semantic and pragmatic factors, as well as commonsense knowledge and contextual factors. If I say ‘I called the man who wrote the book that you told me about up’, this might seem rather unwieldy. But it is grammatical, where this means that the best generative grammars assign it a structural interpretation in just the way they do my less unwieldy utterance of ‘I called up the man who wrote the book that you told me about’. As a set of rules, the grammar will generate a set of structures. But how that set bears on what we do and don't find acceptable is a highly theoretical matter.
At deeper levels of explanation, where grammarians are concerned with very abstract principles and aiming for greater generality, they are not merely checking principles off against the observed intuitions. To do so would trivially achieve descriptive adequacy: that the theory assigns a structural description to every sentence of the language indicating how it is understood. But it would serve merely to recapitulate the data. The grammarian is always concerned with the explanatory adequacy of his theory: determining the actual grammar of the language a speaker has acquired from amongst the possible descriptively adequate grammars. This is a more stringent condition and confers a deeper level of justification. At deeper levels of explanation, theoretical virtues like the generality and simplicity of the grammatical principles will be more to the fore. But speakers’ intuitions will still play a guiding role in the investigation because linguists are concerned to explore the languages that speakers are actually competent in and not just to come up with simpler and more powerful grammars.
Though intuitions have a central evidential role in generative grammar, this does not suggest that other forms of evidence are irrelevant in principle or in practice. A central component of what Fodor calls The Right View of generative grammar is that there is no proprietary body of data such that we can tell a priori what evidence might bear on grammatical hypotheses (see Fodor , pp. 147–51). According to The Right View, not only speakers’ intuitions but also facts about language use, grammar acquisition, the neurology of speaker-hearers,9 ‘or, for that matter, the weather on Mars’ could, in principle, bear on grammatical hypotheses. The alternative to The Right View of generative grammar Fodor calls The Wrong View, according to which we can stipulate in advance what evidence counts as relevant to grammatical theory. Fodor finds The Wrong View implausible in light of the way that science is really conducted; to adhere to it, Fodor claims, would be to take exception to the methodological principles that characterize the more mature sciences.
Katz defends a species of what Fodor calls The Wrong View. For Katz, the evidence from speakers’ linguistic intuitions has precedence over all the other sorts of evidence. Katz labels such evidence ‘direct’, or ‘linguistic’, evidence and contrasts it with other forms of evidence, such as that from grammar acquisition or psychological experiment, which only constitute ‘indirect’ or ‘psychological’ evidence. On Katz's view, a linguist may get clues about grammar, when the ‘linguistic evidence gives out, by discovering psychological or neurological facts about speakers’. But, according to Katz, ‘indirect evidence depends on direct evidence for its legitimization as a relevant source of facts and direct evidence has a prior claim over indirect evidence.’ (Katz , p. 71). Katz thinks that other evidence can never compel us to revise or abandon a grammatical hypothesis that is supported ‘on the basis of unchallenged direct evidence.’ (Katz , p. 83).
In contrast, according to The Right View, there is no distinction between direct and indirect evidence for grammatical theories. There are just different sources of evidence that may be more or less useful in our current state of knowledge. In principle, an experiment in speech perception or a piece of neurological evidence might be relevant to working out the form of a speaker's language, just as a speaker's intuitive judgement may be. In our current state of knowledge, evidence from intuitions is more readily available than neurological evidence for grammatical hypotheses. But there is no principled reason that other forms of evidence could not lead us to revise particular hypotheses that have been supported by intuitions data.
If one is a Platonist, like Katz, then one may be unmoved by methodological morals about general scientific practice because one denies that linguistics is like the other sciences that draw on empirical evidence. If linguistics is a part of mathematics or logic concerned with abstract objects, and in these mathematical sciences one can choose what is of interest, then one can stipulate that only a certain range of data are to be included amongst the ‘linguistic’ data. Platonists can then focus on the mathematical problem of formally specifying a grammar that predicts a certain range of data, such as speakers’ intuitions. However, there is no particular reason why Platonists should attend to just those grammatical properties of the languages speakers actually acquire.
According to The Right View, generative grammarians are interested in explaining and predicting, inter alia, speakers’ intuitions. The difference is that, on The Right View, grammarians are interested in intuitions because they hypothesize, rather than stipulate, that intuitions are revealing of the target of inquiry: the grammars speakers acquire. On The Right View, the intuitions evidence has no privileged status. So conceived, the grammarian wants to know what sort of grammars can be acquired and how, how speech is understood, how language interacts with other areas of cognition, what aphasics and schizophrenics reveal about language, what we have that animals lack: ‘in short, all that stuff that got people interested in studying languages in the first place’ (Fodor , p. 60). Fodor warns proponents of The Wrong View that while they are free to adopt a proprietary, or a priori, conception of the ‘linguistic’ evidence and pursue such an inquiry, ‘all the action is at the other end of town’.
To take a schematic example of the sort of evidence that could be useful to grammarians, consider evidence concerning speech processing and how it could be used to help filter out the effects of the parser that exhibits a different organization to grammatical competence. Let's suppose we had two differently structured grammars, G and G′, that hitherto could both explain a speaker's intuitions, and a theory M of the organization of short-term memory in human adults that has received some independent confirmation. If the conjunction of M and G predicted that triply self-embedded sentences are not construable by human adults, whereas the conjunction of M and G′ predicted the contrary then we have evidence for preferring G to G′; though they might make the same predictions about the intuitions of speaker-hearers’ independently of the evidence from short-term memory.
Linguists draw on evidence from grammar acquisition in meeting explanatory adequacy, evidence from pragmatics in discerning what falls within the core language faculty and what falls outside it, evidence from pathological cases and evidence from work on the brain. (See Pettito , pp. 97–8, for a discussion of evidence from cases of brain impairments that may support Chomsky's postulation of a level of phonological representation common across hearers and signers.) But one might wonder why, if linguists are really interested in this broad array of data, they seem to ignore a lot of readily available psychological evidence from speech processing and production.
The relation between evidence from speech processing and generative grammars is delicate. To take an illustrative example, Quine once argued that the phrase boundaries that grammarians posit are just artefacts of their theories, as he thought they would be with formal languages, rather than a reflection of anything real (see Quine ). Quine claimed that for formal languages, there is no ‘right’ syntax; one can arbitrarily pick one that generates the right theorems, and by analogy, that generative grammarians can just pick a grammar because the only thing that is real is the set of strings that the rules generate.10 Quine argued that it was wrong to assume that there is a true answer to the question of where the phrase boundary is in strings of the form ABC. He thought it could be between B and C or between A and B as one liked, so long as the same strings are preserved.
But later some psychological experiments in speech processing were carried out called the ‘click’ experiments, which led Quine to change his mind. In the click experiments subjects were presented with sentences like (6) and (7). With the bracketed material included, we get different readings of the non-bracketed material and seem to process the non-bracketed material differently. (These examples are taken from Collins .)
(6) [Your] eagerness to win the horse is quite immature.
(7) [In its] eagerness to win the horse is quite immature.
In (6) we leave a main break in between horse and is, whilst in (7) the main break comes after win. If click noises are placed in the same objective positions (between, say, the and horse) in the acoustical stream as each of these sentences are uttered, subjects re-position the clicks in different places to reflect the main phrase breaks. After the click experiments were devised, Quine changed his mind about phrase boundaries and said that they are real because the click experiments show how you could get evidence to decide between the competing rule systems that generate them (see Quine ).
Chomsky thinks that this is a serious misinterpretation of what the experiments establish. As Chomsky (, pp. 125–7) sees it, the work on clicks serves only to test an experiment and not to test for phrase structure. The work on clicks can test whether clicks are displaced in a way that accords with phrase boundaries. But if the click experiments had been out of step with phrase structure in clear cases then it would not have suggested that phrase structure be revised to fit with click displacement. It might equally well suggest that the experiment was poorly designed as an indicator of phrase structure. One would not, for example, hypothesize that phrase boundaries come in the middle of a word on the basis of click displacements being heard in the middle of a word. Chomsky thinks that it would suggest instead that the experiment is not fit for purpose because the displacements suggest the wrong structures in clear cases. We make robust intuitive judgements that provide evidence for where the phrase boundaries are in our language and if the click experiments do not gel with these judgements then the grammarian may have a reason to reject the connection between click displacement and phrase boundaries. So it is not so straightforward to determine what is suggested about the structure of a speaker's grammatical system from experiments that target the way we process speech.
The Orthodox Model: Linguistic Intuitions as Data for Psychological Theories
How do intuitions bear on competence theories?
Psychologists elicit intuitive responses to presented material in such diverse areas as ‘Theory of Mind’ and developmental research11, reasoning12, moral cognition13 and throughout vision science. To focus on the visual case, just as reports of visual impressions constitute data for theories of the visual system that processes visual information, so, on the orthodox model, linguistic intuitions constitute data for theories of the grammatical competence system that constrains the linguistic forms that a speaker finds acceptable and how they can be interpreted. On this model, intuitions data are brought to bear on a theory about a core component of speakers’ internal linguistic organization: grammatical competence. The character of a native speaker's intuitions leads us to ask:
What must her internalized grammar be like […] for her to find these arrangements of words acceptable but not those; for her to be able to interpret a sentence in this way but not in that. To arrive at specific hypotheses about the internalized grammar we reason counterfactually: had the grammar been different, had it not respected a particular constraint then it would have been possible to hear certain utterances differently. (Smith , p. 959)
One of Devitt's charges ([2006b]) against this orthodox model is that there is currently ‘no account of how the rules embodied in the language faculty could provide intuitions about syntactic facts’. In one sense, Devitt is correct. The orthodox model provides only a very partial explanation of how grammatical competence could issue in intuitive judgements of acceptability and interpretability. There is currently no complete explanation of how intuitions are produced, only a partial explanation of the character of those intuitions in terms of the structure of an underlying system of grammatical information and systems for putting that information to use. As Chomsky (, p. 270) has pointed out, ‘we do not, of course, have a clear account, or any account at all, of why certain elements of our knowledge are accessible to consciousness whereas others are not, or of how knowledge, conscious or unconscious, is manifested in actual behaviour.’
The structure of the competence system provides some explanation of the form and character of the intuitions, and the performance mechanisms are intended to provide some explanation of how the linguistic forms licensed by the competence system are employed in speaking and understanding, and the more off-line judgements made on the basis of our comprehension of linguistic material. But Devitt is right that there is no complete theory of how competence and the language mechanisms issue in linguistic intuitions. The explanation of this is that, for the reasons outlined, intuitive judgements of acceptability and interpretability (and indeed, conscious awareness) are not phenomena that can get a full explanation from a theory of grammatical competence. The broader empirical challenge is to try and understand all the different factors involved in linguistic judgements, in the same way that we might try to understand the factors that shape judgements in other areas of psychology. After all, there is currently no account of how the computations of the visual system issue in conscious intuitive judgements about the properties of a presented scene. So, it's not clear that there is a special problem with language.
Speakers’ linguistic intuitions are far more discriminating than one might expect. For instance, sentence (8) is ambiguous between a reading on which duck and swallow are both nouns and one on which they are both verbs.
(8) I saw her duck and swallow.
Interestingly, it is two ways and not four ways ambiguous in natural language. We can hear (8) as containing the two verbs duck and swallow. We can also hear it so that duck and swallow is an NP containing the nouns duck and swallow. But no speakers hear it as having mixed readings on which duck is a verb and swallow is a noun, or vice-versa, and it is never uttered with this meaning. It is logically possible that the sentence should have the mixed readings. And we could artificially stipulate that the sentence was to be understood in such ways. This would be to create a piece of artificial language, since no one naturally acquires such a language. The fact that English-speakers don't intuit these mixed readings can be taken as evidence concerning the organization of their grammatical competence.
The explanation of the relevant intuitions is that the competence system is structured according to a co-ordination constraint. The constraint determines that we can only conjoin constituents of the same grammatical category. This hypothesis about grammatical competence, supported by the evidence from intuitive judgements about forms like (8), explains why speakers are unable to achieve such logically possible mixed readings. The interpretations that speakers can consciously hear, and then judge, such expressions to have, are crucial evidence for, or against, this hypothesis about their grammatical competence. I only hear (8) two ways, and I can only consciously hear, or attend to, one of those interpretations at a time. Once I recognize that (8) has these two readings, I can consciously switch my attention back and forth between these intuited structural properties.
Intuitions and judgements
Though grammarians do not tend to distinguish explicitly between intuitions, judgements, and intuitive judgements, it may be that intuition and judgement are picking out distinct aspects of speakers’ engagement with language. The term ‘intuition’ seems to refer to the unreflective take or awareness that the speaker has of linguistic form, whilst ‘judgement’ seems to refer to the formation of a report on the basis of that intuitive take or impression.14
It also seems that nothing very intellectualized is meant by intuitive judgement in this context. Intuitive judgement might suggest that the data take the form of a speaker-hearer judging that a linguistic form is grammatical, ambiguous, and so on. Indeed, Devitt ([2006a], p. 95) thinks that the relevant sense of judgement is ‘metalinguistic judgements about acceptability, grammaticality, ambiguity, coreference/binding, and the like’. But in proffering their linguistic judgements speakers are not generally required to have linguistic concepts with which to express the status or structural interpretation they have assigned to linguistic material. As Collins notes:
We are interested in how speaker/hearers interpret strings, either their own or those of others. This covers a panoply of different attitudes. Most often, the data are simply that speaker/hearers find a string unacceptable. Period […] Other times, we might be after a more explicit judgement, and so we ask, ‘How many ways ambiguous is the sentence, I had the book stolen?’. Other times we might ask, ‘Who is fixing the car in the sentences Bill told Sam to fix the car and Bill promised Sam he would fix the car.’ (Collins , pp. 7–8).
To be capable of interpreting linguistic material, speakers need not have any metalinguistic concepts with which to categorize the material or any special expertise beyond competence in their language. No expertise is required, only an honest report of how things strike one. In this respect, the linguistic intuitions data is analogous to the data for other psychological theories, where ‘there is no relevant expertise about the data beyond the authority of the subject's own perceptions.’ (Slezak [unpublished], pp. 33–4).
On the orthodox model, a speaker's intuitions are simply cognitive data to be explained and eliciting a speaker's linguistic intuitions does not require attributing to them any of the theoretical concepts that animate grammatical theory. If a linguist says that a speaker has the intuition that a reflexive must be locally bound, this is just a shorthand way of saying that a speaker has linguistic intuitions that can be explained on the basis of his possessing a grammatical competence, organized according to principles involving reflexives, locality, and binding.
Linguistic intuitions and visual impressions
Much of the evidence for computational theories of vision has come from subjects’ responses to presented material, either in the form of reports on the way that things appear or seem to them, or their use of such appearances to carry out visual tasks. Chomsky suggests a comparison between the way that speakers’ intuitive responses to linguistic material are brought to bear on generative grammars and the way that subjects’ reports in visual experiments are brought to bear on theories of vision:
A generative grammar attempts to specify what the speaker knows, not what he may report about his knowledge. Similarly, a theory of visual perception would attempt to account for what a person actually sees and the mechanisms that determine this rather than his statements about what he sees and why, though these statements may provide useful, in fact, compelling evidence for such a theory. (Chomsky , pp. 8–9)
Chomsky (, p. 125) takes the study of the computational operations of grammatical competence, on which the intuitions evidence bears, to be the study of ‘mental representations and computations, much like the inquiry into how the image of a rotating cube in space is determined from retinal stimulations.’
One similarity between the experimental investigation of vision and the investigation of speakers’ linguistic intuition is that the intuitive takes speakers have on linguistic material are pre-doxastic in a way that compares with visual appearances. And the pre-doxastic nature of linguistic intuitions and visual appearances is of interest to the grammarian and vision scientist, respectively. Upon presentation of a Kanizsa triangle (Figure 1), subjects report an impression of an equilateral triangle with its corners in the circular (pacman-like) elements of the presentation.
This impression of a triangle exhibits belief-independence because it can be had even by subjects who do not believe that there is a triangle there and who have seen how the illusion is created by comparing the two boxes in Figure 2.
There are a large number and variety of visual illusions such as the Necker Cube and Muller–Lyer lines, which can be enjoyed or suffered even by those who do not believe in the veridicality of the appearances. They provide important evidence about how the visual system fills in and processes the information that is input to it. These visual seemings or impressions that are generated in the course of visual processing clearly encode more information than is given to the senses, and are of particular interest for precisely that reason. They are sometimes called percepts to highlight that they are impressions or seemings, and distinguish them from genuine perceptions.
Such mental capacities as vision, which exhibit independence from belief and general intelligence, are said to be encapsulated. As Fodor () originally employed this notion, informational encapsulation meant that the computations that a system carries out are defined over a restricted base of information and not penetrated by central cognitive processes, such as those involved in belief and theory-formation.
Linguists’ interest in speakers’ intuitions is comparable along this dimension. With strings like (9), an impression of a complete structure can persist despite our coming to believe that (9) does not have such a structure:
(9) Many more people have been to France than I have.
If we try to fill in the structural ellipsis, we see that the sentence makes no sense: Many more people have been to France than I have (been to France?). Strings like (9), particularly if read at sufficient speed, can strike us over and over as having a full structural interpretation even once careful inspection has revealed to us that they have none, or we have come to believe as much on the basis of testimony. Our initial linguistic judgements about the properties of (9), and our revision of those judgements upon closer inspection, suggest as much.
An inspection of speakers’ intuitions about sentences such as (10) and (11) suggests hypotheses about the grammatical information they possess, and a cleavage between this information and speakers’ abilities to use the information in real-time. These sentences can strike me, again and again, as lacking a full structure even once prompting, extended attention or the testimony of others has issued in contrary beliefs:
(10) The man the cat the dog bit scratched died.
(11) The horse raced past the barn fell.
(11a) The horse raced past the barn.
Given some concentration, time, or prompting, speakers can come to recognize the structure of (10) and judge that it is a sentence of their language, though they tend to find it unacceptable at first blush and would never use that structure. Once I recognize that (10) is a double centre-embedding then I can pair off the embeddings and the structure becomes apparent: the dog bit the cat that scratched the man that died. We can see this if we start with the sentence the man died and then embed the clause the cat scratched, which describes the man, to give us the man the cat scratched died. Then we can embed the dog bit, which describes the cat, into the embedded clause yielding (10). Running through that procedure with (10), I can keep its grammatical structure firmly in view and judge its broad structural properties. But as soon as my attention lapses, I lose that structure and (10) again strikes me as lacking it.
What is stopping me recognizing the sentence as part of my language is the lack of attention and other resources required to process the embedding. In the case of (10), and a vast range of other cases, it is not the language I know that rules out the structure but the extraneous factors involved in using such grammatical information, in this instance to repeatedly centre-embed. The explanation on offer is that when presented with (10) something masks my standing competence with centre-embedded structure. The distinction is suggested by a wide range of phenomena concerning what one can immediately parse and what one can come to recognize with added performance resources such as time, extended attention or bracketing. And linguists need both sorts of principles—the recursive operation of a grammar and the rather different organization of the performance systems—to explain these data.
‘Garden path’ sentences, like (11), strike many English-speakers as leaving the verb ‘fell’ dangling off the end of an otherwise good sentence (11a). The intuitive consideration of (11) via which a speaker comes to structure it so that the dangling verb is the main verb and raced past the barn is an embedded clause, may forever erase this impression that fell is dangling. But it needn’t. The intuition can be robust and a construal of (11) on which The horse raced past the barn is a sentence rather than a determiner phrase can continue to suggest itself. Such structures ‘lead us up the garden path’ and there is a residual impression of unacceptability. Though a parsing explanation of this phenomenon is available on which speakers first find the tensed phrase (11a) and so do not structure (11) such that fell is the main verb, the explanation may in fact be partly grammatical. Compare (12) and (12a):
(12) The paint daubed on the wall stank.
(12a) *The paint daubed on the wall.
Sentence (12a) is not structurally ambiguous in the way that (11a) is. The paint can't daub whereas the horse can race. The subject's grammatical proclivities can be probed in this way by varying the presented material and seeing how the immediate intuitive take varies. Such examples suggest that linguistic intuitions provide evidence for investigating an encapsulated grammatical system and a distinction between a system of grammatical competence and integrated performance systems.
As with visual experimentation, there can be priming effects. If an ambiguous sentence such as (13) is presented in a certain context, the hearer may take it in a unique way and fail to see the ambiguity.
(13) Flying planes can be dangerous.
In such instances speakers may even reject the second proposed interpretation as unnatural or contrived. Nevertheless, the speaker's ‘intuitive knowledge is clearly such that both interpretations are assigned to the sentence by the grammar he has internalized in some form.’ (Chomsky , p. 21). This knowledge can be drawn out, sometimes in quite subtle ways to determine the actual form of the underlying competence. We can see this by taking a less transparent ambiguity like (14):
(14) I had a book stolen.
Few hearers will notice the fact that this structure is three ways ambiguous. But the fact that their internalized grammar provides three structural descriptions for the sentence (corresponding to my having the book stolen from me or for me, or my stealing the book myself) can be brought out by providing elaborations on (14) and gathering intuitive judgements:
(14a) I had a book stolen from my car when I stupidly left the window open.
(14b) I had a book stolen from the library by a professional thief who I hired to do the job.
(14c) I almost had a book stolen but they caught me leaving the library with it.
In bringing out the three-way ambiguity of (14), we do not have to present the speaker with any new information about his language; we only need to arrange linguistic material in such a way that the structures his grammatical competence affords him become available. He then judges accordingly.
Linguists have clever ways of controlling for pragmatic effects on linguistic judgements. Consider ‘minimal pair’ experiments (see Crain and Thornton ). Speakers in these experiments are presented with strings that are hypothesized to differ only in that one fails a certain grammatical constraint. The speakers are asked, simply, which is a worse sentence of their language. Naturally, such controls do not eliminate the intrusion of pragmatic factors, but rather aim to marginalize them. They reflect the fact that the grammarian is not so much concerned with what might be conveyed or implied by using a string in a particular communicative context. The ‘minimal pair’ experimental setting serves to strip away some of that context and leave the speaker to make a report revealing of the structural materials that are immediately available to him on the basis of the linguistic material alone.
There is evidence that the orthodox model I’ve outlined is precisely the model of linguistic intuitions as psychological data, analogous to visual reports, which Chomsky has in mind:
A grammar is a system of rules that generates an infinite class of ‘potential percepts’ […] In short, we can begin by asking ‘what is perceived’ and move from there to the study of perception. (Chomsky , p. 168)
The comparison has been noted by others. Slezak thinks that the familiar perceptual phenomena involving Kanizsa illusory contours and the like, where visual percepts are used to investigate perceptual constancies, are just like the intuitions reported on in linguistic judgement. He remarks that:
The two interpretations of the Necker cube known intuitively to a ‘visual virtuouso’ are closely analogous to the two meanings of an ambiguous sentence known as the percepts of the native speaker. (Slezak [unpublished], p. 34)
Longworth has developed the same theme, comparing visual reports with reports of one's intuitive take on linguistic material. He compares ‘quasi-perceptual’ grammatical appearances with the role of perceptual appearances in vision science. Longworth considers visual experiments where subjects are presented with various patterns of printed marks and asked what they can make of those marks, whether some seem closer together than others, and so on. The key point is that the reports that subjects are requested to make are not reports on the properties that they believe the marks to have, for:
One may very well know that the marks are equally well spaced on the page. What one is asked for are reports about how the marks strike one, or how they seem to one, where how they seem to one is typically impervious to how one believes them to be. (Longworth [unpublished], p. 11)
The intuitive reports are reports on one's experience. They serve as mental meter readings.
Are linguistic intuitions the ‘voice of competence’?
Devitt's main bone of contention with this orthodox model is highlighted by the name he gives it: the ‘voice of competence’ view. Devitt thinks that the orthodox model is committed to speakers having a direct access to the grammatical properties and principles that organize their grammatical competence. He calls this ‘Cartesian access’, comparing it to the sort of direct access Descartes thought we had to the contents of our own minds. Devitt then wonders why, if linguistic intuitions are the voice of our grammatical competence, we cannot read off the properties of the grammars speakers are competent in from their intuitions. As he puts it, ‘if competence really spoke to us why would it not use its own language and why would it say so little?’ (Devitt [2006a], pp. 100–3). Devitt thinks that if the source of linguistic intuition were our grammatical competence then we should have intuitions that give articulation to the very properties that characterize our competence. But our intuitions do not seem to give articulation to those grammatical properties that are only revealed by the theoretical inquiry into grammar.
Devitt is correct that we do not have the kind of direct awareness of the underlying grammatical properties licensed by our competence system that he thinks the orthodox model appeals to. Fiengo's attitude to the idea that we have such direct awareness seems to me to be representative:
[I]t goes without saying that we have no such awareness. If one is in any doubt, all one need do is reflect on the fact that syntactic proposals for even the simplest sentences are often in debate [For if we had such awareness] much that is debated in Linguistics could be settled by appeal to the intuitions of speakers. We could ask them what the structures of sentences should be, and they could tell us. (Fiengo , p. 258)
But there is a way to answer Devitt's question about why linguistic intuitions do not ‘use their own voice’ and why they ‘say so little’ that is suggested by the orthodox model. Grammatical competence does not ‘use its own voice’ insofar as the properties of the sub-personal competence system are not available to mere personal-level reflection. We have to make a theoretical inference from a speaker's judgements of acceptability and interpretability to the structure of the underlying competence and its place within wider performance systems. The competence ‘says so little’ because grammatical competence is only one factor involved in linguistic judgement that engages systems of linguistic performance and more besides. Grammatical competence is not all that speakers bring to bear on presented strings. This has always been Chomsky's view:
The unacceptable grammatical sentences often cannot be used, for reasons having to do not with grammar, but rather with memory limitations, intonational and stylistic factors, ‘iconic’ elements of discourse (for example, a tendency to place logical subject and object early rather than late) and so on […] we cannot formulate particular rules of grammar in such a way as to exclude them. (Chomsky , p. 11)
We don't know how conscious judgements are derived, or the mechanics of the role the linguistic systems play in issuing in these judgements. Linguists infer that a structured grammatical competence system shapes these intuitions, but are well aware that linguistic intuitions are not an unproblematic reflection of the underlying competence.
In that sense, Devitt is absolutely correct to argue that intuitions are not the ‘voice of competence’. But then he is wrong to claim that this is a commitment of the orthodox model. If grammarians do routinely think that linguistic intuitions are the ‘voice of competence’, then it is apparent that they must think the voice is a very muffled one. The relation between the intuitive judgements and the structure of competence is not transparent; it is a highly theoretical matter to determine what it is.15 As Fodor notes, in offering their intuitive takes on strings, subjects have access only to the upshot of their linguistic systems including the grammatical system and the performance systems. So their intuitive judgements will not give voice to the internal organization of those systems. The internal organization of the competence and performance systems, the yields of the systems taken individually and their manner of interaction will all be ‘completely opaque’ to speakers as they respond to linguistic material (Fodor , p. 60). So there is no reason, on the orthodox model, to expect a speaker's judgements to give ‘voice’ to their competence in the way Devitt suggests. The intuitive judgements target the very broad properties of the acceptability and possible meanings of particular pieces of linguistic material, so do not ‘say’ anything about the deeper, general, and highly intricate properties of the competence system involved in their aetiology. We may conclude, as Fiengo suggests, that the fine-grained grammatical properties are not accessible to conscious intuition, whilst keeping distinct ‘intuitions, which are conscious states, and those processes of which we are unconscious that perhaps underlie our intuitions.’ (Fiengo , p. 257).
Devitt cites a number of passages from linguists and philosophers that he claims support the attribution of a ‘voice of competence’ view (Devitt [2006a], pp. 96–7). There are two things I think worth noting about the textual evidence Devitt adduces. The first is that what scientists say about the nature of their investigation and the actual nature of their investigation might come apart. Perhaps some linguists do mistakenly attribute the evidential significance of the linguistic intuitions they examine to their being the ‘voice of competence’. The second is that it is unclear that any of the textual evidence Devitt cites actually supports the attribution. Space permits me to consider only an illustrative example. Devitt cites Chomsky's claim that in some cases ‘conscious knowledge … follows by computations similar to straight deduction’ from the principle that organize the competence (Chomsky , p. 270). Devitt claims that this is sufficient to attribute to Chomsky the view that linguistic intuitions give voice to the underlying facts about the grammars speakers are competent in. But it is not. In the passage in question, Chomsky is trying to answer Dummett's worry about how unconscious knowledge issues in conscious knowledge. Chomsky argues that aspects of our conscious knowledge of language, such as our knowledge that ‘John’ needn't mean ‘him’ in ‘John shaved him’, can be deduced from our unconscious knowledge of binding theory. These particularized pieces of conscious knowledge are a consequence of the principles of UG. But none of this, Chomsky claims, neither the UG possessed nor the way its consequences are computed, is accessible to the speaker. So it is unclear why Chomsky would think that it might be voiced in their intuitions. At no point does he claim that this conscious knowledge gives voice to grammatical information or the principles of the grammatical competence system.
More generally, the breadth of evidence that grammarians appeal to (indicated in Section 2.3) might itself lead one to doubt the attribution of the ‘voice of competence’ view. If speakers’ linguistic intuitions voiced their competence then why would grammarians be so keen to unearth a range of other evidence in theorizing about the competence system? Moreover, the comparison with the case of visual psychology suggests that there is nothing peculiar going on in the linguistics case that merits the ascription of the ‘voice of competence’ view. Subjects of visual experiment do not give voice to the content and organization of their visual system. We could equally well appeal to the other examples of psychologists investigating subjects’ intuitive responses, cited above, to show that nothing in the employment of intuitions data is suggestive of special access to the properties of an underlying cognitive system.
Are linguistic intuitions and visual reports disanalogous?
Devitt claims that ‘Perceptual judgements are not good analogues of linguistic intuitions.’ (Devitt [2006a], p. 112). Devitt argues that there is an important disanalogy between linguistic intuitions and the perceptual reports drawn upon in vision science. He thinks that what the visual module delivers to the central processor is the impression on which a judgement about what is seen can be formed whilst what is delivered to the central processor by the linguistic systems is an impression of what is said.16 The important difference, according to Devitt, is that whereas judgements about what is seen are the ones of interest to the vision scientist, judgements of what is said are not, at least on his model, the topic of discussion when considering grammatical intuitions. The grammarian, Devitt claims, is interested in the grammatical properties of expressions and to this end is interested in speaker's intuitions about grammatical properties. So, Devitt denies that the relevant linguistic intuitions are derived in a way analogous to the way perceptual judgements are derived from the outputs of the visual module. Such intuitions about what is said are not the topic of discussion on his view.
This argument against the orthodox model is unconvincing for two reasons. First, and fundamentally, it is unclear why the only intuitive materials made available to judgement by the linguistic systems are intuitions about what is said rather than intuitions about the acceptability of linguistic forms and their possible structural interpretations. The examples I’ve considered in outlining the orthodox model are all suggestive of a contrary view. Speakers have a sense of broad structural properties of pieces of language, as evidenced by examples (4), (5), (8), (13), and (14), and which are the acceptable forms, as evidenced by examples (1), (2), (3), (10), and (11).
Second, as Devitt himself has pointed out, intuitions about what is said are of interest to the grammarian. These intuitions are revealing of linguistic form, because linguistic form acts as a constraint on speakers’ understanding of what is said. Though these intuitions of what is said are informed by more besides, in particular by semantic and pragmatic information, linguistic structure is an important determinant. The sorts of intuitions drawn upon by grammarians and pragmatists are not sharply discontinuous. The speaker has intuitions about what is said on a given occasion that are partially determined by his immediate recognition of the structure of the expressions of his language. Devitt does not think that such intuitions are the topic of the theory of grammatical intuitions because they do not involve speakers’ having intuitions about theoretical properties like c-command and binding, in the sense of making explicit mention of these theoretical properties. But, on the orthodox view, if the intuitions are being used as evidence about the internal structure of grammatical competence and how competence is organized in terms of such properties, then the intuitions are of obvious relevance though they do not involve speakers’ overtly considering such theoretical properties. The informant need have no way of describing sentences; he need only associate various first-order meanings with sentences. As structure constrains interpretation, it is then a theoretical matter to determine what reflects grammatical competence as opposed to other competences and performance factors. The structure of the grammatical competence that is targeted on the orthodox model is part of the explanation of such comprehension of what is said. That is why judgements about what is said, associations of first-order meanings without metalinguistic categorizations, are important to grammatical theory and determining the contours of speakers’ languages.
Devitt's Model: Linguistic Intuitions as Theory-laden Judgements
Devitt claims that by ‘linguistic intuitions’ linguists mean ‘fairly immediate unreflective judgements about the syntactic and semantic properties of linguistic expressions, metalinguistic judgements about acceptability, grammaticality, ambiguity, coreference/binding, and the like.’ (Devitt [2006a], p. 95). And Devitt's model for these linguistic intuitions is that they are ‘opinions resulting from ordinary empirical investigation, theory-laden in the way all such opinions are.’ (Devitt [2006a], p. 98). On Devitt's view, speakers’ linguistic intuitions are central processor responses to linguistic phenomena, like utterances. Linguistic intuitions, as he claims of intuitions generally, differ from other empirical, central processor responses in being fairly quick and unreflective.
Devitt lists as the third major conclusion of his book, ‘Speakers’ linguistic intuitions do not reflect information supplied by the language faculty. They are immediate and fairly unreflective empirical central-processor responses to linguistic phenomena. They are not the main evidence for grammars.’ (see the glossary of Devitt [2006a]). Devitt argues that speakers’ linguistic intuitions are not the upshot of a dedicated system of grammatical competence interacting with linguistic performance systems. Rather, on his account, linguistic intuitions are fairly unreflective or ‘low level’, theory-laden judgements about the grammatical properties of languages. They are ‘low level’ in that speakers do not typically enter into much serious reflection upon the properties of their language or have knowledge of any scientific linguistics. They are theory-laden in the sense that they involve central processing, or general intelligence, in working out the properties of external linguistic stimuli, albeit relatively immediately. A consequence of Devitt's model is that we should trust a speaker's intuitions ‘to the degree that we have confidence in her empirically based expertise about the kinds under investigation.’ (Devitt [2006a], p. 104).
Devitt asks us to consider a comparison with a palaeontologist. The palaeontologist might be in the field searching for fossils. If she notices a piece of white stone protruding from a grey rock, she might form an immediate and unreflective judgement that the white stone is a piece of a pig's jawbone. We might trust the palaeontologist's judgement much more than we would trust an ordinary observer's judgement if the palaeontologist has spent years in the field studying, and so has a great deal of experience of old bones. In short she is ‘a reliable indicator of the properties of fossils.’ (Devitt [2006a], p. 104).
On Devitt's view, this account of theory-laden intuitive judgements ‘does not need to be modified’ where we are investigating the products of cognitive systems (Devitt [2006a], p. 106). Just as a palaeontologist reflects on the old bones and other objects they have been surrounded with, so a speaker might reflect on the language they and their speech community produce, and form linguistic concepts and opinions (which is not to say that they will become an expert). In virtue of producing and being surrounded by many utterances, speakers are ‘in a position to have well-based opinions about language by reflecting on these tokens.’ (Devitt [2006a], p. 109). Though they may not in fact reflect on the nature of their language, Devitt claims that such ‘intuitive opinions’ as they do have are empirical central-processor responses, the result of ‘education and reflection’. Hence, Devitt rejects the theoretical inference from the character of our linguistic intuitions to a competence system organized according to grammatical principles. He claims that his own explanation is more ‘modest’ in appealing only to the generation of intuitive judgements by central processing. And he points out that everyone should be committed to the existence of central processing and its role in forming judgements.
Devitt's model and belief-independence
But Devitt's view that linguistic intuitions are theory-laden judgements, derivative of central processor responses to external stimuli, is inconsistent with the pre-doxastic nature of linguistic intuition and its encapsulation. The view that linguistic intuitions are amongst our theoretically integrated judgements seems unable to accommodate the persistence of impressions of grammaticality and ungrammaticality through contrary beliefs. So Devitt would have to try and explain these phenomena away somehow. Devitt would also have to explain why these intuitive impressions seem to be more than just ‘relatively unreflective’. They seem to be mandatory. We can't help but hear the sounds of our language as structured and meaningful, forming an intuitive take on their form and interpretation independently of our choosing to reflect upon them.17 Further, our linguistic intuitions evidence special hierarchical and recursive principles that are highly language-specific. Devitt's view that these intuitions are central processor responses would have to accommodate these facts and compete with explanations that appeal to a dedicated competence system.18 I’ll argue (Sections 4.3–4.5) that further failures of Devitt's account reinforce the orthodox inference to the best explanation to the properties of the dedicated grammatical competence system.
Devitt's model and folk theory
Devitt's view of intuitions is ‘based on a view of intuitions in general’ (Devitt [2006a], p. 10); that they are conditioned by empirical theory. Intuitions, on Devitt's model, generally differ from other such theory-laden judgements ‘only in being fairly immediate and unreflective.’ (Devitt [2006a], p. 10). Consequently, for Devitt, the grammarian is a good source of intuitions because he has spent a lot of time reflecting on language and has more theoretical knowledge:
If the person is a linguist then she will of course deploy her concepts from her linguistic theory […] I think we should generally prefer the intuitions of linguists to those of the folk in seeking evidence. (Devitt [2006b], fn. 22)19
Ordinary informants are not such good sources of data, on Devitt's model, because they don't possess scientific theories involving concepts like c-command and binding; perhaps having only a little knowledge of verbs, nouns, and the like. This is in stark contrast to the orthodox model, according to which speakers are not being asked for their opinions about such properties of linguistic material at all. They are being asked only to respond to linguistic material in terms of such broad categories as how acceptable and intelligible they find it, which interpretations of it they come to, and how difficult it is for them to achieve certain readings. On the orthodox model, the intuitions gathered by linguists are just data to be explained rather than assessed according to their credentials as bits of theory or opinion. In contrast, for Devitt, a speaker's linguistic intuitions are amongst their theory-laden judgements and these intuitive judgements constitute a less powerful theory than the linguist’s.
To explain how ordinary speakers’ theory-laden judgements could count as evidence for the science of language, Devitt once drew an analogy with our intuitions about physical reality:
Just as physical intuitions […] can be produced by central processor responses to appropriate phenomena, so also can linguistic intuitions. These linguistic phenomena are not to be discovered by looking inward at our own competence but by looking outward at the social role that symbols play in our lives. When linguists do this now, they do not start from scratch. People have been thinking about these matters for millennia. The result of this central processor activity is folk, or otherwise primitive theory: the linguistic wisdom of the ages. The wisdom will be a good albeit not infallible guide to the nature of linguistic symbols. (Devitt and Sterelny , p. 522)
The analogy is unhelpful for Devitt. If it were good then our ordinary beliefs about physical reality could play an important evidential role in physics. But in physics one does not expect the folk's opinions to inform scientific theory, and there is no reason to assume that the concepts and constructs of ordinary thinkers carry over to scientific debate. Equally, there's no reason to expect folk opinion to constrain linguistics. As Neil Smith remarks:
In physics one does not expect folk views to inform the expert's theory construction, and while ethnoscience is itself an interesting field of inquiry, there is no reason to assume a priori that the concepts and constructs of pre-scientific debate should carry over unchanged into formal theories of I-language. (Smith , p. xv)
It is therefore a consequence of Devitt's model of linguistic intuitions that native speakers’ intuitions should not be afforded the central evidential role that they are afforded.
Devitt might try to soften this result by maintaining that speakers’ theory-laden judgements about grammar are largely correct. But if this were true then much that is debated by grammarians could be settled by appeal to speakers’ intuitions, requiring little scientific theorizing. Despite Devitt's commitment to ‘the linguistic wisdom of the ages’, he recognizes that his model would require some revision of existing methodology:
Where the judgements are those of the ordinary speaker, the theory will be folk linguistics. We do not generally take theory-laden folk judgements as primary data for a scientific theory. So we should not do so in linguistics. (Devitt [2006a], p. 102)
As Devitt agrees, it would be irresponsible to attribute so much significance to the folk's theory-laden judgements.
A modification to Devitt's model
Devitt has modified his view that linguistic intuitions are like theory-laden judgements about other aspects of the world in two ways. First, he now stresses that they are most comparable to intuitions we have about the outputs of other human competences such as ‘touch-typing and thinking’ (Devitt [2006a], pp. 593–4). Second, he now allows a role for grammatical competence in linguistic intuition. On Devitt's model, a speaker asked about a string of words, which first simulates the behaviour of attempting to produce or comprehend a string, and in doing so engages their grammatical competence. There is then some quick central processor reflection upon this experience in which speakers employ their theoretical grammatical concepts to arrive at a judgement.
But even this modified version of Devitt's model is inadequate. Smith brings out the problem that remains with Devitt's model using the following example (Smith [unpublished], p. 37):
(15) Bill believes that Bush is dangerous.
(16) Bill believes Bush is dangerous.
If we were to ask a speaker, presented with cases like (15) and (16), to do some quick reflection, and say whether they believed the that in sentences containing believes is optional, they would probably say that it was entirely optional. But it is clear that this reflection is not what grammarians are targeting in probing a speaker's linguistic intuitions. When we elicit speaker's intuitive responses to strings like (17), we get an intuitive judgement that reveals their grammar but is indifferent to such central processor, theory-laden judgements.
(17) *Bill believes that Hilary to be intelligent.
It is the language that speakers are immediately cognitively sensitive to, data of the latter sort, and not the irrelevant theory-laden reflections that the generative grammarian is targeting.20 In plumbing the speaker's intuitions, we want to find out what the speaker can immediately recognize as part of his language, and what structured interpretations he can get. As Longworth puts it:
The subject may be especially well placed to report on how things seem to them, but should not be taken to be authoritative about whether apparent properties are determined by their language systems […] In short, the linguist for the most part aims to treat subjects as objects of inquiry, rather than fellow inquirers. (Longworth [unpublished], p. 11)
It is Devitt's commitment, not shared by the orthodox model, that linguistic intuitions are a speaker's theory-laden reflections on grammatical matters that causes much of his consternation with the orthodox view. He asks us to compare the linguistic case to other cases of cognitive capacities where a set of rules is somehow encoded in us such as thinking and typing. As Devitt rightly points out, there is no path from the embodiment of these rules in a subject to that subject's having correct beliefs about these rules (Devitt [2006a], p. 118). Devitt correctly infers that there is no such path from the grammatical rules encoded in the competence system to theory-laden beliefs expressed in linguistic judgements. But as Longworth rightly notes, the orthodox model does not treat speakers as making authoritative theory-laden judgements about grammar because it does not treat them as theoreticians, the grammarian's ‘fellow inquirers’, at all. The commitment to a theory-laden conception of linguistic intuition is Devitt's own: not one the orthodox model shares. To be clear, proponents of the orthodox model should agree with Devitt that there is no path from the encoding of the deep principles of grammatical competence in a speaker to their having correct beliefs about those rules. Correct beliefs about the principles of grammatical competence are what grammatical theory aims for, not what speakers are taken to provide for the grammarian. Therefore, the orthodox view creates no mysterious access to the principles that characterize speakers’ competences.
Devitt's alternative view of the evidence
Ultimately, Devitt's own model of linguistic intuitions leads him to the following conclusion: ‘we do not generally take theory-laden folk judgements as primary data for a scientific theory. So we should not do so in linguistics’ (Devitt [2006a], p. 102). Devitt thinks that ‘Linguists greatly exaggerate the evidential role of the intuitive judgements of ordinary speakers.’ (Devitt [2006a], p. 120). He argues that we should not give the linguistic judgements of ordinary native speakers a central evidential role in grammatical theory. Rather he claims we should seek evidence primarily from a combination of corpuses, what speakers would say and understand in linguistic contexts, and the intuitions of linguists. He says:
The main evidence for grammars is not found in the intuitions of ordinary speakers but rather in a combination of the corpus, the evidence of what we would say and understand, and the intuitions of linguists. (Devitt [2006a], p. 100)21
On my view the orthodox model is left unscathed by Devitt's criticisms and it does not require revision of linguists’ methodology. But it is worth considering the alternative conception of the evidence that Devitt is proposing as I will argue that it has some unwelcome consequences.
Although linguists, like other scientists, have theoretical hunches (a sense of the sort of explanation certain phenomena might receive) this is not what they are interested in when probing their native knowledge of language. Theoretical hunches, whatever role they do play in theory construction, are not treated as evidence. As Fiengo notes:
[W]e say, perhaps of a linguist, that the linguist has the theoretical intuition that that is the analysis which should be given of the sentence in question. The term ‘intuition’, in this case, has a sense rather like that of ‘hunch’. Linguists say they have such intuitions or hunches, but they never constitute the data of Linguistics, rather they apparently occur among linguists during the practice of Linguistics, as they do among physicists during the practice of physics […] And on the other hand, my intuition that the sentence ‘Flying planes can be dangerous’ is ambiguous is nothing like a hunch. (Fiengo , p. 256)
So, the ‘hunches’ or theoretical intuitions of linguists will not form a central source of evidence and such hunches pick out a different phenomenon from their apprehension of the properties of their native language. As theorists’ intuitions of this sort do not constitute part of the evidence, I would claim, against Devitt, that this leaves him with corpuses and what we would say and understand upon presentation of linguistic material as the main sources of evidence.
Moreover, an investigation of the ‘evidence of what we would say and understand’ upon presentation of linguistic material is part of the linguistic intuitions evidence on the orthodox model. As I argued in Section 3, gathering evidence about what speakers would say or understand upon the presentation of linguistic material just is part of probing their linguistic intuitions, via their comprehension of speech. This is what linguists count as the linguistic intuitions of ordinary native speakers (see Sections 2 and 3). Hence, again, contrary to Devitt's view, I would argue that these cannot be part of his package for the main evidence for grammars if that package is to exclude what linguists normally call the linguistic intuition or intuitive judgement of ordinary speakers. Though Devitt would not class this source of evidence as the relevant linguistic intuitions, and would deny they are the main topic of discussion, I have argued that he is wrong on this point. Hence, I would suggest, against Devitt, that we are not here dealing with a proposed combination of evidence but instead one main source of evidence, since the considerations I’ve raised eliminate the two other sources from his combination. Note that this is not the view that Devitt propounds but the view he is left with, given my claims about the nature of linguistic intuition. So I’m going to focus on the remaining part of the evidential package that Devitt suggests: namely, that the corpus should be a central source of evidence for generative grammars.
If corpuses were the main evidence for grammars, this would re-orientate the grammarian's attention away from speakers’ proclivities towards properties of external outputs as the locus of grammatical inquiry. But there are major problems if one endorses the view that the corpus can play the central role in grammatical theory that has been played by the intuitions data.
This is because there are problems with narrowly focusing on performance events in order to determine the structures of a speaker's language. For these reasons, speakers’ intuitive judgements have been considered crucial data for generative grammars. Chomsky claims that ‘Linguistics is characterised by attention to certain kinds of evidence … largely, the judgements of native speakers.’ (Chomsky , p. 36) He thinks we cannot determine the grammatical structure of languages on the basis of gathering corpuses of performance events. Why does Chomsky think that the intuitions of speakers are so central to investigating the grammars of languages?
Rather than drawing on corpus data in isolation from speakers’ intuitions, corpus data are employed as part of a complementary package with speakers’ intuitions. Depending on the hypothesis that one is investigating, one might, for example, want to examine a corpus to see if a certain construction ever occurs. Or, in investigating language acquisition, one might want evidence about what is available to children in their primary linguistic data and what sorts of mistakes children make in acquiring their target grammar. As Collins suggests:
For example, take the hypothesis that children don't make ‘errors’ of a certain kind, say, ‘Children don't move auxiliary verbs from relative clauses in the attempt to form interrogatives’ […] one can look at databases of child speech to test this. One can also look at adult speech to see how common certain constructions are, or whether children receive ‘negative data’. (Collins , p. 7)
But it is crucial to hypotheses about grammatical structure that one uses pieces of a corpus in tandem with linguistic judgements, so as to work out how the expressions are actually structured rather than simply whether certain strings occur or not. The mere occurrence of an expression by itself doesn't tell you about its grammatical properties. If one wants to know how it is structured that requires speakers making judgements about its acceptability and interpretation. To this end, the grammarian's use of corpuses involves him, or his native informant, making intuitive judgements too. Any model that ultimately suggests a preponderant role for corpus data over speakers’ intuitions is mistaken.
Devitt rightly points out that linguistics textbooks are full of sample strings described as unacceptable, ambiguous, and so forth. These notions have a cognitive ring to them but Devitt thinks they are best interpreted as marking properties of the written strings. When linguists say that English-speakers have the intuition that a string is ambiguous, Devitt thinks that this is being employed as evidence that the string has the property of structural ambiguity. But as ordinary speakers are a fallible guide to grammatical properties on Devitt's model, one might think that we should go straight to the corpus: straight to the uttered and written strings, as the primary focus of grammatical inquiry, and survey their properties rather than the properties of speakers’ intuitions. On the orthodox view, there is much more to be said for the importance of linguistic intuitions.
A corpus is a list of strings that have been uttered or written down.22 This description of corpuses might be challenged on the grounds that it is unnecessarily austere. Perhaps we should think of corpuses as imbued with all sorts of other interesting information about grammatical structure. But it is difficult to see how linguists could so imbue corpuses without drawing on evidence from the judgements of native speakers. Grammatical structures involve special hierarchical dependencies amongst constituents. We can't determine these special structures of speakers’ languages simply by enumerating the strings that they produce, where the latter are flat lists of words. This is one reason why intuitive judgements are so important in gathering evidence for generative grammars: because they can be used to determine the way that speakers structure linguistic material. Consider (18) and (19) (the example is from Collins , p. 8).
(18) Mary expected to leave by herself.
(19) Bill wondered who Mary expected to leave by herself.
The individual who is leaving can differ between the cases, as we can tell by the acceptability of substituting herself for himself in (19) but not (18). But the fact that the underlined material has different structural articulation in (18) and (19) is not obvious from looking at the strings themselves without bringing such judgements about their interpretation to bear. From looking at this mini-corpus, one might think that (18) is simply embedded as the wh-complement in (19) and retains its structure. Examining a corpus may not suffice for determining the difference in the structures. The difference in structure between the two occurrences of the underlined material is usually explained in terms of the difference in the empty categories, where a copy of Mary is the subject of the infinitival clause in (18), whilst a copy of who fills that position in (19). Linguists determined this by investigating the different interpretations speakers give of these strings.
And crucially, an uttered string that occurs in a corpus may have been an ‘error’, an ungrammatical utterance. If the linguist were to count these strings as part of the language then they would be counting in too much in constructing their grammar. So linguists and their informants make judgements to discern amongst the produced utterances. As Stanley observes: ‘Ordinary discourse often involves the use of complex expressions which would be counted as ungrammatical even by the utterer's own lights.’ (Stanley , p. 408). Corpora contain no explicit information about which are the ungrammatical utterances, and such information is crucial to developing grammatical theories. The same is true in principle of written strings. We might have reason to think that people are more careful about what they write than what they say. Yet we don't want them to be too careful. There are lots of things that speakers easily recognize as linguistic forms but wouldn't write down. Contractions of ‘want to’ to ‘wanna’ may be one such example. We would miss these forms if we relied solely on written corpus data. So corpuses, in the absence of intuitive judgements, are not a perspicuous guide to acceptability.
The point is that a corpus contains a great deal of what is, for the purposes of investigating speakers’ grammars, ‘linguistic debris’. This includes ungrammatical but interpretable utterances, false starts, mistakes, slips of the tongue, half-expressed thoughts, unfinished sentences, interruptions, and utterances affected by deficits in memory, attention and motivation. So corpuses taken in isolation don't provide a perspicuous guide to the linguistic forms that speakers recognize or the construal that they put upon them. Chomsky's suggestion is that to get some explanatory perspective on such a record of performance events, requires not only the further evidence of speakers making judgements but also a distinction between speakers’ grammatical competence and the other factors that enter into these linguistic performances.
The utterances and inscriptions that make up corpora, are performance events: interaction effects amongst which only one factor is grammar. I have argued that Devitt's claims about grammatical intuitions have a consequence, which he neither desires nor anticipates. Whereas he thinks that a combination of corpuses, theoretical intuitions, and what we would say and understand constitutes the evidence independent of grammatical intuitions, I have argued that theoretical intuitions are not part of the evidence and that evidence from what we would say and understand is the relevant intuitions evidence. Contrary to the view Devitt espouses, these arguments remove these sources of evidence from his proposed alternative combination. His alternative will then run up against the following problem. Generative grammarians look to separate out the different factors that contribute to performance events rather than study the properties of the corpus as they result from the motley. That is the point of the theoretical distinction between competence and performance: to try and determine the grammar of the language that a speaker has actually acquired. The broad target is the system responsible for our sensitivity to linguistic form. This system can be explored by probing speakers’ intuitive responses to linguistic material as described in my account of the orthodox model.
It would be a waste of time to wait until the strings of theoretical interest happened to just turn up in written or spoken corpuses when the linguist can construct them himself or enlist the judgements of a native informant when the language is not his own. The crucial cases to test a theory may not be in the corpus. Chomsky recognized these problems with relying on corpuses in the absence of judgements early on, saying:
A corpus may contain examples of deviant or ungrammatical sentences, and any rational linguist will recognise the problem and try to assign to observed examples their proper status […] insofar as a corpus is used as a source of illustrative examples, we rely on the same intuitive judgements to select examples as we do in devising relevant examples with the aid of an informant or ourselves. (Chomsky , pp. 198–9)
Further, an uttered or written string, like ‘Flying planes can be dangerous’, that turns up in a corpus, may instantiate more than one grammatical structure of a language. Sentences are structured objects but ‘do not wear their structures on their sleeves, so it can easily happen that distinct structures sound the same.’ (Fiengo , p. 255). In such cases, the linguist limited to inspecting corpuses may miss out on structures that are part of the language.
To summarize the major problems with Devitt's model of linguistic intuition: (i) It is unable to accommodate the pre-doxastic nature of linguistic intuition (Section 4.2); (ii) Such folk theory-laden opinions would not be afforded an evidential role in a science (Section 4.3); (iii) In practice, linguists do not target speakers’ reflective judgements (Section 4.4); (iv) The model has the (perhaps unnoticed) consequence that a central evidential role should be afforded to corpus data, which it cannot bear. Corpus data works as a complementary package with intuitive judgements (Section 4.5).
In contrast, if a major source of evidence for generative grammars is native speakers’ pre-doxastic linguistic intuition, then a major source of evidence bears on the grammars that speakers have internalized in a system of grammatical competence. The orthodox model of linguistic intuitions and the evidential role they play looks in good shape.
I am very grateful to two anonymous referees for this journal who helped me make significant improvements to an earlier version of the paper. My special thanks go to John Collins, Guy Longworth, and Simon Riches for extensive discussion, and to Barry Smith for drawing my attention to many important examples of speakers’ intuitive sense of linguistic structure. My thanks also go to Richard Breheny, Craig French, Mark Kalderon, Will McNeill, Gabe Segal, Lee Walters, and the audience at the London Philosophy of Linguistics Seminar Series.