Unit 2: Research methods in pragmatics

Studying what speakers mean, and what hearers understand, requires us ideally to see into people’s minds. Since this is not possible, we have to take a slightly indirect approach to the subject. The question that we raised in Unit A2 was what counts as reliable evidence for our claims.

There are many kinds of data that can be used as evidence for claims about pragmatics, including what people say in natural situations, how other people react to what they say (participant response), and what people claim they would say in hypothetical situations. Observing natural conversation is subject to the ‘observer’s paradox’, which is that people do not always behave naturally when they are being observed. Nevertheless, it is possible to minimize this effect in several ways. First, we can simply attempt to ‘overhear’ as discreetly as possible what is going on around us – in the home, in shops, at school or on public transport, and note down what we have heard. It has to be remembered, however, that the researcher may not be able to overhear a wide variety of voices, and any collected data will necessarily be limited to the kind of contacts that he or she has. It has also been found that the researcher’s memory of what has been said is not always accurate. For greater verbatim accuracy we can set up recording sessions, preferably in environments familiar to the participants, and hope that they will soon forget about the recording and converse naturally. These conversations can then be transcribed and stored electronically as a corpus, and later the transcriptions can be searched using special software. Recording in the speakers’ natural environment, however, often involves extraneous noise such as clattering cups and plates, interruptions such as telephones, and overlapping speech that is difficult to understand. Here, too, it is likely that the people willing to be recorded are members of the researcher’s circle of family and friends, and as such are not necessarily representative of the wider community. Finally, we can observe conversations in the broadcast media – radio and television, both ‘real’ conversations, such as interviews, and fictional conversations, such as television or radio drama. These have the advantage of being of good technical quality, and as they are already in the public domain there is no need to ask for permission from the participants to use them. They have the disadvantage that the speakers are not conducting the kind of private conversations that might be of interest, but are by definition performing for a wider audience in addition to responding to their interlocutors.

Here we provide you with practice in three kinds of data analysis:

  • using corpus data: analysing a concordance and extracting lexical data from a corpus
  • observing speakers in action
  • using fictional conversation (TV broadcasts)

The purpose of each exercise is twofold: first of all to gain practice in the particular method of data collection and secondly, and very importantly, to begin to understand the complexities of working with naturally occurring data. ‘Naturally occurring’ data is the kind that is not created especially for the researcher, in contrast for example to data elicited in Discourse Completion Tasks, or questionnaires, about what you would say in a hypothetical situation. Such elicited data is also valuable, but these exercises will focus on the other kind – real language in use.

2.1. Using corpus data

2.1.1. Investigating a mini concordance

Concordance programmes allow the researcher to search a large electronic corpus of text automatically, and extract all the examples of a given word, phrase or other sequence. The British National Corpus (BNC) contains 100 million words of English. The spoken part constitutes 10 percent of the corpus and therefore contains 10 million words of transcribed speech, including informal conversations between many different speakers who represent a variety of ages, regions and social classes. It also contains spoken language in a range of different contexts such as business meetings and radio phone-ins. If we search for very common words or strings of words in such a large corpus we are likely to find hundreds or thousands of examples, far too many to study in detail. In such cases we sometimes take a manageable subset of occurrences, selecting a number at random, and focus on these.


Here is a random selection of 28 instances of what a (from a total of 4,543 occurrences in the BNC). They are not all used in the same way.

  • Group together those that seem to have the same or a similar function.
  1. What a coincidence, I’m going up there too,’ Tumbleweed told us.
  2. As the English might say: what a load of bollocks!
  3. What a relief this will be.
  4. He joined us from a Derbyshire side near his native Tibshelf and made his Southern League debut at Bristol Rovers on 7 December — and what a debut it turned out to be!
  5. What a fellow member of its same species sees of it, is quite different from what the individual feels about itself.
  6. What a lovely surprise.’
  7. The question of diet generally for elderly people has already been mentioned in Chapter 3, and all who are trying to ensure that an elderly parent (particularly one who is living alone) is eating sensibly know what a difficult problem this is.
  8. They’d know what a baby he was.
  9. So (v) what a person says, using the first person singular, present tense, of a psychological verb, is true or false precisely in so far as it is an expression of what he has inwardly observed.
  10. In this way, a system is expected to evolve in which teachers, employers, further and higher education, as well as pupils, can have confidence in what a given award actually means: it will indicate, clearly, what a pupil has been able to achieve.
  11. Different cultures have had different views of what a healthy diet is.
  12. What a mess.’
  13. What a pity!’ he sighed.
  14. What a horrible idea!
  15. What a waste of time.
  16. ‘I know what a delivery skipper does, Miss Levington,’ Nathan Bryce said curtly.
  17. She realised what a relief it was to have formulated that simple statement and how, having done so, the power of it to horrify her was already lessening.
  18. What a terrible experience for him.’
  19. The method works by making explicit the connections between the strategy of reading by remembering what a word looks like and the strategy of reading by analysing the sounds in words.
  20. What a lovely thought!’
  21. What a bad night.
  22. Now I'd often pushed my nose against the window and thought what a classy joint it looked.
  23. He was telling them you know what a good club it could be.
  24. My first reaction was what you know, what a gorgeous little boy.
  25. What a laugh though!
  26. What a good boy.
  27. Yeah I thought what a fool, I mean what did you expect me to think?
  28. I’m still not entirely clear what a microprocessor is.

2.1.2. Collecting corpus data: a search for examples of well

In Unit A7 you can read about pragmatic markers, which include little words like well and so. These are used very frequently in English conversation but their meaning is not so easy to explain. Well, for example, can function as an adverb (you did that very well) and also as a pragmatic marker, and may form the preface to an utterance (Well in that case, I don’t think I will.). The meaning of these little words can only be understood in context, and so it is important to practise looking at different examples to see how they are being used in each case. As we explained above, a large corpus of conversation will contain far too many examples to analyse, and so it is better to use a search method that selects a small number at random.


If you search for all the examples of well in the BNC, you will find 142,354 occurrences. To limit the number of examples (known as ‘hits’), input your search of well into this web-based server for the BNC: http://www.natcorp.ox.ac.uk/. This will bring up 50 random occurrences of well from the total of 142,354.

  • Determine how many of your hits are used as pragmatic markers.
  • In what other way(s) is well used?


  • Baker, P. (2006) Using Corpora in Discourse Analysis. London and New York: Continuum (especially the chapter on ‘Concordances’, pp71–93, which includes a useful ‘Step-by-step guide to concordance analysis’, p92)
  • Biber, D., S. Conrad and R. Reppen (1998) Corpus Linguistics: investigating language structure and use. Cambridge: Cambridge University Press (especially the ‘Introduction: goals and methods of the corpus-based approach’ pp1–18, and ‘Investigating lexicographic issues’ pp21–54)

2.2. Collecting observational data: speech acts as a case study

To find out what people say when apologizing, thanking, greeting or requesting you could search for certain words in a corpus, such as thank you, hello, sorry and please. However, this search would only identify some of the cases in the corpus – there may be other examples of performing the same speech acts using different words. How do we find out all the possible ways of apologizing or thanking? One way is simply to observe as many people as possible performing a specific speech act and making a note of how they do it. Once you have a list of words or phrases you can return to the corpus and find out how common the different ways are, who uses them and in which situations.

Observing people ‘in the field’, as it is called, is not as easy as it sounds. It can be very time-consuming, and you have to take care not to be too intrusive. Following someone around with a notebook is not likely to get very good results. On the other hand, if you just rely on hearing examples as you go about your normal business, you may not collect enough to analyse successfully. An alternative is to choose a particular place where you expect people to thank, greet or request – it could be a railway ticket office, a small shop, a travel agent or an information desk. You may be able to collect more examples if you stand nearby for a while, although the fixed situation will mean that some expressions will not occur at all. For this kind of research it is always necessary to collect the best data possible but balanced against what is a practical, legal and ethical way of doing it.


Choose one of the following speech acts

  • Apologizing
  • Thanking
  • Greeting
  • Requesting

Decide on the best place to collect your data, then take a notebook with you for a day or two and listen to what is being said around you. Make a note of the examples you hear, if possible the exact words, and, as far as you can tell, the context in which it was uttered. Persuade a friend to do the same, so that you can compare results. If you live in a country where English is not spoken, you should do the exercise in the local language. What matters here is not to find out about English, but to find out about the research method.

Your notes (if you are observing English) might look like this:

(Apology) I’m so sorry (woman (ca 40–50 yrs) to other woman (ca 20–30 yrs) after bumping into her in a supermarket.
(Request) Single to Manchester please (railway ticket office – man ca 50 yrs to salesperson)
(Greeting) Hi (young woman to other young woman without stopping to talk. Spoken in a high pitch and as if on two different musical notes)
(Thanks) Thanks mate (young man in paper shop – spoken quickly to shopkeeper after buying a newspaper)

When you have collected what you can, compare notes with your friend. How many different expressions did you hear? Are there any that you know are in common use but did not occur in your data? How useful do you think this exercise would be if you wanted to find out as many ways as possible of expressing the different speech acts? Do you think your way of collecting the examples led to a particular selection while excluding others? Would this be a good method for language learners to find out how to say these things?


  • Cameron, D. (2001) Working with Spoken Discourse. London: Sage (especially Chapter 2, ‘Collecting data’, pp19–30)
  • Thomas, J. (1995) Meaning in Interaction. London: Longman

3.1. Using broadcast/TV data

Collecting examples of language use in the way suggested above is time-consuming, and can in addition lead to ethical problems. A much quicker way is to analyse recordings of broadcast speech. These are already in the public domain, and there is no need to ask permission from the participants. However, activities such as interviews, phone-in programmes, discussions, do not necessarily mirror the kinds of activities that are common in everyday interaction. If you want to study language use that is closer to everyday casual conversation you need to turn to fictional programmes, such as plays, and popular drama series. These can be a very useful source of data, but of course they are NOT real life and it is important to examine the ways in which fictional and real-life conversations differ.


Listen to an episode of a popular television drama, and note down as many examples of one or more of the following speech acts as you can:

  • Apologizing
  • Thanking
  • Greeting
  • Requesting
  • How useful is this exercise for finding out the range of possibilities of expression?

Carry out the exercise with a friend and see if you arrive at the same results.

  • Are the results likely to be different from eavesdropping in public? Are there fewer disfluency features, for example?

Think about what you would say to a learner who watched the programme to learn English.

  • What would they need to know in order to use the expressions appropriately themselves?


  • Quaglio, P. (2009) Television Dialogue: The sitcom Friends vs natural conversation. Amsterdam: Benjamins

Note: The television drama Friends is available in transcription on the web. See: http://www.friendscafe.org/scripts/