An investigation into Corpus-based learning about language in the primary-school: CLLIP The classroom-based fieldwork Background Renewed interest in children learning about language (NLS) ESRC funded project (Economic and Social Research Council) Potential of a corpus-based approach

for childrens evidence-based investigation The study of English helps pupils understand how language works by looking at its patterns, structures and origins The National Curriculum for English (DfEE 1999) investigative, independent and evidencebased learning National Literacy Strategy (DfEE 1998) Aim of Project To investigate the potential of: a corpus-based approach

for childrens evidence-based investigation into patterns in the English language Research questions (1) How do primary school pupils respond to corpus-based teaching and learning activities? What kinds of metalinguistic knowledge, understanding or misconceptions are the children prompted to articulate by the presentation of texts in a corpus format, such as concordance lines? Research questions (2)

What features of the corpus-learner interface do the children find useful in learning to recognise and identify patterned features of vocabulary and grammar? As the children become familiar with the approach, do they begin to posit metalinguistic questions and hypotheses of their own, as older learners have been found to do? Research questions (3) Does the evidence generated by the research suggest any constructive modifications to the prescriptions of the English National Curriculum, the Initial

Teacher Training National Curriculum for Primary English, or the National Literacy Strategy? What is a corpus? an electronically stored databank of authentic language a collection of pieces of language, selected and ordered according to explicit criteria in order to be used as a sample of the language What is the CLLIP corpus? 40 texts written for a child audience extracted from the British National

Corpus stories, history books, Brownie annual etc. approximately 800,000 words What can be done with a corpus? Identify patterns in language Grammar Vocabulary For example collect, classify and order sets of words to identify shades of meaning explain the differences between synonyms

e.g. angry, irritated, frustrated, upset NLS Objective Year 5 Term 1 Fieldwork 2 Primary Schools Phase 1: 6 Y4 + 6 Y5 children Phase 2: 8 Y5 children 9 x 40 minute sessions in each school Recording: 2 video cameras, 2 Mini

Disc recorders Activities First session: introducing the concept of a corpus simple set of concordance lines, focusing on centre of concordance lines first Later sessions Phase 1: Mainly paper-based Phase 2: Computer-based

Objective (NLS) Y4T2W to spell words with the common endings: -ight, etc This worksheet is based on concordance lines generated by a search for *ight. The task was to identify which of these words rhyme with bite and which with ate. The objective was to familiarise the children with the idea and appearance of concordance lines Objective (NLS) Y5T3S toThis

search for, identify classify aby range of prepositions of worksheet wasand produced searching for a verb Understand and use the term preposition motion followed by the after the next word, generating a set of concordance lines centring on prepositions.

It is very busy in appearance, because all the part of speech / word class categories are colour coded. Later versions of the interface allowed us to show colour in a much more selective way. Sample data: Extract 1 Alison: right so we've noticed quite a few patterns in these colours This discussion took place during the session when has anyone got any idea why the colours might be different the children in School A were

introduced to the why the words are in different colours GA1: colour-coded ooh output for the first time. (See previous Alison: slide.)have a think about it and see if you can work out what the reason might be for them all being printed in different colours They L____? found the sheets intriguing and speculated GA1: about iswhy somethe words had been coloured in the way is maybe some of them they had.

GA2: BA2: GA1 = Girl 1 at School A BA2 = Boy 2 at School A like nouns in one colour one adjectives and the other one one verbs and oh adverbs Sample data: Extract 2 Alison: GA2: Alison:

BA4: BA1: Alison: BA2: BA1: adjectives nouns verbs and adverbs but weve found about nine ten or eleven different colours so does does that mean that the colours couldnt be anything to do with parts of speech or they could they could yeah could but # how would they be then

I found eleven because you could have them like because they could show description on them if they wanted and stuff because they could just have some certain colours as nouns adjectives and things and any others could just be different things Sample data: Extract 3 Alison: BA1: Alison: BA1: Alison: BA1: Alison:

BA1: Alison: BA1: BA4: Alison: BA4: Alison: BA1: BA3: Alison: try and say it once more because people were talking when you were talking well just tell me what you've noticed

the blue sort of thi-, ones are like things yes because there's like mouth Bobby Peter and stuff and there's a shower room there's right Karen there's # all right A___ do you want to add something to this and I think they're things they're nouns they're nouns how wo-, how do you know so sure you're very sure because they're things okay [laughter]

things yeah they're things because there's the the book so blue might be the colour for nouns Objective (NLS) Y5T1W: to use adverbs to qualify verbs in writing dialogue The task here was to discuss in a small group which adverb would fit best into each set of three concordance lines. The children completed a copy of the sheet when all in the group agreed, having given reasons for their decisions. The corpus interface

Name of programme: Wordsmith Tools Features Colour chooser Grouping Sorting by set

the lines by left, right, by set Initial findings Children were familiar from previous work with some language categories and patterns They readily assimilated the idea of a corpus and concordance lines The corpus-based activities helped to consolidate and extend their understanding of language categories and patterns They initiated uses of the analysis software which we hadnt foreseen Contact We would be very pleased to hear from anyone

who is interested in the project and would like to know more. Please see the main project page for details of where we are presenting the research. We can be contacted at the University of Reading (address on SLALS home page), or via email: [email protected] [email protected]

