Wednesday, March 14, 2007
BEYOND CONCORDANCE LINES: USING CONCORDANCES TO INVESTIGATE LANGUAGE DEVELOPMENT.
Concordance software such as Wordsmith, MonoConc Pro and Microconcord help in the tedious task of analyzing language data and greatly extend the potential of a corpus in language pedagogy. It helps to examine concordance lines to discover how words and grammatical constructions are used. This articles investigates language development based on data in the EMAS corpus using language production as well as lexical variety as indicators of development. The EMAS corpus was collected in 2002 and consist of close to half a million words. It is an untagged and unedited learner corpus that contains written data by about 800 students, from Year 5 (primary school) ,as well as Form 1 and Form 4 (secondary school) and they are considered as being above average in English language profiency. The major criterion in selecting the topics for the essays was the amount of the language the topic could elicit. To elicit a large amount of language data, “the happiest day of my life” topic was used for one of the essays as almost every respondent was thought to be able to write on the topic and hence produced the required amount of language data. Numerous language acquisition studies, for example, focus on specific target structures and examine the acquisition of these structures over a period of time. Available data in the EMAS corpus is cross sectional and elicited from three groups. Therefore, a basic assumption made in this article is that developmental patterns can be implied by comparing the language use of three different age groups. Productivity in this article is indicated by the number of sentences per essay and the words per sentence. We can detect a gradual increase in the number of sentence, sentences per essay and words per sentence from the primary 5 to form 4 level based on the information from the studies. The diversity of the vocabulary used in the corpus is often determined by calculating the type to token ratio and it is calculated by dividing the number of separate words in a text by the number of words in the text. Type to token ratio gradually increases from the lower to the higher age groups this signifies that the older respondents use a wider range of vocabulary in their essays. The average amount of type was calculated by first counting the number of words with a frequency higher than or equal to the number of respondents in each age level. In order to estimate the number of types per respondents, this amount were also deducted from the total number of types found before this and the result was divided by the number of respondents for each age group and multiplied by three. The result of this operation was then added to the number of high frequency words found earlier to provide an estimate of the average number of word types per student.
Subscribe to:
Post Comments (Atom)

No comments:
Post a Comment