MOVEMENTS IN LANGUAGE TESTING: From Grammar-Based to Communicative Language Testing

I Made Sujana [English Education Department, Fac. of Education, University of Mataram]

Abstract. The development in language teaching principles from grammar-based to communicative has a consequence to the development in language testing. The language testing has moved from testing language elements to testing the use of language in real-life situations. The language testing procedure has also changed from indirect to direct, discrete-point to integrative, norm-reference to criterion reference, and objective to subjective testing. However, at the level of application, these movements raise a number of problems in terms of practicality, validity, and reliability.


 It has been an issue that the result of the second/foreign language teaching is unsatisfactory or even disheartening. In spite of long training in the target language (English for example), people generally feel unhappy with learners’ language acquisition. Frequent lament heard — whether it is from learners, teachers, educational institutions, funding body, parents or public — is that although the learners have spent several years in language training (e.g. six years for senior high school (SMU) graduates in Indonesia), they have never really learnt the target language (English) yet ; in other words, they are still incapable of understanding or expressing themselves in English.

            This common complaint leads to reassessment of language teaching theories and leads those who get involved in this area to attempt the effective ways of foreign/second language teaching. The adoption of the communicative approach in second/foreign language teaching has brought meaningful changes in which the learners now commonly spend a good deal of time using language for purpose of meaningful social interaction rather than just manipulating language forms.

            The innovation of the language teaching has also led to the reassessment of language testing program. The language testing professional tries to find out the effective and appropriate way of testing language teaching program based on the current language teaching principles, communicative approach. According to Brindley (1986), the adoption of communicative approach in language teaching has changed the tradition of language testing and evaluation from traditional ways i.e. testing learner’s mastery of particular language items to the more current trends i.e. testing learner’s ability to use language effectively in real situations. In other words, the language testing now has entered the era of  “communicative language testing”,

            This paper aims at discussing and reviewing the movements of language testing program from traditional (Grammar-Based) language testing to communicative language testing. First, the historical movements in language testing from pre-scientific to integrative-sociolinguistic will be reviewed, followed by the discussion of the differences between  communicative language testing (CLT) and Grammar-Based language testing. The remainder of the paper will discuss the current trend in language testing procedures and problems of application of the communicative language testing.


  1. Historical Movements in Language Testing: From Pre-scientific to Integrative-Sociolinguistic Era

Development in theoretical views on language teaching and linguistics has an impact on the development of language testing, that is, how the teaching and learning process to be assessed or evaluated.  Both Spolsky and Hinofotis (cited in Brown, 1996; see also Weir, 1990) come to an agreement that second language testing can be broken into three trends or periods of development: Pre-scientific period, Psychometric-Structuralist period, and Integrative-Sociolinguistic period. Regarding terminology, Brown (1996) prefers to use the term movements instead of periods in the sense that they sometimes overlap chronologically and they tend to co-exist today in different places.

            Pre-scientific Movement (until World War II). The pre-scientific movement in language testing in language testing is closely related to the grammar translation approaches to language teaching. This movement is characterized by translation and free composition tests developed by non-language testing experts (that is by classroom teachers). A problem that arises with such kinds of tests is that they are difficult to be scored objectively. As a result, the test lacks concern with making fair, consistent and correct decisions about the lives of students involved; in other words, they lack objectivity, reliability, and validity (Brown, 1996). Therefore, there was little concern with the application of statistical techniques. Language testing in this era is regarded as an art rather than as a science.

            Psychometric-Structuralist Movement (beginning in 1950s). The movement from pre-scientific to psychometric-structuralist arises as a result of dissatisfaction to the procedures of the language testing at that time in terms of objectivity, reliability, and validity. This movement was also inspired by the application of Audio-lingual and other related methods, which were normally measured by discrete structure point tests (Brown, 1996). In addition, the collaboration between linguists and specialists in psychological and educational measurement has led the language testing to follow more scientific principles as applied in the fields of psychology and education. This era was characterized by the application of psychometric measures and structural linguistic principles. Statistical analysis was introduced and first carefully designed and standardized tests such as Test of English as a Foreign Language or TOEFL (1963), the Michigan Tests of English Language Proficiency: Form A (1961), Modern Language Association Foreign Language Proficiency Test for Teachers and Advanced Students (1968), etc. were published. Those kinds of tests are usually in objective formats so that they are easy to administer and score and are carefully constructed in order to meet objectivity, reliability, and validity of the tests.

            Integrative-Sociolinguictic Movement.  Along with the development of language teaching principles and linguistics, the language testing also changes. The adoption of communicative approach has led to the movement from testing language aspects separately to testing how to use language in real life communication. In this era, the language testing professionals start to question whether language competence — the speaker’s underlying knowledge of the language rules and not to what s/he actually does when using language in actual communication — or language  performance — the realization of the user’s underlying communicative competence (see Brindley, 1986) is the more important to be taken into account in the language testing.

            The integrative-sociolinguistic movement was inspired by the work of sociolinguists (like Dell Hymes, to mention one) who believe that the development of communicative competence depends on more than simple grammatical control of the language and, thus, cannot be measured simply by discrete aspects of the language. In addition, the communicative competence also involves knowledge of the language appropriate for different situations (Brown, 1996). Besides, it has long been proved that having grammatical competence does not guarantee that one is automatically capable of using the target language. This grammatical competence along with  other communication aspects needs to be activated in the form of practicing how to use the language. The language testing on the integrative-sociolinguistic era moves from separate linguistic skills toward practical communication by combining and contextualizing some aspects of language skills. The test is expected to cover real-life communicative tasks and cultural and sociolinguistic aspects. Two types of tests, dictation and cloze procedures, were preferable because they are capable of assessing student’s ability to manipulate language contextually and, thus, involve integrated skills.

            From the discussion above, each movement has its own principles to see what needs to be included in assessing one’s language proficiency. The psychometric-structuralists emphasize the importance of testing the language elements separately as a representation of  one’s language ability. They believe that by having the grammatical and other language elements, one will be able to use the language in communication automatically. The integrative-sociolinguists, on the other hand, emphasize the importance of testing the language elements contextually by including more than one language element at a time. The opponents of the integrative-sociolinguistic period believe that the use of language in real life situations involves both linguistic and extra-linguistic elements. Knowing just separate grammatical aspects does not guarantee the capability of using language in real-life situations unless s/he practices it integratively and contextually.

            However, because of a number of drawbacks such as time constraint, practicality, and economicality of test constructions and administration and other related problems, it is difficult to apply the principles offered in integrative-sociolinguistic era. At the level of application, there is an overlap of what needs to be included in language testing. In the integrative-sociolinguistic era, there exists a test or an item constructed using psychometric structuralist principles; that is to say, constructing an item that assesses a single language element at a time (discrete point testing). It is due to the difficulty in formulating the language testing itself as expected in communicative language teaching principles. As a result, the application of the communicative language testing is left far behind that of language teaching.

  1. Movements from Grammar-Based to Communicative Language Testing

The term “Grammar-Based Language Testing” (GBLT) which refers to non-communicative (traditional) language testing in this paper is adopted from Richard-Amanto’s (1988) term in differentiating non-communicative (traditional) approaches from a communicative approach. In relation to the three movements mentioned above, the first two eras — pre-scientific and psychometric-structuralist — belong to GBLT and the integrative-sociolinguistic era belongs to CLT.

            GBLT was adopted from the traditional approaches such as Grammar Translation, Audiolingual, Cognitive and Direct Method, which focus on grammar as content and expose learners to input in the target language that concentrates on one aspect of grammar system at a time. The GBLT claims that since language is built of sounds, morphemes, sentences and rules, the specific elements of the language should be taught and tested separately. The followers of the Grammar-Based theory assumed that grammatical or structural aspects of the language are the most useful to be imparted and that the ability to use language (functional ability) will automatically arise from the existence of the grammatical knowledge (Krahnke, 1987). This idea is in accordance with Widdowson’s (1988) comment dealing with the importance of teaching separate language aspects. He argues that in language teaching what the learner needs is a basic knowledge of language system and of lexical and grammatical forms capable of constituting a core linguistic competence. This competence later will provide learners with the essential basis for communicative behavior when the learner finds himself in a situation that requires him to use the language to communicate. This belief brings a consequence that what needs to be taught in language teaching is a knowledge of the language system, and the exploitation for communicative purposes can be left to the learner.

            The belief that the learner has to be taught a knowledge of language system has an impact to what knowledge needs to be assessed in language testing. The language testing in this era tended to focus on testing separate language aspects; that is to say, testing one element of the language at a time and use indirect testing, that is, testing abilities that underlie the language performance (skills) such as testing writing through grammar. Objective testing was preferred in GBLT because it was possible to cover representative samples to maintain validity (especially content validity) and reliability of the test. In addition, those kinds of testing are quantifiable; hence, it is easy to calculate the validity and reliability of the test. Those two issues (validity and reliability) became main concern in this era.

            Along with the development in language teaching principles, the Grammar-Based language testing was criticized. The result of discrete-point testing (testing one language element at a time), indirect testing (testing ability underlying language skills) and objective testing did not reflect one’s ability in using language and had negative washback effects on both teaching and learning. This criticism has led to the application of more communicative testing, which is adopted from communicative language teaching approaches.

            As noted earlier, Communicative language approach/method was born as a reaction to the failure of traditional language approaches. Traditional language approaches failed to bring learners to acquire the target language and teaching language aspects separately do not guarantee the ability of using the language in real-life context. Along with the adoption of communicative approach in language teaching, the method of testing also changes. The method of testing has moved from testing separate language aspects to testing ability to use language for communication. The Communicative Language Teaching (CLT) emphasizes on the assessment of learner’s proficiency, that is, potential success in the use of language in some general sense (Morrow cited in Weir, 1990). It is based on the theories underlying current language teaching. It is not enough to impart students with just grammatical knowledge. What needs to be improved in language teaching using communicative approach is the ability to use the language for communication. Regarding this, Widdowson (1988: 292) comments that:

“… a learner needs to learn appropriate behavior during his course since one cannot count on him learning it later simply by reference to this linguistic knowledge. The belief here is that communicative competence needs to be expressly taught: the learner cannot be left to his own devices in developing an ability to communicate.”

            Savignon who is in line with Chaplan (cited in Weir, 1990) criticized isolated language tests — such as tests of language elements — which cannot be used as a sufficient predictor of communicative skills. To be able to perform the knowledge of the components of language in real life situation needs new and appropriate ways ( Morrow cited in Weir, 1990). Bachman (1991) argues that the communicative ability can be assessed by constructing tests that fulfill at least four criteria, namely: (1) tests should create an information gap; (2) the tasks of the test should be dependent; (3) there should be an integration between the test tasks and the test content; and (4) the tests should measure a broader range of language ability (such as cohesion, function, and sociolinguistic appropriateness).

            In addition, Weir (1990:38) tries to elaborate the expected characteristics of communicative language testing as follows:

  1. Realistic context — the test should be regarded as appropriate to the candidate’s situation.
  2. Relevant information gap — candidates should have to process new information as they might in real situations.
  3. Inter-subjectivity — tasks should involve candidates both as receivers and producers.
  4. Scope for development of activity by candidates — tasks should give candidates chances to assert their communicative independence and allowance should be also made for unpredictability of communication in the task and the marking schemes that are applied.
  5. Allowance of self-monitoring by candidates — the tasks should allow candidates to use their discourse processing strategy to evaluate their communicative effectiveness and make any necessary adjustment in the course of events.
  6. Processing of appropriately size input — the size and scope of the activities should be such that they are processing the kind of input they would normally be expected to.
  7. Normal time constraint operative — the task should be accomplished under normal time constraints.

            From the discussion above, it is clear that what needs to be included in the language testing depends on the principles of language teaching. The GBLT believes that what needs to be tested is learner’s language competence (language element) that underlies communicative ability. Having such competence, the learners will be able to use the language for communication. The CLT, on the other hand, believes that the communicative competence needs to be tested directly in order to know whether or not the learner is able to use language in real situations. The communicative ability depends on the acquisition of grammatical aspects and on how to use the aspects in real-life communication. Therefore, language testing should be designed to assess learners’ ability to use the language in communication.

            The development in language teaching approaches from the traditional ones to the communicative one has also a consequence to the development in language testing procedures. There are continuum movements in testing tradition from (1) discrete-point to integrative testing; (2) direct to indirect testing; (3) norm-reference to criterion testing; and (4) objective to subjective testing. The following is the discussion of the movements.

            Discrete- versus Integrative-Points Language Testing. One obvious change can be seen as the development from GBLT to CLT is the transition from discrete-point testing to integrative-point testing. The discrete point test refers to the tests that were based on single independent skills (decontextualized); while the integrative language test refers to the test which combines language skills together to assess learner’s performance in using language (contextualized). Examples of these tests are Test of English as a Foreign Language (TOEFL) in which one item has no relation to other items and IELTS test which tries to contextualize a number of items.

            As Oller (1979) said that the concept of integrative language test was born in contrast with that of discrete-point; that is to say, the discrete point tests separate language skills into parts; conversely, the integrative language tests put the parts together. Furthermore, he said:

 “Whereas discrete items attempt to test knowledge of language one bit at a time, integrative tests attempt to assess learner’s capacity to use many bits all at the same time, and possibly while exercising several presumed components of a grammatical system, and perhaps more than one of the traditionally recognised skills or aspects of skills”. (Oller, 1979: 37).

             In addition, Rea (cited in Weir, 1990) expresses that discrete-point test may result in artificial, sterile, and irrelevant types of items which have no relation to the use of language in real life situations. Similarly, Morrow (cited in Weir, 1990) suggests rather than to test knowledge of language elements alone, it will be useful to combine the discrete knowledge of the elements in appropriate contexts or situations. However, the discrete-point tests are that the data can be quantifiable and can cover a wide range of representative samples (Weir, 1990: 2).

             Directness and Indirectness in Language Testing. This trend is characterized by the transition from indirect language tests (testees were presented with tasks which did not directly measure their ability in certain skills such as testing writing ability by assigning them to do multiple choice items) to direct/authentic language tests (tests which provide real life situations which are more or less close to what the testees will encounter in real language use.

            Communicative language tasks should be as direct as possible and should also involve realistic discourse processing. A language test is regarded to be direct if it requires the testee to perform precisely the skill that the tester wants to measure. If the tester, for example, wants to know the testee’s ability to speak, the task must get him to speak.

            According to Brindley (1986: 13; cf. Weir, 1990, Skehen, 1988), the only valid way of assessing communicative performance is by using procedures that replicate as closely as possible the actual circumstances under which the learner will have to use the language. In relation to that, Rea (1985:26) states that direct language testing requires integration of linguistic, situational, cultural, and affective constrains which interact in the process of communication.

            Direct language testing has a number of advantages: (1) it has high face validity, that is, it looks like testing what it is supposed to be testing; (2) the test can be steered to draw out to specific aspects of language behavior, for example, to measure the testee’s mastery of the third person “s”. It can be elicited by asking him to talk about a member of family (Brindley, 1986). Moreover, Hughes (1992:15) observes that direct language testing has several attractions: (1) it provides clear description about the ability that wants to be assessed because it is relatively straightforward to create condition eliciting behavior on which to base our judgement; (2) Regarding productive skills, the assessment and interpretation of the testee’s performance is also straightforward; and (3) it can improve beneficial washback effect since in the preparation of the test the testee practices the language skills the teacher wants to foster.

            Indirect language testing, on the other hand, tries to seek the abilities which underlie skills in which the tester is interested. This type of test can be seen in Section II of TOEFL Test in which it measures testee’s writing ability indirectly via testing structure and written expression. According to Hughes (ibid), the real advantage of indirect language testing is that it allows the possibility of testing a representative sample of a finite number of abilities that underlie a potentially large number of manifestations. The Direct language testing, on the other hand, allows only limited sample of tasks. However, the crucial problem of the indirect language testing is that the relationship between the sample taken and prediction to the real performance is often very weak and uncertain.

             Norm-Referenced and Criterion-Referenced Assessment. Another development in second/foreign language learning assessment is characterized by the transition from Norm-Referenced Assessments (NRAs) to Criterion-Referenced Assessment (CRAs). In NRAs the testee is assessed according to how well s/he achieves in relation to other testees, in CRAs the testee is assessed according to what s/he can do or achieve.

            In communicative language testing, the CRAs need to be taken into consideration. Since the purpose of language testing is to describe a learner’s language proficiency, the tester should obtain obvious evidence of what the testee is able to do rather than just providing test scores (Brindley, 1991:139). The purpose of CRAs is to classify the testee according to whether or not he is able to perform some tasks or a set of tasks satisfactorily regardless whether all learners pass or all of them fail. The positive aspects of CRAs, according to Hughes (1992) are, among others, they set meaningful standards in relation to what students can do without being influenced by other student’s scores; and they motivate the learners to achieve the defined standards. Therefore, in testing language proficiciency there must be a clearly defined standard of the expected language ability. Brindley (1991) suggests that the easiest way to define criteria and descriptors for language assessment is by using the already existing criteria such as rating scales, band scales, and performance descriptors.

            Objective versus Subjective Testing. The other movement in the language testing procedure is the application of objective testing in traditional language testing and subjective testing in communicative language testing. According to Hughes (1993), the difference between the objective and subjective testing lies on the level of judgement needed on the part of the scorer. If the scorer does not need judgement in scoring, the test is objective; on the other hand, if the judgement is called for, the test is said to be subjective. The objective testing is preferred in pshycometric-structuralist era in which the reliability (consistency of the score), validity (the representativeness of the sample) and objectivity (of test format) become the main concern. However, along with the development of language teaching and testing principles, the subjuctive testing is preferred in language testing, especially if the purpose of testing is to know one’s ability in communication. The subjective testing has beneficial washback effects on teaching and learning, but it may arise problems on relaiblity and validity.


 As discussed above, the communicative language testing should be closely real life situation and apply direct, integrative, criterion-reference and subjective testing. However, the expectation raises a number of problems in both constructing and scoring the test. Those problems, among others, deal with practicality, economy, validity and reliability of the test.

            Direct testing is impractical, time-consuming and uneconomic, paricularly for large group. Testing students’ communication using direct test in SLTP/SMU, for example, is impossible due to the ratio of teachers and students in a school, budgetting for administering a test, and time constraint. Reliability is another issue that needs to be taken into account in the application of direct, integrative and subjective testing. The problem of reliability may arise from the internal test itself and from the raters; however, the latter arises more obvious problems in terms of both intra-rater consistency (consistency of the same rater for one occasion to another) and of inter-rater consistency (agreement among different raters toward the same trait). A good example of inter-rater inconsistency can be seen in Jaworski’s (1994) research report dealing with marking such a simple answer of “Fine, thanks” toward testee’s question “How are you (doing)?” The tester sometimes gives 3 and sometimes 4 for the same answer. A good point from this simple case is that it is not always easy to assess, even, a very simple problem (such as rating a two-word answer “fine thanks”), let alone to make a complex task such as in speaking or writing.

            Validity is another problem. From face validity point of view, the direct, integrative, and subjective testing has high face validity since it looks as if it measures what it is supposed to measure; however, in terms of content validity (i.e. sample representation of the materials to be tested) and predictive validity (i.e. the ability of the test to predict future performance, the test is questioned: To what extent can the representative sample of the teaching materials be made in direct and subjective testing? and To what extent can the test predict correctly the performance of test-takers in the given future context?

            The complexity of aspects involved in the communicative language testing makes it difficult to overcome these problems totally. It is impossible for the test constructors to always and totally relate to real life situations as expected by communicative language testing. In relation to these problems, Brindley (1986) gives a number of general comments such as not too strict to include directness to real life situations in language assessment because it does not always guarantee to predict performance. Furthermore, he comments “… there are other direct methods of assessment such as self-assessment by learners of their own performance” (p.15) in order not to spend a lot of time and resources. He also suggests using indirect or semi-direct procedures to assess language proficiency.

            Putting together, language test designers and language test practitioners still need to work hard to formulate the “communicativeness” in language testing in order to provide language testing principles that are in accordance with (or at least closely fulfills) the communicative language teaching principles so that the existing gap between communicative language testing and communicative language teaching can be minimized. In other words, the application of the communicative language testing is not left too far behind the application of the communicative language teaching.


