R & D outcome_test
Multipurpose Indian Language Evaluation System [ M I L E S ] Home | About Us | Contact Us | Site Map | Faq

PREPARATION OF TESTS OF LANGUAGE PROFICIENCY

APPENDIX - I

PREPARATION OF TESTS OF LANGUAGE PROFICIENCY

RECOMMENDED GUIDELINES*

(Proceedings of the Workshop on Review, & Developing Guidelines for the preparation of Proficiency Test Items, Nov.25-29, 1985, ERLC, Bhubaneshwar)

The National Workshop on 'Preparation of Tests in Indian Languages' was held under the auspices of Central institute of Indian Languages, Mysore at the Eastern Regional Language Centre, Bhubaneswar from November 25-29, 1985 .

The objectives of the Workshop was to evolve some consensus about the nature and preparation of language tests in selected Indian languages. For the purpose of this Workshop six languages namely Bengali, Marathi, Punjabi, Tamil, Telugu, and Urdu were taken up. However, it is recognized that the objectives, scope and the methodology of test construction as outlined here would apply uniformly to all other languages. Thus, it is visualized that this guideline would provide the model in general, to all those who seek to prepare standardized language tests at the national level.

A. RATIONALE, SCOPE AND OBJECTIVES

B. DIMENSIONS OF LANGUAGE PROFICIENCY

C. SUBSKILLS AND SAMPLE ITEMS UNDER EACH DOMAIN

D. TRYOUT, ITEM ANALYSIS AND STANDARDIZATION

A. RATIONALE, SCOPE AND OBJECTIVES

A review of the existing instruments available for measurement of language skills show a clear lack of any general and standardized measures of language proficiency at the Higher Secondary level which can be administered on a group basis regardless of the varying contents of curriculum across institutions providing instructions in respective regional languages .

It is also felt that a distinction needs to be made between language proficiency and language achievement measures. Language achievement tests as have been used in the country are usually measures of outcomes of specific instruction in a language specific to certain syllabi. Usually they are content-based, teacher-made and non-standardized tests meant for certification purposes. The skills captured through such achievement measures may not necessarily have generalized validity for assessment of general and content-free communicative skills. On the other hand, it is felt that language proficiency measures can be constructed and used for predicting how well an individual, after certain number of years of normal education, can communicate in a content-free and context-free manner across a wide variety of situations. Such a measure would test the knowledge of language in use abstracted from content-based instruction.

It is proposed that current guidelines pertain to construction and use of tests of language proficiency and not of language achievement. More specifically, the proposed tests are intended for use with the High School Graduates or those who have gone through equivalents of 10 years of language instruction as the target population. No apriori judgment is made about the applicability of these tests on second language learners (albeit with a different set of norms); this issue can be empirically verified later on.

It is suggested that the test may be useful to any agency seeking to assess language proficiency of large number of individuals for such purposes as admissions into higher education, job selection, training, and for evaluation of various language instruction programmes in the country. The test is meant primarily for assessment, and not diagnostic purposes. However, the test may be used for screening purposes which may be followed up by more elaborate forms of diagnostic tools.

The proposed battery of tests seek to evaluate each individual's proficiency relative to the group norm and as such, can permit norm-referenced interpretations.

B. DIMENSIONS OF LANGUAGE PROFICIENCY

As a result of elaborate deliberations among language experts and test specialists, the general language proficiency was categorized into the following seven domains :

  • Reading Comprehension
  • Lexical Skills
  • Structure of Language
  • Writing and Composition
  • Listening Comprehension
  • Speaking
  • General Language-related Information

It is recognized that there is some amount of overlap between one domain and the other and that each domain, it may not be possible, to measure in an exhaustive sense. There are also other practical considerations limiting the measurement of each of these domains.

For purposes of standardization, objectivity in scoring and establishing reliability and validity of the tests, it was decided that item formats should be of multiple-choice type. Within this type of format, it was decided that items be written by respective language experts, tapping the hierarchical level of skills such as knowledge, comprehension, application, analysis, synthesis and evaluation for each domain of language proficiency as outlined above .

Each of the domains is explained below in detail along with subskills and examples in multiple choice format for each of these subskills. The subskills covered under each category are only representative and not exhaustive as are the domains of language proficiency enlisted here. The instructions for the tests have to be formulated accordingly.

C. SUBSKILLS AND SAMPLE ITEMS UNDER EACH DOMAIN

1. Reading Comprehension

This part will test candidate's ability to read and understand what is stated or implied in a written passage, and then to answer questions based on it. It is decided that each passage would contain 350-450 words followed by approximately 10-12 multiple choice questions at the end 2. Six types of comprehension questions can be asked :

  • Main idea questions seeking information on the central theme of the passage.
  • Detail questions asking for information directly stated in the text.
  • Sequence questions to assess the knowledge of events in the order of their occurrence.
  • Cause-and effect questions asking for either the cause or the effect.
  • Inference questions asking for information that is implied but not directly stated in the passage.
  • Vocabulary questions asking for the meaning of words or phrases in the context of the passage.

Example :

Long long ago, there was a king, who had great fascination for new and unique dresses and who needed dresses for every hour of the day. One day two weavers came to the capital. They claimed that they could weave thread too fine for the normal human eye to see. The two expert weavers said, “ only the fools cannot see the threads”. The king ordered them to weave for him some out-of-the-world dresses. The cunning cheats pretended weaving a fine piece of cloth and enacted the movements of a loom. The king asked the Prime Minister to find out about the progress of the weavers. The Prime Minister could not see the dress at all. He was worried that he was going to be branded as a fool; so he reported to the king that the royal dress was extremely beautiful. The king himself came to see the weaving of the dress. Near the loom, he thought, “what! I see nothing; am I really a fool ?” He paused a little and then said, “Indeed, it is very beautiful”. All other persons in the royal party said in one voice, “Unique and beautiful dress”. The king arranged a ceremony for putting on the new dress. The two weavers pretended fitting the robe of the king. Ministers, citizens and all others were unanimous in their praise for the beautiful royal dress. Amid crowd a little boy said, “ The king is stark naked”.

Main idea Questions

Everybody in the crowd were full of praise for the unique dress because :

  • They were simply following the king.
  • They did not know what a beautiful dress means.
  • They did not want to be identified as fools.
  • They were afraid of the king.

Detail Questions

The king wanted a new dress

  • Every year
  • every hour
  • every month
  • every day

Sequence Questions

When the king asked the weavers to weave a fine dress, they

  • pretended weaving and enacted the movements of a loom.
  • wove so fine a dress that no one could see it.
  • themselves were wearing no dress.
  • brought all other people except the boy under their magic spell.

Cause and Effect Questions

The king asked the weavers to weave a new dress for him because :

  • he never liked dirty clothes.
  • he wanted to look like a king in his royal robe.
  • the weavers claimed that no one could see the dress.
  • he was very fond of new dresses.

Inference Questions

The young boy said, “The kind was stark naked”. Because :

  • he was a fool
  • he was not under the magic spell of the weavers.
  • he was too innocent to pretend.
  • he was too young to see the fine piece of dress.

Vocabulary Questions

In this passage, the out-of-the-world means :

  • something that does not exist.
  • something that is extraordinarily beautiful.
  • something that no other weaver can weave
  • something that no person in the world can appreciate.

2. Lexical Skills

This part will test candidate's knowledge of vocabulary and ability to use them appropriately. The lexical skills would be tapped by asking the candidates to demonstrate their knowledge of word meanings, meanings of words or phrases in the context of sentences and word synonyms and antonyms. Examples of items meant to tap the sub-skills in this area are given below. Instructions for each subtest should be given in the test booklet.

a. Word Meanings

The meaning of 'alleviate' is

  • to relieve
  • to shorten
  • to remediate
  • to redress

b. Contextual Meanings

His room mate's 'sharp reply' made him angry

  • repeal
  • retort
  • report
  • receipt

c. Synonyms

'Industrious'

  • affluent
  • cogent
  • diligent
  • extinct

d. Antonyms

He is not a good man. He is more likely to be

  • wicked
  • intelligent
  • dull
  • unimportant

3. Structure of Language

This part will test candidate's knowledge of structure of language and written expressions and his ability to make functional use of such knowledge, appropriate to different linguistic contexts. These skills will be measured through tests like
(a) Sentence Completion, (b) Error Detection, (c) Sentence Comprehension,
(d) Transformation and (e) Formal Grammar. Example of items are given below :

a. Sentence Completion

1. Staying in a hotel costs – renting a room in a dormitory for one night

  1. twice more than
  2. twice as much as
  3. as much twice as
  4. as much as twice

2. It is the first time that he has been abroad, ------------------------ ?

  1. is not he
  2. has not he
  3. is not it
  4. has not it

b. Error Detection

1. Whoever turned in the last test did not put their name on the paper.

A B C D

a. A c. C

b. B d. D

2. The excuse that he gave us was not sufficient enough and we will not accept it.
A B C D

a. A c. C

b. B d. D

c. Sentence Comprehension

He fell short of words to express his gratitude .

  1. He was not verbally fluent to express his gratitude.
  2. He was overwhelmed with emotion while expressing his gratitude.
  3. He forgot a few words because he had to express gratitude.
  4. He was so grateful that he could not find appropriate words to match his level of gratitude.

d. Transformation

The mother gave the baby to the servant .

  1. The servant was given the baby by the mother.
  2. The servant took the baby away from mother.
  3. The baby was given the servant by the mother.
  4. The mother was given the baby by the servant.

e. Formal Grammar

The high mountain is covered with tall trees and thick bushes. The adjectives in this expression are

  1. high, covered, thick
  2. high, tall, thick
  3. high, mountain, tall, thick
  4. high, covered, tall, thick

4. Writing and Composition

This part will test the candidate's ability to communicate through writing and to organize such writings in logical sequence, appropriate to the communication intent. The skills in this area will be measured through tests like (a) Spelling, (b) Idioms and Proverbs, (c) Precise Writing, (d) Text Organization, and (e) Letter Writing and Compositions 3. Examples :

a. Spelling

Which one of the following four words is correctly spelt ?

  1. perceive
  2. recieve
  3. decieve
  4. concieve

b. Idioms and Proverbs

The expression 'strike the iron while it is hot' means

  1. iron has to be hot to be struck.
  2. take advantage of the first opportunity.
  3. do the right thing at the right time.
  4. work is to be done when it is hot.

c. Precis Writing

She was hungry. So she wanted some food. She went from door to door requesting people for some food. A little boy was taking some food. He told her, “you can have some of the food that I have”.

The theme of the above sentences can be precisely expressed as

  1. feeling hungry she begged for some food until a boy let her share some of his food
  2. in spite of her request, nobody except the boy gave her food.
  3. she begged from door to door and nobody gave the hungry woman her due except the boy.
  4. finally a boy came to rescue of the hungry woman who did not get food other wise.

d. Text Organization

    1. The train was on time and I missed it.
    2. I realized that I should have left home at least 15 minutes earlier.
    3. I purchased a ticket two days before the date of my journey to Banaras .
    4. I left my residence for the station at 10 P.M.
    5. I was invited to attend a meeting.

The above 5 sentences can be organized into a meaningful text in which of the following orders ?

  1. 4,5,3,1,2
  2. 5,3,4,1,2
  3. 5,4,3,1,2
  4. 4,5,3,2,1

e. Letter Writing and Composition

Which of the following expressions is most likely to occur in a formal application to the authorities for the construction of a village road ?

  1. the work of the village road should be taken up soon.
  2. the sooner the construction of the village road is taken up, the better.
  3. you must do the construction work forthwith.
  4. it would be highly appreciated if the construction work is taken up at the earliest.

5. Listening Comprehension

This part of the test will tap candidate's ability to listen to and understand a spoken discourse, particularly at the level of a relatively lengthy and formal communication, such as a lecture, narration or a speech.

The text spoken clearly by a native speaker and recorded in a cassette tape would be played through an efficient public address system to a group of testees (preferably not more than 50 in a room). The comprehension questions will also be on the tape. The alternative answers will be pointed on the test booklet. The type of comprehension questions will be the same as outlined for the subtest on 'Reading Comprehension'. However, the text would be ideally shorter compared to that of Reading comprehension and would last for about 2-3 minutes duration. The time-interval in between the question should be sufficient to permit the testees to mark out the correct answers in the booklet. For examples of listening comprehension question, the description under the reading comprehension may be referred to.

*6 & 7. Speaking and General Information

Both these domains as outlined earlier are not to be specifically assessed in this test through use of multiple choice type of questions.

The speaking skills as they occur in natural contexts have not been included for specific assessment in this test in consideration of practical difficulties and cost-benefit analysis in view of the large-scale group testing involved. Experts interested in a specific assessment of speaking skills may, however, go for elaborate individual testing in a form (Form B) 3 supplementary to the present one. It may, however, be mentioned that many of the skills involved in speaking overlap in a large measure with the ones which are sought to be assessed through this test. As such, a part of the speaking skills is tapped through the present test.

Considering the variation in the nature and range of general information (as a part of language proficiency) such as knowledge of culture, people, customs, traditions, literary forms and history of language, etc., it was decided not to include specific items in this domain in the test. However, it is suggested that items on vocabulary, idioms and proverbs, sentence structure may be such as to tap specific knowledge of the cultural aspects of a language. In other words, these items should be culturally oriented.


TABLE

Domains of Language Proficiency, and Nature and Number of Items for

Preliminary Tryout and Final Version of the Test

[Each item will be of a multiple choice type question with 4 alternatives]

(Total time : 1 hour 45 minutes)

Domains

Nature of items

No. of items to be written for preliminary try-out

No. of items for the final test

Time limit for the final version (minutes)

 

1

2

3

4

5

1.

Reading Comprehension a. Written passages
(350-450 words)
5 passages with 10-12 comprehension questions being a multiple choice question with 4 alternatives. 2 passages each having 7-8 comprehension questions at the end.

 

20

2.

Lexical Skills (Vocabulary)

a.Word meaning b.Contextual meaning
c.Synonyms - 17
d.Antonyms - 18

 




20

1

2

3

4

5

3.

Structure of Language

a.Sentence completion
b.Error detection c.Sentence comprehension d.Transformation e.Formal grammar

 




20

4.

Writing and Composition

a.Spelling b.Idioms & Proverbs c.Precise Writing d.Text Organization e.Letter writing and composition

 

 

25

5.

Listening Comprehension

a. Spoken passages (each around
300 words)

7 passages with
8-10 questions at the end.

3 passages with 5 questions after each passage.


20

*6.

Left to the choice of the item writers depending on the level & time availability.

*7.

Left to the choice of the item writers depending on the level & time availability.


D. TRYOUT, ITEM ANALYSIS AND STANDARDIZATION

The proficiency test in each of the languages will be developed through the following steps :

    1. Pilot testing.
    2. Preliminary tryout and item analysis.
    3. Standardization (Establishing norm)
    4. Establishing reliability and validity

1. Pilot Testing

The items, once prepared and judged by experts with respect to their content validity, can be put into test format. The test can be administered on a sample of 50-100 fresh high school graduates. The test administration may be distributed over more than one session as required.

The purpose of pilot testing would be to obtain some preliminary information in respect of

  1. the time length for each subtest as well as for the total test,
  2. the clarity of instructions under each subtest,
  3. the suitability of the test items for the testees,
  4. the appropriateness of the distractors of the multiple choice type questions,
  5. organization of the subtests in a test format, and
  6. any other observation that may be useful for refining the test for the preliminary tryout (splitting the test into 2/3 parts to be administered in more than one session)

2. Preliminary Tryout

The purpose of this step would be to obtain information about the item characteristics (item difficulty, item discrimination, etc.) on the basis of which items can be selected for the final version of the test.

The test may be administered in 2 or 3 sessions depending upon the duration of the total test established on the basis of pilot testing. The test should be administered to a sample of 300-500 4 students drawn following stratified random sampling procedure from the population for whom the test is meant. The characteristics of the stratification variables have to be specified in advance keeping in view those variables likely to influence language proficiency of subjects within a native language group.

Once the data are collected, the next step is to go for item analysis, that is, to find out the item difficulty and discrimination indices and, if possible, item validity index (in the sense of criterion-related validity). The items with moderate difficulty level (at and around.5 ) and good discrimination indices (correlation of item score with total score) should be retained for the final version of the test, keeping in view the expected number of items suggested in Table-1.

3. Standardization

Once the items are selected for the final version and organized in a test format, the next step is to obtain norms for the desired population .

The size of the standardization sample should be in the range of 2000-3000 subjects drawn following a stratified random sampling procedure from the population for whom the test is meant. On the basis of distribution of scores, norms can be established for the full test. The norms may be obtained in the form of percentile ranks, standardized or normalized scores. The norm table so established would be used to evaluate the language proficiency of a subject relative to performance of the group.

To empirically test the appropriateness of the test for second language learners, it can be administered to a group of second language learners and the norms can be obtained. From an examination of the content validity of the items for the second language learners, and the norms of performance of this group, one could examine the suitability of the test for the second language learners.

4. Reliability and Validity

Test-retest reliability and internal-consistency estimates of reliability can be obtained for the subtests as well as the full test. The size of the sample for establishing reliability should be in the range of 200-300 subjects. The interval for test-retest reliability may be 4-6 weeks.

The content validity of the test has been partly established at the time of planning the test, defining the domains and writing down the items. The criterion related validity indices may be obtained by correlating the total language proficiency score of this test with high school language examination marks, college language examination marks, teachers' ratings of students language proficiency, etc. The construct validity of the test could be established through factorial designs, and factor analytic procedures.

Once the final form of the test is available in print for the users, it may be necessary to develop parallel forms of the test or preferably to have item banks for each of the sub-skills calibrated with respect to their difficulty and discrimination indices. The items for parallel forms or item banks can be developed by following the guidelines suggested here. Item banks are preferred because they offer the advantage of flexibility in pulling out tests with varying test length, varying difficulty level and other important test characteristics.

NOTES :

  1. *Compilation of proceedings : Dr. A.K. Srivastava is the Professor-cum-Deputy Director at the CIIL, Mysore-6. Dr. A.K. Mohanty is a Professor of Psychology at the Centre of Advanced Study in Psychology, Utkal University , Bhubaneswar . Dr.U.N. Dash is a Reader at the Centre of Advanced Study in Psychology, Utkal University , Bhubaneswar . Dr. T. Ramaswamy is the Joint Director, UPSC, Delhi .
  2. For each subtest of language proficiency, Table-1 presents the number of items to be written for preliminary tryout, the number of items to be retained for the final version of the test, and other specifications as would be helpful for preparation and development of the test.
  3. It is recognized that all aspects of the writing and composition skills cannot be tested through multiple choice type of questions. It may be necessary to have a second form to this subtest (Form B) to tap natural production of writing and composition skills through open-ended questions.
  4. As the number of items in the preliminary version will be approximately 300, the number of subjects in the tryout sample should be more than 300, and preferably one and half times as much as the number of items.
  5. Answer key for every sample question is shown in italic



  Website Host  |   M I L E S  |   Basic References  |   R & D Outcome  |   Graded Syllabi  |   Graded Tests  
  Resource Scholars  |   NTS-India  |   NTS Newsletter  |   Announcement  |   Credits  |   Other Ciil Links  
Copyright © 2008 Central Institute of Indian languages. All rights reserved worldwide.