Analysis of the test is one of the equipment to know the students ability and the successful of teacher in teaching. Giving a test means to evaluate the students. Teacher can know students progress and problem during or after the educational experience and to evaluate the efficiency of teacher method and strategies that used by teacher thought evaluation because evaluation takes importance role in many aspect of the program of school.
Analysis of the test is given to the students based on the statement of the objective and the process of learning. Analysis the test has impact to the process of education during the education experience. A test be constructed measure the students’ achievement of the teaching learning activities. It can not be ignored because test is an important way to get information of the program of education.
Specifically, testing grammar is very importance to improve language skill. The first and the most important the problem in writing and speaking are to acquire of grammar, thus, if someone has a good formula of language (Grammar), he or she will be easy to write thing, appropriate communication well.
1. What are to Analyze
Analysis of the test is very important to measure the students’ understanding in learning process. Item analysis has several benefits. First, it provides useful information for class discussion of the test. For example, easy items can be skipped over or treated lightly; answers to difficult items can be explained more fully, and defective items can be pointed out to students rather than defended as fair. Second, item analysis provides data that help students improve their learning. The frequency with which each incorrect answer is chosen reveals common errors and misconceptions, which provide a focus foe remedial work. The third, items analysis provides insights and skills that lead to the preparation of better tests in the future. The process helps us become more aware of defective items and how to correct them.
1.3 The Test and What are to Analyze
The test of analysis here is multiple-choice. The most common type of multiple-choice structure item present a context in which one or more words are missing, followed by several alternative completions.
Another type does away with the item stem altogether and simply present several sentences from which examinee chooses the acceptable.
The test analysis in this paper is the test of structure one administered to the second semester of English students of UNISMA.
The number of test item is 32 (thirty two) in which ten of them is considered as the upper group and ten of them is lower group. In this paper the writer is going to analyze the norm-referenced test item that design to discriminate the ability among students. The item analysis procedure is going to provide three kinds of information.
(1) Estimate the index of item difficulty is sum the number of students who selected the correct answers dividing by sum the number of students in upper and lower group (10+10=20) and multiply by 100%, as following formula:
Index of Item Difficulty or (FV) = U+L/20 X 100%
The interpretation of this computation is listed as follow:
Very Difficult 0 – 49
Difficult 50 – 54
Moderate 55 – 69
Easy 70 – 79
Very Easy 80 – 100
(2) Estimate the item discriminating power by subtracting the number of the students who selected the right answer in the upper group by the number of students selecting the right answer in the lower group (U-L), dividing by one half of the total number of students who included in the item analysis (1/2 T) and multiply by 100 %, as the following formula:
Index of discriminating power (D) = U-L/10 X 100%
The interpretation of this computation is provided as follow:
Result of Computation Item Discriminating Power Item Quality
1.00 Perfect Remained
0.99 – 0.71 High Remained
0.70 – 0.40 Moderate Remained
0.39 – 0.00 Low Revised
(Negative) Poor Revised
Estimate the effectiveness of distracters by conducting stages below:
• For each item, count the number of students in the upper group who selected each alternative. Make the same count for the lower group
• the interpretation of this computation will provide four possible results:
1. Distracter will be considered effective when more students from the lower group are distracted
2. Distracter will be considered ineffective when more students from the upper are distracted
3. Distracter will be considered ineffective when the number of distracted students between the upper and lower group is the same
4. Distracter will be considered poor when he students are distracted neither from the upper nor the lower group.
For the testing of grammar, test writers sometimes use the device of the scrambled sentence in which the examinee rearranges a jumbled series of element so as to form an acceptable sentences.
Multiple-choices test is one of the objective test generally, most school or other educational constitutions conduct this kind of test. Multiple-choices test is kind of test that popular and have dominated because it is items can be used to measure student’s knowledge progression or student’s ability to engage in higher level of thinking.
Validity test is the extent to which it measures what is supposed to measure and nothing else. Every test, whether it is a short, informal classroom test or a public examination, should be as valid as the constructor can make it. The test must aim to provide a true measure of the particular skill which it is intended to measure: to the extent that it measures external knowledge and other skills at the same time, it will not be a valid test.
Based on American Psychological association (1985; in Bachman. 1990:237) suggest that: Validity is a unitary concept. Although evidence may be accumulated in many ways, validity always refers to the degree to which that evidence support the inferences that are made from the scores. The inferences regarding specific uses of a test are validated, not the test it self.
Content validity should be carried out while a test is being developed, it should not wait until the test is already being used.
The focus of content validity then is one the adequacy of the sample and not simply on appear once of the test. The test is examined to determine the subject matter content covered and the responses students are intended to make the content, and this is compared with the domain of achievement to be measured. Content validity requires the appropriateness of the ability that is measured with the kind of tests being used. That appropriateness is more concern on kind of ability that is measured to do the test than kind of ability to be measurement aims.
A test is said to have content validity if its content constitute a representative sample of the language skills, structure etc, which meant to be concerned.
There are the importances of content validity. Firstly, the greater a test’s content validity, the more likely it is to be an accurate measure of what it is supposed to measure of what it is supposed to measure. Secondly is such a test likely to have harmful backwash effect. Areas that are not tested are likely to become areas ignored in teaching and learning (Hughes, 2003; 26).
Hughes (2003:27) argues that there is importance of content validity. Content validity should be accurate. Validity test is to measure what it is supposed to measure. Accurate means a test should be representative. Tests intend to have a harmful effect. Sometimes making a test focuses on what easy to test is, than the important of the test. The solutions are writing or creating the test specification and ensure the test content is a fair reflection.