Reliability Coefficient
Reliability Coefficient
Reliability refers to the amount of confidence you can have in a test score, which Chapter 2, “The Language of Assessment,” and Chapter 10, “Establishing Evidence of Reliability and Validity,” discuss at length The reliability coefficient for our sample in Table 11.1 is reported as alpha, and its value is 0.754.
What reliability coefficient should you expect from the results of your classroom exams? The answer to this question varies, but it relates directly to the level of confidence you must have in the decisions made based on the test results. High-stakes decisions require measurement results with high reliability. In other words, the results of a test that decides whether or not a student graduates from a program of study would require a high level of reliability. For this reason, you should never base such a serious decision on just one classroom exam.
GET ASSIGNMENT HELP HERE
Miller, Linn, and Gronlund (2009) agree that the degree of reliability you must require for the results of a classroom test depends largely on the decision to be made based on the test results. Consider the importance of the decision and whether the decision can be reversed. If the reliability coefficient of a test’s results is low, make sure that you make tentative decisions; obtain additional data; and, most important, are willing to reverse your decision.
Miller et al. (2009) report that the reliability coefficients of teacher-made tests usually vary between 0.60 and 0.85. Kehoe (1995) maintains that the results of tests of more than 50 items should have reliability coefficients of greater than 0.80, while Frisbie (1988) asserts that teacher-made test results should yield reliability coefficients that average about 0.50 and that 0.85 is the generally acceptable minimum reliability standard when decisions are being made about individuals based on a single test score. Frisbie also states that reliability coefficients of about 0.50 for the results of teacher-made tests can be tolerated when the scores are combined with other scores to assign a grade. In that case, you should be concerned with the reliability of the score that results from combining the scores.
Our sample’s reliability coefficient of 0.754 looks respectable at first glance, according to these standards. This value should not be considered in isolation, however. The factors that affect the reliability coefficient of a test must be taken into account. These factors are discussed in detail in Chapter 10, “Establishing Evidence of Reliability and Validity,” and include the following:
Quality of the test items
Item difficulty
Item discrimination
Homogeneity of the test content
Homogeneity of the test group
Test length
Number of examinees
Speed
Test design, administration, and scoring
When reviewing the reliability coefficients of your classroom test results, consider all these factors. If you have a class that consists of a homogeneous group of high-achieving students, you might get a low reliability coefficient on a test of difficult, well-written, heterogeneous items that follow all the guidelines outlined in this text. It is also possible that a low reliability coefficient indicates that the items are either too difficult or too easy for the group of students. On the other hand, you could obtain a high reliability coefficient for a speeded test with a large number of items on narrowly defined content that is administered to a large heterogeneous group of students. Also remember that the testing conditions, quality of teaching, and number of questions and/or examinees are all factors that can affect the reliability of test scores. Low reliability coefficients are most often due to an excess of very easy or very hard items, poorly written items that do not discriminate, or test items that do not represent a unified body of content (Kehoe, 1995).
You must consider all influencing factors when interpreting a reliability coefficient for the results of a test. A test that has a low reliability coefficient could be providing reliable results. Your judgment is a very important part of the equation. As Mark Twain said, “There are three kinds of lies: lies, damned lies, and statistics.” Statistical findings are meaningless in themselves, and they can be distorted to fit erroneous interpretations. It is your informed interpretation of the data that adds the ingredient of fairness to your grade assignments. Refer to Chapter 10, “Establishing Evidence of Reliability and Validity,” for a detailed discussion related to reliability estimates of classroom exams. Reliability Coefficient.