Psychology of Assessment - Week 4
Reliability
Definition
- Reliability refers to measurement with
little error
- Reflects the accuracy and stability of a
test score
- Can be measured statistically using the
correlation coefficient
Measurement Issues
- Tests attempt to measure some psychological
trait, using a "rubber yardstick" which bends and stretches around
sources of error
- Goal: Find someones true score on
some measure and reduce measurement error
- The standard error of estimate - Tells us
how much, on average, a persons score varies from the true
score
Determining Reliability
Test Retest
- Determines if scores vary over time by
testing a person at two points in time, and correlating these
scores
- Problem: Practice effects influence the
scores
- This type of procedure tends to
overestimate reliability as a result. Should have a correlation of
.90 (.80 would be acceptable)
Determining Reliability
Alternate Form
- Uses 2 different forms of the same test.
Both forms given to everyone in the same session in a
counterbalanced fashion
- If they are not given on the same day then
it is called delayed alternate forms
- The two scores for each person are
correlated for the reliability estimate
Determining Reliability
Split Half
- The whole test is given at one time, then
scores are on one half of the test are compared to the other
(e.g., even and odd items)
- Use the Spearman-Brown correlation for this
estimate (rnn) which intentionally inflates the correlation
coefficient because the test is now based on fewer items (half the
original)
Determining Reliability
Kuder-Richardson
- A statistical measure of internal
consistency, this formula estimates what the average reliability
would be if you did all possible split-half
administrations
- Only works well when a test is fairly
single minded
- Special type is the KR-20, which does the
same thing for tests with dichotomous items
Standard Error of Measurement
- A statistic that expresses
reliability
- Tells what the influence reliability has on
the interpretation of test scores
- Small error, increased
reliability
- ømeas = SD 1 - R
Factors that Affect
Reliability
- Number of items - adding items will
increase reliability, due to the fact that you are sampling a
larger segment of the domain
- The Spearman-Brown formula can be used to
estimate how many items you will need to reach the desired level
of reliability
- Good vs. Bad items - do an item analysis,
which is basically a correlation between any one item and the
total score on the test, or how they load on a factor associated
with the test
- Attenuation - the amount of decrease in the
correlation due to measurement error
- To estimate the reliability without any
measurement error (e.g., the true correlation between the
variables), use a statistic called Correcting for
Attenuation
Back
to the Psych of Assessment Course Outline