A Criterion-Referenced Test is NOT a Mastery Test
This week’s entry is inspired by a Figure in a widely publicized book by Sharon A. Shrock and William Coscarelli called Criterion Referenced Test Development: Technical and Legal Guidelines for Corporate Training. I have not completed reading this book, but I find the authors perspective to be an interesting departure from the traditional testing paradigms (which equals 20 cents for those of you keeping score). I would like to thank them for stretching my mind and inspiring me to think differently about testing in corporate settings.
I would like to take this opportunity to continue the stretching by expanding on a topic they discuss in chapter two, which defines both criterion-referenced testing (CRT) and norm-referenced testing (NRT). A norm-referenced test (NRT) is one in which test outcomes (e.g., grades or pass/fail) are determined based on each examinee’s score relative to the other examinees. Although this practice is uncommon (and arguably unethical) it is occasionally still used today. For example, some of the state Bar exams are norm-referenced. Typically, the top X percent of examinees are awarded a passing mark, regardless of how competent or incompetent the group of test takers was that took the exam together. In other words, if a prospective lawyer was to take an exam along with the most competent group of graduates, then (s)he would have less chance achieving a passing mark than (s)he would have if (s)he took the exam alongside a group of bottom feeders. Does it surprise you that the legal profession would endorse something out-dated, scientifically unsupported and arguably unethical?
On the other hand, a criterion-referenced test (CRT) is a test composed of specific objectives, or competency statements. This type of test is common in licensure and certification. The passing rates for CRTs vary with each test cycle since examinees are evaluated based on their competency relative to a criterion-referenced passing standard (aka cutscore). There are many other attributes of these two types of tests beyond their scoring methodology, and I’ll leave it up to future posts to expand upon these.
One other type of test that Shrock and Coscarelli refer to is a mastery test, a test where most examinees answer the vast majority of the content correctly. K-12 classroom tests are commonly designed this way. The distribution of scores for a mastery test looks similar to this (Insert distribution). I think that it is important to point out that mastery tests are a form of criterion-referenced tests. In other words, Criterion-Reference Test Mastery Test. See below for a visual representation of this.
So, what do we call a non-mastery CRT? To be honest, I don’t know. I have heard people refer to them as non-mastery tests or non-mastery, criterion-referenced tests.
Mastery tests are useful in the corporate training world where the content domains are small (typically measured in class hours) and the shelf-life of the training programs and tests are generally short (measured in months or years). However, they are NOT optimal for certification (corporate or non-corporate).
Why should a corporation build a non-mastery, criterion-referenced test? There are two primary reasons.
- If constructed properly, non-mastery, criterion-referenced tests provide more information than a simple pass/fail result. Non-mastery, criterion-referenced tests are competency measurement instruments. Just as a ruler measures the length of an object, a non-mastery, criterion-reference test can measure the competency of an individual. This ruler can be used to measure the competency of individuals or the difficulty of the test questions which can provide valuable feedback to the training program or corporation.
- When the level of mastery changes, it is much easier to change the level of competency required to achieve mastery, than it is to write new content or a whole new exam.
May 16th, 2009 at 6:59 am
Their definition of mastery test just sounds like a criterion-referenced test with too many easy items. CAT to the rescue!
Matt
May 18th, 2009 at 7:46 am
Thanks BB — you continue to enlighten us all.
… would it be accurate to say that a mastery test attempts to be a more comprehensive evaluation of subject matter in a specific area, with a cut-score tied to gross “mastery” of the content area (eg; specified by some % between 70-100%); and conversely that a CRT is prioritized by desired competencies within the subject matter (ie; sampled), and the cut-score operates as a professional judgement about the attainment of those competencies?
Just wondering…bob
May 18th, 2009 at 4:07 pm
Yes, a Schrock and Coscarelli Mastery Test does contain a lot of easy items relative to the average examinee (which is inline with the common use of the term Mastery Test). I agree with you Matt that this would be less than optimal for certification exams, where content domains are large. But, in corporate training, where content domains are small, I would agree with Sharon and Bill that we should design the training and exam content around specific and clearly definable criterion. When executed properly, this should yield a mastery test distribution. The difference between the two scenarios is the size of the content domain and the need to sample a test takers performance from that domain. Bob Hunt touched on that in his comment. I would encourage Bob to use a different term than CRT though. As I stated in the initial post, we really don’t have a good term in the industry, but this is really a non-mastery CRT.
October 15th, 2009 at 12:51 pm
I just found this thread and wanted to make a few comments: 1. The criteria in criterion should be objective and, in a corporate world, job based. 2. Mastery comes in when you expect/want most to succeed–like training an airline pilot. The score for mastery can be determined in a systematic manner using any or all three general approaches to standard setting–rarely done I might add in most companies and I think all schools. The tendency is to rely on passing scores that evolve from somewhere in our school days–70% for me at Central Catholic but way to low for the local surgeon I think. AND the organizations that seek mastery usually rely on systems approaches to the design of training, e.g. Gagne’s Events of Instruction.
Thanks for the interest!!