If
We Must Use Grades, Let's Make them Reliable
by
Dr. Kathie F. Nunley
My
daughter brought home her report card this week. It wasn't a
bad report card, but it wasn' t the one most parents dream about
- the one with all A’s running down the column for 2nd
quarter. Hers was a bit of a mix.
"How
can you keep getting a B in Spanish?" was my first exclamation.
Yes,
now that I’ve had time to ponder my reaction, the teacher
and psychologist in me realizes I should have first commented
on all the courses that she excelled in and how proud I was
of the effort she was making, etc., etc. before pouncing on
her for what I considered the negative aspects of the report
card. But sometimes my Mother brain section overrides my teacher/psychologist
brain section.
"Mom,
I'm trying, really!" came her standard reply.
"Kahlia,
Spanish is your easiest class. There’s no reason not
to be making an A in there. Why do you keep getting B's? I
thought we discussed all this last term?"
"Mom,
be glad I’m not like Gina" (her best friend). "She
got a C and she speaks Spanish fluently.” “She
never does anything in that class. The only reason she is
even passing is that she speaks it fluently, so she aces all
the tests.” “ I just forget to do my journals
and that brings my grade down."
"I see you got an A in World Studies. That's great. I
appreciate the effort you must have put forth for that."
(OK, the reasoning, empathetic psychologist brain kicked
in here).
My
daughter's reply, "Well, not really. It’s an
easy class. I do all the projects and they count for most
of the grade.” But, hey, I got the A."
Grades.
They are the end result of a student's journey through a class.
But they are more than just a mark on a report card. Grades
are the liaison between schools and the American public. Grades
are the measure by which parents and the community outside our
walls assume we are doing our job inside the walls.
I
contend that the subjectiveness of our grading system though
is not just a small flaw in our educational system, it is a
gaping wound, oozing forth most of the pus which shocks and
disappoints the public as they view their public schools. The
community and political arenas assume that school grades are
valid and reliable predictors of learning and ability. And the
dark secret we educators never talk about is that they really
are neither reliable nor valid. Would anyone like to defend
our educational grading system to the American public?
Validity
One
of the first things you learn in statistics is that a measuring
device has no intrinsic validity. The validity is determined
by its use. In other words, an ACT test alone carries no validity.
The validity comes when you look at what it is used to measure
or predict. Is the ACT a valid predictor of intelligence? Is
it a valid predictor of college success? Is the ACT a predictor
of your ability to drive a car? Is it valid in terms of how
well you can raise pigs? You can see that the validity for the
ACT would vary wildly in these situations. The ACT may be somewhat
valid in its ability to predict college success, but not valid
at all in its ability to predict successful pig farming.
Yet,
the American public assumes that school grades are a valid predictor
of learning and ability. They mistakenly believe that grades
measure these things somewhat accurately. What disappoints them
most is that they see so many blatant examples that indicate
otherwise. They see too many students making good, or at least
passing grades who actually appear to have learned very little.
They see too many students who have the ability to learn and
do complex thinking who have failed classes and dropped out
of school.
Reliability
This
is not to say that students who make high marks or grades in
school are not learning anything or are not gifted young people.
For many, if not most, are. But the system is not reliable.
Reliability
is the other measure of a test Something is reliable if it is
consistent. A test is reliable if it gives you a fairly similar
score each time. We can view reliability with some simple examples.
If I put a full bag of flour on my scale every day for a week
and it always weighs 9 pounds, then I declare my scale to be
reliable. If I’m weighing a 10 pound bag of flour, I may
not find my scale very valid, but it is nevertheless, reliable.
(To be a valid tool for measuring flour, the scale would have
to measure it at its true weight of 10 pounds.) I use this same
thinking with my kitchen oven. It is reliable, but not valid.
It consistently heats 25 degrees hotter than I set it. Because
it is consistent, it is not a problem as I know to just set
the temperature 25 degrees lower than I want it.
You
can easily see how you can have reliability without validity,
but you cannot have validity without reliability. If my scale
weighs the flour at 7 pounds one day, 9 pounds the next day,
10 pounds on day three and 8 pounds on day four, it is not reliable
nor could we consider it a valid tool for measuring bags of
flour. So if you don’t have reliability, then you surely
cannot achieve validity.
So
which is more important, reliability or validity? Perhaps reliability
is, because without that, you lose both.
Are
grades in school reliable predictors of learning and ability?
Does an “A” always mean a student has learned more
than could be expected of most students and has an ability in
the top sector of his or her school? Always? The vast majority
of the time? Most of the time? Sometimes? Maybe?
Take
any department at your school and pull out the students who
achieved an “A” this term on their grade report.
Now pull out those students who received a “C” on
their grade report. Can you put all the students with “A's
in one group and know for sure, without exception, that they
are the brightest of the bright? They all learned more than
any of the other students in the English department or the Science
or Math department? Can you say with certainty that those “C”
students, without exception, are of lesser quality? Are you
assured that they actually know less, learned less and have
a lesser ability than any of the “A” students? It
would be the rare school would could make this claim. For this
is the dark secret that we hesitate to share with the public.
Our grading system seriously lacks reliability and without that
we cannot even hope for validity.
Currently
the assigning of grades is often helter-skelter within departments
and between schools and can often be a bit of luck-of-the-draw.
A student can get a “B” in Mr. Jones’ geometry
class because he did absolutely all the assigned class and homework
(possibly by questionable means), did loads of available extra
credit and barely passed exams. But if Mr. Jones weighs tests
rather lightly compared to classwork and always offers lots
of extra credit and make-up work, then this student can end
up with a “B” in geometry. If however this same
student just happened to have been scheduled into Ms. Richards’
class, he would have earned a “D” as Ms. Richards’
offers no extra credit and heavily weights exams. Mr. Jones
and Ms. Richards teach the same subject in the same school which
issues the same transcripts to the same Universities who use
high school grades as one of the factors in acceptance.
The
system has been steeped in subjectivity for so long that it
will be a lot of work to change it, but certainly not impossible.
We need to start by being honest with ourselves, the public
and the politicians about our grading scheme. The system we
use to publicly declare a young person's success in a course
is extremely subjective and varies widely among teachers, departments,
schools and districts.
Once
we acknowledge this, we can start to address the problem and
look for solutions. We must come up with some type of standardization
within our schools for evaluating student performance and then
form an operational definition for grades that can be shared
with the public. Our goal is to produce a “key”
to the grades on a report card. Can you operationally define
what an A means at your school? Can you take steps toward improving
the reliability of that definition? Given the fact that grades
are the most important interface we have with our constituents
(students, parents, school boards, colleges) it is a critical
that we look for ways to make them highly reliable.
Several
years ago, I started to address a solution to this issue with
the Layered Curriculum model of high school instruction. One
of the key components to Layered Curriculum is that student
grades are indicators of the depth or level of study rather
than subjective marks determined by individual teachers. Layered
Curriculum classrooms divide the study of a subject into 3 layers,
based on Bloom’s taxonomy - basic knowledge, application
of new knowledge to previous knowledge, critical leadership
evaluation in that topic. Grades now are attached to those layers
as such:
C:
This student has added to their bank of general knowledge to
a level deemed acceptable by the teacher.(Standards may be established
through departments as to demonstrated recall)
B:
This student added to their bank of general knowledge as above,
plus demonstrated his or her ability to apply that knowledge
in a different field or compare it to a different arena. The
student demonstrated an ability to use and manipulate the new
knowledge in addition to storing it for recall.
A:
This student added to their general knowledge bank, and applied
or demonstrated use of that knowledge as above plus was able
to critically evaluate an issue in the real world which required
their ability to combine knowledge with ethics, values, morality
and/or sense of global responsibility.
The
idea to this grading scheme is to operationally define what
a grade means by requiring a particular thought process at each
layer. Student grades are determined by the complexity of thinking,
not just rote knowledge and recall. Now there is some standardization
to grades and a way for them to be consistently interpreted
by parents, institutions and businesses outside of our secondary
school system.
Is
this a valid measure of learning? That all depends on if you
agree that more complex thinking is an indicator of learning
and ability and is to be valued. It may be valid if we in fact
judge learning by a student's ability to use or generalize new
knowledge to other areas and by their ability to debate serious
topics and form opinions and make decisions as a leader or adult
voter.
You
may feel that there are better indicators of learning or at
least additional indicators, and you may be correct. But what
is most important here is not whether or not it is valid, but
that it could at least be reliable. Once we get reliable, then
we can start to tackle the next step, validity. But we have
to start with something reliable.
The
current system is seriously flawed. We must start the repair
by starting with the reliability issue of grades. Find some
operational definition for grades within each department, ideally,
within each school, and then build your teaching instruction
around those definitions.
If
you want to use the simple Layered Curriculum model, just start
by having teachers break down each instructional unit into Basic
knowledge, application/manipulation, and critical debate issues.
For
their C layer, teachers decide what basic information do students
need to learn. How can they measure that? What standards will
they use to determine successful completion of that C layer?
For
their B layer, teachers decide what types of assignments or
assignment choices can they offer to allow students to play
around with that new learning. Find ways to have students connect
new learning to previous knowledge. Some interdisciplinary activities
would very well in this layer as do projects, displays and problem
solving labs. Teachers need to establish the criteria or standards
as to how to determine mastery of this B layer.
Finally,
for their A layer, teachers need to identify current issues
pertaining to their topic for which there is research to support
more than one view. This is simply a matter of thinking about
issues in the news that pertains to this subject where there
are no right and wrong answers. What issues are leaders and
voters currently dealing with? Have teachers offer students
the opportunity to research these issues and then form an opinion.
Establish criteria for mastery of this A layer.
For
parents of children in Layered Curriculum high schools, grades
now have predictable meaning. Now the mother can say to the
daughter, “I see you have only applied what you learned
in Spanish class. Why did you not take the time to involve yourself
in a critical thinking issue?” Or, “I see you have
gathered quite a bit of basic information and skill in math,
but how can we help you take on the application issues in order
to bring up your grade?”
At
least it is reliable. It is predictable. It is the same across
the board. A university looking at a transcript would understand
the meaning of an A or a B. A counselor or future employer would
be able to really know what a person was capable of by looking
at a transcript. Once we get some reliability, we can then take
on that real sticky issue, validity. Are these valid measures
of intellect, ability and learning?
Until
then, I will join the other parents and ask my daughter, “Can't
you just do some extra credit?”
Kathie
F. Nunley is an educational psychologist, author, researcher
and speaker living in southern New Hampshire. Developer of the
Layered Curriculum® method of instruction, Dr. Nunley has
authored several books and articles on teaching in mixed-ability
classrooms and other problems facing today's teachers. Full
references and additional teaching and parental tips are available
at: http://Help4Teachers.com Email her:
Kathie (at) brains.org
READ
NEXT ARTICLE IN THIS SERIES =>