A Puff of Absurdity

"When we remember that we are all mad, the mysteries disappear and life stands explained." Twain / "I write to keep from going mad from the contradictions I find among mankind - and to work some of these contradictions out for myself." Montaigne / "I write because I have found no other means of getting rid of my thoughts." Nietzsche / "Writing is an integral part of the process of understanding." Arendt / "Writing, to me, is simply thinking through my fingers." Asimov.

Wednesday, March 13, 2019

Are Grades Harmful to Students?

A bold claim was made to me recently that giving students grades on assignments and tests actually impedes their ability to self-assess their work. It's a big deal when an educator insists that what you've done for years is actually harming the ability for your students to achieve to their potential. The winds have shifted again, and there's another movement coming, this time to restrict grading student work with anything beyond descriptive feedback. I think that number or letter grade is actually important to student success, and that initial claim requires some scrutiny.

Student Self-Assessment

One goal in teaching anything is to get the learner to a point where they can recognize whether or not they are achieving with excellence. Absolutely! In some areas, excellence is easier to see than others. If you're learning to swim, then excellence at a specific level might be measured by the ability to swim one length without touching the bottom of the pool. That's a marker that's easy for the novice to recognize just by the feel of whether or not their feet touched bottom. But other learning is more difficult to assess as clearly. If a student is learning to dance, or learning a new language, or learning to argue a philosophical position, the student can easily feel like they've master the new skill, yet be completely mistaken. This is what makes So You Think You Can Dance so entertaining (or just sad).

These students need an expert to let them know where they stand against a standard of achievement. It takes a certain amount of mastery in some subjects before students can be expected to self-assess. That might not happen during high school. It can take years of study and guidance to focus in the right places and notice the subtle differences between excellent and abysmal. It's often the level of precision demanded in a field that takes so long to get students to notice and then apply to themselves.

In more subjective fields, where there's an element of the aesthetic and we've moved beyond the basic foundational skill level, then reaching a high mark is a not always a matter of assessing how close our work matches an exemplar, but in assessing how the teacher assesses how close our work aligns to their particular expectations. It's learning how to do it the way the teacher wants it done that can make the difference between getting into university/college and going straight to low-level employment. When I was taking philosophy courses in university, I had a professor who always gave me a B+ on every assignment. I had a different prof read my work to try to figure out what I was doing wrong, and he said, "If you want to do a masters, then stop taking courses with that prof." He was very clear that different instructors mark differently. It's not a secret, and we're doing nobody any favours pretending it's not true. But without grades on early assignments, students aren't able to shift their writing style to align with the specific teacher's version of excellence.

Descriptive feedback is definitely useful, but a grade lets the students know if the feedback is encouraging them to reach perfection with just a bit more precision work, or encouraging them to reach a pass and then maybe get out of that line of study. Descriptive feedback alone doesn't let the student know how close they are to the pinnacle in that field. I'm reminded of reading about a distance swimmer who, after exhaustedly reaching out to the boat to give up crossing a lake, was later interviewed and asked, "How does it feel to know you quit just half a mile from shore?" And she responded, "I was just half a mile from shore?!!" Knowing how far we are from a set standard is valuable information that enables us to reevaluate how much more effort we need to muster, and it allows us to decide if this the an area we want to pursue.

High school students can't self-assess in a vacuum. Teachers are experts in the field, or should be, and they see many many examples of similar work that enables them to tell excellent from good, fair, or limited. Those standards we measure against are somewhat arbitrary, but they are a marker that we've generally agreed upon. Sometimes standards are a measure of what people have achieved in the past, provided so people can aim for the top, like trying to beat Olympic times. But standards are also to ensure we're not falling behind because of some learning challenge that hasn't yet been identified. We look to standards when we watch for a toddler to hit each milestone successfully. Without these specific markers, assessed by experts, we might find students slipping through the cracks more frequently.

So, it's not a matter that the teacher is judge and jury in the fate of a student, but that the teacher is better able to calibrate the nearness to perfection of a student's demonstration of ability in a course. This does affect their future, but that's because each of our abilities and limitations dramatically affect our future plans.

Grades are a type of feedback that provides useful information to students, parents, and administrators, particularly if they are clearly tied to a rubric of clear knowledge and skills that can be developed and improved over the course when specific gaps in learning are noted. However, students (and parents) are encouraged to remember that grades are nothing more than an indication of the level of understanding that's been demonstrated on a particular assignment relative to a standard set by the curriculum. It's in no way a reflection of intelligence or ability in general, or of the student's particular wonderfulness, and students (and parents) should never base their self worth on external sources like academic performance, but it can be really useful to understand limitations in ability if the student is finding it difficult to demonstrate understanding of a subject area with excellence.

Our Mark-Focused World

If we didn't have marks at the end of each course, and if our marks didn't have a huge impact on the students' futures, then I would get on the no-marks bandwagon in a second. I would love to teach classes where the focus was entirely on exploring the world, where students didn't have to prove to me how much they understand, but they could be driven entirely by their own curiosity and their absolute joy of learning. I would love for universities and colleges to have an entrance exam specific to each discipline and high schools use a simple pass/fail determination in each course.

But marks do matter.

The attitude of mere curiosity and joyful learning, in a classroom of students entirely intrinsically motivated, can't be fostered in a competitive atmosphere where marks can make the difference between stability and struggle for the rest of their lives. For students who can't pay for university or college, a 2% difference in a grade that leads to a scholarship can make a difference in a decision to further their own education. Unfortunately, we have to educate students within this antagonistic framework. We can't merely will away the importance of marks to our students by no longer providing them. Removing marks during a course, and then providing a calculated mark on the final report card, doesn't address issues with intrinsic motivation and decreased competitiveness of the school system the way a full-on pass/fail system has shown to do. It's a half-measures approach that might result in the worst of both worlds.

The Studies Driving This Trend

The slides I was shown, that come to the dubious conclusion that marks are harmful, weren't cited, but they seem to be primarily coming from John Hattie, a New Zealand educator who wrote a book, Visible Learning, which has lots of numbers (ironically) that indicate which teaching strategies have the most effect on students' achievement. One of the effects at the tip top of his scale, which is being used as the primary backing for the claim, is "Student Self-Reported Grades." It's implied that students do better if they self-report their grades.

First, the methodology. Hattie is doing, what he calls, a meta-meta-analysis. A meta-analysis with just one 'meta' is a compilation of several studies on one topic. But, as students in social science learn, a meta analysis can be great in that it increases the sample size, but it will be flawed in totality if just one of the studies is flawed in any way. It's much more reliable to do one excellent and far reaching study with a large sample size. Hattie's meta-meta analysis is a collection of collections of studies. It just takes one crappy study in one of the collections of studies in the bunch for the results to come into question. As Robert Slavin explains, "Hattie is profoundly wrong. He is merely shoveling meta-analyses containing massive bias into meta-meta-analyses that reflect the same biases."

Secondly, Hattie has spoken to this particular line in his list of effects. He says that he didn't actually mean that students should make an educated guess at what their grade is based on comments alone, at all. He says he should have named the factor "Student Expectations" to avoid all the crazy misunderstandings. What he says DOES work is to have students explicitly predict their expected grade on an assignment, then be encouraged to work beyond that expectation. According to Hattie, it's this performance beyond their own expectations that develops needed confidence in learning ability. Self-assessment helps them focus on how well their work matches a rubric, but any self-assessment must be evaluated by an expert in the area (the teacher) in order for the student to develop assessment skills.

But, before you get students to predict their grades before starting each assignment, there's a problem with this conclusion as well. It isn't remotely what the research suggests; the studies Hattie used were not even about influencing academic success. A problem with the student expectation method is that research shows that generally people overestimate their ability from the outset (cf. prior dance video). Also see David Didau on the topic of self-reporting and grade prediction, who says,

"Most students are novices – they don’t yet know much about the subject they’re studying. Not only do they not know much, they’re unlikely to know the value of what they do know or have much of an idea about the extent of their ignorance. As such they’re likely to suffer from the Dunning-Kruger effect and over-estimate the extent of their expertise. All of this creates a sense of familiarity with subject content which leads to the illusion of knowledge. The reason tests are so good at building students’ knowledge is because they revealing surprising information about what is actually known as opposed to what we think we know. Added to that, our ability to accurately self-report on anything is weak at best."

Another teacher predicts that Hattie's list will be popular as it convinces teachers that they don't have to mark any more, not providing a grade until the very end of the course! Researcher Scott Eacott explains that gurus like Hattie are able to get attention for their work because administrators love data. Teaching and any work with human beings is an art, not a science, but management craves proof that strategies work. Eacott says,

"the uncritical acceptance and proliferation of this cult is a tragedy for Australian school leadership. . . . no amount of research will be able to tell us, in any definitive way, “what works” in different (and always somewhat idiosyncratic) contexts . . . The pressure on individual students, teachers, school leaders, systemic authorities, politicians to improve outcomes is unrelenting. Similarly, the need for evidence to support these improvements, or illuminate weaknesses, is ever increasing. Hattie has provided a means of making decisions. . . . This also relies on an acceptance of the sample of a mega-analysis being stable and of equal value – which methodological critique has highlighted may not be the case. . . . The uncritical acceptance of his work as the definitive word on what works in schooling, particularly by large professional associations such as ACEL, is highly problematic . . . What the Australian school leadership community arguably needs is more rigorous and robust work and more significantly, dialogue and debate."

BUT, hold the phone, there actually IS a study that suggest that grades harm achievement, but only if they come without praise.

Anastasiya Lipnevich and Jeffrey Smith (2008) gave college students feedback on an essay in twelve different conditions and then tested their improvement on the next essay. The conditions were no feedback at all, no feedback except a grade, detailed feedback, detailed feedback with a grade, and detailed feedback by a robot, and robot feedback with a grade - each divided further into with praise and without praise groups. With 464 students in the study, that means there were only 38 students in each treatment group, which is a small sample size. An attempt at replication would be useful. They found that student feedback without a grade showed the highest improvement, except if the grade came with praise. In other words, student feedback with a grade and praise had one of the highest effects. In real world terms, it means telling the kids what they did wrong specifically, and telling them the number they have compared to a standard, but also encouraging further improvement by praising their effort:

"The analysis revealed a significant disordinal interaction between grade and praise . . . under the grade condition scores were higher when praise was presented than when praise was not presented. For the no-grade condition, scores were higher when praise was not presented than when praise was presented. . . . Receipt of a grade led to a substantial decline in performance for students who thought the grade had come from the instructor [not when perceived to come from a robot], but a praise statement from the instructor appeared to ameliorate that effect."

So grades aren't harmful provided they're accompanied by praise. An example of praise that made all the difference in the study is, "Name, you made an excellent start with this essay! I still see room for improvement, so take some time and make it really great." Who doesn't add at least one line of encouragement on their rubrics?

As such, I'm not prepared to give up grading papers yet.

And another thing! I still give grades out of 100 instead of levels out of 4. Shocking, I know. I discovered that when I used levels, students just translated them to get a number out of 100. If a level 4 is about a 90%, then why not just say 90%? We seem to have a better intuitive sense of what a 67% means than what 2+ means. It saves students from trying to translate and calculate marks to determine how well they're doing in the class. Furthermore, while there's not much difference between a 77 or 79 (levels 3+), there IS a big difference between a 91 and 100, which are both levels 4+. Students should be allowed to know if their work is reaching near the best possible exemplar and in need of a bit more work, or if it's actually at the pinnacle of achievement.

But I'm a dinosaur. I'll be out of here in the next few years, and the next round of students will learn to manage the anxiety caused by not having a clue where they stand relative to expectations. Not knowing where they stand is a new frustration for these student. But this too shall pass.

ETA: Bergeron's How to Engage in Pseudoscience with Real Data

ETA: On the problem with learning style - while we're at it!

1 comment:

Clipping Path said...: Nice article as well as whole site.Thanks.; March 14, 2019 at 12:12 AM