Photo courtesy of M. Scott Brauer / original article
Les Perelman has written a program that can mimic a perfect-scoring SAT essay. Unfortunately, it’s gibberish.
His software, appropriately called “Babel” after the famous biblical tower, can generate an SAT-length essay in under a second. It uses a combination of length and “SAT words” to receive high marks from automated essay-grading software. Content is not a criterion; the essay does not need to make sense. But he does pick a “topic” for the computer to write on, much as the standardized test offers a writing prompt for a personal essay.
The College Board, which owns standardized tests such as the SAT, SAT-II, and PSAT, uses a combination of human and computer graders to reach a composite score.
Other researchers dispute Perelman’s findings, arguing that their measures of machine grading vs. human grading have shown that the computers are reliable proxies. But Perelman argues that the human graders are given so little time to complete their scoring that they cannot meaningfully assess an essay for content; the search for big words and length explains the correlation between computers and people. Perelman’s argument has been confirmed by anecdotal data (for example, http://www.latimes.com/opinion/la-op-sat3apr03-story.html), but not in official research.
- Writing Instructor, Skeptical of Automated Grading, Pits Machine vs. Machine
- Essay-grading by software flawed, ‘essentially impossible,’ expert says (pro)
- Computers ‘dramatically more reliable’ than teachers in marking Alberta diploma-exam essays: study (anti)