Need help getting ready for finals? Click here to learn more!

Share

Is Detecting Chatbot-Generated Content Possible?

Welcome to part 3 of Crowdmark’s ‘Rise of ChatGPT’ series, which explores the impact of chatbots on the education sector.

Last time, we covered the rise of ChatGPT and AI-driven chatbot technology and their impact on grading in higher education. This time, we’ll dive into some specific use cases and the million-dollar question: Whether it’s possible to detect chatbot-generated content in student work.

Protecting academic integrity while supporting student learning has been a joint priority for academic communities far longer than chatbots have been in the zeitgeist. On some level, cheating has existed for as long as there have been students.

For example, in My Word! Plagiarism and College Culture, Susan D. Blum cites the Chinese keju exam for entrance into the country’s civil service as one test with a 1,400-year history of students trying to circumvent its requirements.

In 2007, long before chatbots were on anyone’s radar, Maclean’s published a lengthy report claiming over fifty percent of Canadian university students cheat when submitting written work.

Given the existing challenges with encouraging academic integrity in student populations, it’s no surprise that the rise of chatbot technology is a growing concern on academic campuses. Earlier this year, we talked to the Crowdmark community about AI and student work.

We heard:

  • “I don’t know enough about AI, but I want to learn more.”
  • “Students know how to find AI tools and use them; we have no choice but to address it.”
  • “I don’t know how to stay ahead of my students and their knowledge and use of AI.”
  • “Students don’t know what’s allowed and how they are permitted to use AI.”
  • “In some ways, AI has less bias (e.g., doesn’t care if handwriting is neat), but there’s so much potential for deeper bias when using generative AI to create more relatable questions or discussion prompts: Its cultural, social, racial, gender perspective is limited to dominant Internet narratives that reflect the datasets their learning language modules were fed.”

Where have Crowdmark instructors allowed chatbots to be used in their classes?

  • “I’ve used generative AI (ex: ChatGPT) to make questions more relatable to my students’ interests.”
  • “We’ve entered learning outcomes and asked generative AI to write a real-world prompt for the discussion board.”
  • “I’ve used generative AI to make learning outcomes clearer for my students.”


Chatbots: An endlessly patient tutoring assistant  

Chatbots do have potential to help instructors to plan lessons, update material, or extend concepts for students who are struggling.

In an article by Claire Bryan for The Seattle Times, Min Sun, a University of Washington Education professor, described chatbots as potential lesson planning assistants. That help could range from recommending “different levels of math problems for students with different mastery of the concept,” to asking a chatbot to provide a student with tailored assessments to help them catch up to their classroom peers.

Sun also touches on the possible advantages to allowing students to enter a dialogue with the chatbot, learning in conversation rather than being spoon-fed an answer. It’s perhaps not what Socrates envisioned when devising the Socratic method, but such is the state of learning in 2023. 

How this work will overlap with the labour of teaching assistants and human tutors remains to be seen. If you’re an instructor, it’s safe to assume that students are willing to experiment with chatbots.

Can you tell when a chatbot has been used in student work?

Unfortunately, the short answer is no.
Several companies in the education technology space offer chatbot detectors. These tools process text to determine if it was written by an AI, but false positives are very common. 

“I’ve spent time playing with chatbot tools,” says Paul Mitchell, Crowdmark product designer. “On one level, I’m shocked at how good they are. If you ask a chatbot to write text for an app in a user-experience style, for example, it’s very good at delivering short, punchy text. When you ask it to write for an academic audience, things get trickier.”

Through his role, Mitchell considers the downstream effects of technology choices. “You have to think about the dangers that present themselves when a detection service delivers a wrong answer,” he says. “A false positive could mean a student is penalized even if they didn’t cheat, and that challenge partly explains why some institutions won’t implement automated detection. Trust is a two-way street and unreliable detection tools make it very easy to overstep that line.”

Most notably, OpenAI, the makers of ChatGPT, quietly sunset their own detection service earlier this year. “They pulled it down in July,” says Mitchell. “If the market leader doesn’t trust its ability to deliver accurate results, presumably with access to a huge amount of data, that tells you something. For now, detecting the work of chatbots is a challenging problem to crack.”

What’s the future for chatbots in higher education?

While ChatGPT’s speed at changing how we work with technology feels mind boggling at times, it’s equally clear that AI-driven automation tools aren’t going away.

“I like to think it’s going to become something like Wikipedia,” says Mitchell. “It’s collaborative, there’s a lot of input from online crowds, but you take what it does with a grain of salt. Wikipedia is a great jumping off point when I’m researching a new topic, but I don’t go there for news about recent events, especially when they’re highly politicized.”

For now, those jumping off points may include asking chatbots for feedback on:

  • Your project outline or lesson plan: What are you missing?
  • Where to iterate: What other examples could you cite to refresh or extend your thinking?
  •  Spelling, grammar, or logic gaps: What’s missing in your work?
  • Brainstorming new directions: Where else could you take a topic?  

In the meantime, Mitchell sees Crowdmark’s instructor base adapting coursework to smaller, bite-size chunks rather than one big assignment, many of which are being done on paper in class.


“I remember taking high-stakes exams where my mark was predominantly dictated by one day’s performance,” he says. “Tests like that don’t let you evaluate learning along the way. We’re seeing more instructors break up their assignment work to allow students to focus on specific skills and knowledge. Many of these assignments are given and completed in class, making it harder to use chatbots.”

“The flip side,” he continues, “Is that marking then becomes more intensive. There’s more work to give feedback on, which is exciting from Crowdmark’s perspective since it allows for more dialogue between students and instructors. This type of marking approach is less vulnerable to chatbot use.”

For now, it’s safe to assume that your student population is experimenting with these tools and that clear guidelines and policies will create less ambiguity about how chatbots fit into your classroom.

Interested in learning more about Crowdmark? Get in touch for a free trial:

About Crowdmark

Crowdmark is the world’s premiere online grading and analytics platform, allowing educators to evaluate student assessments more effectively and securely than ever before. On average, educators experience up to a 75% productivity gain, providing students with prompt and formative feedback. This significantly enriches the learning and teaching experience for students and educators by transforming assessment into a dialogue for improvement.