Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

12
  • 14
    If you want to add an additional level of realism, first, lay them in front of a subject-matter expert (to substitute the flagger) to identify possible use of ChatGPT. Then, give those results to a mod to do a second check. I'd imagine your approach would still lead to vastly more incorrect suspensions than what's actually happening, if I flag for the tags I follow I'm 100% confident it's not a human. Commented Jun 7, 2023 at 22:02
  • 2
    I second the non-moderator test. I'm pretty sure there are many users who have seen enough GPT answers to have a fairly low error rate in detecting them. I would be quite curious to see how well I do. Commented Jun 7, 2023 at 22:12
  • 4
    @ErikA To be honest, I'm not a SME in most of the answers I'm identifying as coming from ChatGPT. I've just seen thousands of them personally, and I feel I'm fairly good at pattern recognition (obviously a biased statement, but I'm willing to put it to the test). And when I saw thousands, today I just added my 2,000th answer to my "ChatGPT Output" "Saves" list on SO. I have another 144 on a similar list on Ask Ubuntu. But I absolutely agree that for code answers, you must be a SME to identify the issues. Commented Jun 7, 2023 at 22:26
  • 5
    Even if we may misidentify some posts as AI-generated, the damage of allowing them is much larger than the damage of removing them. It's better to lower the punishment, and start with just a warning, and put the user on a watchlist for future offenses, which times out. Commented Jun 8, 2023 at 0:14
  • 1
    A more realistic scenario would be to have groups of three-ish answers from the same user, some groups from real users and some from ChatGPT. Then ask to differentiate which are which. Commented Jun 8, 2023 at 4:27
  • 1
    A better test would have a small fraction of the 100 posts be ChstGPT-generated, and the mod should not know how many there actually are. Commented Jun 8, 2023 at 15:19
  • I proposed a similar test (in moderator-only space), however using pre-GPT answers that score highly on Huggingface (or whatever) vs. GPT-generated answers. This makes the test considerably more difficult of course, but otherwise you have the problem that the vast majority of human answers do not look GPT-generated at all and thus the GPT answers stand out. This wouldn’t reflect a collection of answers flagged for being machine-generated in practical usage. Commented Jun 8, 2023 at 16:16
  • All good scenarios as well. Just wondering why nothing like this seems to have been done since SE claims this is an "evidence-backed" assumption. Commented Jun 8, 2023 at 18:20
  • 1
    Unfortunately, most of these scenarios appear to overlook several of the input metrics that various mods have said they take into account (at least in anything but the most glaringly obvious cases) — things like prior posting history and style, just to pick one example (but by no means the only one). Commented Jun 9, 2023 at 1:22
  • ^That. I for one rely heavily on user history to find GPT plagiarists. It starts by smelling something fishy with the post, and ends in confirmation with user history. Commented Jun 9, 2023 at 9:45
  • 1
    @JoelAelwyn Absolutely, the chances of us finding GPT usage are even higher when we have more data to go on. The point here is simply that some of us do feel that we can identify at a very high rate just based on the unedited GPT output. Commented Jun 9, 2023 at 10:43
  • @NotTheDr01ds Fair enough, just wanted to make sure it wasn't being overlooked. Commented Jun 9, 2023 at 16:22