Edit - Meta Stack Exchange

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

66

I'm deeply unimpressed with the idea that "number of drafts per answer" is a substantially more accurate measurement of GPT use than "number of answers deleted for suspected use of GPT." I simply find it incredible that the CM team is even considering the former as a more accurate metric than the latter.

Kevin
– Kevin

2023-06-08 00:57:32 +00:00
Commented Jun 8, 2023 at 0:57
2

@Kevin If my (unverified) theory is correct, it should've been a good predictor before ChatGPT was released, back when people were using document-based interfaces like Write With Transformer. Now that we're seeing chatbot-style generators being used in earnest, I expect the metric to have become decoupled.

wizzwizz4
– wizzwizz4

2023-06-08 01:04:52 +00:00
Commented Jun 8, 2023 at 1:04
13

@Kevin I think what Philippe is driving at, but not stating plainly, is that they are uncomfortable because they cannot independently verify whether the mod's action was defensible. If they could find a reliable ChatGPT detector or other method to verify whether the mod did what they were supposed to do, they could pretty much automate the handling of appeals (but then of course they might as well deploy those as automated checks on the site itself in the first place).

tripleee
– tripleee

2023-06-08 04:44:25 +00:00
Commented Jun 8, 2023 at 4:44
35

@tripleee: Unfortunately, life is like that sometimes. You can't possibly always know who's right and who's wrong. If their internal processes have no room for shades of gray, the problem is with those processes, not the moderators.

Kevin
– Kevin

2023-06-08 04:56:01 +00:00
Commented Jun 8, 2023 at 4:56
3

No disagreement. It's more of me thinking out loud.

tripleee
– tripleee

2023-06-08 04:57:24 +00:00
Commented Jun 8, 2023 at 4:57
10

@tripleee there is a small possibility that this is indeed an overblown issue and what the company was trying to do was to find a more verifiable workflow for suspension with no ill intent, maybe because someone did actually call in to the GDPR rule about fully automated actions. Yet even if this was the case, they made every possible mistake one could make while trying to do something that could have been achieved sitting in a bar while drinking tea together if they asked the right way. Public shaming on media, hidden information, no room for discussion all while advertising AI on blog etc.

ꓢPArcheon
– ꓢPArcheon

2023-06-08 15:10:41 +00:00
Commented Jun 8, 2023 at 15:10
"I have not seen any proof of this assumption" To be fair there seldom is proof of anything that leaves no reasonable doubt in such things. I guess one can also not prove the opposite, so it comes down to trust in the end and weighing the evidence. And even a blind test is not perfect, for example the used test setup could differ from reality.

NoDataDumpNoContribution
– NoDataDumpNoContribution

2023-06-09 10:41:28 +00:00
Commented Jun 9, 2023 at 10:41
Further, some platforms don't support drafts. iOS 12 for instance.

Harper - Reinstate Monica
– Harper - Reinstate Monica

2023-06-12 01:53:10 +00:00
Commented Jun 12, 2023 at 1:53
6

@tripleee Philippe is pretty clear: Participation is falling at a rate that poses an existential threat to the site and the company. Management and staff are in red alarm mode, and rightly so. One reason answering is going down is an incredibly high suspension rate of frequent answerers (at around 10%/month, an order of magnitude higher than baseline) which did not subside in lockstep with the number of actual GPT answers estimated through other channels; conclusion: Most valuable users are driven off the site at high numbers, contributing to the existential threat.

Peter - Reinstate Monica
– Peter - Reinstate Monica

2023-06-12 12:21:19 +00:00
Commented Jun 12, 2023 at 12:21
7

@tripleee The compounding factor here is that AIs have mastered a subsection or specialized version of the Turing test: They are indistinguishable from idiots when they are discussing technical issues. I fully believe Rory that he/she is convinced that each and every suspension was justified. It is just that this may be compatible with human authorship.

Peter - Reinstate Monica
– Peter - Reinstate Monica

2023-06-12 12:28:09 +00:00
Commented Jun 12, 2023 at 12:28
Without access to the actual suspension data, this is hard to assess. A reasonable null hypothesis is that these were not valuable, long-term answerers who were actually capable of bringing value to the site in the first place.

tripleee
– tripleee

2023-06-12 12:31:55 +00:00
Commented Jun 12, 2023 at 12:31
13

I'm not a regular asker on SO but last week I had my first GPT answer. It was well formatted and tidy, if tonally a little heavy on the obsequious politeness of ChatGPT but the real giveaway was that it referenced methods and properties that simply don't exist and the code it gave was useless. If the company wants a way to drive users away, bad answers that look good is a great way to do that. In almost every non-trivial case that is what a LLM will offer.

glenatron
– glenatron

2023-06-13 11:08:11 +00:00
Commented Jun 13, 2023 at 11:08
"in the Community Management industry, it is a well-known fact that removing a person from a community, even for a short time, has an outsize impact on the contributor community." and what is the impact on the contributor community of repeatedly overruling, denigrating and undervaluing your volunteer moderation staff, who as elected representatives of the community can be assumed to have a lot of community support and respect?

Ty Hayes
– Ty Hayes

2023-06-16 07:40:59 +00:00
Commented Jun 16, 2023 at 7:40
1

@Peter-ReinstateMonica It seems to me that the users driven off the site by the LLM ban were mostly spammers who want to rep farm, while the users driven off the site by prohibiting all LLM moderation are power users, SMEs and moderators who want to curate a high-quality repository of accurate information about programming. The company believes the value proposition of uninhibited LLM spam beats the 15 years of curation and expertise that built the site up to this point.

ggorlen
– ggorlen

2023-06-20 20:12:44 +00:00
Commented Jun 20, 2023 at 20:12
2

"What makes you so confident that your methods of identifying GPT posts are more accurate than how moderators have been identifying them?" - This is a silly question. Overzealous mods were suspending almost 7% of the site's active users every three-week window. That's both unsustainable and prima facie inaccurate. What evidence have moderators provided that their methods for detecting GPT posts are accurate? They're the ones suspending people; they're the ones who have to prove they're doing it justifiably.

aroth
– aroth

2023-06-26 03:50:26 +00:00
Commented Jun 26, 2023 at 3:50

| Show 1 more comment

Correct minor typos or mistakes
Clarify meaning without changing it
Add related resources or links
Always respect the author’s intent
Don’t use edits to reply to the author

create code fences with backticks ` or tildes ~
```
like so
```
add language identifier to highlight code
```python
def function(foo):
print(foo)
```
put returns between paragraphs
for linebreak add 2 spaces at end
_italic_ or **bold**
indent code by 4 spaces
backtick escapes `like _so_`
quote by placing > at start of line
to make links (use https whenever possible)

<https://example.com>

[example](https://example.com)

<a href="https://example.com">example</a>

formatting help »
answering help »