Codewashing

I have little understanding for people using large language models to generate slop; words and images that nobody asked for.

I have more understanding for people using large language models to generate code. Code isn’t the thing in the same way that words or images are; code is the thing that gets you to the thing.

And if a large language model hallucinates some code, you’ll find out soon enough:

With code you get a powerful form of fact checking for free. Run the code, see if it works.

But I want to push back on one justification I see repeatedly about using large language models to write code. Here’s Craig:

There are many moral and ethical issues with using LLMs, but building software feels like one of the few truly ethically “clean”(er) uses (trained on open source code, etc.)

That’s not how this works. Yes, the large language models are trained on lots of code (most of it open source), but they’re not only trained on that. That’s on top of everything else; all the stolen books, all the unpaid creative work of others.

Even Robin Sloan, who first says:

I think the case of code is especially clear, and, for me, basically settled.

…goes on to acknowledge:

But, again, it’s important to say: the code only works because of Everything. Take that data away, train a model using GitHub alone, and you’ll get a far less useful tool.

When large language models are trained on domain-specific data, it’s always in addition to the mahoosive amount of content they’ve already stolen. It’s that mohoosive amount of content—not the domain-specific data—that enables them to parse your instructions.

(Note that I’m being very delibarate in saying “parse”, not “understand.” Though make no mistake, I’m astonished at how good these tools are at parsing instructions. I say that as someone who tried to write natural language parsers for text-only adventure games back in the 1980s.)

So, sure, go ahead and use large language models to write code. But don’t fool yourself into thinking that it’s somehow ethical.

What I said here applies to code too:

If you’re going to use generative tools powered by large language models, don’t pretend you don’t know how your sausage is made.

Responses

11 Likes

# Liked by Ethan Marcotte on Wednesday, April 30th, 2025 at 4:00pm

# Liked by Sean Gillies on Wednesday, April 30th, 2025 at 4:00pm

# Liked by Wayne Myers on Wednesday, April 30th, 2025 at 4:26pm

# Liked by Luke Dorny on Wednesday, April 30th, 2025 at 4:26pm

# Liked by Jason Neel on Wednesday, April 30th, 2025 at 4:26pm

# Liked by Royce Williams on Wednesday, April 30th, 2025 at 5:30pm

# Liked by Brett Jephson on Wednesday, April 30th, 2025 at 6:56pm

# Liked by Jeff Bridgforth on Wednesday, April 30th, 2025 at 9:55pm

# Liked by Patrick Nesbitt on Thursday, May 1st, 2025 at 6:06am

# Liked by Joe Crawford on Thursday, May 1st, 2025 at 9:52pm

# Liked by Bob on Tuesday, May 20th, 2025 at 7:21pm

Related posts

Coattails

Language matters.

Creativity

Thinking about priorities at UX Brighton.

Decision time

Balancing the ledger.

Crawlers

Pest control for your website.

Permission

You have the power, not Google.

Related links

Gas Town and Bullet Hell – Petafloptimism

Matt has some smart reckons on the relationship between time and technology:

The factory bell, the railway timetable, the telegraph wire, the always-on smartphone — each imposed a new temporal discipline, each produced its own characteristic form of exhaustion, and each was eventually (partially, imperfectly) domesticated through a combination of regulation, design, and collective action.

Tagged with

Progress Without Disruption - Christopher Butler

We’ve been taught that technological change must be chaotic, uncontrolled, and socially destructive — that anything less isn’t real innovation.

The conflation of progress with disruption serves specific interests. It benefits those who profit from rapid, uncontrolled deployment. “You can’t stop progress” is a very convenient argument when you’re the one profiting from the chaos, when your business model depends on moving fast and breaking things before anyone can evaluate whether those things should be broken.

We’ve internalized technological determinism so completely that choosing not to adopt something — or choosing to adopt it slowly, carefully, with conditions — feels like naive resistance to inevitable progress. But “inevitable” is doing a lot of work in that sentence. Inevitable for whom? Inevitable according to whom?

Tagged with

Keeping up appearances | deadSimpleTech

Looking at LLM usage and promotion as a cultural phenomenon, it has all of the markings of a status game. The material gains from the LLM (which are usually quite marginal) really aren’t why people are doing it: they’re doing it because in many spaces, using ChatGPT and being very optimistic about AI being the “future” raises their social status. It’s important not only to be using it, but to be seen using it and be seen supporting it and telling people who don’t use it that they’re stupid luddites who’ll inevitably be left behind by technology.

Tagged with

In 2025, venture capital can’t pretend everything is fine any more – Pivot to AI

Here is the state of venture capital in early 2025:

  • Venture capital is moribund except AI.
  • AI is moribund except OpenAI.
  • OpenAI is a weird scam that wants to burn money so fast it summons AI God.
  • Nobody can cash out.

Tagged with

Build It Yourself | Armin Ronacher’s Thoughts and Writings

We’re at a point in the most ecosystems where pulling in libraries is not just the default action, it’s seen positively: “Look how modular and composable my code is!” Actually, it might just be a symptom of never wanting to type out more than a few lines.

It always amazes me when people don’t view dependencies as liabilities. To me it feels like the coding equivalent of going to a loan shark. You are asking for technical debt.

There are entire companies who are making a living of supplying you with the tools needed to deal with your dependency mess. In the name of security, we’re pushed to having dependencies and keeping them up to date, despite most of those dependencies being the primary source of security problems.

But there is a simpler path. You write code yourself. Sure, it’s more work up front, but once it’s written, it’s done.

Tagged with

Previously on this day

2 years ago I wrote Composability in design systems

There’s probably a Pace Layer analogy in here somewhere.

6 years ago I wrote User agents

The web browser is your mutual friend.

11 years ago I wrote 100 words 039

Day thirty nine.

13 years ago I wrote Anniversary

It was twenty years ago today.

14 years ago I wrote Left to our own devices

Pop ‘round to the Clearleft office if you want to test a site on our devices.

22 years ago I wrote Songs from the web

iTunes 4.5 was released earlier this week.

24 years ago I wrote The Trash Compactor Debate

On the Implausibility of the Death Star’s Trash Compactor:

24 years ago I wrote Apple - eMac

Apple have released a new computer specifically for the education market - the eMac (the "e" is for education).