Today I want to focus on a different question: what quality of writing are LLMs converging upon? It seems to me there are two possibilities:
- As LLMs improve, they will continually become better and better writers, until eventually they surpass the abilities of all human writers.
- As LLMs improve, they will more closely mimic the aggregation of all writers, and thus will not necessarily perform better than strong human writers.
If you take the Kevin Drum view that AI by definition will be able to do anything a human can do, but better, then you probably think the end game is door number one. Use chess engines as your template. As the engines improved, they got better and better at playing chess, until eventually they surpassed the capacities of even the best human players. The same thing will eventually happen with writing.
But there's another possibility. Unlike chess, writing does not have an objective end-goal to it that a machine can orient itself to. So LLMs, as I understand them, are (and I concede this is an oversimplification) souped-up text prediction programs. They take in a mountain of data in the form of pre-existing text and use it to answer the question "what is the most likely way that text would be generated in response to this prompt?"
"Most likely" is a different approach than "best". A chess engine that decided its moves based on what the aggregate community of chess players was most likely to play would be pretty good at chess -- considerably better than average, in fact, because of the wisdom of crowds. But it probably would not be better than the best chess players. (We actually got to see a version of this in the "Kasparov vs. the World" match, which was pretty cool especially given how it only could have happened in that narrow window when the internet was active but chess engines were still below human capacities. But even there -- where "the world" was actually a subset of highly engaged chess players and the inputs were guided by human experts -- Kasparov squeaked out a victory).
I saw somewhere that LLMs are facing a crisis at the moment because the training data they're going to draw from increasingly will be ... LLM-generated content, creating not quite a death spiral but certainly the strong likelihood of stagnation. But even if the training data was all human-created, you're still getting a lot of bitter with the sweet, and the result is that the models should by design not surpass high-level human writers. When I've looked at ChatGPT 4 answers to various essay prompts, I've been increasingly impressed with them in the sense that they're topical, grammatically coherent, clearly written, and so on. But they never have flair or creativity -- they are invariably generic.
Now, this doesn't mean that LLMs won't be hugely disruptive. They will be. As I wrote before, the best analogy for LLMs may be to mass production -- it's not that they produce the highest-quality writing, it's that they dramatically lower the cost of adequate writing. The vast majority of writing does not need to be especially inspired or creative, and LLMs can do that work basically for free. But at least in their current paradigm, and assuming I understand LLMs correctly, in the immediate term they're not going to replace top-level creative writing, because even if they "improve" their improvement will only go in the direction of converging on the median.
I think this post arrives at an approximately correct expectation, while missing some key nuances.
ReplyDeleteReinforcement learning can train a program to pursue any goal, so long as that goal can be efficiently operationalized.
For games with a mechanistically defined win condition, this is easy to do at scale. A computer—a regular piece of human-written software, not another trained process—can determine who has won a game of chess, and so for training a chess-playing program you can just tell it "play yourself a trillion times, calling this hard-coded algorithm to determine which of you won, and reinforce the winning behaviors." (Even this operationalization is a bit inefficient—my understanding is that usually the trainers implement a chess-state scoring system and tell the software to reinforce behaviors that drive its score up.)
With natural language, of course, there is no mechanical process for scoring it. So the first step was "train a program to generate a sequence of text that might plausibly belong to this corpus." This is the "autocomplete" concept you describe: "given this sequence of text, what are the likely next chunks?" "okay, add one of those likely next chunks. Given this new sequence, what comes next?" etc. etc.
A key concept here is "temperature". You can tell a program of this type "always pick the likeliest single output" or "choose at random from all possible outputs" or anything in between, e.g., "choose among the top likeliest outputs in proportion to their likelihood". A temperature of zero is "always pick the single most likely next bit" and is fully deterministic, and also typically very boring (or can get stuck in a loop—e.g."the man of the man of the man of the man" etc.) A temperature of 1 (or whatever the max is) would produce total chaos.
Okay. Once you have a program that's pretty good at generating plausible text, you can train it AGAIN, for some secondary desideratum. This could be anything! For example, you could train an LLM to try to complete sentences while using the letter "e" as little as possible, or by only using words that have a prime number of letters, or whatever. Those would be pretty easy, since you could very easily operationalize the desideratum. Unfortunately we mostly want to do secondary training for some other, more complicated thing. For example, we want to train the LLM to have "helpful, harmless" output.
The way we do this is kind of involved and has a bunch of layers (there's some detail in this post: https://nostalgebraist.tumblr.com/post/721395846638436352/tweet-twitter) but basically this is where the underpaid Kenyan workers come in: you have a lot of humans review a lot of text and give it some sort of score, which is then used to train the algorithm to output the sorts of things that are likely to score well.
1/
So LLMs aren't stuck converging to the median of their corpus. (Chat-GPT certainly doesn't say "haha get fucked" a MEDIAN amount for an algorithm trained on reddit!) Instead, LLMs can be trained toward any target.
ReplyDeleteBut, that target has to be:
1. Possible to train towards
2. Possible to hit with an LLM
3. Someone has to actually do the training
It's probably possible to train towards better essay writing. The corpus of all college essays and their associated grades would likely allow you to build a much better homework machine.
It's not clear that the target of "good writing" or "great writing" can actually be hit with an LLM. Some things that humans can do are just really, really, really hard. Compare self-driving cars, which climbed in capability and then have plateaued for a good long while, somewhere that's miraculous by pre-digital standards and still inadequate by human standards. As far as I can tell this is because the task is actually super super hard. It's possible that great writing is this sort of task.
Finally, for an LLM targeting some behavior to come into being, someone has to actually train it. Right now there's a lot of incentive for AI companies to train LLMs that produce text that corporations find useful, since corporations have all the money. And training a specialized LLM is unlikely to become arbitrarily cheap, since (unless the desideratum can be mechanically operationalized) you need a bunch of human work-hours to tell the machine what to do. So there may never be an attempt to train an LLM to get really good at writing undergraduate theses—there's just no market there.
All of this does rely on the idea that LLMs are not, ultimately, capable of recursive "takeoff"—that is, they don't represent a process that can feed off its own inputs to become arbitrarily good at something. (I think Ted Chiang's discussion of this theme here is useful: https://www.newyorker.com/tech/annals-of-technology/chatgpt-is-a-blurry-jpeg-of-the-web) So far this seems true. As for what comes next, who knows.
Good luck with grading!
2/2