We’ve talked a little in recent months about some of the bold claims made for large language models (LLMs), colloquially known as “artificial intelligence” (AI) even though AIs are no such thing by any meaningful standard, in reference to their potential to revolutionize the process of writing. Even an LLM skeptic like me can admit that the technology has its uses. AI can be handy for generating paraphrases or automating repetitive, predictable writing jobs, for instance.
Credulous tech dweebs, though, insist on imbuing LLMs with almost supernatural qualities that they manifestly do not possess. This superstitious regard for chatbots as all-knowing oracles is both hilarious and a little pitiful. Less blatantly, but just as insidiously, industry hucksters are still trying to sell LLMs as a solution for creative writing, such as as a way to jumpstart a fiction project or analyze the quality of your prose. These are the tasks for which chatbots are spectacularly unsuited.
To unpack this, let me tell the true story of the English paper that was, in my ten years on the job, the absolute worst academic project I’ve ever proofread.
The client’s stated goal was to present a quantitative analysis of the works of two nineteenth-century English poets, Robert Browning and Alfred, Lord Tennyson. His method was to take five poems by each poet, count up all the instances of consonance, assonance, alliteration, and onomatopoeia as they occurred, and compare the totals.
The savvy reader will realize that this method cannot, in fact, be meaningfully said to constitute an “analysis” of any kind. Alas, that did not stop the client from tossing those numbers around as if they meant something. The entire project (all 28 pages!) was a prolonged exercise in begging the question, that is, assuming the conclusion in the formulation of the hypothesis. His assertions essentially amounted to, “By identifying which poet uses alliteration more frequently, we will show who is the more alliterative poet.”
Worse, the project didn’t even accomplish what it set out to because it only looked at raw numbers. That is, the client gave no consideration to the relative length of the poems he was ostensibly analyzing, and so, by chance or design, the selections by Browning averaged around 30–40 lines long, while Tennyson’s were mostly shorter, one being only six lines.
So, saying “Robert Browning used alliteration 200 times in the course of his five poems, whereas Tennyson used it 180 times in his. Therefore, the quantity of alliteration in Browning’s work is greater than in Tennyson’s,” is already a meaningless conclusion and then is rendered doubly meaningless by the lack of context; if you’re looking at roughly twice as many lines by Browning as by Tennyson, then Tennyson is actually employing alliteration more frequently than Browning on a per-line basis.
In any case, there was no attempt to attach any significance to these numbers, no effort to draw any conclusion or find any deeper truth. It failed even as an exercise in bean-counting, because the bags of beans were of unequal sizes, and some were fava beans and some were lentils. It was that rarest of papers, one that not only failed to inform but which actually left you understanding less about the subject than when you started reading.
I stood helpless before it; there was nothing I could do to improve its substance. All I could do was clean up the language, leave a couple of notes pointing out (as gently as I could) the ways in which the client’s premise was fundamentally flawed, and suggest that he start over from scratch.
Now, this paper hit my desktop several years ago, and I’m 100 percent certain the author was a flesh-and-blood person. I mention it here because, in retrospect, the result of his analytical approach looks precisely like what you’d get from LLMs’ combination of brute-force computation and large-scale pattern recognition: data presented in a vacuum, betraying no interest in or understanding of any larger context, stringing words together in ways that sound plausible but which ultimately form a semantic black hole sucking in all surrounding meaning and annihilating it.
There are worthwhile questions those data could help answer. If one poet used alliteration comparatively rarely and another used it a lot, for instance, what does that tell about what kind of poet each one was or about the respective literary traditions in which they worked, or about the different effects and purposes toward which they employed these various poetic devices? But a purely quantitative analysis, which is a truly mechanical approach, is not suited to grapple with these questions, whether it’s calculated by a human or an LLM.
So, what is the lesson here? Optimize the task to the strengths of the tool. Let the machines do the mechanical work; if you’re human, you’ll get the best results by thinking (and editing) like a human.
Jack F.
Get a free sample proofread and edit for your document.
Two professional proofreaders will proofread and edit your document.
We will get your free sample back in three to six hours!