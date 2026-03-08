On a Tuesday morning, you open your laptop to respond to your professor.

“Thank you for reaching out.”

You did not decide to write that. Instead, Gmail floated the phrase into your email, ghost-grey and waiting, and you tapped the tab key without thinking. The system wrote your email, and you merely approved it.

On your phone, your keyboard proposes whole words before you finish typing them. In Google Docs or Word, an AI assistant offers to “polish” your rough notes. In ChatGPT, you paste a messy paragraph and ask it to make you sound “more professional.”

Our digital lives are being overtaken, one small convenience at a timeThe result is writing that is efficient, grammatically perfect and indistinguishable from everyone else’s. Over the last few years, researchers have mapped this phenomenon, and the emerging picture is not of some overnight collapse of language, but of a slow and comfortable erosion. Across predictive keyboards, smart replies and Large Language Models (LLMs), AI systems are pulling human text toward what is short, common and safe. They are squeezing our vocabulary, flattening our style and quietly tilting the world’s writing toward a single cultural default.

When your phone quietly trims your thoughts

Long before ChatGPT, there was the predictive keyboard. As part of a 2020 study, published as “Predictive Text Encourages Predictable Writing,” computer scientist Kenneth Arnold asked people to write simple one-sentence photo captions, sometimes with a predictive keyboard and sometimes without.

Captions written with suggestions were shorter, more predictable, and used fewer words that the system did not predict. People did not just type faster. They described less.

When the predictive text offered the default option, like suggesting “man” after “a,” writers routinely took it, bypassing more specific words like “hitter” or “baseball player.” The algorithm’s statistical preference for the generic gently guided users away from colorful detail. While the study was done in a controlled setting, the pattern is universal: we sacrifice nuance for efficiency.

The conclusion from Arnold is a sobering one: predictive text fundamentally alters what we write, not just how fast.

Co-writing with language models: essays on rails

If a smartphone keyboard can trim our sentences, what happens when an LLM helps us write an entire essay? We often assume AI expands our creative horizons, but research from New York University suggests it might actually be narrowing them.

In a controlled experiment published at ICLR 2024, researchers Vishakh Padmakumar and He He set out to answer this question

To measure the “AI effect,” researchers recruited participants to write argumentative essays on open-ended topics, such as “Should Schools Teach Mindfulness?” or “Is It Selfish to Pursue Risky Sports?” Participants were divided into three distinct groups, writing their essays either 1) without any digital assistance, 2) with the help of GPT-3 or 3) with the help of InstructGPT, a model fine-tuned on human feedback which functions as the polite, helpful ancestor of ChatGPT The researchers then analyzed the essays for “homogenization,” measuring whether the writers were sounding more like themselves or more like each other.

The results revealed a stark trade-off between assistance and variety. Essays co-written with InstructGPT — the feedback-tuned model optimized to follow instructions and be “safe” — were statistically less diverse and more homogenized than those written by the other groups.Surprisingly, the raw GPT-3 model did not have this effect. The “raw” model left the writers’ diversity intact, roughly on par with the solo writers.

Did the writers become lazy? Did seeing a perfect AI suggestion make them suppress their own unique voices? The data says no. When Padmakumar and He decomposed the essays, they found that the InstructGPT model tended to recycle the same “safe,” repetitive phrases across different users, creating a high degree of overlap. For example, while solo writers used a wide scatter of phrases, the InstructGPT-assisted essays frequently converged on identical 5-word sequences like “keep up with the news” or “students should learn in school.”The “human component” stayed varied; the “machine component” got copy-pasted similar.

This study uncovered a hidden tax on AI-assisted writing. The process of “aligning” models appears to reduce the entropy of their output but in doing so, their consistent outputs becomes, unfortunately, repetitive.

When millions of users rely on the same models to draft emails, essays and articles, we face the risk of “algorithmic monoculture.” The AI doesn’t stop us from being creative, but it dilutes our unique contributions by flooding the page with statistically average, highly predictable filler words.

(Graphic: REBECCA BYERS/The Stanford Daily)

Shrinking vocabularies and smooth, anonymous prose

Zoom in further, down to the individual words, and another troubling question emerges: do these models actually use a narrower vocabulary than humans? A group led by Pedro Reviriego tackled this in Playing with Words: Comparing the Vocabulary and Lexical Diversity of ChatGPT and Humans ). They compared ChatGPT’s answers to human responses on identical tasks and measured lexical diversity — essentially, how deep into the dictionary a writer is willing to dig. While ChatGPT-3.5 — the standard version of the model — was indeed stuck in a “vocabulary rut,” the newer ChatGPT-4 effectively closed that gap, displaying a lexical richness that rivaled, and occasionally even exceeded, human writers.

However, capability is not the same as behavior. As the researchers discovered in a follow-up study, Beware of Words, ), a model’s “default mode” is often to play it safe. By systematically dismantling the model’s configuration settings, the team found that lexical diversity is highly sensitive to parameters like “presence penalty,” which discourages repetition. Without active tuning, the models tend to recycle familiar words rather than reach for novel ones. The authors warn that this creates a potential feedback loop: if models rarely use specific words, those words may effectively go extinct in the digital environment, disappearing from the linguistic diet of future models and humans alike.

This linguistic flattening becomes even more obvious when you step back to look at entire essays. In a large-scale 2023 study published in Scientific Reports, Steffen Herbold and his colleagues compared hundreds of student essays against those generated by ChatGPT The results revealed a paradox: while the AI essays were rated higher for quality and logical flow, they possessed a rigid, almost robotic, signature. The AI writing relied on repetitive structural scaffolding — almost always using standard transition phrases and identical “In conclusion” endings — while stripping away opinions and attitudes — the messy, distinct textures that make human writing unique.

So yes, models can be lexically rich on paper. But as soon as you let them write for you at scale, their preferences start to dominate. They like certain words, certain rhythms, certain turns of phrase. Those become the water we all swim in.

When AI makes you sound more American

The flattening of language isn’t just stylistic; it’s cultural. In a 2025 study titled AI Suggestions Homogenize Writing Toward Western Styles and Diminish Cultural Nuances Dhruv Agarwal and colleagues at Cornell University recruited 118 participants from India and the United States and asked them to write about deeply personal, culturally grounded topics: their favorite foods, holidays, and local celebrities.

When participants wrote on their own, the cultural distinctions were clear. But when they wrote with an AI assistant, the lines began to blur. The assistant, trained on Western-centric data, defaulted to American norms. In the “favorite food” task, the AI suggested “pizza” more readily than local dishes; for holidays, it nudged users toward “Christmas” regardless of their background. The friction was even more obvious with famous figures: when Indian participants typed “S” to write about the Bollywood icon Shah Rukh Khan, the AI would frequently attempt to autocomplete the name as “Shaquille O’Neal” or “Scarlett Johansson.”

The study found that while Americans enjoyed a straightforward efficiency boost, the experience was different for Indian users. Surprisingly, Indian participants actually accepted more of the AI’s suggestions than the Americans did — 25% versus 19% — but they had to do significantly more work to fix them. They were forced to constantly edit the AI’s defaults to fit their own reality, yet the final essays still drifted toward Western phrasing and structure.

The result is a digital path of least resistance that bends toward the West. Billions of users are being quietly nudged toward Standard American English, not because they choose it, but because it is the only option the dropdown menu offers.

When AI starts talking to itself: model collapse

So far, we have looked at how AI tools change the way humans write. A separate line of work asks what happens once the models themselves start learning mostly from their own output, a scenario that seems likely because forecasting teams now expect frontier LLMs to fully use up the stock of high-quality public human text sometime between 2026 and 2032 As this bottleneck approaches, both researchers and companies are turning to synthetic data generated by earlier models

It sounds like a sci-fi paradox, but it is a statistical reality. In The Curse of Recursion: Training on Generated Data Makes Models Forget, Ilia Shumailov and coauthors model what happens when each new generation of a system is trained not on the messy and creative chaos that is human text, but on a mix of human and model-generated data

The researchers found that as AI-generated data is fed back into the training loop, the model begins to suffer from irreversible defects, losing access to the tails of the original distribution.

In plain English? The model stops understanding the edges. Rare phrases, unique dialects, and out-of-the-way patterns get under-represented in the synthetic data. Consequently, the next generation of the model under-represents them even further. Shumailov calls this degenerative process Model Collapse.

It is a kind of statistical echo chamber. If future LLMs are trained on text that previous LLMs already smoothed out, we can expect less surprising turns of phrase, less idiosyncratic constructions, less linguistic weirdness. And that is before we even ask what happens to smaller languages or dialects that were barely represented to begin with.

The long-term risk is that we end up with language models that are very good at producing decent, generic English, and gradually incapable of anything else — losing the unique spark that made the original human data valuable in the first place.



(Graphic: REBECCA BYERS/The Stanford Daily)

We really are starting to “sound like ChatGPT”

If this still feels abstract — a theoretical problem for future historians — consider the world outside the lab. Synthetic data is no longer just training the models; it is already intermixed with human language, reshaping not only how we write but also how we speak.

In June 2025, The Verge ran a feature titled “You sound like ChatGPT,” reporting on research from the Max Planck Institute for Human Development in Berlin

The researchers analyzed nearly 280,000 academic YouTube videos and tracked word usage before and after the public release of ChatGPT. In the 18 months after launch, speakers used words that ChatGPT particularly favors, such as “meticulous,” “delve,” “realm,” and “adept,” up to 51%more often than in the three previous years.

The speakers in those videos were not reading from a script generated by ChatGPT. They were just talking. Yet their language shifted in the direction of the model’s lexicon.

Hiromu Yakura, the lead author, told The Verge that “we internalize this virtual vocabulary into daily communication.” The article notes that “delve” has become a kind of academic shibboleth, a subtle watermark for AI influence in speech.

In written media, this drift is even more aggressive. In a massive study of 15 million biomedical abstracts, researchers Dmitry Kobak and colleagues tracked what they call “excess vocabulary”—words that appear far more often than historical trends can explain. While major world events usually spike “content” words (like “pandemic” or “lockdown” during COVID-19), the post-ChatGPT era has seen an unprecedented explosion of “style” words. The usage of “delves” skyrocketed, showing a frequency ratio increase of 28-fold.. Even common words like “potential” and “crucial” are being deployed with a frequency that suggests at least 13.5% of all 2024 biomedical abstracts were processed by LLMs.

The use of LLMs has become more prolific in certain scientific fields. In a broader analysis of over 1.1 million papers, Weixin Liang and his team mapped this saturation across different disciplines. . Their data reveals that computer science papers are leading the charge, with the estimated proportion of LLM-modified sentences in abstracts hitting 22.5% by September 2024. In contrast, fields with perhaps rigid stylistic norms or rigorous gatekeeping, such as mathematics and journals in the Nature portfolio, have shown more resistance, with adoption rates hovering closer to 8% or 9%. Not surprisingly, the tools are being adopted fastest by the very people building them.

Yet the broadest shifts are happening outside the ivory tower. In a separate systematic analysis, Liang and colleagues examined over 680,000 consumer complaints filed with the Consumer Financial Protection Bureau (CFPB). They found that following the release of ChatGPT, LLM-assisted writing surged, stabilizing at roughly 18% of complaint text by late 2024.

Intriguingly, areas with lower educational attainment showed higher adoption,suggesting these tools may be democratizing access to polished, formal prose. For consumers fighting bank fees or credit errors, AI has become a quiet collaborator.

That same synthetic fluency now echoes through boardrooms and embassies. The study found that up to 24% of corporate press releases show signs of AI modification. Young startups are automating up to 15% of their job postings. And in the high-stakes world of diplomacy, an analysis of nearly 16,000 United Nations press releases reveals that 14% of content bears AI’s fingerprint.

Can we keep the tools without losing the “Wind of Freedom?”

For a university deeply entrenched in the development of these very technologies this research poses a specific challenge. Stanford’s motto, “Die Luft der Freiheit weht” (“The wind of freedom blows”), suggests an environment of intellectual openness and distinct, unbound thought.But the research is clear: if we glide along with AI defaults, the wind stops blowing. The air becomes still and recycled.

The upside is that there are several levers we can pull, and this campus is the place to pull them.

At the model and data level, companies and researchers can decide who their “default” user is and whose language counts as data. Agarwal’s study makes it clear that a supposedly neutral English assistant is in practice a Western, often American assistant . That argues for genuinely localized models trained on region-specific corpora, as well as for transparent “data statements” and “datasheets” that document which languages and dialects are represented and which are not . It also argues for treating lexical diversity itself as a first-class objective. If labs here and elsewhere tune only for “helpfulness” and user satisfaction, they will quietly optimize toward safe, repetitive phrasing; if however they optimize for lexical and cultural diversity as well, the models can be pushed to keep the language space wide.

There is also a deeper stewardship question: whose writing will future models learn from? In the face of “model collapse” , institutions like Stanford should help preserve large, clean corpora of human-authored text and insist that commercial systems keep training on a meaningful share of real human language, not only on the increasing slop of synthetic text.

At the interface level, designers are not powerless either. Arnold’s caption-writing experiment suggests that always-visible, low-friction suggestions are exactly the ones that compress language most aggressively. Decision-support research on automation bias adds that people are inclined to over-trust automated suggestions simply because they come from a system . Taken together, that points toward interfaces where full-sentence completions require a deliberate action, where AI-inserted text is clearly highlighted, and where writers are nudged to revise or choose among multiple options rather than accept the first fluent continuation. Co-writing studies also suggest that higher-level planning and reviewing assistance can boost quality while preserving authors’ sense of ownership, as long as the system does not silently overwrite their voice

For a university, though, the most powerful levers are institutional. AI literacy is not just about knowing which button to click; it is about understanding when using a model undermines learning or expression. Stanford’s own Teaching Commons frames “understanding AI literacy” as a mix of functional, ethical and rhetorical skills, including the ability to decide when to resist automation The MLA–CCCC task force similarly calls on colleges to build a “culture for generative AI literacy” that helps students retain their agency and know when not to use these tools (. And a recent EDUCAUSE framework urges universities to treat AI literacy as a core competency, emphasizing vigilance about bias and misuse

Policy is part of that literacy. Journals from Science to Nature now forbid listing AI tools as authors and require authors to disclose substantive use of generative systems in writing and figures A BMJ bibliometric analysis finds that many major publishers have already introduced explicit AI guidance, though coverage is uneven. Stanford can mirror and extend these norms in its own classrooms and journals, insisting that the intellectual work of framing questions, making arguments and choosing words is done by people and that any AI assistance is openly labeled rather than smuggled in. That protects what Nature recently reminded its readers of in a blunt editorial title: “Writing is thinking”

And on an individual level, it helps to think of these tools as collaborators, not ghostwriters.Ask for multiple options but then splice them together. Use AI to brainstorm but then write the final draft yourself.Refuse the smart reply to type the messy, human sentence instead.

The stakes are not merely aesthetic. As Mor Naaman, one of the authors of the Cornell study, told The Verge, we risk losing signals of humanity, effort and ability when we outsource our wording. The tidy, correct sentence is often the one that hides how much of you went into it.

If AI is compressing language, it is also compressing the space in which you decide what you meant to say. The machines are very good at averaging everyone’s language into something competent. The challenge for us is to ensure our own voices are not averaged away in the process.