If you’ve spent any time reading AI-generated text, you’ve noticed it: em dashes everywhere. A single paragraph from ChatGPT, Claude, or Gemini can contain three or four of them. It has become such a reliable signal that readers now half-jokingly treat the em dash as a watermark for “a robot wrote this.” So why does it happen?
It learned from us — but unevenly
Large language models learn punctuation by absorbing enormous amounts of text: books, articles, web pages, forums. The em dash is common in published, edited prose — journalists and authors love it because it’s flexible and always grammatical. Models pick up on that and reach for it whenever a sentence could use a pause, an aside, or a connector.
The catch is that models don’t reproduce human frequency; they amplify the patterns they find rewarding. Because the em dash almost never produces a grammatical error, it’s a “safe” choice. Where a human writer might agonise over whether to use a comma, a colon, or a new sentence, a model can drop in an em dash and move on. Multiply that across millions of sentences and you get text that is statistically over-dashed.
Markdown bleeds into prose
There’s a second, more technical reason. Many models are trained and fine-tuned on text that mixes Markdown formatting with ordinary writing. Dashes appear constantly in Markdown — as bullet markers, as horizontal rules, in tables. Some of that “dash habit” leaks into normal prose generation, nudging the model toward dashes even when plain punctuation would read more naturally.
Reinforcement toward a “polished” voice
Models are also tuned with human feedback to sound articulate and confident. The em dash carries a certain rhetorical flourish — it feels emphatic and writerly. Training that rewards “good writing” can inadvertently reward the surface features of good writing, including a heavier hand with dramatic punctuation. The result is prose that looks sophisticated at a glance but reads as formulaic once you notice the pattern.
Is the em dash bad? No.
It’s worth saying clearly: the em dash is excellent punctuation. Emily Dickinson built a whole style on it. The problem isn’t the mark — it’s the density. Human writers use an em dash occasionally, for effect. AI text uses it as a default connector, and that uniformity is what makes writing feel machine-made.
What to do about it
If you want AI-assisted text to read as your own, you don’t need to eliminate every em dash. You need to rebalance them:
- Some should become commas — for light, in-sentence pauses.
- Some should become periods — when two independent clauses are joined by a dash, a comma would create a splice, so a full stop (or semicolon) is correct.
- Some should become colons — when the dash introduces an explanation or list.
- A few should stay — used sparingly, they’re effective.
- The occasional one is really an en dash — a range like “May—September” that got the wrong character.
Doing this by hand is tedious, and a naive “replace all em dashes with commas” makes things worse by creating comma splices and run-ons. That’s the whole reason we built a grammar-aware em dash remover: it looks at the words around each dash and picks the correct replacement, so your text stays correct instead of just dash-free.
The bigger picture
The em dash is only the most visible AI tell. The same overuse pattern shows up in clichéd phrases (“it’s not just X, it’s Y”), filler vocabulary (“delve”, “tapestry”, “leverage”), and invisible formatting characters. Cleaning the dashes is a great first step — and once you start noticing the patterns, you’ll write (and edit) more deliberately, with or without an AI assistant.