Blogger at Unknown
Her blog post about em-dash usage in AI models and digitization driving the phenomenon is referenced and discussed
How media typically covers Maria Sukhareva
Research or work cited
Language models use em-dashes at unusually high frequencies compared to human writing, but the underlying cause remains unexplained despite common theories about training data, tokenization efficiency, and RLHF practices.
“Her blog post about em-dash usage in AI models and digitization driving the phenomenon is referenced and discussed”