Just recently read the paper "Delving into ChatGPT usage in academic writing
through excess vocabulary"
. by Kobak et al. Their premise is that (from the abstract) the [models] can produce inaccurate information, reinforce existing biases, and can easily be misused. So, the authors analyse pubmed abstracts for vocabulary changes, and identify certain words that have become more common post LLM. They find that words such as "delves", "showcasing", "underscores", "intricate", "excel", "pivotal", "encompassing", "enhancing" are all showing an increased usage, and hence suspect.

While this data is indeed interesting, I wonder why LLMs tend to use these words. Aren't LLM outputs supposed to be more of a reflection of the data they are fed in training? Surely that means that these words are more common in some data set than we expect?