Large Language Models (LLMs) are trained using vast amounts of text data from diverse sources, enabling them to excel at language-related tasks. However, the unsupervised nature of this training also allows these models to internalize and propagate societal biases embedded in their training data. Since the biased associations learned by an LLM are ultimately derived from the presence of such associations in the texts in its training data, those who write and publish those texts need to be scrutinized for their role in seeding these biases.

Enjoy the read.