Researchers at EPFL have discovered that large language models, predominantly trained on English text, appear to internally utilize English, even when prompted in a different language. This finding could have significant implications for linguistic and cultural bias as AI continues to permeate our daily lives.

Large language models (LLMs) such as Open AI’s ChatGPT and Google’s Gemini have captivated the world with their seemingly natural speech understanding and response capabilities.

These LLMs can interact in any language, but their training is primarily based on English text parameters in the hundreds of billions. It has been speculated that these models process internally in English and only translate to the target language at the final stage. This hypothesis lacked concrete evidence until now.

Probing Llama

Researchers from the Data Science Laboratory (DLAB) at EPFL’s School of Computer and Communication Sciences investigated the open-source LLM, Llama-2, to ascertain which languages were employed at various stages of the computational process.

“Large language models are designed to predict the subsequent word. They achieve this by associating each word with a numerical vector, essentially a multi-dimensional data point. For instance, the word ‘the’ will always correspond to the same fixed numerical coordinates,” elaborated Professor Robert West, the head of DLAB.

“These models link together around 80 layers of identical computational blocks, each transforming one word-representing vector into another. At the end of these 80 transformations, the output is a vector representing the next word. The number of computations is determined by the number of computational block layers—the more computations performed, the more potent the model and the higher the likelihood of the next word being accurate.”

In their paper, “Do Llamas Work in English? On the Latent Language of Multilingual Transformers,” available on the pre-print server arXiv, West and his team forced the model to respond after each layer while predicting the next word, instead of allowing it to complete its 80-layer calculations. This approach enabled them to see the word the model would predict at that stage. They set up various tasks, such as asking the model to translate a series of French words into Chinese.

the model knew it was supposed to translate the French word into Chinese. Ideally, the model should assign a 100% probability to the Chinese word, but when we forced it to make predictions before the final layer, we found that it most often predicted the English translation of the French word, even though English was not involved in this task.

“We provided a French word, followed by its Chinese translation, another French word and its Chinese translation, and so on, so that the model knew it was supposed to translate the French word into Chinese. Ideally, the model should assign a 100% probability to the Chinese word, but when we forced it to make predictions before the final layer, we found that it most often predicted the English translation of the French word, even though English was not involved in this task. It was only in the last four to five layers that Chinese became more likely than English,” West stated.

From Words to Ideas

A straightforward hypothesis might suggest that the model translates the entire input into English and only translates into the target language at the end. However, the researchers’ data analysis led to a more intriguing theory.

In the first phase of calculations, neither word receives any probability, leading the researchers to believe that the model is addressing input issues. In the second phase, where English prevails, the researchers propose that the model operates in an abstract semantic space. Here, it’s not reasoning about individual words but about other types of representations that are more concept-oriented, universal across languages, and more representative of the world. This is crucial because to accurately predict the next word, the model needs extensive world knowledge, which can be achieved through this concept representation.

“We hypothesize that this world representation in terms of concepts is biased towards English, which is logical considering these models were exposed to approximately 90% English training data. They map input words from a superficial word space into a deeper concept space where there are representations of how these concepts relate to each other in the world—and the concepts are represented similarly to English words, rather than the corresponding words in the actual input language,” West explained.

Monoculture and Bias

An essential question that emerges from this English dominance is, ‘does it matter’? The researchers believe it does. Extensive research indicates that the structures present in language shape our perception of reality and that the words we use are deeply intertwined with our worldview. West suggests that we need to begin researching the psychology of language models, treating them as humans, and subjecting them to behavioral tests and bias assessments in different languages.

An essential question that emerges from this English dominance is, ‘does it matter’? The researchers believe it does. Extensive research indicates that the structures present in language shape our perception of reality and that the words we use are deeply intertwined with our worldview

“This research has struck a chord as people are becoming increasingly concerned about potential monoculture issues. Given that the models perform better in English, many researchers are now exploring the idea of inputting English content and translating it back to the desired language. While this might work from an engineering perspective, I would argue that we lose a lot of subtlety because what cannot be expressed in English will not be expressed,” West concluded.

6 COMMENTS

  1. This research is a fascinating look into the inner workings of large language models (LLMs). It’s interesting to see how these models, despite being able to interact in any language, seem to process information internally in English. This could be due to the fact that their training is primarily based on English text parameters. The implications of this finding are significant, especially when considering linguistic and cultural bias in AI.

  2. While the research is indeed intriguing, it also raises some concerns. If these models are processing information internally in English, what does this mean for non-English languages? Could this lead to a loss of linguistic diversity and nuance in AI responses? It’s crucial to consider these questions as AI continues to permeate our daily lives.

  3. The methodology used in this research is quite innovative. By forcing the model to respond after each layer while predicting the next word, the researchers were able to gain insights into the model’s internal processes. This approach could pave the way for more in-depth studies into the functioning of LLMs.

  4. The findings of this research highlight the importance of considering linguistic and cultural diversity in AI development. If we continue to train these models predominantly on English text, we risk creating AI systems that are biased towards English, potentially marginalizing other languages and cultures.

  5. It’s worth noting that the researchers are not suggesting that these models are consciously “thinking” in English. Instead, they propose that the models operate in an abstract semantic space during the second phase of calculations, where they reason about concept-oriented representations that are universal across languages. This is a crucial distinction that adds a layer of complexity to the discussion.

  6. The question raised by the researchers, “does it matter?”, is indeed essential. If the structures present in language shape our perception of reality, then the dominance of English in these models could have far-reaching implications. It’s crucial that we continue to explore this issue and work towards creating AI systems that respect and represent linguistic and cultural diversity.