A surprising amount of content available online is generated by artificial intelligence
Does your company use machine translation to communicate with its international audiences? Then this article is for you. A recent study revealed that a majority of the content found online is translated or generated by Artificial Intelligence, which raises concerns about the reliability and future quality of that content, if AI models use in fact the content created by themselves to learn and improve. Apparently, there is a closed learning cycle that feeds on itself, using content generated by Artificial Intelligence as training data for Artificial Intelligence itself. This self-feeding of data can lead to a progressive degradation in the quality of machine translations and AI-produced content.
M21Global, is a translation company committed to delivering quality translations, focused on helping companies reach new markets while expanding their operations across borders, positions itself as a strategic partner in helping companies communicate with their international partners and clients.
The study, conducted by researchers at the Amazon Web Services Artificial Intelligence Laboratory, in collaboration with the University of California, Santa Barbara , analyzed more than six billion sentences on the internet, and concluded that more than half of that content would be the translation of one or more iterations, (i.e. when content is translated into one language, and then from that to another), with a large part of this content exhibiting low-translation quality. Furthermore, the study shows that as these translations went through further iterations, the quality deteriorated significantly. Researchers found that translations made into more languages were of significantly lower quality than translations that were done solely from one language to another (these are more likely to be human translations).
These findings highlight the challenges with the use of machine translation, namely the accuracy, fluency and reliability of the content generated by artificial intelligence systems. This problem is amplified if we consider that if the content generated by Artificial Intelligence is continually reintroduced into the system as training data, then there is a risk of degradation in the quality of the content produced over time, particularly if the Artificial Intelligence model perpetuates or amplifies the errors found in its training data, previously created by itself.
A real example, to better understand the problem, occurs in recommendation systems, such as those used by streaming platforms, such as Netflix. As these systems recommend content based on users’ past preferences, if they fail to collect up-to-date feedback on changes in those same user tastes or interests, then the self-feeding cycle leads to increasingly less relevant recommendations, because as users interact less with the recommendations provided, the system amplifies this departure from users’ current preferences.
M21Global is a translation company, focused on supporting companies in their internationalisation by offering high-quality specialised technical and legal translations. We help your company overcome the challenges of global communication, with the mission of ensuring accuracy and preserving cultural and linguistic diversity, contributing to your company’s international success.