Transformer: The Idea That Gave A Solid Boost To Artificial Intelligence

The Idea That Gave A Solid Boost To Artificial Intelligence

What is a transformer, and how Google Brain’s 2017 paper, Attention Is All You Need, revolutionized the world of artificial intelligence and generative models, with OpenAI beating everyone to the punch. Transformers are neural network architecture used in artificial intelligence for natural language processing (NLP) tasks and other tasks related to pattern recognition in sequential data.

This architecture was introduced in the 2017 paper entitled Attention Is All You Need “, developed and presented by researchers at Google Brain. Researchers who worked on the transformer concept wanted to introduce a new approach to processing data sequences, overcoming some limitations of previous architectures, such as recurrent networks (RNN) and convolutional neural networks (CNN).

Transformers Are The Attention Mechanism

The main feature of transformers is the use of attention mechanisms (the word attention appears in the research title by Google Brain) to manage and process input data sequences. Attention is a fundamental concept in natural language processing and machine learning in general. It refers to the model’s ability to give more weight to specific input parts during processing based on their relevance or importance.

Attention allows models to focus on specific input parts during learning and make better decisions. In the first lines, we talked about sequential data: sets of data organized sequentially, where the order of the data units is essential and has a specific meaning. Sequential data is structured so that each element is linked to its predecessor and successor in the sequence. This happens in natural language, music, video, and other areas: the order of the units is essential to understanding the overall meaning.

We said that the transformer can assign different weights to different input parts, focusing on those relevant to text generation. In this way, the transformer can capture long-range relationships between the terms provided as input by the user, improving the ability to generate coherent texts that are, therefore, entirely in line with the context.

The Revolution Introduced By OpenAI’s Generative Models

When OpenAI presented the ChatGPT chatbot and, consequently, the generative models used for its operation (the most updated and powerful to date is GPT-4 with GPT-5, which could already be on the horizon despite the company’s founder, Sam Altman, vehemently denying it), it was in November 2022.

That event proved to be a real earthquake on the market and in the world of information technology in general: OpenAI had opened up the idea of artificial intelligence (AI) into the hands of all users, inspired by the 2017 document “Attention Is All You Need “. What Google Brain had put in black and white has turned into an enormously successful commercial project under the pressure of OpenAI.

It was one of the few situations in which Google felt overtaken by the competition, then sensing the revolutionary impact that the introduction of AI would have, for example, in the world of online search. Shortly after that, Google took action by intensifying investments and efforts in its artificial intelligence solutions and recently merging the Brain team with that of DeepMind. Bard was only a first “taste” because from now on, the company wants to “get serious” by quickly recovering lost ground.

Google Is Lagging Behind OpenAI: Why

Until the presentation of ChatGPT and the generative GPT (Generative Pre-trained Transformer) models, the various players in the sector had used their proverbial leaden feet. The reasons are many and rather multifaceted: among them is the difficulty in developing models less influenced by the so-called biases in the training data.

Biases are prejudices or distortions within the training data used to create AI models. Imbalanced or distorted representations typically negatively affect models, which, in turn, can acquire these biases as “reliable” and use them in the gradually generated results. The DAN role-playing game blew all the safeguards imposed by OpenAI by showing a ChatGPT chatbot without filters or censorship.

Aside from these “special cases”, OpenAI first achieved what others had not yet managed to create. Additionally, creating a profitable business. In short, if, on the one hand, Google Brain anticipated everyone with a revolutionary idea like the one described in the document Attention Is All You Need “, the parent company does not seem to have had the strength to invest in it with conviction to develop a successful commercial product (as OpenAI did instead).

What Happened To The Researchers Of The Attention Is All You Need Document?

In short, for once (even if the Mountain View company has made several missteps throughout its history, as is normal), Google has not noticed the gold mine it already had among its assets since 2017. Suffice it to say that the 2017 study Attention Is All You Need has been cited more than 80,000 times in articles written by various researchers. But what happened to the eight researchers who studied and developed the transformer concept, giving birth to that historic document that will remain in modern information technology? None of them work for Google anymore.

Also Read: Artificial Intelligence: The Future Arrives In Google’s Online Search

Exit mobile version