2021 in Review: What Just Happened in the World of Artificial Intelligence?

Infectious research ideas, game-changing applications and four awkward moments…

Published in

Applied Data Science

9 min readJan 4, 2022

Sipping a warm cup of tea and zoning out to candy-coated thoughts? Hiding your 2021 resolution list under a glass of champagne? Trying to make a summary of what happened in the world of AI out of a long and vague chain of events?

You’re not alone!

To write this post we shook the internet upside down for industry news and research breakthroughs and settled on the following 5 themes, to wrap up 2021 in a neat bow:

🤖 Transformers taking the AI world by storm

💡Reinforcement learning rethinking its practices

👩‍💻 A new programming framework is born

🏭Industry steals the AI spotlight

😅 Four awkward moments for AI

Packing a full year of exciting AI events into a single post is not easy. At the very least, we hope that by reading this list you can cross-out “Learning about the state of AI in 2021” from your resolution list 😉.

🤖 Transformers taking the AI world by storm

The family of artificial neural networks (ANNs) saw a new member being born in 2017, the Transformer. Initially introduced for Natural Language Processing (NLP) applications like translation, this type of network was used in both Google’s BERT and OpenAI’s GPT-2 and GPT-3.

What makes the Transformer architecture special? Before we answer this, we may want to attack a more fundamental question: what makes any architecture special?

You’ve probably heard of three different architectures widely used in machine learning: feedforward, convolutional and recurrent ANNs. Feedforward networks live in the here-and-now: they only draw conclusions about the example and the corresponding label you just fed them with. Easy-going and widely used, there are cases when these networks lose to their more focused cousins.

Convolutional networks are a particular type of feedforward neural network that have additional assumptions about spatial correlations, which makes them appropriate for vision. Recurrent ANNs ditch the feedforward requirement and acknowledge temporal dependencies and are therefore the architecture of choice for speech and test.

So how do Transformers work?

The strength of Transformers lies in their attention mechanism: inspired from the human skill of focusing on information that is important and ignoring information irrelevant to a given task. Transformers can choose to focus on what matters in a given task, be it temporal or spatial correlations.

Transformers have revolutionised both the research and industry world. Protein folding, document summarisation and playing chess are just a few of the applications that we have seen in previous years.

In 2021, the following were added to the ever growing list of Transformer applications.

Vision Transformer

The Vision Transformer is Deepmind’s extension of Transformers to visual data. This is hugely exciting for computer vision — CNNs have held first place until now.

Self-supervision

Self-supervision is a deep learning technique that could compete with Transformers for the most influential discovery of the past years. What happens when you combine the two? According to research by Meta, Self-supervised Vision Transformers are great at image and video classification without requiring immense labelled data.

The self-supervised Vision Transformer learned how to identify objects without labelled information. Source

Is the Transformer the meteorite that will cause the extinction of all other architectures? Does the success of Transformers lie in their attention mechanism or in the attention they are getting by the AI community? Only future research will tell.

💡Reinforcement learning: rethinking past practices

Muzero, Deepmind’s program that mastered multiple board games without anyone explaining the rules to it, drew the curtains of last year’s reinforcement learning scene. Agents living in simulations have improved impressively over the years. But at what cost?

Statistical significance

The answer is a concept that the deep learning community has been shoving under the carpet for a while now: statistical significance. The immense computational complexity of recent algorithms has forced their creators to train them only a handful of times, in many cases just once. ML models are however statistical in nature, which theoretically means that their average performance may be very different from the one during a specific training run. But what does this mean in practice?

Think of the performance of a ML model as a dice. Having MuZero train once and measuring its performance would be like rolling a dice and seeing the side it lands on. You probably suspect that the dice will roll on each side with equal probability, to give an expected performance of 3.5. But can you be sure about that if you can only roll it once? Not at all! Statistics say that to be 98% sure that the dice is fair (to within 2%), you need to roll it 766 times.

Whilst MuZero was trained only once, the task that it achieved was very complex, so it is highly improbable that it solved it by chance. But perhaps repeating the experiment would give a very different result, below human-level performance or other competing algorithms. Ultimately this lack of reproducibility is very bad news for the community, which wants to build upon existing solutions.

Generalisation

Related to this realisation is the emergence of a new buzzword in RL: generalisation. We say that an agent generalises when it can solve tasks that differ from the ones that they have solved during training.

Realising how narrow RL agents of the past were can be overwhelming. Take this agent for example that impressed us with its ability to find clever strategies in Pong in the classical setting. When the same agent was moved slightly up and to the right, it failed catastrophically.

Deep QNs managed to solve the game on the left but failed when the slider was moved slightly up and to the right. This means that our human intuition of what “similar task” means is not always correct in AI

Following up on this quest for generalisation, Deepmind recently introduced XLand, arguably the most impressive simulation environment the community has seen so far. To get a picture of how big it is, try to think of any great testbed you have recently seen in RL: it’s probably already in XLand. The dynamics of the engine allow the instantiation of a variety of agents, objects and terrains, where one can define any manipulation task described in natural language.

XLand is a recent simulation environment with a vast number of tasks

👩‍💻 A new programming framework is born

First presented at last year’s NeurIPS conference, JAX, is the new software child of Deepmind and has already become the first choice in their research projects.

Deepmind introducing JAX at NeurIPS 2020

In case you have not guessed it already, JAX aims at achieving unprecedented performance that enables the seamless acceleration of deep learning algorithms. Primarily based on Python’s numpy library for high-performance numerical computing, it supports large-scale data parallelism and compilation tools that allow scaling across GPUs and TPUs at wish. Deepmind has already provided specialisations for reinforcement learning (rlax) and graph neural networks (jraph).

What does this mean for deep learning practitioners?

While you certainly don’t need to throw away your Pytorch and Tensorflow skills (yet), you may want to get a taste of JAX and follow its adoption by the community.

🏭 Industry steals the AI spotlight

This year may have felt a bit underwhelming for someone following the headlines of big AI research centres. No world champion beaten in her own game, no new deep learning technique as fabulous as Transformers and Generative Adversarial Networks. Pay close attention though, because you may miss the trick.

Almost 3-fold increase in enterprise value creation in the last 12 months, with US taking the lion’s share and China accelerating Source

The trick is happening in the industry. As the State of AI report for 2021 communicates through data analysis from various sources, the maturity reached by certain AI-enabled technologies is leading to adoption levels that speak of digital transformation.

AI in Pharma

The recent breakthrough of AlphaFold in protein structure prediction had the world dream about the possibilities of the adoption of AI in the pharmaceutical industry. This year, a British company designed the first AI drugs to be tested in clinical trials with humans and a new Google spin-off, Alphabet, announced that it will build upon the work of AlphaFold to predict how drugs interact with the body.

Why are all eyes on pharma? One just needs to position the hard computational problems and high operational costs of this industry alongside the recent ability of deep learning to automate and leverage unlabelled data, to see that deep learning has found an effective, lucrative and vital application.

AI in Utilities and Farming

An industry with equally major societal implications is the utilities sector, where public and private operators can see big cuts in cost by improving the accuracy of their prediction models. In 2021, according to the UK National Grid ESO, the use of Transformers halved the error in demand forecast . Exemplifying the potential of combining AI with Internet of Things technology, dairy cow farms reportedly monitored and improved the health of their livestock.

At the same time, areas such as computer vision and language progressing continue their tradition in contributing to the rate at which start-ups and applications are sprouting.

😅 Four awkward moments for AI in 2021

Sitting in its academic chair and frequenting research conferences, AI was, until recently, protected from the surprises and awkward moments that the real world often brings to your door.

However, with the impressive performance of AI-enabled vision applications and the ethical implications of AI technology knocking on the doors of institutions and individuals, AI is slowly facing reality.

Here are some moments of the past year that made us ponder:

AI claims a patent

The patent applicant’s name is DABUS and it is an AI system that, according to its developer Stephen Thaler, invented “a food container and devices for attracting enhanced attention”. DABUS’ application was initially accepted by South Africa and Australia, but rejected by the United States and European Union. Even in these cases, however, the court’s response seemed uncertain: “It’s ambiguous. The wording indicates the legislators were not thinking about this possibility.”

Our take: If AI can play games, compose music and write software, what is stopping it from being an inventor?

AI startup reanimates your past relatives

Deep Nostalgia is a tool introduced by MyHeritage that “lets you experience your family history in a whole new way!”. Building upon years of progress in computer vision, where Generative Artificial Networks have revolutionised the quality of artificially produced images, Deep Nostalgia managed to create videos out of a single picture that can be shockingly realistic.

Our take: This product has been seen as creepy and manipulative, but the fact that it has had 50 million uses means that fun outweighs the concerns. Perhaps this is a message for future applications?

Google fires prominent AI ethics researcher

Dr. Timnit Gebru is an AI researcher most famous for her work on the ethical implications of AI technology. The story of her dismissal from Google sparked long and convoluted discussions on social media and major technology magazines. According to her account, Google refused to publicise one of her works and, when she reacted, fired her. The specific paper contained analysis indicating that state-of-the-art techniques for reducing bias in language models (techniques also used by Google) were not effective.

Our take: there’s many bits and pieces in this story, but perhaps the most pressing questions coming out of it is: Can research on the ethical implications of AI techniques come from companies who employ these techniques?

Microsoft patents chatbots modelling people

“… a past or present entity … such as a friend, a relative, an acquaintance, a celebrity, a fictional character, a historical figure”. These are the kind of people you will probably be able to chat with once Microsoft’s chatbot goes into production.

Our take: This application is not surprising from a technological perspective: by scraping data from social media or records people are willing to provide, one can build a model of someone’s personality, that, combined with the latest achievements in NLP, can be a better conversational version of yourself than you on a Monday morning. By crossing ethical limits however, it will probably remain a patent for the years to come.

How did 2021 feel for you? Was there a major event or trend that our post missed? Let us know in the comments! 👇

ADSP is a London based consultancy that implements end-to-end data science solutions for businesses, delivering measurable value. If you’re looking to do more with your data, please get in touch via our website. Follow us on LinkedIn for more AI and data science stories!