AI Predictions for 2024

Dr. Santanu Bhattacharya
DataDrivenInvestor
Published in
8 min readFeb 10, 2024

--

Photo by Kenny Eliason on Unsplash

Decades from now, the two years of the post-COVID world, 2020 and 2021, will be seen as watershed moments in Artificial Intelligence (AI). In the unprecedented years that COVID-19 ravaged the world, enterprises accelerated their journey to data and digital: finding ways to creatively, broadly, and boldly apply AI to emerge stronger in the short term and survive in the long term.

A clear pattern has emerged globally for the past 5 years. In 2018–19, the AI experimentations within enterprises became mature. In 2020, adoptions began earnestly and suddenly, COVID-19 gave the business leaders an opportunity and impetus to push automation and AI. In 2021, the fallout from the second and third waves of COVID-19 became clear, starting with the rapid decline of many traditional, not-so-digital businesses. In 2022, finally, out of the shadow of COVID-19, human-focused AI applications proliferated. And 2023 was all about Generative AI, forcing the C-suite of most of the Fortune companies to take notice of the long-term impact of AI. According to McKinsey & Company, AI applications have the potential to contribute between $2.6 trillion and $4.4 trillion annually to the global economy.

Following are the trends to expect in 2024:

Customization of GenAI deployment will be a top corporate strategy

Feels like it is an eon back when the world got taken over by ChatGPT promising to revolutionize everything from education, business, healthcare, commerce, and more, a natural question arises:

How does an enterprise such as a bank, healthcare provider, telecommunication company or retailer decide on implementing such systems in their business?

Specifically, whether to use closed Large Language Models (LLM) like ChatGPT or build the infrastructure internally? How costly are closed models like ChatGPTs vs. open-source LLMs? What are the tradeoffs in terms of performance, ongoing costs, risks, etc.?

Before implementing language models like ChatGPT, industries must carefully consider several risks related to data privacy, AI model bias, observability, and customer trust, particularly when handling private customer data. These concerns are critical in maintaining compliance and protecting sensitive information.

In a paper in mid-2023, I stated that there are cases where it makes sense to use a closed model like ChatGPT, such as summarizing a company’s product offerings for chatbot queries. There is a threshold of usage when ChatGPT is economical. At ~1000 requests per day, ChatGPT is cheaper than using open-source LLMs deployed to AWS. However, as the request volume escalates to millions per day, the economics shift, and deploying open-sourced models on AWS becomes the more affordable option, especially considering the 2023 pricing structures for both ChatGPT and AWS. Note: this article was written before the advent of GPT4, but the generalized framework mentioned there, is applicable for GPT4 and more.

Enterprises will continue to grapple with the use cases, and tradeoffs in terms of performance, ongoing costs, risks, etc. in 2024 to arrive at an optimal scenario for AI applications deployment. However, this will continue to be a moving target — largely because the tussle between performances of open sources and closed models will continue, at least through 2024.

Closed models will outperform Open ones, at least through 2024

On the debate around open-source and closed-source models, most leading AI model developers — OpenAI, Google DeepMind, Anthropic, Cohere et. al. — keep their advanced models proprietary. However, a few companies including Meta (disclosure: my former employer) and new startup Mistral have chosen to make their state-of-the-art models publicly available.

As of writing of this article, the highest-performing foundation models, e.g., OpenAI’s GPT-4 are closed-source. However, the performance gap between closed and open models is shrinking. Many experts in the open-source community state that open models are on track to overtake closed models in performance, perhaps by next year.

However, this will take a lot of work to achieve for the following reasons. The investment required to develop new models is enormous, and will only continue to balloon for the near future.

The current estimate is that OpenAI will spend around $2–2.5 billion to develop GPT-5.

While Meta continues to support the development of Llama 3, I am not sure being a public company, they can continue to spend $2–2.5 billion, or even more in the future without a clear revenue path.

Startups like Mistral face an even more daunting challenge, For instance, it was considerably riskier for OpenAI to build GPT-4 using a mixture-of-experts architecture, when this approach had not previously been shown to work at this scale, than it was for Mistral to follow several months later with its own mixture-of-experts model. However, doing so, makes Mistral a fast follower as opposed to an inventor.

Photo by BoliviaInteligente on Unsplash

Cambrian Explosion of Generative Models — SLM, LAM, VLA and more

LLM’s have taken the world with the dramatic release of ChatGPT, catching the public’s imagination and making it a household name in 2023. Most of today’s leading generative AI models incorporate text, images, 3-D, audio, video, music, physical action and more in their training data can produce such outputs. They are far more than just language models and many of today’s problems, for example, if someone is building a special purpose language model to analyze, or to create marketing documents, the language model would unlikely have to be trained to create poetry in Shakespear’s style, or give its opinion on the potential for finding alien life in another galaxy. Such models, therefore, will likely have to be trained on smaller datasets, making it potentially a Small Language Model (SLM).

Or consider Robotics where there is a need for an efficient model that has to be integrated with an action. Large Action Models (LAMs) take this as a step forward by enhancing LLMs to turn into ‘agents’: software units capable of running tasks by themselves. Instead of answering human user queries like LLMs do, LAMs help to achieve a goal, combining the language fluency of an LLM with the capacity to complete tasks and decision-making autonomously, which involves a substantial change.

Going even more specific, consider generative models that combine visual and language input, combining general internet-scale knowledge to take actions via a robotic arm. A richer term than “language model” should and will exist for such highly specific models. In fact, in this specific case, the Vision-Language-Action (VLA) model is one alternative phrase that researchers have used.

Alternatives to Transformer Architecture will see adoptions — enter Hyena and Liquid Neural Network

Every major generative AI model in existence — GPT4, Midjourney, GitHub Copilot, and so on — is built using transformers, introduced in a seminal paper, “Attention is all you need” out of Google in 2017. The challenge with a transformer is that scales quadratically with sequence length, requiring a huge amount of computing power and electricity. For example, to build the next generation of Llama and assorted AI products, Mark Zuckerburg announced that Meta will buy 350,000 NVIDIA H100 GPUs, at an estimated street price of $9–10 billion.

Add to this the looming environmental crisis the continuation of current trends in AI capacity and adoption are set to cause. Per grand, NVIDIA will likely ship 1.5 million AI server units annually by 2027, which running at full capacity, would consume at least 85.4 terawatt-hours of electricity annually — more than what many small countries use in a year, according to an assessment by Alex de Vries, a data scientist at the central bank of the Netherlands and a Ph.D. candidate at Vrije University Amsterdam. Clearly, this is not a sustainable path for the future, given the looming climate disaster the planet is already facing.

Photo by Jeffery Ho on Unsplash

Within the AI community, a few researchers are developing next-generation AI architectures that are different and potentially super transformers. At Chris Ré’s lab at Stanford, the team is building a new model architecture that scales sub-quadratically with sequence length rather than quadratically, as transformers do. Sub-quadratic scaling enables AI models that are (1) computationally less intensive and (2) better able to process long sequences compared to transformers. Well-known sub-quadratic model architectures from Ré’s lab in recent years are Monarch Mixer, Hyena, and most recently, Mamba, which some researchers are christening as the “end of transformers”

Another alternative to the transformer architecture is liquid neural networks, developed at MIT. In 2024, I expect some of this research will start seeing adoption, transitioning from a lab novelty to a credible technology used in production.

5. 2024 will be the year of Generative AI for Music and Video

2023 was the year of AI image generation, 2024 will be for music and videos. Runway Gen2 kicked off AI video generation ten months back, with Pika 1.0 and Stable Diffusion Video joining recently. These tools can generate short, low-resolution video clips, which is great as a novelty, but does not qualify for production quality videos, or even as YouTube clips, yet.

By the end of 2024, AI videos will be lengthy (10 minutes plus), high definition, and adhere to prompts in detail. By 2025, expect Hollywood animation-level quality generations, and AI features to permeate professional studio productions in the same way computer graphics dominates special effects and animation now.

6. Wildcard entry: The first AI-enabled $10M “bank heist” will happen in 2024

This prediction does not require much of an explanation. While the AI generated voice calls with cloned voices of family members with an emergency started in 2023, The first “bank heist” enabled by emails, GenAI generated synthetic voice/ video calls and possibly a video conference to convince a bank employee to transfer money, over $10 million will happen in 2024.

Photo by Eduardo Soares on Unsplash

Embracing the future

These trends and predictions are not just technological forecasts: they serve as a roadmap for innovation, societal growth, and advancement worldwide. While a technologist like me is often engrossed with the latest technological “shiny object”, it is also important to think about business impacts as well as global impacts. Having lived in India more and more of late, I often think about how AI will shape emerging markets where the entire nation, even for a ~$4 Trillion economy, uses only a few thousand H100 GPUs whereas just one company like Meta will 500–600 thousands of the same by end of 2025, 50 to 100 times India’s computing firepower. Will early “Land Grab” worsen global inequality, especially in Africa, parts of Asia, and parts of Latin America?

So by staying informed, adaptable, and proactive, perhaps the human race can help harness AI to build a more efficient, innovative, and inclusive future.

Previous Story: Performance Benchmarking: A Detailed Comparison Between Llama and Llama 2

Next Story: State of AI Report 2023

Visit us at DataDrivenInvestor.com

Subscribe to DDIntel here.

Have a unique story to share? Submit to DDIntel here.

Join our creator ecosystem here.

DDIntel captures the more notable pieces from our main site and our popular DDI Medium publication. Check us out for more insightful work from our community.

DDI Official Telegram Channel: https://t.me/+tafUp6ecEys4YjQ1

Follow us on LinkedIn, Twitter, YouTube, and Facebook.

--

--

Chief Technologist at NatWest, Prof/Scholar at IISc & MIT, worked for NASA, Facebook & Airtel, built start-ups, and future settler for Mars & Tatooine