A mechanical heart with cables coming out ion it

Large Language Models will become the heart of all software

Large Language Models (LLMs) like ChatGPT are starting to replace the logic sections of some software. The LLM is hidden from the user and only interacts with the application. I think this is the start of a trend that will affect all software in the future.

15th April 2023

Every fortnight I cover three pieces of AI news and information on the same topic. I'll show how trends in these areas are forming, help you learn understand how these technologies work, and draw some conclusions from the three related pieces.

  1. News: A new paper from Stanford and Google shows how Large Language Models (LLMs) like ChatGPT can be leveraged to create game characters that can plan and influence each other through highly believable conversations.
  2. Learn: AutoGPT and BabyAGI are goal completing agents that try to complete tasks by themselves. How do autonomous agents work and how can you build something yourself?
  3. News: Stability AI has released an open-source language model called StableLM. The LLM model is available for commercial and research purposes and can generate text and code. It's clear that most future software will leverage models like this for user communication, planning and customisation.

Part 1: Generative Agents Paper

There has been a lot of news about autonomous agents recently. These programs ask the user to provide a goal and then they use that response to prompt a LLM (What is a Large Language Model?) to break down the goal into tasks. The program then takes each task and iteratively prompts the LLM to complete each task or break it down into more manageable tasks. It tries to complete all the tasks autonomously, without human intervention. These task completion programs are called agents.

This paper called Generative Agents: Interactive Simulacra of Human Behaviour is an amazing example of how complex human-like behaviour can be simulated by using an LLM to plan tasks, form memories and conduct conversations. Normally, code like this would be exceptionally complex to write and manage, but here Park et al. show that this task can be offloaded to an LLM with better and more believable results.

A screen shot of a 2D computer game
A screen shot fom the demonstration game created for the paper

In the above screenshot you can see that there are a number of game characters which move round the map and complete daily tasks in a simulation, similar to those in a game like 'The Sims'. Where it differs is that characters here can remember and react to previous conversations, understand their relationships to other characters and even form new relationships:

John: Good morning, Eddy. Did you sleep well?
Eddy: Good morning, Dad. Yeah, I slept great.
John: That’s good. What are you working on today?
Eddy: I’m working on a new music composition for my class. It’s due this week, so I’m trying to get it finished. But I’m having so much fun with it!
John: That sounds great!

Soon after Eddy heads out, Mei wakes up and joins John. Mei asks about her son, and John recalls the conversation that they just had:

Mei: Did Eddy already leave for school?
John: Yes, he just left. He’s working on a music composition for his class.
Mei: Oh, that’s wonderful! Tell me about it.
John: I think he’s really enjoying it! He said he’s having a lot of fun with it.
Mei: That’s great! I’m so proud of him.

This novel architecture makes it possible for generative agents to remember, and reflect, when interacting with other agents. In the conversation above Mei immediately asks after her son as he is important to her, despite not having recent memories of him to prompt conversation.

How does this work?

They do this by storing the events that happen to them as plain text descriptions called the memory stream. This is a brilliant idea as it allows the memories to be passed directly to the LLM as part of the prompt. But the memory stream alone is not enough, instead, all of the characters constantly go through a process of reflection which prioritises memories based on recency, importance and relevancy. This list of reduced and prioritised items is then used in conversations with other characters.

A diagram showing the memory stream of a character and the weightings they give to events in the stream
Memories are filtered by importance, recency and relevance

What is truly impressive is that information passed on in a conversation with one charcater, can be remembered and then passed on to other characters in the future.

If you have a spare ten minutes try the online demo too, it is sadly just a replay of the day's events, but it is quite fun to click on each of the characters and watch them go about their day and have conversations.

Thoughts

While this paper is exceptional work that could have interesting implications for AI in games, the broader picture is in the use of an LLM as the computational heart of other non-gaming applications. They have shown that an LLM can replace code for planning, user conversations, reputational status and more. The LLM used here isn't writing code for them, it's replacing the need for the logic parts of the code instead.

Part 2: Autonomy is the new black

In a similar vein there are many projects that have been circling AI Twitter and causing a stir. They are all proofs of concept, and published as Open Source but they, like the Generative Agents paper, are using a LLM like ChatGPT or Facebook’s Llama to obviate the need to write planning code.

BabyAGI

This is a paired down version of a previously more complicated project called Task-driven Autonomous Agent. The actual output of BabyAGI is unimpressive at first glance but its value is in the blueprint it provides for how an autonomous agent might work. The code can be downloaded at the BabyAGI GitHub Project and run on any computer. The video below is queued up to the demo at 4m 34s, but the entire tutorial is only 6 mins:

AutoGPT

Released at the same time, give or take a few days, AutoGPT is another Open-Source project designed to run independently of the user. You can download the code here at AutoGPT on GitHub. Driven by GPT-4, it chains together LLM "thoughts", to autonomously achieve whatever goal is set by the user. What is different here is that the Bot can search Google and extract information from websites it navigates to. The video below is queued up to the live demo at 8m 32s. The entire video tutorial is 14 mins:

HyperWrite ‘Personal Assistant’ enters early access

Finally, HyperWrite is another interesting project in the field of automation. Billed as “web browsing with a Personal Assistant” it builds on HyperWrite’s previous product which is a Chrome Extension that uses GPT-4 to fill in form fields on the web for you.

In the short demo, the personal assistant is prompted to ‘order a large plain pizza from Dominos to One Vanderbilt’. It then proceeds to search for Dominos on Google, navigate to the Domino’s Pizza website and begin placing an order, looking up an address and zip code to complete the transaction.

Thoughts

These agents are what have been described as 'scaffolded' LLMs. Here these apps again use LLMs like a human would by asking it questions in natural language. In this way the app acts as 'scaffolding' to the LLM acting as a go-between for the user and the LLM.

Part 3: Stability AI releases open source LLM called Stability LM

The new LLM is available for free on GitHub. Here is their statement from the website:

Today, Stability AI released a new open-source language model, StableLM. The Alpha version of the model is available in 3 billion and 7 billion parameters, with 15 billion to 65 billion parameter models to follow. Developers can freely inspect, use, and adapt our StableLM base models for commercial or research purposes, subject to the terms of the CC BY-SA-4.0 license.
In 2022, Stability AI drove the public release of Stable Diffusion, a revolutionary image model that represents a transparent, open, and scalable alternative to proprietary AI. With the launch of the StableLM suite of models, Stability AI is continuing to make foundational AI technology accessible to all. Our StableLM models can generate text and code and will power a range of downstream applications. They demonstrate how small and efficient models can deliver high performance with appropriate training.

Thoughts

Obviously, another language model being released for free is a big deal, but this is just the tip of the iceberg. Apple moved very quickly to enable the previous release of Stable Diffusion to run efficiently on Apple’s M series of chips, and I think it is highly likely we will see this model paired down to run on a iPhone. This is important because right now it takes a few seconds to return answers from ChatGPT. When that time is instantaneous it can enable voice-in-your-ear applications that can advise, translate and inform in real-time.

Later down the page of their release this quote stuck out for me:

Language models will form the backbone of our digital economy, and we want everyone to have a voice in their design.

The projects I've linked here all use LLMs as a foundational beating heart of the application. Crucially they are not used by humans to answer questions or write chunks of code like ChatGPT, they are used by programs to replace entire logic modules of code. I think it's clear that we are seeing the start of a new class of software.

✉️
This post is from a free newsletter that I publish every now and again. Hit the subscribe button to sign up and please forward it to others if you think they may like it.

Subscribe to our Newsletter and stay up to date!

Subscribe to our newsletter for the latest news and work updates straight to your inbox.

Oops! There was an error sending the email, please try again.

Awesome! Now check your inbox and click the link to confirm your subscription.