- LLMs take a text prompt, from a user or another program to generate text.
- They are probabilistic, which means the results they generate can vary in form each time and they are not always accurate.
- When an LLM creates an inacurate response, it is called 'hallucinating'.
One minute overview
A large language model (LLM) is a type of artificial intelligence (AI) that can understand and generate human language. It works by using a neural network, which is a type of computer program that is designed to learn from data. They use statistical methods to predict the probability of a given sequence of words.
They predict the likelihood of the next word given a sequence of words. This means that. LLMs are probabilistic in nature, which means that they don’t always generate the same output for a given input. Instead, they generate a range of possible outputs, each with a different probability of being correct.
The probabilistic nature of LLMs is both a strength and a weakness. On the one hand, it allows LLMs to generate text that is similar to what a human might write, even when the input is ambiguous or incomplete. On the other hand, it means that LLMs can generate text that is incorrect or misleading, especially when the input is complex or unusual.
The neural network is trained on a massive amount of text data, such as books, articles, and websites. It learns to understand the structure of language by analyzing statistical patterns in the data. Once the neural network has been trained, it can generate text that is similar to what a human might write.
Creating an LLM is acutally possible to do yourself. The process will take maybe eight hours of your time but the tools online are free and the lessons are free too.
Creating your own Language Model
Andrej Karpathy, who is now at Open AI, has a series of free videos on YouTube that will show you how you can build your own Neural Network Language Model. The video below is part one of five. You can watch the first one (2hrs) and follow along quite easily. It will give you an understanding of how it works. Understanding basic coding and algebra will help.
Learning how to create Neural Networks from scratch
If you want to learn how all of this works and be able to code your own Neural Networks, it is absolutely possible to do. What you need:
- Basic programming - but you can copy and paste the code from the sessions.
- Simple algebra and calculus - which you can copy if you don't fully understand.
- A dedicated time each week to learn. A couple of evenings each week will get you a long way in just a month.
- Access to the internet - the tools and sessions are all free.