LLM
Also known as: Large Language Model, Foundation Model
- LLM
- A large language model (LLM) is a neural network, almost always a transformer, trained on massive text corpora to predict the next token in a sequence. By learning statistical patterns across billions of words, it can generate text, answer questions, translate, summarize, and write code, powering tools like ChatGPT, Claude, and Gemini.
A large language model (LLM) is a transformer-based neural network trained on huge amounts of text to predict the next token in a sequence, which is the simple objective that gives rise to its ability to write, summarize, translate, and reason over language.
What It Is
An LLM learns by repeatedly guessing the next piece of text in its training data and adjusting its internal weights when it guesses wrong. After training on trillions of tokens, those weights encode statistical relationships between words, concepts, and code, so the model can continue almost any prompt in a way that is fluent and usually relevant.
What makes modern LLMs general-purpose is scale combined with the transformer architecture. Attention lets the model weigh every token in its context window against every other token, so it can track meaning across long passages. The same trained model can then be adapted to chat, coding, retrieval, or agent workflows without retraining from scratch, which is why a single LLM underpins products as different as a chatbot, a code assistant, and a search engine.
One Sentence to Remember
An LLM is a next-token predictor scaled up until prediction turns into something that looks like understanding, with no built-in guarantee that any individual output is correct.