AI glossary - terms explained clearly

What this is about

Anyone who wants to work with AI can't get around a handful of English terms. They look unwieldy - behind each one sits a simple idea.

This glossary explains the six terms you'll hear most often, in two paragraphs each. Plus an appendix with further terms that matter less often but crop up occasionally. As of: 2026.

One

LLM - Large Language Model.

A large language model is the technology behind ChatGPT, Claude, Gemini and similar services. At its core it's a program that predicts which word would plausibly come next - based on the vast amounts of text it was trained on. When you type "The capital of Germany is..." it knows: Berlin. Not because it knows Berlin, but because Berlin followed this opening millions of times in its training.

That isn't real intelligence in the philosophical sense - but it's an astonishingly useful imitation. From this simple basic idea comes everything you experience as AI today: writing texts, answering questions, generating code, translating language.

Two

RAG - Retrieval Augmented Generation.

A normal LLM only knows what was known at training time. It knows nothing about your company, your customers, your contracts. RAG is the technique that gives an LLM access to your own documents - so that it draws on them when answering.

In concrete terms: you put all your documents into a searchable collection. With every question the system first pulls out the relevant documents and hands them to the LLM to answer. The result: answers based on your real data - not on the model's general knowledge. Almost every sensible AI application in mid-sized companies uses some form of RAG today.

Three

Agent.

An AI agent is an LLM that doesn't just answer but acts. It can operate programs, send emails, look things up in databases, create orders - in steps it plans itself.

Example: you say "Reply to all enquiries from the last week that are still unanswered with a standard response." An agent checks which enquiries are open, drafts the replies, sends them. That's the next generation of AI application - but it's also the more dangerous one. Agents need clear limits on what they're allowed to do, and ideally a four-eyes principle for anything with consequences.

Four

Hallucination.

When an LLM gives an answer that sounds plausible but is factually wrong - that's called a hallucination. It isn't a bug in the narrow sense, but a consequence of how it works: the model predicts what probably fits - not what's true.

Hallucinations are the biggest risk when using AI. They're especially dangerous because they sound convincing. A made-up phone number, a wrong legal clause, a source citation that doesn't exist. Protection: always verify important facts, treat AI answers as a suggestion rather than the truth, use RAG so the model works from real documents.

Five

Embedding and vector database.

An embedding is a mathematical translation of text into numbers. More precisely: each text becomes a list of a few hundred or thousand numbers that describe its meaning. Texts with similar meaning have similar number patterns - even when they use completely different words.

A vector database is a store that can handle these embeddings. You file all your documents as embeddings. With a question, the question itself also becomes an embedding - and the database finds the documents closest in meaning. That's the technical basis of RAG (see above).

Six

Knowledge cutoff and training data.

Every LLM was trained at a particular point in time. What happened after this cutoff it doesn't know - unless it gets current information via RAG or an internet connection. That's why ChatGPT sometimes gives outdated answers to questions about today's weather or the current chancellor.

Training data are the texts the model learned from. With the large models these are billions of web pages, books, scientific articles and code examples. What didn't appear in training, the model doesn't know either. What appeared especially often, it handles especially well.

Further terms in brief

Token

The smallest unit of text an LLM works with. Roughly three quarters of a word. When a provider says "100,000 tokens of context", that means: the model can process about 75,000 words at once.

Prompt

The input you give the model. A good prompt contains the task, the context and an example. Prompt engineering is the name for the craft of writing good prompts.

Fine-tuning

Further training of a finished model on your own data so it handles a specific topic better. Involved, not worth it for most SMEs - RAG is usually the better choice.

Open-source model

An LLM whose weights are publicly available. Can be run on your own servers. Examples: Llama (Meta), Mistral. Important when you must not hand data to US providers under any circumstances.

Context window

The amount of text a model can process at once. 8,000, 100,000 or two million tokens - depending on the model. Bigger isn't always better; more context costs more.

Multimodal

Models that can process not only text but also images, audio or video. Handy for evaluating scans, photos of receipts or spoken recordings.

Inference

The process by which a finished model produces an answer. This is the expensive computing power that goes with every request - and it determines the running costs of AI applications.

Temperature

A setting that determines how creative or conservative the model's answers are. Low: precise, repeatable. High: varied, sometimes surprising. For facts: low. For texts: higher.

MCP - Model Context Protocol

A standard from Anthropic (late 2024) that lets AI models access your own data sources and tools in a uniform way. Will appear in many applications over the coming years.

Tool use / function calling

An LLM's ability to call external programs - to query a database, send an email, have a calculation done. The basis for agents.

Bias

Distortions a model takes on from its training data. If the data is unbalanced, the answers are too. Important for personnel decisions, credit assessments and similarly sensitive topics.

Guardrails

Protective layers around a model that block certain requests or answers. Important so users can't lure the model into statements that are legally risky or damaging to reputation.

Foundation model

A large, generally trained base model on which many specialised applications are built. Examples: GPT-4, Claude, Gemini. Developed by a few companies, used by many.

Generative AI

An umbrella term for AI systems that can create content - texts, images, audio, video, code. In contrast to classic analytical AI, which only recognises and classifies.

If a term is missing

This glossary grows with what customers ask us. If an important term is missing, write to us - we'll add it.

AI vocabulary isn't static. What was mainstream in 2024 may be outdated by 2026 - and new terms keep arriving. We try to keep this list current without overloading it with buzzwords.

If you want to go deeper into the topics

You'll find the detailed background pages on AI linked - or write to us with your specific question.

What AI fundamentally is, is under What is AI, really?. What AI can do today is under What AI makes possible. What it can't do is under What AI can't do, even when it looks like it can.

Let's talk ← Back to the knowledge base

AI glossary.