AI Vocabulary
Last updated on: 01 April, 2026
-
AI: Machines or robots or software that are made to think and make decisions as humans do. Examples of AI are a chatbot that explains things, a map that suggests the fastest route to a destination, and a car that drives itself.
-
LLM: Large Language Model or LLM is a type of AI that understands and generates human language. Different types of AIs work with different types of things. For example, there is a separate type of AI for a self-driving car. Driving cars doesn't need proficiency in language; such AI is not LLM. But LLM is the type that has proficiency in language.
-
Generative AI \ GenAI: As the word generative suggests, this type of AI generates something. That something is text, audio, images, and videos. The current AI models are all Gen AI models.
-
AGI (Artificial General Intelligence): AI has a lot of types. Some can suggest short videos that trigger your dopamine release, and some can answer your funny questions. The AI that suggest videos can not answer questions and vice versa. Since they can do only a certain type of work, they are called narrow AI. Computer scientists want to create an AI that can think across all domains of knowledge and can do all types of tasks that a human can do. That let's-do-everything AI is called AGI.
-
Chatbot: The word bot comes from the word robot. A chatbot is a software with which you can chat. What you see on Claude.ai or Lumo.proton.me is a chatbot.
-
Model: A model is the main piece of software, the brain that an AI company has made. Different companies have different models. Check huggingface.co/models, you'll see millions of models or 'brains'. Each model has its own pros and cons. One might be smarter, heavier, and pricier; another model might be lighter, cheaper, but less smart. One model might feel more like a chatty human being, another model might feel like a know-it-all professional. In the current AI race, models are what matter the most.
-
Prompt: The question that you ask or the command that you type in the chat box is called a prompt. There is a word "prompt engineering," which essentially means carefully crafting your question. If you ask a certain way, the AI will reply a certain way. There are some best practices of prompting to get better answers from AI models.
-
System prompt: It's the prompt that the AI company gives to a model before giving the model to users. The prompt contains instructions like how to behave, its persona, tone, etc. You can't edit a system prompt -- that's up to the company. You can edit your prompt, which is called the user prompt.
-
Apps: Model is the underlying brain; you don't directly interact with it. You need an app, an interface to interact with the brain. On Claude.ai, you see the chat box. That is an app. But it is not the only app that uses the Claude model. Claude code is a coding app that uses the same model, but is a different app.
-
Harness: A harness is the system around an AI model that turns it from a simple chatting model into an assistant that do helpful tasks. It gives a model access to tools, files, and the network. The harness also puts safety boundaries around AI models so that they don't delete or edit things they are not supposed to do.
Claude code is a harness around the Claude model that turns Claude model from a text-generating LLM into a software engineering agent. An app is what you interact with, while a harness is what's working behind the scenes. So, Claude code is both an app and a harness. From user's POV, it's an app, but from the company's POV, it's a harness.
-
Agent/Agentic AI: An LLM is a brain in a jar, but an agent is a brain with a plan and little hands. An LLM can only give you an output, but an agent can do stuff for you, for example, read emails and reply to them, create and edit files on your computer, book reservations in restaurants, etc.
-
Tokens: A token is the smallest unit of text that an LLM can read or write. One word is approximately equal to one token. When you give a prompt, the LLM sees the prompt as a bunch of tokens. In reply, it produces a bunch of tokens. You might ask who cares what the smallest unit of text is in an LLM's eyes? There are some AI plans in which you pay according to the tokens used/produced by the model. In such a situation, you will care about the token count in each chat/task.
-
Context window: A context window is an AI's "short-term memory." When you chat with an LLM, it doesn't actually remember your chats from the day you started using it. AI will remember the prompt with which you initiated the chat, what it replied, and the next messages. But once you start a new chat, the older context window is no longer there. One chat is saved in one context window. A new chat starts with a next context window. Every AI model has a limited context window (from hundreds of thousands of tokens to millions of tokens).
-
Multimodal: A modal is a format of data. Text is one format. Image is another. Audio and video are also data formats. A unimodal LLM is an LLM that works with only one data format like text. If you try to make it understand an image, it won't be able to do that. Because it's a unimodal. But a multimodal LLM can understand different data formats. Modal is not the same as model. A model is the brain, while modal is a data format.
-
Inference: There are two stages of a model: being built and being run. Being built is called training and being run is called inference. So, inference is just a fancy term for "a model running on a computer."
-
API: An average person uses publicly available apps to talk to LLMs. A webapp at Claude.ai is human-friendly interface or app. But what if a programmer wants to develop her own app -- the interface -- and use Claude LLM as the underlying brain. How would she do it?
She would use something called API (Application Programming Interface). An API lets a programmer send requests to another computer/software and get answers back. In our case, the developer would get an API key from Claude, embed it in her app, send a request to Claude via API, and get the answer back via the same API. It basically becomes a connecting tunnel.
-
Open-source vs. closed-source: I am going to butcher this up a bit to keep it concise. The code, the piece of software of LLMs, generally sits and runs on servers of the company that created it. We only pay fee to use them. We never get the underlying code; we are only sending questions and getting answers back. That's closed-source. But if a company gives away the 'source' or the code, lets you legally download it, and run it on your computer, it's called open-source. Go to www.duck.ai, click on the model selection option, and you will see a bunch of open-source models that are created by other companies, but downloaded and run by DuckDuckGo.
-
Benchmark: How do we test the smartness of humans? With standardised tests. How do we test the smartness of LLMs? With standarised tests.
But for reasons unknown to humankind, the standarised are being called benchmarks. They are basically a set of questions, puzzles, and tasks with known, correct answers. Researchers feed these questions to the AI and grade how many it gets right. There are benchmarks related to general knowledge, math, coding, safety, etc.