AI & Agentic

AI Fundamentals: Key Concepts You Need to Know Before Going Deeper

AI fundamentals explained clearly - IDE, API, LLMs, MCP, tokens, RAG, hallucination, context windows, and agentic AI. The vocabulary every AI practitioner needs.

AI Fundamentals: Key Concepts You Need to Know Before Going Deeper

Starting to work with AI - whether through coding, no-code tools, or integrating models into existing workflows - requires a working vocabulary. Without it, tutorials feel opaque, documentation makes no sense, and conversations with engineers go sideways.

This article walks through the most important foundational concepts in plain language. No fluff, no unnecessary abstraction - just the terms and ideas you will actually encounter and need to understand.

1. IDE (Integrated Development Environment)

An IDE is software that bundles everything a developer needs into one place: a code editor, a debugger, a terminal, a file manager, and often more. Think of it as a fully equipped workstation rather than a blank desk with one pen.

Common examples: VS Code, Cursor, IntelliJ IDEA, PyCharm.

Why it matters for AI: Modern IDEs now embed AI directly into the development environment. Tools like Cursor or GitHub Copilot in VS Code can read your entire project, suggest completions as you type, explain errors, and help you write new features through a chat interface. The IDE has become the primary place where humans and AI collaborate on code.

2. API (Application Programming Interface)

An API is the bridge that lets two separate software systems talk to each other. The classic analogy: you are the customer at a restaurant. You give your order to the waiter (the API), who carries it to the kitchen (the server - in this case, OpenAI or Anthropic’s infrastructure), and brings back the result (the AI-generated response).

The practical implication: You do not need to build or train your own AI model. You call the OpenAI or Anthropic API to borrow their model’s capabilities and embed them in your own product - a chatbot, a writing tool, an automated analysis pipeline. The API is what makes AI accessible without massive compute infrastructure.

3. The Major AI Models You Will Encounter

These are the foundational Large Language Models (LLMs) that power most of what people call “AI” today:

  • Claude (Anthropic): Models like Claude 3.5 Sonnet are highly regarded for logical reasoning, code writing, and following complex multi-step instructions. Claude tends to produce fewer hallucinations than competitors and has a notably natural writing style.

  • GPT (OpenAI): GPT-4 and GPT-4o are the most flexible general-purpose models available. They handle text, images, and voice well and were largely responsible for igniting the current wave of mainstream AI adoption.

  • Gemini (Google): Gemini 1.5 Pro’s standout feature is an enormous context window - capable of processing hours of video or dozens of books in a single session. It also integrates deeply with Google Workspace (Docs, Drive, Gmail).

  • Llama (Meta): Llama 3 is the most powerful open-source model available. You can download and run it on your own hardware or company servers, which means your data never leaves your infrastructure. This is a significant advantage for privacy-sensitive applications.

4. MCP (Model Context Protocol)

MCP is an open standard that lets AI connect to external data sources and tools in a standardized way.

The problem it solves: Models like Claude or ChatGPT are trained on public internet data. They do not know what is in your internal documents, your Google Drive, your Notion workspace, or your company’s database.

What MCP does: It provides a standardized connection mechanism - sometimes described as “USB-C for AI.” Any tool that implements MCP can connect to any AI that supports MCP, without custom integration work. You can give your AI assistant access to your file system, your GitHub repositories, your databases, or your calendar.

This is what transforms AI from a question-answering machine into an actual execution agent - one that can retrieve real data, take actions, and produce results in the real world.

5. Tokens and Quotas

When you use AI through an API rather than a consumer interface, you are billed and rate-limited based on tokens:

What is a token? AI models do not read text word by word. They split text into tokens - roughly three-quarters of an English word per token on average. You are billed for input tokens (what you send to the model) and output tokens (what the model generates back).

Rate limits: To prevent abuse and manage infrastructure load, providers enforce limits:

  • RPM (Requests Per Minute): How many API calls you can make per minute.
  • TPM (Tokens Per Minute): Total token volume you can process per minute.
  • Usage limits: Monthly spending caps (for example $50 or $100) to prevent runaway costs if a bug causes your code to call the API in an infinite loop overnight.

Understanding token economics is important if you are building anything that runs AI at scale.

6. Prompt Engineering

Prompt engineering is the skill of communicating effectively with AI. The quality of AI output is directly proportional to the quality of the input. A good prompt is specific and structured.

The four elements of an effective prompt:

  • Role: “Act as a senior SEO specialist…”
  • Context: “The product is a B2B SaaS tool for finance teams…”
  • Task: “Write three headline variations for…”
  • Output format: “Return results as a markdown table…”

Prompt engineering is not about memorizing tricks - it is about thinking clearly about what you want and expressing it unambiguously.

7. Hallucination

Hallucination is when AI confidently states something that is factually wrong or entirely made up. This happens because language models work by predicting statistically likely sequences of tokens - they are not reasoning from a database of verified facts. When a model lacks the right information, it sometimes generates plausible-sounding content anyway.

This is why human oversight (Human-in-the-loop) remains essential. AI can be wrong with remarkable confidence, and the output always needs a sanity check - especially for factual claims, data, sources, or anything consequential.

8. Context Window

The context window is the maximum amount of information a model can hold in its working memory during a single interaction. Think of it as the model’s short-term memory.

If a model has a 128k token context window, it can process roughly 100,000 words - about a 300-page book - in one session. Once you exceed that limit, the model begins dropping earlier parts of the conversation to make room for new information. This is when responses start becoming inconsistent or ignoring things you mentioned earlier.

See context-in-ai for a full explanation of how to manage context effectively.

9. RAG (Retrieval-Augmented Generation)

RAG is a technique for grounding AI responses in specific documents without the prohibitive cost of retraining a model on your data.

Here is how it works:

  1. You ask a question.
  2. A search layer scans your document library and retrieves the most relevant passages.
  3. Those passages are injected into the prompt alongside your question, with an instruction like: “Answer this question based only on the provided document excerpts.”

Why this matters: AI responses based on RAG are anchored to real documents - dramatically reducing hallucination risk. For enterprise use cases where accuracy is critical, RAG is typically more practical and less expensive than fine-tuning.

10. Agentic AI

Agentic AI describes AI that goes beyond answering questions and actually takes autonomous action to complete goals.

You give an AI agent a goal. The agent breaks it into steps, selects appropriate tools (search, code execution, file manipulation, email sending), executes those steps in sequence, evaluates results, and corrects course when something goes wrong - all without continuous human direction.

This is the shift from AI as a conversational tool to AI as a collaborator that can complete multi-step workflows independently. See agentic-workflows-enterprise-2026 for how this plays out in organizational contexts.

FAQ

Do I need to understand all these concepts to start using AI productively?

Not all of them immediately. For casual use of AI chat tools, you mainly need to understand prompts, hallucination, and context windows. If you are integrating AI into workflows or building anything with APIs, add tokens, quotas, and RAG to your list. MCP and agentic AI become relevant when you start building more sophisticated automation.

What is the difference between a model and an AI tool?

A model is the underlying neural network that processes language - Claude, GPT-4, Gemini. A tool is the product built on top of a model - Claude.ai, ChatGPT, Cursor. The tool provides the interface and potentially adds features (memory, tools, plugins), while the model does the actual language processing.

How do I know which model to choose for a given task?

General guidance: Claude or GPT-4o for writing and coding, Gemini for tasks involving very large documents, Llama if data privacy requires self-hosting. That said, the difference between top-tier models matters less than how well you structure your context and prompts. A mediocre prompt with a great model often produces worse results than a well-crafted prompt with a competent model.

What is the relationship between RAG and fine-tuning?

Both are ways to make AI more accurate for specific domains. Fine-tuning retrains the model itself on your data - expensive, time-consuming, and produces a specialized model. RAG leaves the base model untouched and retrieves relevant information at query time - faster to set up, more flexible, and much cheaper. For most real-world use cases, RAG is the right starting point.

✦ Miễn phí

Muốn nhận thêm kiến thức như thế này?

Mình tổng hợp AI, marketing và tech insights mỗi tuần - gọn, có gu.

Không spam. Unsubscribe bất cứ lúc nào.