How to Build an AI Agent from Scratch (Free, No Framework, 2026)

Q: Can I build an AI agent from scratch in Python without a framework?

Yes. An agent is just a loop. 40-150 lines of Python are enough.

Q: What do I need to build a Python AI agent?

Python 3.10+, API key, ReAct loop, 1-2 tools, memory (message list), max-step guardrail.

Q: How many lines of code is a basic AI agent?

40-80 lines minimal; 150 lines with tools + memory + bounding.

Q: Should beginners use LangChain or build from scratch?

Build from scratch first. Understand the loop. Then use frameworks.

Q: How do I stop the agent from looping forever?

max_steps limit + repeat-call detection.

Q: Do I need GPT-4 or can I use a cheaper model?

gpt-4o-mini / Claude Haiku sufficient for 2-3 tool agents. Frontier models for complex multi-step tasks.

Q: What is the ReAct pattern?

Reasoning + Acting: Thought → Action → Observation → repeat until Final Answer.

Q: What is the difference between an AI agent and a chatbot?

Chatbot: single-turn, stateless. Agent: multi-turn, stateful, tool-using, autonomous.

Q: What frameworks are best for AI agents in 2026?

LangGraph, OpenAI Agents SDK, CrewAI, Claude Agent SDK.

Q: Can I use free/open-source models to build an AI agent?

Yes — Ollama + GGUF models (e.g., via pguso/agents-from-scratch repo).

What Is an AI Agent? (And Why It's Not a Chatbot)

how to build an AI agent from scratch free 2026

The One-Loop Mental Model

An AI agent is a loop. You give it a goal, it reasons, takes action, observes results, and repeats until it finishes. That's it. No frameworks needed.

Simple analogy: - Chatbot: "What's the weather?" → Answers with static knowledge. - Agent: "What's the weather in Tokyo?" → Calls weather API → Returns live data.

Agent vs Chatbot — The Critical Difference

Feature	Chatbot	AI Agent
State	Stateless	Stateful (keeps memory)
Tools	No tools	Uses tools (APIs, calculators, search)
Loop	Single turn	Multi-step reasoning
Goal	Answer questions	Complete tasks autonomously

Why 2026 Is the Year of AI Agents

73% of enterprises are actively investing in agentic AI (AgileSoftLabs, 2026)
LangGraph has overtaken LangChain for production agents
MCP (Model Context Protocol) is becoming the standard for tool integration
"Skip LangChain, learn the loop first" is now the consensus advice

What You Need Before We Start

Python 3.10+ and a Code Editor

Python 3.10 or newer
Any code editor (VS Code, PyCharm, or even IDLE)
Terminal/command prompt

Free API Keys (Google AI Studio, Anthropic Free Tier, Groq)

Provider	Free Tier	API Key Needed
Google AI Studio (Gemini Flash)	$0.25/1M input tokens	Yes (free)
Anthropic	$5 free credits	Yes
Groq	Free tier available	Yes
DuckDuckGo	Completely free	No (direct API call)

Why We're Skipping LangChain (For Now)

LangChain is powerful but adds complexity. Before using any framework, understand the core loop. Once you grasp the basics, LangChain becomes easier to learn.

The 4 Components Every AI Agent Needs

The LLM Brain (Which Model to Pick in 2026)

Model	Best For	Cost (June 2026)
Claude Sonnet 4.6	Agentic workflows, 1M context	Paid tier
GPT-5.4	General purpose, strong tool calling	Paid tier
Gemini 3.1 Flash	Cheapest option	$0.25/1M input
DeepSeek V3.2	Most cost-effective	$0.27/1M input
Llama 4	Open-source option	Free (local)

Our choice: Gemini 3.1 Flash (free tier) + DuckDuckGo (free search).

Tools — Giving Your Agent Hands

Tools are functions the agent can call. Examples: - Web search - Calculator - Read/write files - Call REST APIs

Memory — Short-Term and Long-Term

Short-term: Conversation history (list of messages)
Long-term: Vector database (FAISS) for retrieving past information

The Orchestration Loop

while True:
    thought = llm_think(history, tools)
    action = parse_action(thought)
    observation = execute_action(action)
    history.append(thought + observation)
    if done: break

Step-by-Step: Build Your First AI Agent

Step 1 — Set Up Your Environment (pip install + API key)

# Create virtual environment
python -m venv agent_env
source agent_env/bin/activate  # Windows: agent_env\Scripts\activate

# Install dependencies
pip install requests python-dotenv

# Create .env file with your API key
echo "GEMINI_API_KEY=your_key_here" > .env

Step 2 — Define Your First Tool (Web Search)

import requests

def web_search(query):
    """Search the web using DuckDuckGo API (free, no key needed)."""
    url = f"https://api.duckduckgo.com/?q={query}&format=json"
    response = requests.get(url)
    data = response.json()
    return data.get("AbstractText", "No results found")

Step 3 — Write the ReAct Loop (The Heart of the Agent)

import os
from dotenv import load_dotenv

load_dotenv()

def llm_think(history, tools):
    """Ask the LLM what to do next."""
    # Simplified: call Gemini API
    # Real implementation would send history + tool descriptions
    return "Search for current weather in Tokyo"

def parse_action(thought):
    """Extract action from LLM response."""
    if "search" in thought.lower():
        return {"tool": "web_search", "query": "weather Tokyo"}
    return None

def execute_action(action):
    """Run the tool."""
    if action["tool"] == "web_search":
        return web_search(action["query"])
    return "No action taken"

# Simple agent loop
history = []
max_steps = 5

for step in range(max_steps):
    thought = llm_think(history, [])
    action = parse_action(thought)
    if not action:
        print("Agent: I don't know what to do.")
        break
    observation = execute_action(action)
    history.append(f"Thought: {thought}\nAction: {action}\nObservation: {observation}")
    print(f"Step {step+1}: {observation}")

Step 4 — Run and Test Your Agent

python agent.py

Expected output:

Step 1: Current weather in Tokyo: 22°C, sunny.

Step 5 — Add Memory (Keep Conversation Context)

# Extend history with conversation turns
def add_memory(history, user_input, agent_response):
    history.append(f"User: {user_input}")
    history.append(f"Agent: {agent_response}")
    return history

Step 6 — Add Guardrails (Stop Infinite Loops)

# Prevent infinite loops
max_steps = 5
repeat_count = 0
last_action = None

for step in range(max_steps):
    thought = llm_think(history, [])
    action = parse_action(thought)

    # Repeat detection
    if action == last_action:
        repeat_count += 1
        if repeat_count > 2:
            print("Agent stuck in loop. Stopping.")
            break
    else:
        repeat_count = 0

    last_action = action
    observation = execute_action(action)
    history.append(f"Step {step+1}: {observation}")

The Complete Code (Copy & Run)

import os
import requests
from dotenv import load_dotenv

load_dotenv()

def web_search(query):
    url = f"https://api.duckduckgo.com/?q={query}&format=json"
    response = requests.get(url)
    data = response.json()
    return data.get("AbstractText", "No results found")

def llm_think(history, tools):
    # Placeholder: replace with actual LLM call
    return "Search for current weather in Tokyo"

def parse_action(thought):
    if "search" in thought.lower():
        return {"tool": "web_search", "query": "weather Tokyo"}
    return None

def execute_action(action):
    if action["tool"] == "web_search":
        return web_search(action["query"])
    return "No action taken"

# Agent loop
history = []
max_steps = 5
last_action = None
repeat_count = 0

for step in range(max_steps):
    thought = llm_think(history, [])
    action = parse_action(thought)

    if not action:
        print("Agent: I don't know what to do.")
        break

    # Repeat detection
    if action == last_action:
        repeat_count += 1
        if repeat_count > 2:
            print("Agent stuck in loop. Stopping.")
            break
    else:
        repeat_count = 0

    last_action = action
    observation = execute_action(action)
    history.append(f"Step {step+1}: {observation}")
    print(f"Step {step+1}: {observation}")

Make It Useful: Add Real-World Tools

Connect to Any REST API

def call_api(url, method="GET", data=None):
    """Generic API caller."""
    if method == "GET":
        response = requests.get(url)
    elif method == "POST":
        response = requests.post(url, json=data)
    return response.json()

Read and Write Files

def read_file(path):
    with open(path, "r") as f:
        return f.read()

def write_file(path, content):
    with open(path, "w") as f:
        f.write(content)

Send Emails and Messages

# Example using SMTP (requires email credentials)
import smtplib
from email.mime.text import MIMEText

def send_email(to, subject, body):
    msg = MIMEText(body)
    msg["Subject"] = subject
    msg["From"] = "your_email@example.com"
    msg["To"] = to

    with smtplib.SMTP("smtp.example.com", 587) as server:
        server.starttls()
        server.login("your_email@example.com", "password")
        server.send_message(msg)

Search the Web (DuckDuckGo Free API)

Already implemented above. No API key needed.

From Toy to Production: What Changes

Vector Memory with FAISS (Free, Local)

import faiss
import numpy as np

# Create index
dimension = 768  # Embedding size
index = faiss.IndexFlatL2(dimension)

# Add embeddings (simplified)
def add_memory(embedding, text):
    index.add(np.array([embedding]))
    # Store text in separate list
    return True

Model Routing (Cheap Model for Simple Tasks, Smart Model for Hard Ones)

def route_to_model(task_complexity):
    if task_complexity == "simple":
        return "gemini-flash"
    else:
        return "claude-sonnet"

Logging and Debugging Your Agent

import logging

logging.basicConfig(filename="agent.log", level=logging.INFO)

def log_step(step, thought, action, observation):
    logging.info(f"Step {step}: {thought} | Action: {action} | Observation: {observation}")

Deploy as a Discord Bot or Web App

Discord bot: Use discord.py library
Web app: Use Flask or FastAPI

Free vs Paid: What You Can Build for $0

Fully Free Stack (Gemini Flash + DuckDuckGo + FAISS)

Component	Free Option
LLM	Gemini 3.1 Flash (free tier)
Search	DuckDuckGo API (free)
Memory	FAISS (local, free)
Hosting	Replit (free tier) or local

When to Upgrade to Paid Models

When you need > 1M context
For complex multi-step reasoning
When free tier limits are exceeded

Cost Comparison Table (June 2026)

Model	Input Cost (per 1M tokens)	Output Cost (per 1M tokens)
Gemini 3.1 Flash	$0.25	$0.50
DeepSeek V3.2	$0.27	$0.35
Claude Sonnet 4.6	$3.00	$15.00
GPT-5.4	$5.00	$15.00

Common Mistakes and How to Avoid Them

Infinite Loops (The #1 Agent Killer)

Solution: Set max_steps and repeat detection.

Hallucinated Tool Calls

Solution: Validate tool names against available tools list.

Vague Objectives That Waste Tokens

Solution: Be specific in prompts: "Search for weather in Tokyo, Japan" not "Search for weather".

Security: Never Expose eval() to an LLM

Solution: Use safe parsing functions, never eval() on LLM output.

When Should You Use a Framework?

LangGraph for Complex Workflows

Use when: - You have multi-step workflows - Need state management - Building production agents

CrewAI for Multi-Agent Teams

Use when: - Multiple agents need to collaborate - You want pre-built agent templates

The Decision Checklist

Can you build it with < 200 lines of Python?

Do you need production monitoring?

Are you building a one-off script or a product?

Do you need multi-agent collaboration?

If you answered "yes" to production/monitoring/multi-agent, consider a framework.

Frequently Asked Questions

1. Can I build an AI agent from scratch in Python without a framework?