The 3 Levels of AI Chatbots
Not all chatbots are created equal. Understanding the three levels helps you choose the right approach:
Level 1 — Simple Q&A Bot: Takes user input, sends it to an LLM with a system prompt, returns the response. Good for general knowledge tasks. Build time: 1 hour.
Level 2 — RAG-Powered Bot: Retrieves relevant documents from your knowledge base before answering. Uses embeddings and vector search to find the right context. Answers are grounded in your actual data. Build time: 1 day.
Level 3 — Agentic Bot: Can take actions beyond just answering questions. Books appointments, processes refunds, updates records, escalates to humans. Uses tool calling to interact with external systems. Build time: 1-2 weeks.
Most businesses need Level 2 for internal knowledge bots and Level 3 for customer-facing applications.
Building a Level 1 Chatbot
Stack: Next.js + Vercel AI SDK + Claude/GPT-4
The simplest production chatbot needs three things:
- 1A system prompt that defines the bot's personality, knowledge boundaries, and response format
- 2A streaming API route that sends user messages to the LLM and streams responses back
- 3A chat UI that handles message history, loading states, and error handling
Key decisions: - Model choice: Claude for nuanced conversations, GPT-4 for broad knowledge, open-source for privacy - Context window management: Summarize or truncate old messages to stay within token limits - Guardrails: Add content filtering, output validation, and fallback responses
Pro tip: Start with a clear system prompt. The quality of your chatbot is 80% determined by how well you define its role, boundaries, and response style.
Prêt à Maîtriser l'IA ?
Rejoignez 2 500+ professionnels qui ont transformé leur carrière avec le Bootcamp IA CodeLeap.
Adding RAG for Custom Knowledge
RAG (Retrieval-Augmented Generation) grounds your chatbot in real data:
- 1Chunk your documents: Split PDFs, docs, and web pages into overlapping chunks (500-1000 tokens each)
- 2Generate embeddings: Convert each chunk into a vector using an embedding model
- 3Store in a vector database: Use Pinecone, Weaviate, Supabase pgvector, or ChromaDB
- 4Query at runtime: When a user asks a question, embed the query, find similar chunks, inject them as context
The embedding pipeline: - Documents → Chunks → Embeddings → Vector DB - User query → Query embedding → Similarity search → Top-K chunks → LLM context
Common pitfalls: - Chunks too large = irrelevant context, higher costs - Chunks too small = missing important context - No overlap = breaking context at chunk boundaries - Not filtering by metadata = returning results from wrong categories
With AI coding tools, you can build a complete RAG pipeline in 2-3 hours instead of 2-3 days.
Making It Agentic with Tool Calling
Tool calling transforms your chatbot from answerer to doer:
Define tools as functions the LLM can call: - `search_orders(customer_id)` — Look up order status - `process_refund(order_id, reason)` — Issue a refund - `schedule_callback(phone, preferred_time)` — Book a support callback - `escalate_to_human(conversation_id, summary)` — Hand off to a human agent
The flow: 1. User says "I want a refund for order #1234" 2. LLM decides to call `search_orders("1234")` 3. Your code executes the function, returns order data 4. LLM sees the order is eligible, calls `process_refund("1234", "customer request")` 5. Your code processes the refund, returns confirmation 6. LLM tells the user "Your refund of $49.99 has been processed"
Security considerations: - Validate all tool inputs server-side - Rate-limit expensive operations - Require confirmation for destructive actions - Log all tool calls for audit trails
CodeLeap's Developer Track covers all three levels in Weeks 5-7, with production deployment on Vercel in Week 8.