Stop Building Agents, Start Building Tools

19 April 2026 · 4 min read · AI, Next.js, Software Architecture, Supabase, Engineering

Why the future of AI isn't autonomous robots, but deterministic tool-calling that integrates directly into your existing full-stack architecture.

I’ve spent the last month interviewing candidates for a senior full-stack role at Thea Tech Solutions. Every CV comes with ‘AI Integration’ highlighted in bold. But when I ask them to explain how they’d implement an AI feature in a Next.js application, 90% of the time, I get the same answer: ‘We’ll build an agent that uses LangChain to...’

That is usually the moment I tune out.

We are in the middle of a hype cycle where everyone wants to build autonomous agents—software that ‘reasons,’ plans, and acts. It sounds cool in a demo. But in production? It’s a nightmare. If you are building customer-facing software, you shouldn't be looking at agents. You should be looking at tools.

The Hallucination Problem is a Feature, Not a Bug

Let’s look at the hard reality. An LLM is a probabilistic engine. It predicts the next token. When you wrap that in an ‘agent’ loop where you ask it to decide what to do next, you are multiplying your error surface.

I recently had to debug a client’s system where an agent was supposed to fetch data from Supabase and format a report. It worked 80% of the time. The other 20%? It decided to invent a database schema that didn't exist and hallucinated the results. That is not a bug you can easily patch; that is the nature of the model.

In a production environment, especially when dealing with financial or user data, 80% reliability is zero reliability. If I have to put a human in the loop to check the agent's work, I haven't automated anything. I’ve just moved the typing from a data entry clerk to a prompt engineer.

The Case for Deterministic Tools

This brings me to what I actually build: Tools. A tool is a deterministic function wrapped in an LLM interface. It doesn't ‘decide’ to call the weather API because it feels like it. It calls the weather API because the user explicitly asked for the weather, or because the system architecture triggered a specific intent.

I prefer the ‘Controller’ pattern over the ‘Agent’ pattern.

In this architecture, the LLM is nothing more than a translation layer. It translates natural language into a structured JSON object that your TypeScript code can understand. Once you have that JSON, your code takes over completely. No loops, no autonomous wandering, no surprises.

How I Build This (The Stack)

At Thea Tech Solutions, our stack is React Native for mobile, Next.js for web, and Supabase for the backend. Here is how we integrate AI without the chaos.

1. Function Calling is King

I don't use raw text prompts for logic. I use OpenAI’s function calling (or similar OpenAI-compatible APIs) to force the model into a schema. If you haven't tried this, you are missing out. You define a TypeScript interface (using Zod or similar), and the LLM returns arguments that match that interface.

2. The Router Pattern

Instead of a generic agent, I write a router in Next.js.

// This is a simplified example of a Next.js API route

import { z } from 'zod';
import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';

// Define the tools the AI is allowed to use
const tools = {
  getBalance: {
    description: 'Get the current balance for a user',
    parameters: z.object({ userId: z.string() }),
    execute: async ({ userId }) => {
      // Direct call to Supabase - deterministic and safe
      const { data, error } = await supabase
        .from('accounts')
        .select('balance')
        .eq('id', userId)
        .single();
      return data;
    }
  },
  transferFunds: {
    description: 'Transfer money between accounts',
    parameters: z.object({ 
      fromId: z.string(), 
      toId: z.string(), 
      amount: z.number() 
    }),
    execute: async ({ fromId, toId, amount }) => {
       // Business logic validation happens HERE, not in the prompt
       if (amount <= 0) throw new Error('Invalid amount');
       // Perform transaction...
       return { status: 'success', txId: '12345' };
    }
  }
};

export async function POST(req: Request) {
  const { messages } = await req.json();

  // The AI only decides WHICH tool to run, not HOW to run it
  const response = await generateText({
    model: openai('gpt-4o'),
    messages,
    tools: tools
  });

  return Response.json(response);
}

Notice the difference here. The LLM cannot decide to transfer $1M just because it had a bad day. It can only output the parameters. The execution happens in my controlled TypeScript environment, where I have logging, error handling, and database constraints.

3. Context Injection via RAG (But Keep It Simple)

I see so many teams over-engineering RAG (Retrieval-Augmented Generation). They set up massive vector databases like Pinecone when a simple Postgres text search would suffice.

For most SaaS apps, your context is already in your database. I use Supabase’s pgvector extension. It’s fast enough for 99% of use cases and it keeps the architecture simple. I don't want to manage two databases if I don't have to.

Cost and Latency

There is a practical side to this too. Agents are expensive. When you let an LLM run in a loop, thinking and reflecting, your token count explodes. You are paying for the model to sit there and ‘think’ about things it shouldn't be thinking about.

By forcing the LLM to act as an intent classifier and tool selector, you get a single API call. It’s fast. It’s cheap. And crucially, it’s cacheable.

The Verdict

If you are building a toy project or a research demo, go ahead and build an autonomous agent. It’s fun. But if you are building a business, stop trying to build a robot brain. Build a machine with very specific levers.

The LLM is the user interface. It is the new CLI. But your backend? That should remain boring, predictable, and solid.

Focus on building tools that do one thing well. Use the LLM to route the user to those tools. That is how you ship AI features that don't fall apart the moment a user tries something unexpected.