AI Knowledge — 02
Agent Architecture: How It Actually Works
An AI Agent is not just an LLM with a chat interface. It is a system of components: a backend that orchestrates flow, a RAG layer that retrieves context, tools that take real actions, and an LLM that decides what to do next. Here is how they connect.
Full Architecture
User
UI
Frontend
Backend / Server — Flow Orchestrator
RAG
Context retrieval
Tool Executor
Runs actual tools
Flows
Purpose-specific orchestration logic
LLM API
Decides: answer or use tool
Each Component's Role
Backend — the orchestrator
This is where your application logic lives. It receives the user query, decides which flow to run, calls RAG, assembles the prompt, and manages the loop with the LLM.
The LLM does not orchestrate. The backend does. The LLM only decides what to do next within each turn.
RAG — context retrieval
Before the LLM sees the question, the backend searches a vector database for relevant documents or records. That retrieved context is added to the prompt so the LLM answers from real data, not from training memory.
RAG always runs before the LLM call, not after.
Tool Executor — actions
Tools are functions the backend can run: query a database, call an API, write a file, send a notification. The LLM receives a list of tool definitions (names and descriptions) and decides which one to call.
The LLM outputs a tool call request. The backend executes it. The LLM does not run tools directly.
LLM — the decision engine
Given the user message, retrieved context, and available tools, the LLM does one of two things: produce a final answer, or request a tool call. That is its entire job in the loop.
The Loop — what makes it an Agent
A regular chatbot calls the LLM once and returns the response. An Agent can call the LLM multiple times within a single user request until it has gathered everything it needs.
"Lot A2241 failed tensile test. What do we do?"
RAG search → retrieves traceability records + past NCRs
Decides: use MES tool to check in-process lots
Executes MES tool → returns 8 lots on the line
Decides: use Regulatory DB to check notification rules
Executes Regulatory DB tool → AS9100 §8.7, 72h notice required
Has enough data. Generates final answer.
Receives: impact summary + actions + draft documents
Steps 3–6 are the loop. The LLM decided to use two tools before it had enough information to answer. A simple chatbot would have guessed after step 2. The Agent waited until it had real data.
Flows — purpose-specific logic
Not every request should follow the same path. A Flow defines which tools are available, in what order steps run, and how results are handled — for a specific type of task.
nonconformance_flow
supply_query_flow
general_qa_flow
The backend routes each incoming request to the appropriate flow based on intent classification — often done by the LLM itself in the first step.