2.2 Agent Architecture

1. The Agent Equation

In the framework of Russell and Norvig, the internal structure of an AI is defined as the Agent Architecture. It is the relationship between physical hardware and controlling software.

Agent = Architecture + Program

Architecture: The physical device or computing platform (e.g., PC, robotic chassis, cloud server).
Program: The computer code that implements the Agent Function (mapping percepts to actions).

The Architecture makes percepts from sensors available to the Program, runs the logic, and feeds the chosen actions to the Actuators.

2. Basic Architecture Diagram

The architecture serves as the "bridge" between the software logic and the physical world.

graph TD subgraph Agent ["The Agent"] direction TB Program["Agent Program \n Logic/Software"] Arch["Architecture \n Hardware/PC/Robot"] end Env["Environment"] -- "Percepts" --> Sensors Sensors -- "Data" --> Arch Arch -- "Input" --> Program Program -- "Decision" --> Arch Arch -- "Commands" --> Actuators Actuators -- "Actions" --> Env style Program fill:#c8e6c9,stroke:#388e3c style Arch fill:#bbdefb,stroke:#1976d2 style Env fill:#fff9c4,stroke:#fbc02d

3. Four Basic Types of Agent Architectures

Agent programs are categorized based on how the architecture processes information. As we move down this list, the "internal mind" becomes more complex.

A. Simple Reflex Agent

Acts based only on the current percept, using pre-defined condition-action rules.

Logic: "If [Condition], then [Action]."
Best For: Fully observable environments.
Examples:
- Medical System: If "high fever", prescribe paracetamol.
- Thermostat: If "temp < 20°C", turn on heater.
- Automated Camera: If "lighting is dark", activate flash.

graph LR subgraph Env ["Environment"] World["Current State"] end subgraph Agent ["Simple Reflex Agent"] direction TB Sensors --> Rules["Condition-Action Rules"] Rules --> Actuators end World -- "Percepts" --> Sensors Actuators -- "Actions" --> World style Rules fill:#f9f9f9,stroke:#333 style Sensors fill:#fff,stroke:#333 style Actuators fill:#fff,stroke:#333 style Env fill:#eeeeee,stroke:#999

B. Model-Based Reflex Agent

Maintains an internal state that depends on the percept history to handle partially observable environments.

Logic: "What is the world like now? (Even the parts I can't see)."
Best For: Environments requiring "memory" of the past.
Examples:
- Self-Driving Car: Remembers a cyclist in the blind spot from 2 seconds ago.
- Dishwasher: Knows it's in the "rinse cycle" even if current sensor readings match the "wash cycle".

graph LR subgraph Env ["Environment"] World["Current State"] end subgraph Agent ["Model-Based Reflex Agent"] direction TB Sensors --> CurState["What the world \n is like now"] Evolution["How the world \n evolves"] --> CurState ActionsDo["What my \n actions do"] --> CurState CurState --> Rules["Condition-Action Rules"] Rules --> Actuators end World -- "Percepts" --> Sensors Actuators -- "Actions" --> World style CurState fill:#f9f9f9,stroke:#333 style Evolution fill:#fff,stroke:#333,stroke-dasharray: 5 5 style ActionsDo fill:#fff,stroke:#333,stroke-dasharray: 5 5

C. Goal-Based Agent

Uses goals to guide its actions, choosing those that lead to a desired state via search and planning.

Logic: "What will happen if I do Action X, and will it get me closer to my Goal?"
Best For: Complex tasks where the right action depends on the destination.
Examples:
- GPS/Maps: Evaluates thousands of turns to find the sequence ending at the goal.
- Robotic Arm: Plans joint trajectories to reach a specific coordinate ("place bolt").

graph LR subgraph Env ["Environment"] World["Current State"] end subgraph Agent ["Goal-Based Agent"] direction TB Sensors --> CurState["What the world \n is like now"] CurState --> Predict["What it will be like \n if I do action A"] Goals["Goals"] --> Predict Predict --> Choice["What action I \n should do now"] Choice --> Actuators end World -- "Percepts" --> Sensors Actuators -- "Actions" --> World style Goals fill:#e3f2fd,stroke:#1976d2 style Choice fill:#f9f9f9,stroke:#333

D. Utility-Based Agent

Chooses actions to maximize a utility function ("happiness" or "success") when there are multiple paths or trade-offs.

Logic: "How happy will I be in this state? Is this path better?"
Best For: Environments with conflicting requirements (e.g., speed vs. safety).
Examples:
- Automated Taxi: Chooses the fastest, cheapest, and most comfortable route among many.
- Trading Bot: Balances potential gain against risk of loss.

graph LR subgraph Env ["Environment"] World["Current State"] end subgraph Agent ["Utility-Based Agent"] direction TB Sensors --> CurState["What the world \n is like now"] CurState --> Predict["What it will be like \n if I do action A"] Predict --> Utility["How happy will \n I be in such a state"] Utility --> Choice["What action I \n should do now"] Choice --> Actuators end World -- "Percepts" --> Sensors Actuators -- "Actions" --> World style Utility fill:#f3e5f5,stroke:#4a148c

4. The Learning Agent Architecture

Any of the above architectures can be turned into a Learning Agent, allowing the AI to improve over time.

graph LR subgraph Env ["Environment"] World["Current State"] end subgraph Agent ["Learning Agent"] direction TB Sensors --> Critic["Critic"] Critic --> Learning["Learning \n Element"] Learning --> Performance["Performance \n Element"] Performance --> Actuators Standard["Performance Standard"] --> Critic Learning --> Changes["Knowledge/Modifications"] Changes --> Performance Learning --> Generator["Problem \n Generator"] Generator --> Performance end World -- "Percepts" --> Sensors Actuators -- "Actions" --> World style Learning fill:#e8f5e9,stroke:#2e7d32 style Critic fill:#fff3e0,stroke:#ef6c00 style Generator fill:#f3e5f5,stroke:#7b1fa2 style Standard fill:#f5f5f5,stroke:#9e9e9e,stroke-dasharray: 5 5

1. Performance Element: This is the "old" agent (Reflex, Goal, etc.) that decides on actions.

2. Learning Element: Responsible for making improvements based on feedback.

3. Critic: Observes the world and evaluates the agent based on a fixed Performance Standard.

4. Problem Generator: Suggests exploratory actions that will lead to new, informative experiences.

5. Summary Comparison

Agent Type	Main Feature	Core Question	Analogs
Simple Reflex	Condition-Action Rules	What do I see now?	Light switch
Model-Based	Internal State	What is the hidden state?	Driver in fog
Goal-Based	Future Planning	Will this get me there?	Chess player
Utility-Based	Preferences	How good is this path?	Travel agent

Note on Learning Agents

Any of these four can be a Learning Agent. A "Learning Utility-Based Agent" starts with a basic utility function and learns to prefer different paths based on feedback (e.g., discoverying a road is bumpy and lowering its comfort utility).

Back to Course Syllabus

2.2 Agent Architecture (BT104CO)