The Idea
Most AI interactions are stateless — the model doesn't remember last Tuesday, doesn't have opinions formed from experience, and can't reflect on what it's learned. Generative Agents changes this by giving AI characters a comprehensive memory of all their experiences, the ability to synthesize higher-level insights from those memories, and multi-scale plans that guide daily behavior.
Originally demonstrated in "Smallville" (25 AI agents living in a virtual town), these agents formed friendships, organized parties, discussed politics, and navigated social dynamics — all emergent from the architecture. When evaluated, their behavior was rated as more believable than human-authored character scripts.
Component Patterns
This system weaves Level 2 compositions into a memory-driven architecture:
RAG Patterns Reflexion Plan-and-Execute Least-to-Most ReAct Meta-PromptingRAG provides the memory retrieval backbone. Reflexion enables periodic self-reflection. Plan-and-Execute handles multi-scale planning. Meta-Prompting drives persona-based conversation.
The Architecture
Memory Stream
A comprehensive, append-only record of everything the agent experiences. Observations, reflections, and plans are all stored as timestamped, importance-scored, embedded entries.
Retrieval
When the agent needs to act, it retrieves the most relevant memories using a three-factor score: how recent is it? How important was it? How relevant is it to right now?
Reflection
When accumulated importance exceeds a threshold, the agent pauses to reflect — generating higher-level insights from raw observations. "John values his friendship with Maria" emerges from many small interactions.
Planning
Daily plans decomposed into hourly blocks, then into 5–15 minute actions. Plans adapt when unexpected events occur, with reactive replanning that feels natural.
The Memory Stream
John Lin's Memory (Pharmacy Owner, Smallville)
Observations, reflections, and plans all live in the same stream. Importance scores (1–10) determine what gets surfaced.
Retrieval Scoring
Recent memories score higher
Life events rank above routine
How related to the current situation
See It in Action
11:00 AM — another agent asks John what he's doing today.
• Conversation with Maria about art project (high recency + relevance)
• Reflection: "John values his friendship with Maria" (high importance)
This response seamlessly weaves together John's plan, recent experience, and relationship values — all from memory retrieval.
Why This Works
The key insight is that believable behavior emerges from the combination of memory, reflection, and planning — not from any single component. Raw observations give the agent experiences. Reflections give it values and self-awareness. Plans give it direction. Retrieval scoring ensures the right memories surface at the right time.
The reflection mechanism is especially powerful. Without it, the agent has experiences but no understanding. With it, patterns emerge: "I enjoy spending time with Maria" isn't programmed — it's synthesized from accumulated observations, just like human insight formation.
The System
Record everything. Reflect periodically. Plan across timescales. Retrieve what matters most right now. The result: AI agents that remember, grow, and behave believably over time.
When to Use This
- • Building AI characters for games, simulations, or interactive storytelling
- • Long-running agents that need persistent identity and coherent behavior
- • Virtual assistants with personality, memory, and relationship awareness
- • Social simulations where emergent behavior is the goal
When to Skip This
- • Task-completion agents — too much overhead for solving a coding problem or answering a question
- • Short-lived interactions — no benefit from memory if the conversation is one-and-done
- • Factual QA systems — persona and memory are irrelevant to looking up facts
- • Latency-sensitive applications — memory retrieval and reflection add processing time
How It Relates
Generative Agents shares its memory and reflection subsystem with the Cognitive Loop. Voyager uses a similar but code-focused memory (skill library vs. experience stream). In Multi-Agent Compositions, each agent can have its own memory stream for persistent persona.
At Level 4, World Model Agents extend the memory architecture with predictive simulation, and Hierarchical Agent Architecture coordinates multiple generative agents operating at different timescales.