NEWLatest release — richer signals.Latest release — richer signals, safer startup validation, cleaner team events.→

Build AI agents that survive production

lauren-ai is the AI companion to Lauren framework. Type-safe agents, multi-agent teams, DI-integrated memory, richer lifecycle telemetry, and built-in testing and typechecking — with the structure your team needs.

Documentation $pip install lauren-aiView on GitHub

Python 3.11–3.14 ✓Pydantic v2 ✓mypy ✓Typecheck by default ✓Signals ✓Teams ✓Anthropic ✓OpenAI ✓Ollama ✓

25+ Features4+ Providers4 Memory TiersPython 3.11–3.14

from lauren_ai import agent, tool, guardrail
from lauren_ai.guardrails import GuardrailResult
from lauren_ai.memory import RedisMemory

@guardrail()
async def no_pii(text: str) -> GuardrailResult:
    if contains_pii(text):
        return GuardrailResult.block("PII detected")
    return GuardrailResult.allow()

@tool()
async def search_docs(query: str) -> list[str]:
    """Search the knowledge base."""
    return await kb.search(query, top_k=5)

@agent(
    model="claude-haiku-4-5-20251001",
    tools=[search_docs],
    guardrails=[no_pii],
    memory=RedisMemory(),
)
class SupportAgent:
    """Customer support agent with RAG and guardrails."""

Agentic Coding Ready

Give your AI agent full lauren-ai expertise in one command.

60+ pre-loaded SKILL.md context packs cover every feature — providers, tools, RAG, guardrails, memory, teams, and observability. Works with Claude Code, Cursor, Copilot, Continue, and every major coding agent.

# Claude Code, Cursor, Copilot, Continue, Codex CLI — auto-detected
npx skills add lauren-framework/lauren-ai

Auto-detects installed agents and copies skills to their global directory

Featured skills

Building Agents

@agent, lifecycle hooks, streaming, agentic loop

Building Tools

@tool() functions & classes, ToolContext DI

LLM Providers

OpenAI, Anthropic, Ollama — one LLMConfig change

RAG Pipeline

KnowledgeBase, chunking, embedding, retrieval

Guardrails

Content filtering, scope enforcement, PII redaction

Memory

Multi-turn history, sliding window, user facts

Multi-Agent

Agent delegation, handoff tools, sub-agent spawning

Telemetry

Token usage, cost tracking, latency metrics

View all 60+ skills on GitHub

Context files:llms.txt llms-full.txt AGENTS.md

Why lauren-ai

Everything you need to ship real agents.

Every feature addresses a real source of bugs and outages in production AI systems. lauren-ai catches them at design time, not at 3 a.m.

Core

Decorator-based Agents

@agent() defines autonomous agentic loops. DI-injected, lifecycle-hooked, and composable with all other Lauren primitives. Recent runtime hardening now fails fast when an agent is missing its model configuration.

@agent(model="claude-opus-4-6")
@remember(store=user_memory, extract=True)
@use_guardrails(input=[PromptInjectionFilter()])
@use_tools(search, summarize)
class ResearchAgent: ...

Memory

4-Tier Memory

ShortTermMemory (sliding window), ConversationStore (cross-request history), UserMemoryStore + @remember() (per-user long-term facts), InMemoryVectorStore (RAG). All swappable, all DI-native.

Tools

Tool Registry

@tool() auto-generates JSON Schema from type annotations and docstrings. Function-form or class-form with full DI support and HITL confirmation.

Safety

Guardrails

@guardrail() intercepts agent responses before delivery. Block, modify, or audit any output — sync or async, DI-injectable, composable.

Teams

Multi-Agent Teams & Events

@team() coordinates specialists with structured handoffs and shared memory. Team events are exported cleanly for streaming UIs, orchestration traces, and typed integrations.

Providers

Provider-Agnostic

Swap Anthropic, OpenAI, Ollama, or any LiteLLM provider behind an identical Transport interface. No code changes to your agents or tools.

Eval

Evaluation Framework

AccuracyEval, AgentJudge (LLM-as-judge), TrajectoryEval, PerformanceEval. Run evals in CI with a single decorator — score, regress, and gate releases on real metrics, not vibes.

Observability

Signals, Tracing & Cost Tracking

ModelCallComplete, ToolCallComplete, AgentRunComplete signals with richer payloads like total_tokens and turns. @traced() for spans. TokenBudget for hard caps. CostTracker per-request.

Testable

Zero-network unit tests

MockTransport + AgentTestClient = fast, deterministic CI. Queue responses, assert on tool calls, snapshot trajectories. Your agents are tested, not prayed over.

DI-Integrated

Full Lauren DI into agents, tools, and guardrails. Singleton/Request/Transient scopes, module boundaries, and lifecycle hooks all apply.

Streaming Support

Token-by-token streaming through EventStream or SSE. Every turn, tool result, and agent handoff streams natively.

Structured Output

Return Pydantic models directly from agents. Type-safe extraction with validation — no manual JSON parsing.

Knowledge Base & RAG

KnowledgeBase with document loaders, hybrid BM25+vector retrieval. kb.as_tool() turns any KB into an agent tool instantly.

Semantic Router

Route messages to specialized agents based on semantic similarity. Embeddings-powered with configurable confidence thresholds.

Workflow Engine

Step, Parallel, Condition, Loop primitives for deterministic multi-agent pipelines. Full type safety through the chain.

Cost & Token Tracking

Per-agent and per-request cost tracking. Budget guards raise exceptions before you blow past your limits.

Extended Thinking

First-class support for Claude extended thinking and OpenAI reasoning models. Surface reasoning traces in your UI.

Multimodal Support

Pass images, PDFs, and files as first-class tool inputs. Automatic base64 encoding and MIME type handling.

Prompt Templates

Jinja2-powered prompt templates with type-safe variable injection. Version your prompts alongside your code.

Output Parsers

Parse LLM text into structured data: JSON, YAML, markdown tables, custom regex. Chainable and composable.

Interceptors

Pre/post hooks on every LLM call and tool invocation. Inject logging, rate limiting, or caching transparently.

Fail-Fast Validation

Misconfigured agents are rejected before the first turn. Missing model settings and mismatched tool plumbing fail early, not after deployment.

Signal System

ModelCallComplete, ToolCallComplete, AgentRunComplete signals now surface richer observability data. Wire to StatsD, Prometheus, or Sentry.

LLM Call Abstraction

LLMService injectable for single-turn completions and embeddings. No need for a full agent loop for simple calls.

Typecheck by Default

The default nox workflow now includes mypy, so typing regressions are caught in normal contributor flow instead of being bolted on later.

Philosophy

Built for production, not prototypes.

Four principles that separate lauren-ai from every other Python agent framework. No global singletons, no surprise side effects.

Structured by Default

Agents, tools, and guardrails are first-class framework citizens, not loose functions. Everything is registered in a module, injected by DI, and visible to the router.

Type-Safe Throughout

From the LLM call to the final response, every data shape is a Pydantic model. Generic types flow through agents, tools, memory, and extractors without Any escape hatches.

Fail Fast, Not Late

Recent runner and executor hardening pushes configuration and typing mistakes forward. Missing models and mismatched tool plumbing fail early instead of hiding in production traffic.

Framework-Native

Agent infrastructure reuses the same DI container, module system, lifecycle hooks, signals, and middleware stack as your HTTP controllers. One mental model for your entire app.

InboundHTTP Controller

@post("/ask") → Agent[T]

Decorators@agent → @remember → @use_guardrails → @use_tools

order enforcedDI singleton

RuntimeAgentRunner

loop · tools · budget · signals · fail-fast config

StateMemory tiers

short-termconversationuser factsvector

TransportProvider-agnostic

AnthropicOpenAIOllamaLiteLLM

Code

See it in action.

Real code examples from agents to testing — everything you need to ship production AI.

example.py

from lauren import controller, post, module, LaurenFactory
from lauren_ai import agent, use_tools, tool, Agent, LLMModule, AgentModule, LLMConfig

@tool()
async def get_weather(city: str) -> dict:
    """Get current weather for a city.

    Args:
        city: The city name, e.g. 'London'.
    """
    return {"city": city, "temperature_c": 18, "condition": "cloudy"}

@agent(model="claude-opus-4-6", system="You are a helpful travel assistant.")
@use_tools(get_weather)
class TravelAgent: ...

@controller("/travel")
class TravelController:
    def __init__(self, runner) -> None:
        self._runner = runner

    @post("/ask")
    async def ask(self, body: Json[AskRequest], agent: Agent[TravelAgent]) -> dict:
        response = await self._runner.run(agent, body.question)
        return {"answer": response.content, "turns": response.turns}

// live demo

See the latest platform updates in context.

SecureBank remains the fastest way to see the stack end to end. The latest `lauren-ai` changes harden the core runtime with stricter startup validation, richer signal payloads, cleaner team exports, and safer DI-backed tool execution.

Open live demo

Fail-Fast Agent Startup

Agents now reject missing model configuration before the first turn begins, so miswired environments fail in development and CI instead of surfacing mid-conversation.

Richer Lifecycle Signals

Signal payloads now include fields like total_tokens and turns, making dashboards, billing views, and live telemetry UIs much easier to wire accurately.

Cleaner Team Event Streams

Team event exports and runner behavior are now easier to consume from streaming interfaces, so multi-agent timelines stay typed and predictable across app boundaries.

Safer DI Tool Execution

The tool executor now narrows DI-backed class tools more reliably, which reduces runtime ambiguity when tools are resolved from the container and invoked through run().

securebank-ai-chatbot.netlify.app

SecureBank AI Chatbot — multi-agent banking demo

Comparison

lauren-ai vs the alternatives.

Same endpoint, two implementations. See why structure wins.

lauren-ai

agent.py

@agent(model="claude-haiku-4-5-20251001",
       tools=[search], memory=RedisMemory())
class MyAgent:
    async def handle(self, msg: str) -> str:
        return await self.run(msg)

LangChain equivalent

agent.py

llm = ChatAnthropic(model="claude-haiku-4-5-20251001")
memory = ConversationBufferMemory()
agent = initialize_agent(
    tools, llm, memory=memory,
    agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
)
result = agent.run(input=msg,
                   chat_history=memory.chat_memory.messages)

Feature	LangChain	CrewAI	lauren-ai
Agent definition model	Function chains / LCEL	Role-based classes	@agent() decorator, DI-native
DI integration	None	None	Full Lauren IoC container
Type safety	Partial	Partial	End-to-end, mypy/pyright compliant
Testing support	Mock-heavy, flaky	Limited	MockTransport + AgentTestClient
Multi-agent coordination	LangGraph	Native	@team() with structured handoffs
Memory system	Various buffer classes	Basic	4-tier: short-term / conversation / user-facts / vector
Guardrails	Manual	Manual	@guardrail() built-in
Provider switching	Various wrappers	Various wrappers	Single Transport interface
Streaming	Yes	Limited	First-class EventStream / SSE
Module system	None	None	NestJS-style modules with imports/exports

At a glance

Criterion	LangChain	CrewAI	lauren-ai
Type Safety	🟡	🔴	🟢🟢
DI Integration	🔴	🔴	🟢🟢
Testing Support	🟡	🟡	🟢🟢
Multi-Agent	🟡	🟢	🟢🟢
Memory System	🟢	🟡	🟢🟢
Guardrails	🟡	🟡	🟢🟢
Provider-Agnostic	🟢	🟢	🟢🟢
Production Ready	🟡	🔴	🟢🟢
Framework Integration	🔴	🔴	🟢🟢
Streaming	🟢	🟡	🟢🟢

🟢🟢 = excellent · 🟢 = good · 🟡 = partial · 🔴 = missing

Guarantees

6 hard guarantees for production AI.

These are the release-level qualities reinforced by the latest runtime and typing changes, not just marketing promises.

Type-Safe Agents

Pydantic throughout. Every input, output, tool call, and memory entry is fully typed. pyright and mypy compliant — no Any escape hatches.

Typecheck by Default

Typing is part of the default developer flow now. The standard nox run includes mypy, so regressions are caught in everyday work rather than as a separate afterthought.

Deterministic Testing

MockTransport + AgentTestClient gives you reproducible test results every time. No flaky tests, no network calls, no API keys needed in CI.

Fail-Fast Configuration

Agent runners now reject missing model configuration before work starts, which turns a silent production misconfiguration into an immediate, debuggable startup error.

Streamable Team Events

Multi-agent team events export cleanly for UIs and orchestration layers. Team timelines are easier to consume, re-export, and type-check across app boundaries.

Richer Observability

Lifecycle signals expose more actionable data, including total_tokens and turns, so tracing, billing, and support dashboards can stay aligned with actual runtime behavior.