Structured Output
StructuredLLM wraps any LLMService so that every completion is guaranteed
to deserialize into a Pydantic model instance. Under the hood it uses native
tool-calling to constrain the model's response to a specific JSON schema —
no brittle prompt engineering required.
Quick start
from pydantic import BaseModel
from lauren_ai import LLMConfig
from lauren_ai._module import LLMModule
class SentimentResult(BaseModel):
sentiment: str # "positive" | "negative" | "neutral"
confidence: float # 0.0 – 1.0
cfg, mock = LLMConfig.for_testing()
LLMProviderModule = LLMModule.for_root(cfg, transport_override=mock)
# Obtain the LLMService from DI or create it directly
from lauren_ai._module import LLMService
llm = LLMService(transport=mock, config=cfg)
structured = llm.with_structured_output(SentimentResult)
result: SentimentResult = await structured.complete([
Message.user("This product is absolutely fantastic!")
])
print(result.sentiment) # "positive"
print(result.confidence) # 0.95How it works
with_structured_output(MyModel)builds aToolSchemawhoseinput_schemais derived fromMyModel.model_json_schema().- The underlying
LLMService.complete()call includes that tool withtool_choice=ToolChoice.specific("structured_output"), forcing the model to emit exactly one tool call. - The tool call's
inputdict is unpacked intoMyModel(**input)and returned to the caller. - If the model returns plain JSON text instead of a tool call (some providers), the content is parsed as JSON and used as the input dict.
Using StructuredLLM in chains
StructuredLLM supports the | operator so it composes naturally with
PromptTemplate:
from lauren_ai._prompts import PromptTemplate
from lauren_ai._chains import Chain
pipeline: Chain = (
PromptTemplate(template="Analyse the sentiment of: {text}")
| llm.with_structured_output(SentimentResult)
)
# Render the prompt, run the LLM, and get a validated SentimentResult
result = await pipeline.run(text="I love this!")Testing with MockTransport
Use MockTransport.queue_structured() to pre-load a response without making
any network calls:
from lauren_ai._transport._mock import MockTransport
from lauren_ai._config import LLMConfig
transport = MockTransport()
config, _ = LLMConfig.for_testing()
llm = LLMService(transport=transport, config=config)
# Queue an instance — it will be returned by the next complete() call
transport.queue_structured(SentimentResult(sentiment="negative", confidence=0.8))
structured = llm.with_structured_output(SentimentResult)
result = await structured.complete([Message.user("This is terrible.")])
assert result.sentiment == "negative"
assert result.confidence == 0.8queue_structured accepts any Pydantic model instance and internally builds a
Completion whose tool_calls list contains a single ToolCall with the
serialized data.
Schema introspection
Access the generated JSON Schema directly:
structured = llm.with_structured_output(SentimentResult)
print(structured._schema)
# {'properties': {'sentiment': {'title': 'Sentiment', 'type': 'string'},
# 'confidence': {'title': 'Confidence', 'type': 'number'}},
# 'required': ['sentiment', 'confidence'],
# 'title': 'SentimentResult',
# 'type': 'object'}Error handling
When the model's response cannot be deserialized into the requested model,
OutputParserError is raised:
from lauren_ai._output_parsers import OutputParserError
try:
result = await structured.complete(messages)
except OutputParserError as exc:
print(f"Parse failed: {exc}")Supported providers
| Provider | Mechanism | Notes |
|---|---|---|
| Anthropic | Tool-forcing | Full support |
| OpenAI | Tool-forcing | Full support |
| Ollama | Tool-forcing | Requires a model with tool support |
| LiteLLM | Tool-forcing | Delegates to the upstream provider |
Pydantic requirements
with_structured_output calls model_cls.model_json_schema(), which is a
Pydantic v2 API. Pydantic v1 models are not supported. If the class does not
expose model_json_schema(), an empty schema is used and the model may not
constrain its output correctly.
Limitations
- Streaming is not supported through
StructuredLLM.complete(). If the underlying transport returns a streaming iterator, the chunks are collected and the accumulated JSON is parsed — this buffers the full response. - Recursive schemas (self-referential models) work but may confuse some provider tool implementations.
StructuredLLMis not directly injectable via the lauren DI container. Obtain it by callingllm_service.with_structured_output(Model)whereLLMServiceis injected normally.