Hey everyone, Sarah here from agnthq.com. It’s been a crazy few weeks, hasn’t it? Every time I blink, there’s a new AI agent promising to change my life, my workflow, or at least my coffee order. And honestly, a lot of them just… don’t. They’re either too complex, too niche, or they just don’t deliver on the hype.
But then, every now and then, something genuinely interesting pops up. Something that actually makes me sit up and think, “Okay, this might actually be useful.” Today, I want to talk about one of those. Not a brand new, shiny thing, but a platform that’s been steadily evolving and, in my opinion, just hit a sweet spot for practical, everyday use: LangChain’s new Agents framework with its improved structured output and tool calling capabilities.
Now, before you groan and think, “Oh great, another LangChain article,” hear me out. I’ve been fiddling with LangChain since pretty much day one. I remember the early days, stringing together LLMs and tools, feeling like a digital Frankenstein. It was powerful, sure, but often clunky, hard to debug, and felt a bit like writing Python 2 code in 2024. The outputs could be… creative, to say the least, and getting an agent to reliably do what you wanted, especially with multiple steps, felt more like a prayer than programming.
The recent updates, particularly around how agents interact with tools and produce structured output, have really changed the game for me. It’s moved from “experimental playground” to “actually useful for my freelance work.” And that’s a big deal.
My Frustration with Unreliable Agents (and how this helps)
Let me paint a picture. A few months ago, I was trying to build a simple agent for a friend who runs a small e-commerce store. Her problem: customers often ask very similar questions about products, shipping, and returns, and she was spending way too much time copy-pasting answers. My idea was an agent that could:
- Look up product details (price, availability) from a dummy database.
- Check shipping zones and times.
- Formulate a polite, accurate answer.
Sounds straightforward, right? Not really. My early LangChain attempts were a mess. The agent would sometimes hallucinate product IDs, or forget to call the shipping tool, or just output a conversational ramble instead of a concise answer. Getting it to consistently output a specific format, like a JSON object containing the answer and the tools it used, was a nightmare. I’d spend hours trying to coax it with elaborate prompts, only for it to fail on an edge case.
This is where the new agent framework shines. LangChain has really tightened up how agents decide which tools to use and, crucially, how they report back their findings. It’s less about hoping the LLM “figures it out” and more about giving it a clear, structured path.
The Core Idea: Better Tool Calling and Structured Output
The biggest improvement, in my opinion, comes from a combination of things:
- Improved Tool Definition: Tools are now defined with Pydantic schemas, making it much clearer to the LLM what inputs it expects.
- Function Calling APIs (e.g., OpenAI’s): LangChain uses these under the hood to make tool selection much more reliable. The LLM doesn’t just “guess” which tool to use; it’s explicitly told about the available functions and their parameters.
- Structured Output Parsers: This is the holy grail for me. No more trying to regex an answer from a free-form text blob. We can now define exactly what structure we expect the agent’s final answer to take.
Let’s look at a simple example to illustrate this. Imagine we have a tool to get the current stock level for a product.
Example: A Simple Stock Checker Tool
First, define our tool using a Pydantic model for its input:
from langchain_core.tools import tool
from pydantic import BaseModel, Field
class ProductStockInput(BaseModel):
product_id: str = Field(description="The unique identifier for the product.")
@tool("get_product_stock", args_schema=ProductStockInput)
def get_product_stock(product_id: str) -> dict:
"""
Looks up the current stock level for a given product ID.
Returns a dictionary with product_id and stock_level.
"""
# This would typically query a database
stock_data = {
"P101": 50,
"P102": 0, # Out of stock
"P103": 15
}
stock_level = stock_data.get(product_id, -1) # -1 for not found
if stock_level == -1:
return {"product_id": product_id, "stock_level": "Product not found"}
return {"product_id": product_id, "stock_level": stock_level}
tools = [get_product_stock]
Notice the `args_schema` here. This is crucial. It tells the LLM exactly what arguments `get_product_stock` expects and what their types are. No more ambiguity.
Building the Agent with Structured Output
Now, let’s build an agent that uses this tool and, importantly, provides its final answer in a structured way. For my e-commerce friend, I wanted the agent to output the customer’s query, the agent’s answer, and any tools it used, all in a nice JSON format.
from langchain_openai import ChatOpenAI
from langchain import hub
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.messages import BaseMessage
from typing import List, Union, Dict, Any
# Define the structured output format for the agent's final answer
class AgentResponse(BaseModel):
original_query: str = Field(description="The customer's original query.")
agent_answer: str = Field(description="The agent's formulated answer to the query.")
tools_used: List[str] = Field(description="A list of names of the tools the agent used.")
metadata: Dict[str, Any] = Field(description="Any additional metadata or findings from the tools.")
# Load the base prompt for OpenAI tools agent
prompt = hub.pull("hwchase17/openai-tools-agent")
llm = ChatOpenAI(model="gpt-4-0125-preview", temperature=0) # Using a recent GPT-4 for reliability
# Create the agent
agent = create_openai_tools_agent(llm, tools, prompt)
# Create the agent executor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
# Now, we define how the agent should respond using with_structured_output
# This is where the magic happens for reliable structured output!
structured_agent_executor = agent_executor.with_structured_output(AgentResponse)
# Let's test it out!
query1 = "What is the stock level for product P101?"
response1 = structured_agent_executor.invoke({"input": query1})
print("\n--- Response 1 ---")
print(response1.json(indent=2))
query2 = "Is product P102 available?"
response2 = structured_agent_executor.invoke({"input": query2})
print("\n--- Response 2 ---")
print(response2.json(indent=2))
query3 = "What is the capital of France?" # Agent shouldn't use our tool for this
response3 = structured_agent_executor.invoke({"input": query3})
print("\n--- Response 3 ---")
print(response3.json(indent=2))
When you run this, you’ll see in the verbose output how the LLM first identifies the need for `get_product_stock`, calls it with the correct `product_id`, and then uses the result to form its `agent_answer`. More importantly, the *final output* is a `AgentResponse` object, not just a string. This is incredibly powerful for downstream processing, logging, or even just displaying consistent information to a user.
What I love about `with_structured_output`
This `with_structured_output` method is a significant shift. It means I can reliably integrate the agent’s responses into other parts of my application. I don’t have to write brittle parsing logic. If the agent somehow deviates, Pydantic will often catch it, giving me a clear error instead of silently failing or returning garbage.
For my e-commerce friend, this means her customer service portal can now display the agent’s answer with confidence, knowing it’s in the right format. We can even log the `tools_used` and `metadata` fields to understand how often the agent is using specific tools or if there are common questions it can’t answer.
Beyond Simple Tools: Multi-Step Reasoning with Reliability
The real test for an agent is often multi-step reasoning. Let’s add another tool: one to get estimated shipping times based on a product ID and a destination zone.
from langchain_core.tools import tool
from pydantic import BaseModel, Field
# ... (ProductStockInput and get_product_stock remain the same) ...
class ShippingInfoInput(BaseModel):
product_id: str = Field(description="The unique identifier for the product.")
destination_zone: str = Field(description="The shipping zone (e.g., 'Zone A', 'Zone B').")
@tool("get_shipping_info", args_schema=ShippingInfoInput)
def get_shipping_info(product_id: str, destination_zone: str) -> dict:
"""
Provides estimated shipping times for a product to a specific zone.
Returns a dictionary with product_id, zone, and estimated_days.
"""
# Dummy data for shipping
shipping_times = {
("P101", "Zone A"): "3-5 business days",
("P101", "Zone B"): "5-7 business days",
("P103", "Zone A"): "2-4 business days",
("P103", "Zone B"): "4-6 business days",
}
key = (product_id, destination_zone)
estimated_days = shipping_times.get(key, "Varies, please contact support")
return {"product_id": product_id, "destination_zone": destination_zone, "estimated_days": estimated_days}
tools_multi = [get_product_stock, get_shipping_info]
# Re-create agent and executor with the new tools
agent_multi = create_openai_tools_agent(llm, tools_multi, prompt)
agent_executor_multi = AgentExecutor(agent=agent_multi, tools=tools_multi, verbose=True)
structured_agent_executor_multi = agent_executor_multi.with_structured_output(AgentResponse)
# Test a multi-step query
multi_step_query = "What is the stock for P101 and how long would it take to ship to Zone A?"
response_multi = structured_agent_executor_multi.invoke({"input": multi_step_query})
print("\n--- Multi-step Response ---")
print(response_multi.json(indent=2))
You’ll notice in the verbose output that the agent now intelligently calls `get_product_stock` first, then `get_shipping_info`, and combines the information into a coherent answer, all while respecting the `AgentResponse` structure. This is a massive leap forward from the days where you had to explicitly chain these tool calls or pray the LLM would infer the correct sequence.
My Takeaways & Why This Matters Now
So, why am I making such a big deal about this now? Because it feels like the LangChain agents, specifically with the `create_openai_tools_agent` and `with_structured_output` combination, have finally matured to a point where they are truly practical for developers building real-world applications. No more endless prompt engineering to force a JSON output, no more brittle regex parsing, and significantly fewer “hallucinated tool calls.”
Actionable Takeaways for You:
- Revisit LangChain Agents: If you tried LangChain agents a year ago and got frustrated, now is the time to give them another look. The improvements in tool calling and structured output are substantial.
- Define Tools with Pydantic: Always define your tool inputs using Pydantic models with clear descriptions. This gives the LLM the best chance to understand when and how to use your tools.
- Embrace `with_structured_output`: This is your best friend for reliable agent integration. Define a Pydantic model for your agent’s final output, and use `agent_executor.with_structured_output(YourOutputModel)`. It will save you countless hours of debugging and parsing.
- Start Simple, then Expand: Don’t try to build a super-agent right away. Start with one or two simple tools and a clear output structure. Once that’s working reliably, gradually add more complexity.
- Use Detailed Tool Descriptions: The `description` field in your `@tool` decorator is what the LLM reads to decide whether to use a tool. Make it clear, concise, and explain what the tool *does* and what it *returns*.
- use OpenAI’s Function Calling: While this article focuses on LangChain, the underlying reliability often comes from LLMs like OpenAI’s GPT models having solid function calling capabilities. Ensure you’re using models that support this for best results.
I genuinely believe these advancements make building intelligent agents much more accessible and, more importantly, much more reliable. For anyone building AI-powered features, whether it’s a customer service bot, a data analysis assistant, or just automating tedious tasks, the ability to predictably interact with an agent and receive structured data back is a significant shift. It’s moved agents from being a cool demo to a genuinely useful piece of the developer’s toolkit.
That’s it for me today. Go try it out, and let me know what you build! Drop your thoughts and experiences in the comments below. Happy agent building!
🕒 Last updated: · Originally published: March 15, 2026