Skip to main content

Tool Calling

Parlant implements its own tool-calling mechanism rather than using vendor-provided APIs. This page explains why, and how tools are inferred, evaluated, and executed within the engine.

Why Custom Tool Calling?​

Vendor tool-calling APIs (such as OpenAI's function calling and Anthropic's tool use) work well for simple cases. However, they fall short of meeting Parlant's requirements:

1. Vendor Independence​

With native tool calling, switching from OpenAI to Anthropic requires changing tool definitions due to different schemas and behaviors. Parlant's abstraction enables provider switching while maintaining identical tool configurations.

2. Guided Calling​

Native tool calling presents all available tools to the LLM and allows it to decide what to call. This approach creates several problems:

  • A larger number of tools increases the potential for confusion.
  • The LLM might call tools that are inappropriate for the current context.
  • There is no connection between why a tool is being called and which guideline requested it.

Parlant's guided approach addresses these issues: only tools associated with matched guidelines are considered, ensuring the LLM sees a focused set of relevant tools rather than the entire toolkit.

3. Iteration Support​

Native tool calling typically follows a request-response pattern: call a tool, receive a result, and generate output. However, tool results often trigger new guidelines that require additional tools.

Parlant's iterative preparation loop handles this naturallyβ€”tool results flow back into guideline matching, potentially triggering additional tool calls within the same response cycle.

4. Optimization Opportunities​

Because Parlant understands why each tool is being considered (specifically, which guideline requires it), the system can optimize in several ways:

  • Tools whose data is already present in the context can be skipped.
  • Independent tools can be batched together for parallel execution.
  • Tool execution can be prioritized based on guideline criticality.

Tool-Guideline Association​

Tools become available through explicit association with guidelines:

# The guideline
guideline = await agent.create_guideline(
condition="Customer asks about their account balance",
action="Look up their balance and tell them"
)

# The tool
tool = await agent.create_tool(
name="get_balance",
description="Retrieves account balance for a customer",
parameters={...}
)

# The association
await agent.associate_tool_with_guideline(
tool_id=tool.id,
guideline_id=guideline.id
)

When the guideline matches, the tool becomes a candidate for execution.

Tool Call Inference​

Not every associated tool needs to be called. The engine infers which tools are actually needed:

ALGORITHM: Tool Call Inference

INPUT: tool_enabled_guidelines, context
OUTPUT: tool_calls, tool_insights

1. COLLECT candidate tools:
FOR each matched guideline:
tools = get_associated_tools(guideline)
FOR each tool:
candidates.add({
tool: tool,
guideline: guideline,
reason: "Required by matched guideline"
})

2. EVALUATE each candidate:
decision = LLM_evaluate(tool, context, conversation_history):

NEEDS_TO_RUN:
- Tool data is required and not available
- Parameters can be inferred from context

DATA_ALREADY_IN_CONTEXT:
- Information the tool would provide is already known
- Example: Balance already retrieved earlier

CANNOT_RUN:
- Required parameters cannot be determined
- Missing information the customer hasn't provided

3. BUILD call list:
FOR each candidate with NEEDS_TO_RUN:
parameters = infer_parameters(tool, context)
tool_calls.append({
tool: tool,
parameters: parameters,
guideline: guideline
})

4. RECORD insights:
FOR each candidate with CANNOT_RUN:
tool_insights.add({
tool: tool,
reason: why_cannot_run,
missing_data: what_is_missing
})

5. RETURN tool_calls, tool_insights

Evaluation States​

NEEDS_TO_RUN: The tool should be executed. Its result will provide information needed for the response.

DATA_ALREADY_IN_CONTEXT: The tool would return information that is already availableβ€”either from an earlier call or from the conversation itself. The tool is skipped to reduce latency and cost.

CANNOT_RUN: The tool requires parameters that are not available. For example, the customer may not have provided their order number yet. In this case, the tool is blocked from execution.

Inference Batching​

Multiple tools are evaluated together for efficiency:

Tool Execution​

Tools that pass inference are executed:

ALGORITHM: Tool Execution

INPUT: tool_calls
OUTPUT: tool_events

FOR each tool_call (can be parallelized):

1. EMIT tool_start event

2. RESOLVE tool service:
- Plugin tools: Call Python function
- OpenAPI tools: Make HTTP request
- MCP tools: Use MCP protocol

3. CALL tool:
result = tool_service.call(
tool_name,
arguments,
tool_context # agent, session, customer, etc.
)

4. PROCESS result:
tool_result = {
data: result.data,
metadata: result.metadata,
control_options: result.control_options
}

5. EMIT tool_end event with result

6. APPEND to tool_events

RETURN tool_events

Tool Context​

Tools receive context about the execution environment:

class ToolContext:
agent_id: str
session_id: str
customer_id: str
plugin_data: dict # Custom data from plugins

This context enables tools to behave differently based on the specific agent or customer involved in the interaction.

Tool Results​

Tools return structured results:

class ToolResult:
data: Any # The actual result data
metadata: dict # Source attribution, etc.
control_options: ControlOptions # Session mode, lifespan

Control options enable tools to influence engine behavior:

  • session_mode: Switches the session to manual mode (human takeover)
  • response_lifespan: Specifies how long the result remains valid

Tool Insights​

When tools cannot run, insights explain the reason:

class ToolInsights:
evaluations: list[ToolEvaluation] # What was decided for each tool
missing_data: list[MissingData] # What parameters weren't available
invalid_data: list[InvalidData] # What parameters were wrong

Why Insights Matter​

The message composer uses insights to generate appropriate responses:

Without insights:
Customer: "Transfer $500 to my friend"
Agent: "I've transferred $500." (but we don't know who "friend" is!)

With insights:
Customer: "Transfer $500 to my friend"
Agent: "I'd be happy to help transfer $500. Who would you like to send it to?"

The insight informs the composer that "transfer_money could not run because recipient_name is missing." The composer uses this information to request the missing data from the customer.

Insight Flow​

Consequential vs Non-Consequential Tools​

Tools have different risk levels:

Non-Consequential Tools​

Non-consequential tools are read-only operations with no side effects:

  • get_balance() β€” Returns account data
  • search_products() β€” Returns product data
  • check_order_status() β€” Returns order data

These tools can be called with less stringent validation because errors have no lasting impact on system state.

Consequential Tools​

Consequential tools are operations that change state:

  • transfer_money() β€” Moves funds between accounts
  • cancel_order() β€” Performs an irreversible action
  • update_profile() β€” Modifies customer data

These tools require stricter validation:

  • Thorough parameter verification
  • Confirmation before execution (configurable)
  • Higher-criticality ARQ enforcement during message generation

Marking Tools​

tool = await agent.create_tool(
name="transfer_money",
description="Transfer funds between accounts",
consequential=True, # Marks as consequential
...
)

The engine uses this to adjust validation depth and potentially require confirmation flows.

Tool Execution Within Iterations​

Tools participate in the preparation loop:

Each iteration can trigger new tools based on what was learned in previous iterations.

Why This Design?​

Why Guided Over Open-Ended?​

Open-ended tool calling (providing all tools to the LLM and allowing it to choose) creates several problems:

  • Context pollution from dozens of tool definitions
  • Unpredictable behavior regarding which tool will be selected
  • No connection to business rules explaining why a tool is being called

Guided calling provides significant advantages:

  • Focused context containing only relevant tools
  • Predictable behavior with tools tied to matched guidelines
  • Full auditability showing which guideline triggered each tool call

Why Allow Multiple Iterations?​

Single-iteration tool calling fails to handle cascading requirements:

  1. Customer requests "Check my balance" β†’ system calls get_balance()
  2. Balance is high β†’ "recommend investments" guideline matches
  3. That guideline requires get_investment_options()

Without iteration support, step 3 would never execute.

Tradeoffs​

ChoiceBenefitCost
Guided callingProvides predictable, auditable behaviorOffers less flexibility than open-ended approaches
Inference evaluationEnables skipping unnecessary callsRequires additional LLM inference
Tool insightsEnables better error responsesRequires additional bookkeeping
Consequential markingProvides safety for high-risk toolsRequires developers to classify tools

What's Next​