Tool Calling

Parlant implements its own tool-calling mechanism rather than using vendor-provided APIs. This page explains why, and how tools are inferred, evaluated, and executed within the engine.

Why Custom Tool Calling?

Vendor tool-calling APIs (such as OpenAI's function calling and Anthropic's tool use) work well for simple cases. However, they fall short of meeting Parlant's requirements:

1. Vendor Independence

With native tool calling, switching from OpenAI to Anthropic requires changing tool definitions due to different schemas and behaviors. Parlant's abstraction enables provider switching while maintaining identical tool configurations.

2. Guided Calling

Native tool calling presents all available tools to the LLM and allows it to decide what to call. This approach creates several problems:

A larger number of tools increases the potential for confusion.
The LLM might call tools that are inappropriate for the current context.
There is no connection between why a tool is being called and which guideline requested it.

Parlant's guided approach addresses these issues: only tools associated with matched guidelines are considered, ensuring the LLM sees a focused set of relevant tools rather than the entire toolkit.

3. Iteration Support

Native tool calling typically follows a request-response pattern: call a tool, receive a result, and generate output. However, tool results often trigger new guidelines that require additional tools.

Parlant's iterative preparation loop handles this naturally—tool results flow back into guideline matching, potentially triggering additional tool calls within the same response cycle.

4. Optimization Opportunities

Because Parlant understands why each tool is being considered (specifically, which guideline requires it), the system can optimize in several ways:

Tools whose data is already present in the context can be skipped.
Independent tools can be batched together for parallel execution.
Tool execution can be prioritized based on guideline criticality.

Tool-Guideline Association

Tools become available through explicit association with guidelines:

# The guideline
guideline = await agent.create_guideline(
    condition="Customer asks about their account balance",
    action="Look up their balance and tell them"
)

# The tool
tool = await agent.create_tool(
    name="get_balance",
    description="Retrieves account balance for a customer",
    parameters={...}
)

# The association
await agent.associate_tool_with_guideline(
    tool_id=tool.id,
    guideline_id=guideline.id
)

When the guideline matches, the tool becomes a candidate for execution.

Tool Call Inference

Not every associated tool needs to be called. The engine infers which tools are actually needed:

ALGORITHM: Tool Call Inference

INPUT: tool_enabled_guidelines, context
OUTPUT: tool_calls, tool_insights

1. COLLECT candidate tools:
   FOR each matched guideline:
     tools = get_associated_tools(guideline)
     FOR each tool:
       candidates.add({
         tool: tool,
         guideline: guideline,
         reason: "Required by matched guideline"
       })

2. EVALUATE each candidate:
   decision = LLM_evaluate(tool, context, conversation_history):

     NEEDS_TO_RUN:
       - Tool data is required and not available
       - Parameters can be inferred from context

     DATA_ALREADY_IN_CONTEXT:
       - Information the tool would provide is already known
       - Example: Balance already retrieved earlier

     CANNOT_RUN:
       - Required parameters cannot be determined
       - Missing information the customer hasn't provided

3. BUILD call list:
   FOR each candidate with NEEDS_TO_RUN:
     parameters = infer_parameters(tool, context)
     tool_calls.append({
       tool: tool,
       parameters: parameters,
       guideline: guideline
     })

4. RECORD insights:
   FOR each candidate with CANNOT_RUN:
     tool_insights.add({
       tool: tool,
       reason: why_cannot_run,
       missing_data: what_is_missing
     })

5. RETURN tool_calls, tool_insights

Evaluation States

NEEDS_TO_RUN: The tool should be executed. Its result will provide information needed for the response.

DATA_ALREADY_IN_CONTEXT: The tool would return information that is already available—either from an earlier call or from the conversation itself. The tool is skipped to reduce latency and cost.

CANNOT_RUN: The tool requires parameters that are not available. For example, the customer may not have provided their order number yet. In this case, the tool is blocked from execution.

Inference Batching

Multiple tools are evaluated together for efficiency:

Tool Execution

Tools that pass inference are executed:

ALGORITHM: Tool Execution

INPUT: tool_calls
OUTPUT: tool_events

FOR each tool_call (can be parallelized):

  1. EMIT tool_start event

  2. RESOLVE tool service:
     - Plugin tools: Call Python function
     - OpenAPI tools: Make HTTP request
     - MCP tools: Use MCP protocol

  3. CALL tool:
     result = tool_service.call(
       tool_name,
       arguments,
       tool_context  # agent, session, customer, etc.
     )

  4. PROCESS result:
     tool_result = {
       data: result.data,
       metadata: result.metadata,
       control_options: result.control_options
     }

  5. EMIT tool_end event with result

  6. APPEND to tool_events

RETURN tool_events

Tool Context

Tools receive context about the execution environment:

class ToolContext:
    agent_id: str
    session_id: str
    customer_id: str
    plugin_data: dict  # Custom data from plugins

This context enables tools to behave differently based on the specific agent or customer involved in the interaction.

Tool Results

Tools return structured results:

class ToolResult:
    data: Any                    # The actual result data
    metadata: dict               # Source attribution, etc.
    control_options: ControlOptions  # Session mode, lifespan

Control options enable tools to influence engine behavior:

session_mode: Switches the session to manual mode (human takeover)
response_lifespan: Specifies how long the result remains valid

Tool Insights

When tools cannot run, insights explain the reason:

class ToolInsights:
    evaluations: list[ToolEvaluation]  # What was decided for each tool
    missing_data: list[MissingData]    # What parameters weren't available
    invalid_data: list[InvalidData]    # What parameters were wrong

Why Insights Matter

The message composer uses insights to generate appropriate responses:

Without insights:
Customer: "Transfer $500 to my friend"
Agent: "I've transferred $500." (but we don't know who "friend" is!)

With insights:
Customer: "Transfer $500 to my friend"
Agent: "I'd be happy to help transfer $500. Who would you like to send it to?"

The insight informs the composer that "transfer_money could not run because recipient_name is missing." The composer uses this information to request the missing data from the customer.

Insight Flow

Consequential vs Non-Consequential Tools

Tools have different risk levels:

Non-Consequential Tools

Non-consequential tools are read-only operations with no side effects:

get_balance() — Returns account data
search_products() — Returns product data
check_order_status() — Returns order data

These tools can be called with less stringent validation because errors have no lasting impact on system state.

Consequential Tools

Consequential tools are operations that change state:

transfer_money() — Moves funds between accounts
cancel_order() — Performs an irreversible action
update_profile() — Modifies customer data

These tools require stricter validation:

Thorough parameter verification
Confirmation before execution (configurable)
Higher-criticality ARQ enforcement during message generation

Marking Tools

tool = await agent.create_tool(
    name="transfer_money",
    description="Transfer funds between accounts",
    consequential=True,  # Marks as consequential
    ...
)

The engine uses this to adjust validation depth and potentially require confirmation flows.

Tool Execution Within Iterations

Tools participate in the preparation loop:

Each iteration can trigger new tools based on what was learned in previous iterations.

Why This Design?

Why Guided Over Open-Ended?

Open-ended tool calling (providing all tools to the LLM and allowing it to choose) creates several problems:

Context pollution from dozens of tool definitions
Unpredictable behavior regarding which tool will be selected
No connection to business rules explaining why a tool is being called

Guided calling provides significant advantages:

Focused context containing only relevant tools
Predictable behavior with tools tied to matched guidelines
Full auditability showing which guideline triggered each tool call

Why Allow Multiple Iterations?

Single-iteration tool calling fails to handle cascading requirements:

Customer requests "Check my balance" → system calls get_balance()
Balance is high → "recommend investments" guideline matches
That guideline requires get_investment_options()

Without iteration support, step 3 would never execute.

Tradeoffs

Choice	Benefit	Cost
Guided calling	Provides predictable, auditable behavior	Offers less flexibility than open-ended approaches
Inference evaluation	Enables skipping unnecessary calls	Requires additional LLM inference
Tool insights	Enables better error responses	Requires additional bookkeeping
Consequential marking	Provides safety for high-risk tools	Requires developers to classify tools

What's Next

Response Lifecycle: How tool calling fits in the iteration loop
Message Generation: How tool results are used in responses
Debugging: Tracing tool execution and failures

Why Custom Tool Calling?​

1. Vendor Independence​

2. Guided Calling​

3. Iteration Support​

4. Optimization Opportunities​

Tool-Guideline Association​

Tool Call Inference​

Evaluation States​

Inference Batching​

Tool Execution​

Tool Context​

Tool Results​

Tool Insights​

Why Insights Matter​

Insight Flow​

Consequential vs Non-Consequential Tools​

Non-Consequential Tools​

Consequential Tools​

Marking Tools​

Tool Execution Within Iterations​

Why This Design?​

Why Guided Over Open-Ended?​

Why Allow Multiple Iterations?​

Tradeoffs​

What's Next​