Back

How to Define Tools for LLM Agents

Jun 01, 2026
How to Define Tools for LLM Agents

How to Define Tools for LLM Agents

Tool definitions are one of the main control surfaces for LLM agents. A prompt tells the model what to do, but tools tell it what actions are possible, what inputs are valid, and what output it can expect back.

For production agents, treat every tool as an API contract between the model and your system. If the contract is vague, the agent will guess. If the schema is loose, the agent will pass bad arguments. If one tool tries to do too much, the agent will call it at the wrong time.

A good tool definition helps the model answer three questions:

  • When should I use this tool?
  • What exact inputs do I need?
  • What will happen after I call it?

This matters whether you are building a simple support agent, a sales assistant, a coding agent, or a multi-step workflow with planning. If you are using the OpenAI Agents SDK, Claude tool use, LangGraph, or your own agent runtime, the same design rules apply.

Start with the workflow, not the tool list

Teams often start by asking, “What tools should the agent have?” That is usually too early.

Start with the workflow:

  1. What user request should the agent handle?
  2. What decision points exist?
  3. What data does the agent need before it can act?
  4. Which actions are safe to perform automatically?
  5. Which actions require confirmation?
  6. What should happen when a tool fails?

For example, a support agent that creates tickets may follow this workflow:

  1. User reports an issue.
  2. Agent identifies the customer by email or account ID.
  3. Agent fetches customer details.
  4. Agent asks for missing required information.
  5. Agent creates a support ticket.
  6. Agent returns the ticket ID and expected response time.

That workflow suggests at least two tools: one to fetch the customer record and one to create the ticket. It also tells you which fields are required, when to ask follow-up questions, and where validation belongs.

If you define tools before clarifying the workflow, you usually end up with broad tools such as handle_customer_issue or update_system. Those names feel convenient, but they hide too many decisions inside one call.

Weak tool definition versus strong tool definition

Here is a weak tool definition for a customer support agent:

{
  "name": "get_info",
  "description": "Gets customer info when needed.",
  "parameters": {
    "type": "object",
    "properties": {
      "id": {
        "type": "string"
      }
    }
  }
}

This tool is weak because the model has to guess what “info” means, when “needed” applies, and what kind of ID is expected. Is it a customer ID, user ID, account ID, email address, ticket ID, or external CRM ID? The schema does not say.

Here is a stronger version:

{
  "name": "fetch_customer_record",
  "description": "Fetch a customer record from the CRM using either customer_id or email. Use this before creating a support ticket when the user's account status, plan, or support tier is unknown.",
  "parameters": {
    "type": "object",
    "additionalProperties": false,
    "properties": {
      "customer_id": {
        "type": "string",
        "description": "Internal CRM customer ID, for example cus_8f31a92."
      },
      "email": {
        "type": "string",
        "format": "email",
        "description": "Customer email address, for example alex@example.com."
      }
    },
    "oneOf": [
      { "required": ["customer_id"] },
      { "required": ["email"] }
    ]
  }
}

This version gives the model a specific action, a clear use case, valid identifiers, and input constraints. It also blocks extra fields with additionalProperties: false, which reduces malformed calls.

Write tool names as specific actions

Use names that describe one action. Good names usually start with a verb:

  • fetch_customer_record
  • create_support_ticket
  • search_knowledge_base
  • refund_payment
  • send_password_reset_email

Avoid names that are too broad:

  • customer_tool
  • admin_action
  • do_task
  • update_data
  • handle_request

The name should help the model pick the right tool without reading a full manual. The description should then explain the conditions for use, required context, side effects, and limits.

Make descriptions operational

A tool description should not read like internal documentation. It should guide model behavior at call time.

Weak description:

"description": "Creates tickets."

Strong description:

"description": "Create a Zendesk support ticket after the user has described the issue and a customer record has been fetched. Do not use for billing refunds, account cancellation, or password reset requests."

The strong version tells the model:

  • What system the action affects.
  • What must happen before the tool call.
  • Which user requests are out of scope.
  • That the tool has a real side effect.

For tools that change state, make the side effect explicit. The agent should know that create_support_ticket writes to Zendesk, while draft_support_ticket only prepares content.

Use required parameters aggressively

Missing required parameters are one of the most common causes of unreliable tool calls. If the backend cannot complete the action without a value, require it in the schema.

Here is a practical schema for creating a support ticket:

{
  "name": "create_support_ticket",
  "description": "Create a support ticket for an identified customer. Use only after fetch_customer_record has returned a valid customer_id. Ask the user for missing issue details before calling this tool.",
  "parameters": {
    "type": "object",
    "additionalProperties": false,
    "required": [
      "customer_id",
      "subject",
      "description",
      "priority",
      "source"
    ],
    "properties": {
      "customer_id": {
        "type": "string",
        "description": "Internal CRM customer ID returned by fetch_customer_record, for example cus_8f31a92."
      },
      "subject": {
        "type": "string",
        "minLength": 8,
        "maxLength": 120,
        "description": "Short ticket title written in plain language."
      },
      "description": {
        "type": "string",
        "minLength": 20,
        "maxLength": 4000,
        "description": "Customer issue summary, including relevant error messages, affected product area, and troubleshooting already attempted."
      },
      "priority": {
        "type": "string",
        "enum": ["low", "normal", "high", "urgent"],
        "description": "Urgency level based on customer impact. Use urgent only for security issues, outages, or blocked production workflows."
      },
      "source": {
        "type": "string",
        "enum": ["chat", "email", "api"],
        "description": "Channel where the customer request originated."
      }
    }
  }
}

This schema gives the agent enough structure to produce useful calls and gives your runtime enough information to reject invalid ones.

Avoid ambiguous field names

Ambiguous field names create bad calls even when the model understands the task. Use domain-specific names instead of generic names.

  • Use customer_id instead of id.
  • Use ticket_description instead of text.
  • Use billing_account_id instead of account.
  • Use requested_refund_amount_cents instead of amount.
  • Use user_timezone instead of timezone when multiple time zones may exist.

Good field names reduce prompt complexity. You should not need a long system prompt explaining what id means in five different contexts.

Keep tools narrow

Overloaded tools are hard for agents to use correctly. A tool such as manage_customer might fetch a record, update an address, cancel a subscription, and create a support ticket depending on its parameters. That design pushes routing logic into the model and makes failures harder to debug.

Prefer narrow tools:

  • fetch_customer_record
  • update_customer_billing_address
  • cancel_subscription
  • create_support_ticket

Narrow tools make traces easier to inspect. If a ticket was created incorrectly, you can inspect the exact call, inputs, response, prompt version, and model output that led to it.

This is especially important for plan-and-execute agents, where the model may first decide a multi-step plan and then execute each step through tools. Narrow actions keep the plan readable and easier to test.

Define validation outside the model

The model should receive a clear schema, but your application should still validate every tool call before execution. Do not trust the model to enforce your business rules.

Validation should check:

  • Required fields are present.
  • Enums contain allowed values only.
  • Strings meet length limits.
  • IDs match expected formats.
  • Dates use the expected timezone and format.
  • Amounts use integer cents rather than floating-point dollars.
  • The user has permission to perform the requested action.
  • The action is safe to run without confirmation.

For example, if an agent calls refund_payment with amount: 50.00, your validator should reject it if the API expects amount_cents: 5000. The tool schema should make that clear, and the backend should enforce it.

Return structured tool outputs

Tool outputs need structure too. If your tool returns a blob of text, the model has to parse it. That increases error rates.

Prefer a response like this:

{
  "customer_id": "cus_8f31a92",
  "email": "alex@example.com",
  "plan": "enterprise",
  "support_tier": "priority",
  "account_status": "active"
}

Avoid a response like this:

"Customer Alex is active and has enterprise priority support."

Structured outputs help the next tool call. If the agent needs customer_id to create a ticket, return customer_id as a field.

Design tool definitions for your agent type

Static agents and dynamic agents need different tool design discipline.

With static agents, the workflow is usually fixed. You can define a smaller tool set and write stricter instructions because the path is known. A support triage agent may always search docs, fetch the customer record, then create or update a ticket.

With dynamic agents, the agent chooses a path at runtime. Tool descriptions need stronger boundaries because the model may decide between several valid actions. For example, a dynamic customer operations agent might choose between creating a ticket, issuing a credit, updating account metadata, or escalating to sales.

If your system uses compiler-style planning, an LLM compiler may break a goal into parallel or sequential tool calls. In that setup, tool schemas must be especially precise. The planner needs to know which outputs feed later inputs.

Common mistakes to avoid

Vague tool descriptions

Descriptions such as “gets data” or “handles request” force the model to infer too much. Describe when to use the tool, what it does, what it does not do, and whether it changes state.

Overloaded tools

One tool should not represent five unrelated actions. Split broad tools into smaller tools with clear names and typed parameters.

Missing required parameters

If your backend needs a value, mark it as required. Otherwise, the model may call the tool too early and your runtime will need to recover.

Ambiguous field names

Fields like id, type, data, and message are often too vague. Use names that match the business object and expected value.

No validation

A schema helps the model, but your application still needs runtime validation. Reject bad calls before they reach production systems.

No eval set

You need test cases that measure whether the agent chooses the right tool, passes correct arguments, and handles failures. Include happy paths, missing data, ambiguous requests, permission failures, and tool errors.

Defining tools before clarifying the workflow

Without a workflow, your tool list becomes a guess. Map the user journey first, then define the smallest set of tools needed to complete it.

Build an eval set for tool use

An eval set for tools should test more than final answers. It should inspect the tool call sequence and arguments.

For a support ticket agent, include cases like:

  • Known customer: User provides an email and a clear bug report. Expected behavior: fetch customer, then create ticket.
  • Missing issue details: User says “Your app is broken.” Expected behavior: ask a follow-up question before creating a ticket.
  • Ambiguous identifier: User provides an account name but no email or customer ID. Expected behavior: ask for a valid identifier or search through an approved lookup tool.
  • Out-of-scope request: User asks for a refund. Expected behavior: do not create a support ticket if your workflow requires a billing tool or escalation path.
  • Tool failure: CRM lookup times out. Expected behavior: explain the failure and retry or escalate based on your policy.

Track pass or fail at each step:

  • Did the agent choose the correct tool?
  • Did it avoid tools when it needed more information?
  • Did it pass valid arguments?
  • Did it respect required confirmation steps?
  • Did it handle tool errors correctly?

These checks are easier to run when you log prompts, tool calls, model outputs, and tool responses in one trace.

Suggested screenshots to include in your docs

If you are documenting tool design for your team, include screenshots that show the full execution path. Useful screenshots include:

  • A trace showing the user message, selected tool, tool arguments, tool response, and final answer.
  • A failed tool call where validation rejected a missing required field.
  • An eval result table comparing expected tool calls against actual tool calls.
  • A prompt version diff where a tool description changed.
  • A schema editor or JSON schema view for a production tool.
  • A latency breakdown showing which tool calls added the most time.

These screenshots make tool behavior concrete for engineers, product managers, and reviewers. They also help new team members understand how the agent behaves outside a demo.

A practical checklist for defining agent tools

  • Define the workflow before the tools.
  • Give each tool one clear action.
  • Use specific names such as fetch_customer_record.
  • Write descriptions that explain when to use the tool and when not to use it.
  • Mark backend-required fields as required in the schema.
  • Use enums for bounded choices such as priority, channel, status, and category.
  • Use clear field names that include the business object.
  • Set additionalProperties: false when possible.
  • Validate every tool call before execution.
  • Return structured outputs from tools.
  • Add evals for tool selection, argument quality, missing information, and failure handling.
  • Log tool calls and responses so you can debug production behavior.

Good tool definitions make agents easier to ship

Reliable agents come from clear contracts. The model needs to know which actions exist, when to call them, and which arguments are valid. Your runtime needs to validate those calls, execute them safely, and record what happened.

When tool definitions are specific, narrow, and tested, your agent becomes easier to debug and safer to change. You can update a schema, run an eval set, inspect traces, and see whether the behavior improved before shipping.


PromptLayer helps AI teams manage prompts, tool calls, traces, datasets, and evals for production LLM applications. If you are defining tools for agents and want a cleaner way to test and monitor them, create a PromptLayer account.

The first platform built for prompt engineering