glean-agent-toolkit-builder
Glean Agent Toolkit — Tool Builder Guide
Tool Anatomy
Every tool in the toolkit is a Python function decorated with @tool_spec. The decorator:
- Extracts an input JSON schema from the function signature (via Pydantic)
- Creates a
ToolSpecdataclass wrapping the function - Registers the spec in the global
Registrysingleton - Attaches
.as_openai_tool(),.as_langchain_tool(),.as_crewai_tool(),.as_adk_tool()convenience methods
Minimal tool
from glean.agent_toolkit.decorators import tool_spec
from glean.agent_toolkit.context import GleanContext
@tool_spec(
name="my_tool",
description="Does something useful.",
)
def my_tool(
ctx: GleanContext | None = None,
*,
query: str,
) -> dict:
# ctx is injected by adapters; LLM frameworks never see it
client = ctx.get_client()
# ... call Glean API ...
return {"answer": "result"}
Rules:
- First parameter must be
ctx: GleanContext | None = None - Use
*separator — all remaining params are keyword-only (visible to the LLM) namein the decorator is the tool name exposed to LLMsdescriptionis the tool description exposed to LLMs
Input Schema with Annotated and Field
Use typing.Annotated with pydantic.Field to add descriptions, examples, and constraints. The decorator uses Pydantic's create_model to generate JSON Schema from these annotations.
from typing import Annotated, Any
from pydantic import Field
from glean.agent_toolkit.decorators import tool_spec
from glean.agent_toolkit.context import GleanContext
from glean.agent_toolkit.tools._common import ToolResult
@tool_spec(
name="glean_example_search",
description="Search for examples in the company knowledge base.",
)
def example_search(
ctx: GleanContext | None = None,
*,
query: Annotated[
str,
Field(
description="Search query with optional filters.",
examples=["API documentation", "security policy updated:past_week"],
),
],
datasources: Annotated[
list[str] | None,
Field(description="Restrict to specific datasources."),
] = None,
page_size: Annotated[
int,
Field(description="Number of results to return.", ge=1, le=100),
] = 10,
) -> ToolResult:
...
Parameters with defaults become optional in the schema. Parameters without defaults are required. The GleanContext parameter is automatically excluded from the schema.
Output Model
Use the output_model parameter to attach a Pydantic model to the tool spec. This generates an output JSON schema and provides structured typing.
from pydantic import BaseModel
from glean.agent_toolkit.decorators import tool_spec
class ChatResult(BaseModel):
answer: str
sources: list[dict[str, Any]]
@tool_spec(
name="my_chat_tool",
description="Chat with an AI assistant.",
output_model=ChatResult,
)
def my_chat(ctx: GleanContext | None = None, *, message: str) -> ToolResult:
...
Two Implementation Patterns
Pattern 1: Stub tools via run_tool() (for Glean's tools.run endpoint)
Most built-in tools use this pattern. They map parameters to ToolsCallParameter objects and delegate to run_tool(), which calls the Glean tools.run (or tools.execute) API endpoint.
Reference: search.py
from glean.agent_toolkit.decorators import tool_spec
from glean.agent_toolkit.tools._common import (
ToolResult,
convert_to_tool_params,
run_tool,
)
from glean.agent_toolkit.context import GleanContext
@tool_spec(
name="glean_my_search",
description="Search for something specific.",
)
def my_search(
ctx: GleanContext | None = None,
*,
query: str,
page_size: int = 10,
) -> ToolResult:
ctx = ctx or GleanContext()
client = ctx.get_client()
# convert_to_tool_params wraps each value in a ToolsCallParameter
parameters = convert_to_tool_params(query=query, pageSize=str(page_size))
# run_tool calls the Glean tools.run API and wraps the result in ToolResult
return run_tool("My Search Display Name", parameters, client=client)
Key functions from _common.py:
convert_to_tool_params(**kwargs)— wraps values intomodels.ToolsCallParameter(name=key, value=value)dictsrun_tool(tool_display_name, parameters, *, client=None)— calls Glean'stools.run/tools.executeendpoint, returnsToolResultarun_tool(...)— async version ofrun_tool
Pattern 2: Direct API calls (for non-tools.run endpoints)
Some tools call Glean API endpoints directly instead of going through tools.run. This is used when the Glean API has a dedicated endpoint (like client.chat.create()).
Reference: chat.py
from glean.agent_toolkit.decorators import tool_spec
from glean.agent_toolkit.tools._common import (
ToolResult,
run_with_error_handling,
serialize_tool_result,
)
from glean.agent_toolkit.context import GleanContext
@tool_spec(
name="glean_my_chat",
description="Chat with Glean Assistant.",
output_model=ChatResult,
)
def my_chat(
ctx: GleanContext | None = None,
*,
message: str,
) -> ToolResult:
ctx = ctx or GleanContext()
client = ctx.get_client()
def _do_chat() -> dict:
with client as g_client:
response = g_client.client.chat.create(
messages=[{"fragments": [{"text": message}]}],
)
# Process response...
return {"answer": "...", "sources": []}
# run_with_error_handling calls the function and wraps in ToolResult
return run_with_error_handling(_do_chat)
Key functions:
run_with_error_handling(fn, *args, **kwargs)— callsfn, wraps success inmake_ok(), catches exceptions and wraps inmake_error()with classificationserialize_tool_result(value)— calls.model_dump(by_alias=True)on Pydantic models, passes through plain values
Error Handling Helpers
All helpers are in glean.agent_toolkit.tools._common:
from glean.agent_toolkit.tools._common import (
make_ok,
make_error,
run_with_error_handling,
_classify_error,
ToolResult,
ErrorType,
SuggestedAction,
)
make_ok(result) — Build a success ToolResult
return make_ok({"documents": [...]})
# → {"status": "ok", "result": {...}, "error": None, "error_type": None, "suggested_action": None}
make_error(message, error_type, suggested_action) — Build an error ToolResult
return make_error("Token expired", error_type="auth", suggested_action="check_credentials")
_classify_error(exc) — Classify an exception
Returns (error_type, suggested_action) tuple. Classification logic:
TimeoutErroror "timeout" in message →("timeout", "retry")ValueError→("validation", "rephrase_query")- 401/403 in message →
("auth", "check_credentials") - 404 in message →
("not_found", "rephrase_query") - 429 in message →
("rate_limit", "retry") OSErroror fallback →("api", "retry")
run_with_error_handling(fn, *args, **kwargs) — Call and wrap
Calls fn(*args, **kwargs). On success returns make_ok(result). On exception, classifies and returns make_error(...).
Registration
After creating your tool module, add it to src/glean/agent_toolkit/tools/__init__.py:
- Add the module name to
_tool_modules:
_tool_modules: list[str] = [
"search",
"web_search",
# ... existing modules ...
"my_new_tool", # your new module
]
- Add the explicit import and export:
from .my_new_tool import my_new_tool # noqa: E402
__all__: list[str] = [
# ... existing exports ...
"my_new_tool",
]
The @tool_spec decorator calls get_registry().register(spec) automatically when the module is imported, so adding it to _tool_modules is sufficient for registration.
Testing
Use GleanContext dependency injection for testing. Pass a mock or fake Glean client directly — do not use unittest.mock.patch.
from unittest.mock import MagicMock
from glean.agent_toolkit.context import GleanContext
def test_my_tool():
# Create a mock Glean client
mock_client = MagicMock()
mock_client.__enter__ = MagicMock(return_value=mock_client)
mock_client.__exit__ = MagicMock(return_value=False)
# Inject via GleanContext
ctx = GleanContext(client=mock_client)
# Call the tool with injected context
result = my_tool(ctx, query="test")
assert result["status"] == "ok"
For tools using run_tool(), the client is passed through:
def test_search_tool():
mock_client = MagicMock()
mock_client.__enter__ = MagicMock(return_value=mock_client)
mock_client.__exit__ = MagicMock(return_value=False)
# Configure the mock to return expected data
mock_run = MagicMock(return_value={"results": []})
mock_client.client.tools.run = mock_run
ctx = GleanContext(client=mock_client)
result = search(ctx, query="test query")
assert result["status"] == "ok"
Complete Example: Custom Tool from Scratch
Here is a full example of a custom tool that searches Glean for people and returns structured results.
1. Create src/glean/agent_toolkit/tools/team_search.py
"""Search for teams and their members."""
from __future__ import annotations
from typing import TYPE_CHECKING, Annotated, Any
from pydantic import BaseModel, Field
from glean.agent_toolkit.decorators import tool_spec
from glean.agent_toolkit.tools._common import (
ToolResult,
run_with_error_handling,
serialize_tool_result,
)
if TYPE_CHECKING:
from glean.agent_toolkit.context import GleanContext
class TeamSearchResult(BaseModel):
"""Structured result from team search."""
teams: list[dict[str, Any]]
total_count: int
@tool_spec(
name="glean_team_search",
description=(
"Search for teams and their members within the organization. "
"Returns team name, members, and department information."
),
output_model=TeamSearchResult,
)
def team_search(
ctx: GleanContext | None = None,
*,
query: Annotated[
str,
Field(
description="Team name, department, or member name to search for.",
examples=["backend engineering", "product design", "Jane Smith's team"],
),
],
include_members: Annotated[
bool,
Field(description="Whether to include team member details."),
] = True,
) -> ToolResult:
"""Search for teams matching the query.
Args:
ctx: Optional Glean context for client injection.
query: The team search query.
include_members: Whether to include member details in results.
"""
from glean.agent_toolkit.context import GleanContext
ctx = ctx or GleanContext()
client = ctx.get_client()
def _do_search() -> dict[str, Any]:
with client as g_client:
response = g_client.client.search.query(query=query)
results = serialize_tool_result(response)
return TeamSearchResult(
teams=results.get("results", []),
total_count=len(results.get("results", [])),
).model_dump()
return run_with_error_handling(_do_search)
2. Register in tools/__init__.py
Add "team_search" to _tool_modules, then add the import and export:
from .team_search import team_search # noqa: E402
# Add "team_search" to __all__
3. Use it
from glean.agent_toolkit.tools import team_search
# Direct call
result = team_search(query="backend engineering")
# As an OpenAI tool
openai_tool = team_search.as_openai_tool()
# Via get_tools (automatically included after registration)
from glean.agent_toolkit import get_tools
all_tools = get_tools("langchain") # includes team_search