llm
LLM Implementation Guidelines
Directory Structure
LLM-related code is organized in specific directories:
apps/web/utils/ai/- Main LLM implementationsapps/web/utils/llms/- Core LLM utilities and configurationsapps/web/__tests__/- LLM-specific tests
Key Files
utils/llms/index.ts- Core LLM functionalityutils/llms/model.ts- Model definitions and configurationsutils/usage.ts- Usage tracking and monitoring
Implementation Pattern
Follow this standard structure for LLM-related functions:
import { z } from "zod";
import { createScopedLogger } from "@/utils/logger";
import { chatCompletionObject } from "@/utils/llms";
import type { EmailAccountWithAI } from "@/utils/llms/types";
import { createGenerateObject } from "@/utils/llms";
export async function featureFunction(options: {
inputData: InputType;
emailAccount: EmailAccountWithAI;
}) {
const { inputData, user } = options;
if (!inputData || [other validation conditions]) {
logger.warn("Invalid input for feature function");
return null;
}
const system = `[Detailed system prompt that defines the LLM's role and task]`;
const prompt = `[User prompt with context and specific instructions]
<data>
...
</data>
${emailAccount.about ? `<user_info>${emailAccount.about}</user_info>` : ""}`;
const modelOptions = getModel(emailAccount.user);
const generateObject = createGenerateObject({
userEmail: emailAccount.email,
label: "Feature Name",
modelOptions,
});
const result = await generateObject({
...modelOptions,
system,
prompt,
schema: z.object({
field1: z.string(),
field2: z.number(),
nested: z.object({
subfield: z.string(),
}),
array_field: z.array(z.string()),
}),
});
return result.object;
}
Best Practices
-
System and User Prompts:
- Keep system prompts and user prompts separate
- System prompt should define the LLM's role and task specifications
- User prompt should contain the actual data and context
-
Schema Validation:
- Always define a Zod schema for response validation
- Make schemas as specific as possible to guide the LLM output
-
Logging:
- Use descriptive scoped loggers for each feature
- Log inputs and outputs with appropriate log levels
- Include relevant context in log messages
-
Error Handling:
- Implement early returns for invalid inputs
- Use proper error types and logging
- Implement fallbacks for AI failures
- Add retry logic for transient failures using
withRetry
-
Input Formatting:
- Use XML-like tags to structure data in prompts
- Remove excessive whitespace and truncate long inputs
- Format data consistently across similar functions
-
Type Safety:
- Use TypeScript types for all parameters and return values
- Define clear interfaces for complex input/output structures
-
Code Organization:
- Keep related AI functions in the same file or directory
- Extract common patterns into utility functions
- Document complex AI logic with clear comments
-
AI-First Behavior:
- Prefer generic prompt instructions, structured outputs, and model choice over brittle lexical heuristics that imitate model reasoning
- Only add deterministic filters when the product truly needs a hard rule outside the model
- Do not add prompt examples that closely mirror eval fixtures just to make a test pass
-
Draft Attribution Versioning:
- When changing draft-generation prompt inputs, retrieval context, routing, or post-processing, bump
apps/web/utils/ai/reply/draft-attribution.tsDRAFT_PIPELINE_VERSION - Treat that version as analytics attribution for reply-draft quality comparisons
- When changing draft-generation prompt inputs, retrieval context, routing, or post-processing, bump
Testing
See llm-test.mdc
More from elie222/inbox-zero
ui-components
UI component and styling guidelines using Shadcn UI, Radix UI, and Tailwind
77security
Security guidelines for API route development
36fullstack-workflow
Complete fullstack workflow combining GET API routes, server actions, SWR data fetching, and form handling. Use when building features that need both data fetching and mutations from API to UI.
23test-feature
End-to-end feature testing — browser QA, API verification, eval tests, or any combination. Covers browser interactions (via agent-browser CLI), Google Workspace operations (gws CLI), API calls, and LLM eval tests. Can also persist tests as reusable QA flows or eval files.
21testing
Guidelines for testing the application with Vitest, including unit tests, integration tests (emulator), AI tests, and eval suites for LLM features
21project-structure
Project structure and file organization guidelines
21