Chapter 12: Background Jobs with Trigger.dev (Long-running LLM Tasks)

Theoretical Foundations

In the architecture of modern web applications, particularly those leveraging sophisticated AI capabilities, a fundamental tension exists between user experience and computational load. The standard request-response cycle is synchronous and blocking by design: a user initiates an action (e.g., clicking "Generate Summary"), the server receives the request, performs the necessary work, and only then sends a response back. If the work is trivial—like fetching a user's profile from a database—this cycle completes in milliseconds. However, when the work involves a long-running LLM task, such as performing a complex data transformation on a large document or executing a multi-step inference chain, this synchronous model breaks down. The user is left staring at a loading spinner, the browser connection may time out, and the server thread is occupied, unable to handle other incoming requests.

This is the problem that background job orchestration solves. It is the practice of taking a long-running, computationally expensive task and offloading it from the main application thread to a separate, asynchronous execution environment. Instead of performing the work during the request, the application simply registers the job to be done and immediately returns a response to the user, often with a token or ID to track the job's status. The heavy lifting happens "in the background," out of the user's direct request path.

Trigger.dev is a specialized service that provides the infrastructure for this pattern. It acts as a durable execution engine for background jobs, particularly well-suited for the unpredictable and often lengthy nature of LLM interactions. It is not merely a task queue; it is a platform that manages the entire lifecycle of a job, from its initial trigger, through its execution (which might span hours or even days), to its final state, including sophisticated error handling and retry logic.

The Analogy: The Restaurant Kitchen vs. The Waitstaff

To understand the role of Trigger.dev and background jobs, consider a high-end restaurant.

The Main Application Thread (The Waitstaff): The waitstaff are the front line of the customer experience. They take orders, answer questions about the menu, and bring food to the table. Their primary goal is to be responsive and attentive. If a waitstaff member were to personally cook a complex, 20-minute dish for every customer who ordered it, they would be tied up in the kitchen, unable to take new orders or attend to other tables. The entire restaurant's service would grind to a halt. This is analogous to a web server performing a long-running LLM task within the main request-response cycle.
The Background Job (The Kitchen): The kitchen is a separate, specialized environment with its own staff (chefs) and equipment (ovens, grills). When a waiter takes an order for a complex dish, they don't cook it themselves. They pass the order ticket to the kitchen. The kitchen then takes ownership of the task, managing its preparation, cooking time, and potential issues (e.g., a burnt dish requires remaking). The waiter is free to take the next table's order immediately. The kitchen can even handle multiple complex orders in parallel, managed by a head chef (the job orchestrator) who prioritizes and tracks each ticket.
Trigger.dev (The Head Chef & Order Management System): Trigger.dev is the head chef and the sophisticated ticketing system combined. It doesn't cook the food itself, but it orchestrates the entire kitchen.
- Job Definition: It knows every recipe (the code for the background job).
- Triggering: It receives the order ticket (the job request from the application).
- Execution Management: It assigns the job to a chef (a serverless compute environment), tracks its progress, and knows if it's "in the oven" (running), "plated" (completed), or "burnt" (failed).
- Resilience: If a chef drops a plate (an execution fails due to a transient network error), the head chef knows the recipe and can immediately instruct another chef to remake the dish (automatic retry logic).
- Notification: Once the dish is ready, the kitchen alerts the waiter (the system sends a webhook or updates a database), who can then inform the customer (the user).

This decoupling is the "why" behind background jobs. It preserves the responsiveness of the waitstaff (the user-facing application) while enabling the kitchen (the backend) to perform complex, time-consuming work without blocking the entire system.

Theoretical Foundations

Let's break down the process of a long-running LLM task using a concrete example: generating a detailed market analysis report from a 100-page document. This involves multiple steps: document ingestion, text chunking, embedding generation, vector database retrieval, and finally, context-augmented synthesis.

In a naive, synchronous architecture, this entire chain would execute within a single HTTP request. The user would wait for minutes, and the server would be under constant load.

With a background job orchestrator like Trigger.dev, the workflow is transformed:

Triggering the Job: The user's frontend application makes a simple API call to the Backend for Frontend (B4F) layer. The B4F endpoint's only responsibility is to validate the request and initiate the background job. It immediately returns a 202 Accepted response, perhaps with a jobId. The user sees a "Processing..." status in the UI.
Job Execution: Trigger.dev receives the job definition. It spins up a secure, isolated execution environment and runs the job's code. This is where the multi-step process occurs, completely detached from the user's browser.
State Management & Checkpointing: A key feature of robust background job systems is the ability to manage state. For a multi-step LLM task, the job might need to store intermediate results (e.g., the embedded chunks of the document). Trigger.dev allows for durable state storage, meaning if an execution is interrupted, it can potentially resume from the last known good state rather than starting over.
Error Handling & Retry Logic: LLM APIs are not infallible. They can experience rate limits, temporary network issues, or internal server errors. A naive implementation might simply fail the entire job on the first error. Trigger.dev provides declarative retry logic. A job can be configured with a policy like "retry on failure up to 3 times with an exponential backoff delay." This makes the system resilient to transient failures, which are common when integrating with external AI services.
Completion & Notification: Once the final LLM call completes and the report is generated, the job finishes. Trigger.dev can then trigger a callback, such as calling a webhook on the B4F service, updating a record in a database (e.g., setting a status field from processing to completed and storing the result), or even pushing a real-time update to the client via WebSockets.

The Role in Backend for Frontend (B4F) Architecture

The B4F pattern is about creating a thin, client-specific API layer that sits between the frontend and the broader backend microservices. Its purpose is to shape data for the client's needs and aggregate calls to downstream services. Integrating background jobs into this architecture is a natural fit.

The B4F layer acts as the initial point of contact—the waiter taking the order. It should not contain the heavy business logic itself. Instead, it delegates to the background job orchestrator.

Consider the B4F endpoint in our report generation example. Its logic is minimal:

Receive the document and user prompt.
Validate the input.
Call Trigger.dev to start a generateReport job, passing the document ID and prompt as payload.
Return the jobId to the client.

This keeps the B4F layer lean, fast, and scalable. The heavy computational workload is abstracted away into the background. The B4F layer's other responsibility is to provide endpoints for the client to poll for the status of a job, which is a common pattern for asynchronous workflows.

Visualization of the Asynchronous Workflow

The following diagram illustrates the flow of control and data in a background job system integrated with a B4F architecture.

A diagram illustrating the asynchronous workflow in a B4F architecture shows the client sending a request to the server, which initiates a background job and returns a job ID, allowing the client to periodically poll a dedicated endpoint for status updates until the task is complete.

Connecting to Core AI Concepts: Context Augmentation

This chapter's focus on long-running LLM tasks directly intersects with concepts introduced earlier, such as Context Augmentation. In the context of Retrieval-Augmented Generation (RAG), context augmentation is the final step where retrieved text chunks are packaged with the user's query and sent to the LLM for synthesis.

This step, while conceptually simple, is a perfect candidate for a background job. Imagine a user query that requires retrieving 50 different text chunks from a vector database. The process of:

Embedding the user's query.
Performing a similarity search against a massive vector index.
Fetching the top-k relevant chunks.
Formatting them into a prompt that fits the LLM's context window.
Making the final API call to generate the answer.

This entire sequence can take several seconds, if not longer. Performing this synchronously would degrade the user experience. By offloading this to a background job, the application can provide a responsive interface. The job can handle the complexity of the RAG pipeline, manage the state of the retrieval process, and ensure the final synthesized answer is delivered asynchronously, perfectly aligning with the principles of resilient and user-centric AI application design. The background job becomes the engine that powers the "magic" of context-aware AI without making the user wait for it.

Basic Code Example

This example demonstrates a "Hello World" scenario for a SaaS application where a user submits a document for processing. Instead of performing the heavy LLM-based summarization on the main web server (which could cause timeouts or block the user interface), we offload the task to a Trigger.dev background job. The job runs asynchronously, manages its own lifecycle, and notifies the user when complete.

We will define a single background job that:

Receives a document ID.
Simulates fetching a document and running an LLM transformation (summarization).
Updates a database record with the result.
Includes basic error handling and retry logic.

The Code

// src/trigger/example-job.ts
import { task, retry } from "@trigger.dev/sdk/v3";
import { z } from "zod";

// 1. Define the payload schema using Zod for runtime validation.
// This ensures the job only runs with valid data, preventing type errors downstream.
const DocumentPayloadSchema = z.object({
  documentId: z.string().uuid(),
  userId: z.string(),
});

/**

 * A simulated database client for our SaaS application.
 * In a real app, this would be Prisma, Drizzle, or a direct SQL client.
 */
const mockDatabase = {
  // Simulate a slow database call
  async findDocumentById(id: string) {
    await new Promise((resolve) => setTimeout(resolve, 100)); // Simulate network latency
    if (id === "invalid-id") throw new Error("Document not found");
    return {
      id,
      content: "Trigger.dev is a powerful orchestration platform for background jobs. It simplifies running long-running tasks like LLM inference without blocking your main application threads.",
    };
  },
  // Simulate updating the document with the LLM result
  async updateDocumentSummary(id: string, summary: string) {
    console.log(`[DB] Updating document ${id} with summary.`);
    return { success: true, summary };
  },
};

/**

 * 2. Define the Background Job using Trigger.dev's `task` helper.
 * This function is the entry point for the background worker.
 */
export const summarizeDocumentTask = task({
  id: "summarize-document",
  // Define the expected input type. Trigger.dev infers this from the schema.
  schema: DocumentPayloadSchema,

  // 3. The main execution logic.
  // This runs in a separate process, isolated from the main web server.
  run: async (payload, { ctx }) => {
    // Log the start of the job for observability
    console.log(`Starting job for user ${payload.userId}, document ${payload.documentId}`);

    try {
      // 4. Fetch the document data
      const document = await mockDatabase.findDocumentById(payload.documentId);

      // 5. Simulate an LLM Data Transformation
      // In a real scenario, this might call an OpenAI API or run a model locally.
      // We wrap this in a retry block. If the LLM API fails (e.g., rate limit),
      // Trigger.dev will automatically retry this specific block up to 3 times.
      const summary = await retry(
        async () => {
          // Simulate LLM inference latency
          await new Promise((resolve) => setTimeout(resolve, 2000));

          // Simulate a potential transient error (random failure)
          if (Math.random() < 0.1) {
            throw new Error("LLM API Rate Limit Exceeded");
          }

          // Mock the LLM output
          return `Summary: ${document.content.substring(0, 50)}... [Processed by Job ID: ${ctx.run.id}]`;
        },
        { 
          maxAttempts: 3, // Retry up to 3 times
          minTimeout: 1000, // Wait at least 1 second between retries
        }
      );

      // 6. Persist the result
      await mockDatabase.updateDocumentSummary(payload.documentId, summary);

      // 7. Return the result (optional, but useful for chaining jobs)
      return {
        status: "success",
        summary,
        jobId: ctx.run.id,
      };

    } catch (error) {
      // 8. Error handling
      // If the error is not caught here, Trigger.dev marks the run as "Errored".
      // You can configure alerting (Slack, Email) in the Trigger.dev dashboard.
      console.error("Job failed permanently:", error);
      throw error; // Re-throw to ensure the failure is recorded
    }
  },
});

Visualizing the Workflow

The following diagram illustrates the flow of data and execution between the user, the main web server, and the Trigger.dev background worker.

This diagram illustrates the workflow where the main web server sends a job to the Trigger.dev background worker, which processes the task and, upon a permanent failure, logs an error and re-throws it to ensure the failure is recorded back to the main server.

Line-by-Line Explanation

1. Payload Validation (Zod)

const DocumentPayloadSchema = z.object({
  documentId: z.string().uuid(),
  userId: z.string(),
});

Why: Background jobs are often triggered by events (e.g., HTTP requests, cron jobs). Data passed to them might be malformed or change over time.
How: We use zod, a TypeScript schema validation library.
Under the Hood: When the job is triggered, Trigger.dev validates the incoming payload against this schema. If the documentId is not a valid UUID, the job will fail immediately with a clear validation error, preventing runtime errors deep inside your logic.

2. The Task Definition

export const summarizeDocumentTask = task({
  id: "summarize-document",
  schema: DocumentPayloadSchema,
  run: async (payload, { ctx }) => { ... }
});

Why: This registers the function with the Trigger.dev SDK.
How: The task function takes a configuration object.
- id: A unique string identifier used to reference this job in the dashboard or when triggering it from code.
- schema: Links the validation logic defined above.
- run: The actual function that executes.
Under the Hood: The run function receives two arguments:
1. payload: The strongly typed data passed to the job (inferred from the Zod schema).
2. { ctx }: An object containing context about the execution, such as ctx.run.id (a unique ID for this specific execution attempt), ctx.attempt (current retry number), and metadata.

3. Data Fetching & Simulation

const document = await mockDatabase.findDocumentById(payload.documentId);

Why: Background jobs often need to fetch data that wasn't available or was too large to pass directly in the payload.
How: We use a mock database client to simulate an async database query.
Under the Hood: This represents a standard I/O operation. In a real app, this would be a Prisma or Drizzle query. Note the await keyword—background jobs support standard async/await patterns, allowing for complex sequential logic.

4. LLM Transformation with Retry Logic

const summary = await retry(
  async () => { ... },
  { maxAttempts: 3, minTimeout: 1000 }
);

Why: LLM APIs are prone to transient errors (rate limits, network timeouts). We don't want the entire job to fail permanently on a temporary glitch.
How: Trigger.dev provides a retry utility. We wrap the LLM call inside it.
Under the Hood:
- The inner async () => { ... } function contains the logic that might fail.
- If it throws an error, the retry wrapper catches it.
- It waits for minTimeout (1000ms) before attempting again.
- It repeats this up to maxAttempts (3 times). If all attempts fail, the error is propagated out of the retry block, and the job fails.

5. Persistence & Return

await mockDatabase.updateDocumentSummary(payload.documentId, summary);
return { status: "success", summary, jobId: ctx.run.id };

Why: The result of the background job needs to be stored so the user can see it later (e.g., via polling or a websocket notification).
How: We update the database and return a result object.
Under the Hood: The return value is serialized and stored in the Trigger.dev database. This allows you to inspect the result in the dashboard or pass it to downstream jobs if you were chaining tasks.

Common Pitfalls

When implementing background jobs for LLM tasks, especially in serverless or edge environments, watch out for these specific issues:

Vercel/AWS Lambda Timeouts:
- Issue: Standard serverless functions (like Vercel API routes) have strict timeouts (usually 10-60 seconds). LLM inference often takes longer.
- Pitfall: Running the LLM directly in the API route will cause the request to hang or timeout, resulting in a "504 Gateway Timeout" error for the user.
- Solution: Trigger.dev runs on persistent infrastructure (not standard serverless timeouts). Always offload tasks expected to take >10 seconds to a background job.
Async/Await Loops in CPU-Bound Tasks:
- Issue: JavaScript is single-threaded. While await releases the CPU for I/O (network requests), heavy computation (like running a Transformer model via Transformers.js) blocks the event loop.
- Pitfall: If you run a heavy LLM model inside a background job on a standard Node.js worker, it will block that worker from processing other jobs.
- Solution: For extremely heavy inference, use Trigger.dev's "Compute" options (like running on a GPU instance) or split the workload. If using Transformers.js in a standard worker, ensure you are using streaming APIs if available, or accept that the worker is busy for that duration.
Hallucinated JSON / Schema Drift:
- Issue: LLMs are non-deterministic. If your job relies on the LLM outputting valid JSON to be parsed by the next step, it might output a malformed string.
- Pitfall: JSON.parse(llmOutput) throws a syntax error, crashing the job.
- Solution: Always validate LLM outputs with a schema validator like Zod before processing. If parsing fails, catch the error and use the retry mechanism (potentially with a modified prompt) to ask the LLM to correct itself.
Idempotency and Duplicate Runs:
- Issue: In distributed systems, network blips might cause Trigger.dev to retry a job even if the first attempt technically succeeded but the acknowledgement was lost.
- Pitfall: The job runs twice, charging the user twice or duplicating data.
- Solution: Design jobs to be idempotent. Use the ctx.run.id or a unique transaction ID to check if a record has already been processed before writing to the database.

The chapter continues with advanced code, exercises and solutions with analysis, you can find them on the ebook on Leanpub.com or Amazon

Loading knowledge check...

Code License: All code examples are released under the MIT License. Github repo.

All textual explanations, original diagrams, and illustrations are the intellectual property of the author. To support the maintenance of this site via AdSense, please read this content exclusively online. Copying, redistribution, or reproduction is strictly prohibited.