Skip to content

Chapter 20: Capstone - Building a Full-Stack AI Copywriter SaaS

Theoretical Foundations

To understand the architecture of a full-stack AI Copywriter SaaS, we must first visualize the application not as a monolithic block, but as a distributed system of specialized services communicating over a network. In the context of the modern web stack, this is analogous to a high-frequency trading floor.

On a trading floor, you have: 1. The Trader (The Frontend/Client): The individual making rapid decisions based on incoming data streams. They need a low-latency, reactive dashboard (the UI) that updates instantly as market conditions change. 2. The Broker (The API Layer/Edge): The intermediary who receives orders, validates them against regulations (authentication/authorization), and routes them to the correct exchange. 3. The Market Maker (The AI Model/LLM): The complex algorithmic engine that calculates prices (generates text) based on vast amounts of historical data and current inputs. It is computationally expensive and slow compared to the trader's actions. 4. The Settlement House (The Database): The immutable ledger that records every transaction (user data, generated copy, subscription status) for audit and future reference.

In our specific capstone, we are building this trading floor using Next.js, React Server Components (RSCs), and the Vercel AI SDK. The goal is to create a seamless flow where the "Trader" (user) submits a request, the "Broker" (Server Component) securely routes it, the "Market Maker" (LLM) generates the content, and the "Settlement House" (Database) records the result—all while maintaining the illusion of instantaneous interaction.

The "Why": Solving the Latency and Complexity Gap

The primary challenge in building AI-native applications is the inherent latency of Large Language Models (LLMs). Unlike a traditional SQL query which returns in milliseconds, an LLM inference can take several seconds. If we were to rely solely on client-side fetching (the traditional "Client Component" model), the user would stare at a loading spinner, breaking the immersion and perceived performance.

Furthermore, handling AI outputs requires strict validation. LLMs are non-deterministic; they can hallucinate or return malformed data. We need a mechanism to enforce structure on this chaos.

This is where the Server Components and Vercel AI SDK synergy comes into play. We offload the heavy lifting to the server, stream the response back to the client, and use JSON Schema to ensure the data arriving at the client is not just text, but a structured, typed object.

The Data Flow: A Pipeline Analogy

Imagine the application data flow as a physical water filtration system.

  1. The Input Valve (User Prompt): The user types a prompt into a form. In a traditional React app, this would trigger a fetch call from the browser. In our RSC architecture, this form submission is intercepted by a Server Action. This is akin to a valve that immediately closes off the external environment, ensuring the water (data) is processed in a controlled, secure environment (the server) before it ever reaches the pipes.

  2. The Filtration Membrane (JSON Schema & Zod): Before the water enters the main processing tank (the LLM), it must pass through a membrane that filters out impurities. In our stack, this is the JSON Schema definition. We define exactly what the output of the AI should look like (e.g., { "headline": string, "body": string, "tags": string[] }). The Vercel AI SDK uses this schema to instruct the LLM to output strictly formatted JSON. This is a critical reliability pattern; it prevents the "dirty water" of unstructured text from flowing downstream to the client.

  3. The Turbine (The LLM): The water hits a turbine (the LLM) which spins up the content. However, unlike a static generator, this turbine outputs water continuously (streaming) rather than in one giant bucket.

  4. The Conduit (Streaming): Instead of waiting for the entire tank to fill (standard HTTP response), we pipe the water immediately. The Vercel AI SDK utilizes Server-Sent Events (SSE) or HTTP streaming to send chunks of data as they are generated. The client receives these chunks and stitches them together in real-time. This is the difference between downloading a 10MB file and watching a YouTube video buffer—the latter feels instantaneous because you see progress immediately.

The "What": Deconstructing the Stack Components

1. React Server Components (RSC) as the Secure Gateway

In previous chapters, we discussed the distinction between Client and Server Components. In this capstone, RSCs serve as the Secure Gateway.

Analogy: Think of an RSC as a VIP Bouncer at an exclusive club (your database and API keys). * Client Components are the partygoers. They can see the lights and hear the music, but they cannot enter the back office where the expensive liquor (API keys, database credentials) is stored. * Server Components are the bouncers. They stand at the boundary. When a partygoer (user) asks for a drink (data), the bouncer goes behind the counter, mixes the drink securely using private ingredients (server-side LLM calls), and hands it over.

By keeping the LLM API call inside a Server Component (or a Server Action triggered by one), we ensure that the anthropic or openai API keys never leak to the browser. This is a non-negotiable security requirement for a SaaS application.

2. The Vercel AI SDK and the useChat Hook

The Vercel AI SDK abstracts the complexity of streaming protocols. The useChat hook is the interface between our secure server logic and the reactive client UI.

Analogy: The useChat hook is like a smart radio receiver. * Traditional Fetch: Like waiting for a cassette tape to finish recording before you can listen to it. * useChat: Like tuning into an FM radio station. The moment the DJ speaks (the server sends a token), the speaker plays it.

The hook manages: * Message History: It maintains the context of the conversation (system prompts, user inputs, assistant outputs) in a local state array. * Streaming State: It handles the asynchronous nature of the stream, appending incoming tokens to the message content as they arrive. * Optimistic UI: It allows the UI to update immediately upon user action, even before the server responds, providing a snappy feel.

3. JSON Schema Output: The Contract

When building an AI Copywriter, we don't just want a wall of text. We want structured data: a headline, a sub-headline, and a list of keywords. Relying on string parsing on the client is brittle.

Analogy: JSON Schema is the Blueprint for a House. If you ask a builder (the LLM) to "build me a house," you might get a shack, a mansion, or a pile of bricks. If you hand the builder a blueprint (JSON Schema) specifying "2 bedrooms, 1 bathroom, kitchen on the left," the probability of getting exactly what you need approaches 100%.

In the Vercel AI SDK, we define a schema using a library like Zod or a raw JSON object. The SDK sends this schema to the LLM along with the prompt. The LLM is instructed (via system prompting or function calling capabilities) to format its response to match the schema. The SDK then parses the stream against this schema, ensuring type safety all the way to the UI.

Visualization of the Architecture

The following diagram illustrates the request lifecycle in our SaaS application. Note the separation of concerns between the Client (Browser) and the Server (Edge/Node.js).

Diagram: Architecture
Hold "Ctrl" to enable pan & zoom

Diagram: Architecture

Under the Hood: The Mechanics of Streaming JSON

To understand the "how" without code, we must look at the byte stream.

When a user requests a blog post, the server opens a connection to the LLM. The LLM begins generating text. However, instead of sending plain text, the Vercel AI SDK wraps the generation in a stream that emits specific events.

  1. The Start Event: The server signals the client that a new message is beginning.
  2. The Content Event: As the LLM generates tokens (words or sub-words), the SDK intercepts them. It buffers these tokens locally on the server. Once a valid JSON object can be formed from the buffer, it sends a chunk to the client.
    • Crucial Detail: The client does not receive the raw LLM output immediately. It receives a parsed object. If the LLM outputs {"headline": "The Best", the client might not render anything yet. When the LLM finishes the sentence Product"}, the SDK parses the full JSON and sends the complete object to the client.
  3. The Finish Event: The stream closes, and the client marks the message as complete.

This architecture ensures that the client never has to parse complex JSON strings. It simply receives a JavaScript object that matches the interface defined by our TypeScript types.

Explicit Reference to Previous Concepts

In Chapter 18: "Server-Side Rendering and Data Fetching", we explored how Next.js App Router allows components to fetch data asynchronously on the server. We learned that this reduces the bundle size and improves the First Contentful Paint (FCP).

In this capstone, we are applying that concept to AI generation. Instead of fetching static data (like a blog post from a database), we are fetching dynamic data (generated copy) using the same RSC pattern. The async/await syntax used in Chapter 18 to fetch a database row is now used to await the generation of text from an LLM. The principle remains identical: move the data fetching burden off the client to ensure the user sees a fully rendered page (or in this case, a fully rendered copy block) without the delay of client-side waterfalls.

Theoretical Foundations

The theoretical foundation of this capstone rests on three pillars:

  1. Security via Isolation: Using Server Components and Server Actions to keep sensitive keys and business logic on the server, exposing only the necessary data to the client.
  2. Reliability via Structure: Using JSON Schema to enforce a contract between the non-deterministic LLM and the deterministic client application, preventing runtime errors.
  3. Performance via Streaming: Using the Vercel AI SDK to stream tokens rather than waiting for full completion, masking latency and providing a fluid user experience.

By combining these pillars, we move beyond simple "chatbot" implementations and enter the realm of scalable, production-ready SaaS applications where AI is a feature, not a gimmick.

Basic Code Example

This example demonstrates a fundamental pattern for building an AI copywriting feature. We will create a Next.js Server Component that acts as the backend, using the Vercel AI SDK to stream a structured JSON response from an LLM. We will then consume this stream on the client side using the useChat hook to display the generated copy in real-time.

This setup is the architectural backbone of a SaaS copywriter: the server handles the LLM logic and security, while the client provides a responsive UI.

The Architecture

The flow of data in this pattern is linear but relies on two distinct environments (Server and Client) communicating via a streaming HTTP connection.

A linear data flow diagram illustrates how the server handles LLM logic and security while the client provides a responsive UI, communicating via a streaming HTTP connection.
Hold "Ctrl" to enable pan & zoom

A linear data flow diagram illustrates how the server handles LLM logic and security while the client provides a responsive UI, communicating via a streaming HTTP connection.

The Code

This example is split into two parts: the Server Action (handling the AI generation) and the Client Component (handling the UI).

File Structure: 1. app/actions.ts (Server Action) 2. app/page.tsx (Client Component)

// File: app/actions.ts
'use server';

import { generateText, streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

/**
 * Server Action: generateCopy
 * 
 * This function runs exclusively on the server. It accepts a user prompt,
 * constructs a strict JSON schema using Zod, and streams the LLM response.
 * 
 * @param {string} prompt - The user's request (e.g., "A landing page headline for a SaaS").
 * @returns {Promise<ReadableStream>} - A stream of text tokens.
 */
export async function generateCopy(prompt: string) {
  // 1. Define the structure of the expected output using Zod.
  // This acts as our JSON Schema. The LLM will be instructed to match this.
  const copySchema = z.object({
    headline: z.string().describe('The catchy main headline'),
    subheadline: z.string().describe('The explanatory subheadline'),
    cta: z.string().describe('Call to action button text'),
  });

  // 2. Call the LLM using the Vercel AI SDK.
  // We use streamText to return a web-standard ReadableStream.
  const result = await streamText({
    model: openai('gpt-4-turbo-preview'),
    system: 'You are a professional copywriter. Generate marketing copy based on the user prompt.',
    prompt: prompt,

    // 3. Enforce the JSON schema.
    // The SDK automatically constructs the prompt to ask the LLM for this format.
    experimental_providerMetadata: {
      openai: {
        response_format: { type: 'json_object' },
      },
    },
    // Note: In newer SDK versions, you might use 'output' or 'schema' directly in the config.
    // Here we rely on the LLM's ability to follow instructions defined in the prompt context.
    // For strict schema enforcement, we often append instructions to the system prompt.
    prompt: `Generate copy for: ${prompt}. Output strictly as JSON matching: ${JSON.stringify(copySchema.shape)}`,
  });

  // 4. Return the stream to the client.
  // The AI SDK returns a ReadableStream<string> which we can pipe to the HTTP response.
  return result.toAIStream();
}
// File: app/page.tsx
'use client';

import { useChat } from 'ai/react';
import { generateCopy } from './actions';
import { useState } from 'react';

export default function CopywriterPage() {
  // 1. Initialize the useChat hook.
  // This hook manages the message history, input state, and the streaming response.
  const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({
    // We point the hook to our custom Server Action.
    api: '/api/generate-copy', // We will create a route handler for this
  });

  // Local state to parse the JSON stream (optional, but good for demonstration)
  const [parsedCopy, setParsedCopy] = useState<any>(null);

  // Note: In a real app, we would route this through a Next.js API Route 
  // that calls the 'generateCopy' server action, or use the 'useActionState' hook.
  // For this 'Hello World', we simulate the client-server interaction via a standard API route.

  return (
    <div className="max-w-2xl mx-auto p-8">
      <h1 className="text-2xl font-bold mb-4">AI Copywriter</h1>

      {/* Message Display Area */}
      <div className="border rounded p-4 h-64 overflow-y-auto mb-4 bg-gray-50">
        {messages.map((m, index) => (
          <div key={index} className="mb-2">
            <strong className="block text-sm text-gray-600">
              {m.role === 'user' ? 'You:' : 'AI:'}
            </strong>
            <span className="text-gray-800">
              {m.content}
            </span>
          </div>
        ))}
        {isLoading && (
          <div className="text-gray-400 italic">Generating...</div>
        )}
      </div>

      {/* Input Form */}
      <form onSubmit={handleSubmit} className="flex gap-2">
        <input
          type="text"
          value={input}
          onChange={handleInputChange}
          placeholder="Describe your product..."
          className="flex-1 border p-2 rounded"
          disabled={isLoading}
        />
        <button 
          type="submit" 
          className="bg-blue-600 text-white px-4 py-2 rounded hover:bg-blue-700 disabled:opacity-50"
          disabled={isLoading}
        >
          Generate
        </button>
      </form>
    </div>
  );
}
// File: app/api/generate-copy/route.ts
// This is the bridge between the client hook and the server action.
// In Next.js App Router, we use Route Handlers for API endpoints.

import { generateCopy } from '@/app/actions';
import { StreamingTextResponse } from 'ai';

export async function POST(req: Request) {
  const { prompt } = await req.json();

  if (!prompt) {
    return new Response('Prompt is required', { status: 400 });
  }

  // Call the server action (which returns a ReadableStream)
  const stream = await generateCopy(prompt);

  // Return the stream as a standard HTTP response
  return new StreamingTextResponse(stream);
}

Line-by-Line Explanation

1. The Server Action (app/actions.ts)

  • 'use server';: This directive marks the file (or specific functions) as Server Actions. It allows client components to call these functions directly via RPC (Remote Procedure Call) without manually creating API endpoints.
  • Imports: We import streamText from ai (the SDK core), openai (the provider), and z (Zod for schema validation).
  • copySchema: We define a Zod object. This is crucial for the "JSON Schema Output" concept. While the LLM generates text, we use this schema to instruct the model on the required fields (headline, subheadline, cta). In production, you might use a library like jsonrepair to handle minor hallucinations, but the schema minimizes them.
  • streamText: This is the core function of the Vercel AI SDK.
    • model: Specifies the LLM (GPT-4 in this case).
    • prompt: The user input.
    • experimental_providerMetadata: This specific configuration tells the OpenAI provider to request a json_object response format. This is a provider-specific feature that wraps the LLM call in instructions to output valid JSON.
  • result.toAIStream(): The SDK returns a result object containing a stream. We convert this into a standard Web ReadableStream. This allows the data to be sent over HTTP chunk-by-chunk (streaming) rather than waiting for the entire response.

2. The API Route (app/api/generate-copy/route.ts)

  • Standard API Endpoint: Even though we have a Server Action, using a Route Handler is often more robust for streaming in Next.js (especially regarding timeouts and edge compatibility).
  • POST Handler: Receives the request from the client.
  • StreamingTextResponse: A helper from the AI SDK that wraps the ReadableStream into a valid HTTP Response object, setting the correct headers (Content-Type: text/plain; charset=utf-8 and Transfer-Encoding: chunked).

3. The Client Component (app/page.tsx)

  • 'use client';: This marks the component as a Client Component, allowing the use of React hooks like useState and useChat.
  • useChat Hook:
    • This hook abstracts away the complexity of managing a WebSocket or HTTP stream.
    • It automatically handles the POST request to the specified api endpoint.
    • It updates the messages array as tokens arrive from the stream.
  • handleSubmit: Intercepts the form submission, sends the input value to the server, and begins listening for the stream.
  • messages.map: We render the conversation history. The AI's response (m.content) is updated in real-time as the stream chunks arrive, creating the "typewriter" effect.

Common Pitfalls

When building this pattern, developers often encounter specific issues related to streaming and server-side execution.

  1. Vercel/AI SDK Timeouts (The 10s Limit)

    • Issue: Vercel's Hobby plan has a 10-second execution limit for Serverless Functions. LLMs can be slow to generate the first token (cold starts) or for long responses.
    • Symptom: The stream cuts off abruptly, or the request fails with a 504 Gateway Timeout.
    • Fix: Use Edge Runtime or Vercel Pro/Enterprise. If using Edge, ensure your dependencies (like zod or ai) are compatible. In the Route Handler, add:
      export const runtime = 'edge'; // Use Edge runtime
      
    • Note: Edge runtime has a smaller bundle size but restricts access to certain Node.js APIs (like fs).
  2. JSON Hallucination & Parsing Errors

    • Issue: Even with json_object mode enabled, LLMs can output invalid JSON (e.g., trailing commas, unquoted keys, or text before/after the JSON block).
    • Symptom: JSON.parse() throws an error on the client side, crashing the UI.
    • Fix:
      • Server-side: Use streamText with experimental_providerMetadata to force the format.
      • Client-side: Do not parse the raw stream immediately. Instead, stream the text to the UI, and only attempt to parse the final accumulated string if you need structured data. Alternatively, use a library like jsonrepair on the client side before parsing.
  3. Async/Await Loops in Server Components

    • Issue: Trying to use await inside a loop when generating multiple variations of copy (e.g., generating 5 headlines one by one).
    • Symptom: Poor performance; the user waits for the total duration of all requests.
    • Fix: Use Promise.all() to run LLM calls in parallel.
      // BAD
      const headlines = [];
      for (const idea of ideas) {
        headlines.push(await generateHeadline(idea));
      }
      
      // GOOD
      const headlines = await Promise.all(ideas.map(idea => generateHeadline(idea)));
      
    • Warning: Parallel calls increase token usage and costs rapidly. Implement rate limiting (e.g., via Vercel KV or Upstash Redis) to prevent abuse.
  4. Missing 'use server' Directive

    • Issue: Attempting to call streamText directly from a Client Component.
    • Symptom: ReferenceError: window is not defined or API keys being exposed in the browser console.
    • Fix: Ensure the function containing the LLM logic is marked with 'use server' or resides within a Route Handler (app/api/). Never import API keys into client-side code.

The chapter continues with advanced code, exercises and solutions with analysis, you can find them on the ebook on Leanpub.com or Amazon


Loading knowledge check...



Code License: All code examples are released under the MIT License. Github repo.

Content Copyright: Copyright © 2026 Edgar Milvus | Privacy & Cookie Policy. All rights reserved.

All textual explanations, original diagrams, and illustrations are the intellectual property of the author. To support the maintenance of this site via AdSense, please read this content exclusively online. Copying, redistribution, or reproduction is strictly prohibited.