Chapter 12: Stepwise Planner vs Handlebars Planner

Theoretical Foundations

At the heart of every AI agent lies a fundamental question: How does the agent decide what to do next? In the context of the Microsoft Semantic Kernel, this question is answered by the planner. A planner is not merely a tool; it is the cognitive architecture that orchestrates the flow of logic, data, and execution. It transforms a high-level intent—such as "summarize the latest sales report and draft an email"—into a concrete sequence of executable steps.

To understand the distinction between the Stepwise Planner and the Handlebars Planner, we must first visualize the two distinct philosophies of reasoning they represent. One is a meticulous engineer drafting a blueprint; the other is a jazz musician improvising a melody based on a chord chart.

The Stepwise Planner is depicted as a meticulous engineer carefully drafting a detailed blueprint, while the Handlebars Planner is shown as a jazz musician freely improvising a melody based on a chord chart.

The Stepwise Planner: The Deterministic Engineer

The Stepwise Planner operates on a principle of iterative decomposition. It treats a complex problem as a series of smaller, manageable sub-problems that are solved sequentially. This approach is deeply rooted in the concept of Chain of Thought (CoT) prompting, a technique explored in earlier chapters where the Large Language Model (LLM) is encouraged to verbalize its reasoning process before arriving at a conclusion.

Theoretical Foundations

The Stepwise Planner does not attempt to generate the entire execution path in a single pass. Instead, it engages in a dialogue with the LLM. It presents the LLM with the available functions (skills), the current state, and the goal. The LLM responds with a single, next logical step. The planner then executes that step, updates the state, and feeds the result back into the LLM for the next decision.

Observation: The planner observes the current context and the available tools. 2. Reasoning: The LLM generates a "thought" about what to do next. 3. Action: The planner translates this thought into a function call (invoking a native C# method or a prompt function). 4. Result: The execution returns data, which becomes the new observation.

Analogy: The GPS Navigator

Imagine you are driving to a distant city. The Stepwise Planner is like a GPS navigator that recalculates the route at every single intersection. It does not know the entire path to the destination when you start; it only knows the immediate next turn. After you turn, it looks at the new road, checks the traffic, and tells you the next turn. This is incredibly robust because if a road is closed (an execution error), the GPS simply recalculates the next immediate step without needing to rewrite the entire itinerary.

Architectural Implications in C

In the Semantic Kernel, this is implemented via the Kernel and the StepwisePlanner class. The planner utilizes the Kernel's function registry to identify available skills. The critical C# concept here is delegates and function pointers. The planner wraps LLM-generated text into FunctionDelegate calls.

When the LLM outputs a structured response (often using XML or JSON tags to denote thoughts and actions), the Stepwise Planner parses this output. It uses C#’s robust parsing libraries (like System.Text.Json) to extract the function name and arguments. It then locates the corresponding C# method registered in the kernel (e.g., a method decorated with [SKFunction]) and invokes it.

This relies heavily on Interfaces and Dependency Injection. The planner is agnostic to the underlying implementation of the function. Whether the function queries a SQL database via Entity Framework Core or calls an external REST API via HttpClient, the planner treats them as abstract steps in its reasoning process.

Robustness: If the LLM hallucinates an invalid step, the planner can catch the error and ask the LLM to try again. * Transparency: You can see the exact chain of thought the agent took to reach the solution. * Dynamic Adaptation: The agent can react to unexpected data returned from a function call and adjust its subsequent plan on the fly.

Latency: The back-and-forth nature (LLM -> Execution -> LLM) introduces significant latency. * Token Usage: Each iteration consumes tokens for both the prompt and the response, which can be costly.

The Handlebars Planner: The Flexible Architect

The Handlebars Planner takes a radically different approach. Instead of iterative reasoning, it focuses on structural generation. It leverages the Handlebars templating language—a logic-less templating engine—to construct a static execution plan before any code runs.

Theoretical Foundations

The Handlebars Planner prompts the LLM to generate a Handlebars template string. This template acts as a blueprint for the final output. The LLM is instructed to embed the available kernel functions (skills) into this template using specific syntax (e.g., {{function_name arg1 arg2}}).

Once the template is generated, the planner compiles it. It does not execute immediately. Instead, it waits for input data. When data is provided, the planner "renders" the template. During rendering, it encounters the function tags, executes the underlying kernel functions, and injects the results into the final string.

Analogy: The Mad Libs Generator

Think of the Handlebars Planner as a highly sophisticated "Mad Libs" or a mail-merge template. You provide the template: "Dear {{Recipient}}, your order for {{Product}} (Price: {{GetPrice Product}}) is confirmed." The structure is fixed. The LLM helped write the structure, but the execution is deterministic. You don't decide step-by-step; you fill in the blanks of a pre-defined form. If you need to change the logic, you modify the template structure, not the execution flow.

Architectural Implications in C

The Handlebars Planner utilizes the Handlebars.Net library. In C#, this manifests as a compilation step where the string template is converted into an executable delegate.

The key C# feature here is Expression Trees and Delegates. When the template is compiled, Semantic Kernel maps the Handlebars helpers (the function tags) to C# delegates that point to the registered kernel functions.

// Conceptual representation of how Handlebars maps to C# Delegates
// This is not the actual internal code, but illustrates the architectural pattern.

public class HandlebarsPlan
{
    private readonly string _template;
    private readonly Kernel _kernel;

    public HandlebarsPlan(string template, Kernel kernel)
    {
        _template = template;
        _kernel = kernel;
    }

    public string Execute(object context)
    {
        // 1. Compile the template (usually done once, cached for performance)
        var compiledTemplate = Handlebars.Compile(_template);

        // 2. Register helpers that bridge to Kernel Functions
        // This is where the magic happens: mapping template syntax to C# methods
        Handlebars.RegisterHelper("invoke_function", (output, context, args) => 
        {
            string functionName = args[0].ToString();
            // Resolve the function from the Kernel's registry
            var function = _kernel.Functions.GetFunction(functionName);

            // Invoke the function (could be native C# or prompt)
            var result = _kernel.RunAsync(function, new KernelArguments(args.Skip(1).ToArray()));

            output.Write(result.Result);
        });

        // 3. Render the template with the context
        return compiledTemplate(context);
    }
}

Performance: The plan is generated once and can be cached. Subsequent executions are fast because no further LLM calls are needed for planning (unless the template changes). * Predictability: The execution flow is linear and defined by the template structure. It avoids the "reasoning loops" that Stepwise Planners might fall into. * Portability: The generated Handlebars template is a string. It can be saved to a file, version-controlled in Git, and reused across different environments.

Rigidity: The template is static. If the execution encounters a scenario not accounted for in the template structure (e.g., a conditional branch based on dynamic data), the template might fail or produce incorrect output. * Complexity in Generation: The LLM must be highly skilled at generating valid Handlebars syntax. A syntax error in the generated template will cause runtime failures.

Comparative Analysis: The Core Trade-Off

The choice between these two planners is not about which is "better," but which aligns with the volatility of your problem domain.

1. Determinism vs. Flexibility

The Stepwise Planner is flexible. It allows the agent to explore the solution space dynamically. It is ideal for "open world" problems where the path to the solution is unknown at the start. The Handlebars Planner is deterministic. It enforces a strict structure. It is ideal for "closed world" problems where the workflow is known, but the data varies (e.g., generating a report, formatting an email).

2. The Role of the LLM

In the Stepwise Planner, the LLM is the driver. It steers the car at every turn. This requires a powerful, reasoning-capable model (like GPT-4) to minimize hallucinations in the chain of thought. In the Handlebars Planner, the LLM is the architect. It draws the blueprint once. Once the blueprint is drawn, a less capable model could theoretically execute the plan (though Semantic Kernel typically uses the LLM for generation).

3. Error Handling and Recovery

Stepwise: If a function fails, the error message is fed back to the LLM. The LLM analyzes the error and suggests a new step. This is self-healing. * Handlebars: If a function fails during rendering, the entire execution usually halts. The error is thrown to the C# runtime. Recovery requires catching the exception and potentially generating a new template via the LLM.

4. C# Integration Nuances

Stepwise relies on the Kernel's ability to dynamically resolve and invoke functions at runtime. It leverages reflection and dynamic invocation heavily. It feels like an interpreter pattern. * Handlebars relies on string parsing and compilation. It feels more like a compiler pattern, where the template is source code and the execution is the compiled binary.

Real-World Application Scenarios

To solidify these concepts, consider two distinct business requirements.

Scenario A: The Customer Support Triage Agent (Stepwise) A customer types: "My order #12345 hasn't arrived, and I'm angry." Why Stepwise? The agent needs to adapt. If the order is still in the warehouse, it apologizes. If it's lost, it initiates a replacement. The Stepwise Planner allows the LLM to reason through these distinct branches dynamically.

Scenario B: The Daily Sales Report Generator (Handlebars) Why Handlebars? The structure is always the same. We don't need the LLM to reason about how to structure the report every single day. We just need it to generate the template once. Subsequent runs just fill in the data. This is significantly faster and cheaper.

Conclusion

The Stepwise Planner and Handlebars Planner represent two ends of the agentic planning spectrum. The Stepwise Planner offers a reactive, reasoning-heavy approach suitable for complex, dynamic problem solving. The Handlebars Planner offers a proactive, structure-heavy approach suitable for predictable, high-performance workflows.

Understanding these theoretical foundations allows the AI Engineer to architect systems that are not just intelligent, but also efficient and robust. By selecting the right planner, you define the cognitive boundaries of your agent, balancing the fluidity of thought with the rigidity of execution.

Basic Code Example

Here is a self-contained, "Hello World" level code example comparing the Stepwise Planner and the Handlebars Planner within the Microsoft Semantic Kernel.

The Real-World Context

Handlebars Planner is like a Template Engine. It is excellent for structured, predictable outputs. You might use it to generate a formatted HTML email report or a JSON payload for an API. It relies on a strict template with variables ({{variable}}) and logic ({{#if}}). 2. Stepwise Planner is like a Chain of Thought reasoning engine. It is designed for complex, multi-step problems where the exact sequence of steps isn't known upfront. It breaks the request down into logical steps (e.g., "Step 1: Authenticate user," "Step 2: Call Reset Password API," "Step 3: Call Quota API") and executes them sequentially.

In this example, we will ask both planners to solve a simple math problem: "Calculate the square of the sum of 10 and 5."

Code Example

using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Planning;
using Microsoft.SemanticKernel.Planning.Handlebars;
using Microsoft.SemanticKernel.Planning.Stepwise;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using System.ComponentModel;
using System.Text.Json;

// 1. SETUP: Define a simple plugin with mathematical capabilities.
public class MathPlugin
{
    [KernelFunction, Description("Adds two numbers")]
    public double Add(double number1, double number2) => number1 + number2;

    [KernelFunction, Description("Multiplies two numbers")]
    public double Multiply(double number1, double number2) => number1 * number2;
}

// 2. MAIN PROGRAM: Comparing the two planners
class Program
{
    static async Task Main(string[] args)
    {
        // Configuration: Using a local model or a mock for demonstration.
        // In production, replace with AzureOpenAIConfig or OpenAIConfig.
        var kernel = Kernel.CreateBuilder()
            .AddOpenAIChatCompletion(
                modelId: "gpt-3.5-turbo", // Or any model supporting function calling
                apiKey: "fake-key-for-demo") // Placeholder for dependency injection
            .Build();

        // Register the Math Plugin
        var mathPlugin = new MathPlugin();
        kernel.ImportPluginFromObject(mathPlugin, "Math");

        // The user request
        string request = "Calculate the square of the sum of 10 and 5";

        Console.WriteLine($"--- Request: {request} ---\n");

        // ============================================================
        // STRATEGY 1: HANDLEBARS PLANNER
        // ============================================================
        Console.WriteLine("=== HANDLEBARS PLANNER ===");

        var handlebarsConfig = new HandlebarsPlannerOptions
        {
            // Handlebars is strict; we limit iterations to prevent infinite loops in simple demos
            MaxIterations = 10 
        };

        var handlebarsPlanner = new HandlebarsPlanner(handlebarsConfig);

        try 
        {
            // Generate the plan (the Handlebars template)
            HandlebarsPlan handlebarsPlan = await handlebarsPlanner.CreatePlanAsync(kernel, request);

            Console.WriteLine("Generated Handlebars Template:");
            Console.WriteLine(handlebarsPlan.ToString()); // Prints the template string

            Console.WriteLine("\nExecuting Handlebars Plan...");

            // Execute the plan. Handlebars requires input variables if the template expects them.
            // For this simple math problem, the plan usually embeds the numbers or infers them.
            var handlebarsResult = await handlebarsPlan.InvokeAsync(kernel, new KernelArguments());

            Console.WriteLine($"Result: {handlebarsResult}");
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Handlebars Plan Failed: {ex.Message}");
            Console.WriteLine("Note: Handlebars often requires specific model support for generating valid templates.");
        }

        Console.WriteLine(new string('-', 40));

        // ============================================================
        // STRATEGY 2: STEPWISE PLANNER
        // ============================================================
        Console.WriteLine("\n=== STEPWISE PLANNER ===");

        var stepwiseConfig = new StepwisePlannerOptions
        {
            MaxIterations = 5,
            MaxTokens = 2000
        };

        var stepwisePlanner = new StepwisePlanner(kernel, stepwiseConfig);

        try
        {
            // Generate the plan (a sequence of reasoning steps)
            var stepwisePlan = stepwisePlanner.CreatePlan(request);

            Console.WriteLine("Stepwise Plan Created. Executing...");

            // Execute the plan
            var stepwiseResult = await stepwisePlan.InvokeAsync(kernel, new KernelArguments());

            Console.WriteLine($"Final Result: {stepwiseResult.Result}");

            // Stepwise Planner provides rich metadata about the steps taken
            Console.WriteLine("\n--- Steps Taken ---");
            if (stepwiseResult.StepsTaken != null)
            {
                foreach (var step in stepwiseResult.StepsTaken)
                {
                    Console.WriteLine($"- {step.Thought}");
                    if (!string.IsNullOrEmpty(step.Action))
                        Console.WriteLine($"  Action: {step.Action}");
                }
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Stepwise Plan Failed: {ex.Message}");
        }
    }
}

Detailed Line-by-Line Explanation

using directives: We import the core Semantic Kernel namespaces, specifically Planning (for the planners), Connectors.OpenAI (for the LLM backend), and System.ComponentModel (for describing functions). * MathPlugin Class: This is a standard C# class decorated with [KernelFunction] attributes. The Semantic Kernel treats these as callable tools. * Add: Sums two doubles. * Multiply: Multiplies two doubles. * Why this matters: Planners do not calculate math themselves; they orchestrate calls to plugins that perform the calculations.

Kernel.CreateBuilder(): The modern .NET way to construct the Kernel. * .AddOpenAIChatCompletion(...): We configure the chat completion service. Even though we provide a "fake" API key here, in a real scenario, this connects to Azure OpenAI or OpenAI. * Note: Planners rely heavily on the LLM's ability to generate JSON schemas or structured text based on available functions.

HandlebarsPlanner: This planner attempts to generate a Handlebars template string that calls the registered plugins. * CreatePlanAsync: The LLM analyzes the user request and the available plugins (Math.Add, Math.Multiply) to produce a template. * Output Expectation: A string looking like {{Math.Add 10 5}} piped into {{Math.Multiply ...}}. * plan.InvokeAsync: The Semantic Kernel executes the generated template string. Handlebars is a logic-less templating language, so the "execution" is purely variable substitution and function calls defined within the template syntax. * Constraint: Handlebars plans are often brittle if the LLM generates invalid syntax (e.g., missing closing braces).

StepwisePlanner: This planner uses the LLM's reasoning capabilities to generate a "Chain of Thought" before acting. * CreatePlan: Unlike Handlebars, this doesn't generate a template. It generates an internal representation of the problem. However, the execution happens dynamically. * InvokeAsync: The Stepwise planner executes in a loop: 1. Ask the LLM: "Given the history, what is the next step?" 2. The LLM responds with a "Thought" and an "Action" (calling a plugin). 3. The Kernel executes the action. 4. The result is fed back into the LLM. 5. Repeat until the LLM determines the goal is met. * StepsTaken: The result object contains a history of the reasoning steps, which is crucial for debugging and transparency.

Hallucinated Plugins (Handlebars): * The Mistake: The Handlebars Planner might generate a template calling a function that doesn't exist (e.g., {{Math.Square 5}}) if the LLM hallucinates capabilities. * The Fix: Ensure your plugin descriptions are precise. Use strict mode or validation steps to check the generated template against available functions before execution.

Infinite Loops (Stepwise):
- The Mistake: The Stepwise Planner gets stuck in a reasoning loop (e.g., "I need to calculate X... I will calculate X... I have calculated X... Now I need to calculate X").
- The Fix: Always set MaxIterations (as done in the code). In production, implement a "circuit breaker" that detects repeated identical actions.
Model Compatibility:
- The Mistake: Using a model that doesn't support function calling (like older GPT-3 models) for these planners.
- The Fix: Both planners rely on the LLM's ability to understand function schemas. Use models like gpt-4, gpt-3.5-turbo (or newer), or open-source equivalents that support tool calling.
State Management:
- The Mistake: Assuming the planner remembers context across different user sessions.
- The Fix: Planners are stateless. You must manage KernelArguments or ContextVariables manually and pass them in during every InvokeAsync call.

Visualization of Execution Flow

The following diagram illustrates how the Stepwise Planner loops through reasoning and action, compared to the linear execution of a Handlebars template.

A diagram contrasting the Stepwise Planner's iterative loop of reasoning and action with the linear, sequential execution flow of a Handlebars template.

The chapter continues with advanced code, exercises and solutions with analysis, you can find them on the ebook on Leanpub.com or Amazon

Loading knowledge check...

Code License: All code examples are released under the MIT License. Github repo.

All textual explanations, original diagrams, and illustrations are the intellectual property of the author. To support the maintenance of this site via AdSense, please read this content exclusively online. Copying, redistribution, or reproduction is strictly prohibited.