Template-Based Prompt Rendering in Multi-Step AI Orchestration ·

Table of Contents

Building intelligent content generation pipelines with maintainable, human-readable prompt templates

When building AI-powered applications that go beyond simple chat interfaces, you quickly realize that prompt engineering isn’t just about writing good prompts; it’s about managing them. In complex workflows where multiple AI calls chain together, each with different parameters, models, and contextual data, the naive approach of hardcoding prompts as strings becomes a maintenance nightmare.

This article explores a template-based prompt rendering system designed for multi-step AI orchestration. We’ll dive into the architecture of a real-world social media content generation platform that transforms complex business requirements into publishable content across multiple platforms, all powered by a surprisingly elegant prompt management system.

The problem: when prompts become code smell
#

Imagine building an AI system that generates a complete social media posting plan. This isn’t a single API call; it’s an orchestrated dance of multiple AI interactions:

Analyze media attached by the user
Generate strategic pillars based on brand identity, goals, and events
Distribute content across platforms and timeframes
Create platform-specific content (Instagram captions, LinkedIn posts, story scripts)

Each step requires different context, different AI models, different parameters. The traditional approach looks something like this:

prompt := fmt.Sprintf(`You are a social media content creator.
Create an Instagram caption for a post about %s.
Brand voice: %s
Target audience: %s
Include %d hashtags.`,
    productName, brandVoice, audience, hashtagCount)

response, err := llmClient.Complete(ctx, prompt)

This becomes unwieldy fast. The prompt logic mixes with business logic. Changing a prompt requires modifying Go code. Non-technical team members can’t contribute to prompt improvements. Testing becomes a nightmare.

There had to be a better way.

The solution: using template files
#

The architecture we’ll explore treats prompts as standalone, version-controllable artifacts: Markdown files with embedded metadata and template logic. Here’s what a prompt template looks like:

---
temperature: 0.8
max_tokens: 1000
model: "anthropic/claude-sonnet-4"
---

<system>
You are a professional social media copywriter specialized in {{.Platform}}.
Your writing style is {{.BrandVoice}}.
</system>

<user>
Create an engaging caption for the following post:

--- POST DESCRIPTION ---

{{.Description}}

--- END POST DESCRIPTION ---

{{if .TargetAudience}}
--- TARGET AUDIENCE ---

{{.TargetAudience}}

--- END TARGET AUDIENCE ---
{{end}}

{{if .Products}}
--- FEATURED PRODUCTS ---

{{ range $product := .Products }}
- {{$product.Name}}: {{$product.Description}}
{{ end }}

--- END FEATURED PRODUCTS ---
{{end}}

Requirements:
- Include 3-5 relevant hashtags
- Add an engaging call-to-action
- Match the brand voice
- Keep it under 200 characters for optimal engagement
</user>

This single file contains everything the AI call needs to know: the model to use, the temperature setting, the system and user messages, and dynamic placeholders for runtime data.

Anatomy of a prompt template
#

Let’s break down the three core components:

1. YAML front matter: the configuration layer
#

---
temperature: 0.7
max_tokens: 6000
model: "anthropic/claude-sonnet-4"
---

The front matter defines the “how” of the AI call. This isn’t just convenience; it’s strategic. Different tasks require different model characteristics:

Creative writing (captions, story scripts): Higher temperature, creative models
Analytical tasks (distribution planning): Lower temperature, reasoning-focused models
Vision tasks (image description): Multimodal models with specific capabilities

By embedding these parameters in the template, you create self-describing prompts. A team member reviewing the prompt file immediately understands not just what the AI is asked to do, but how it’s configured to respond.

2. System and user sections: clear role separation
#

<system>
You are a professional social media copywriter specialized in {{.Platform}}.
Your writing style is {{.BrandVoice}}.
</system>

<user>
Create an engaging caption for the following post:

{{.Description}}

Include 3-5 relevant hashtags and a call-to-action.
</user>

The explicit <system> and <user> tags map directly to the message structure of modern LLM APIs. This isn’t just syntactic sugar; it enforces clear separation between:

System instructions: Persistent context, role definition, behavioral constraints
User content: Dynamic, request-specific data and tasks

This separation makes prompts easier to audit, debug, and optimize. You can modify the system prompt independently from the data injection logic.

3. Go templates: the dynamic engine
#

The real power comes from Go’s template engine. The prompts support:

Conditional rendering:

{{if .Products}}
--- FEATURED PRODUCTS ---
{{ range $product := .Products }}
- {{$product.Name}}: {{$product.Description}}
{{end}}
--- END FEATURED PRODUCTS ---
{{end}}

Iteration with data handling:

{{ range $index, $hashtag := .SuggestedHashtags }}
#{{$hashtag}}{{if ne $index (len $.SuggestedHashtags | minus 1)}} {{end}}
{{end}}

Date formatting:

Post scheduled for: {{.PublishDate.Format "January 30, 2026"}}

Conditional field handling:

Target Audience: {{if .TargetAudience}}{{.TargetAudience}}{{else}}General audience{{end}}
Brand Voice: {{.BrandVoice}}
Platform: {{.Platform}}

This creates prompts that are truly dynamic, adapting not just to different data, but to the shape of that data.

The prompt parser: where magic meets engineering
#

The template system is powered by a dedicated parser that transforms raw Markdown into executable prompt components.

The parser performs three critical operations:

Extracts YAML front matter using delimiter detection (---)
Parses XML-like sections using regex to separate system and user content
Executes Go templates with the provided data context

The result is a Prompt struct containing:

type PromptAttributes struct {
    Temperature *float64
    Model       *string
    MaxTokens   *int `yaml:"max_tokens"`
}

type Prompt struct {
    Attributes PromptAttributes
    System     *string
    User       *string
}

This struct becomes the single source of truth for the LLM service call.

(The parser is available as a separate package – see the Appendix below)

The step pattern: input -> process -> output
#

With the prompt system in place, the architecture uses a consistent “Step” pattern for each AI interaction. Let’s look at a simplified example: generating a social media caption.

package contentgen

import (
    "context"
    _ "embed"
)

//go:embed prompts/generate_caption.md
var generateCaptionPrompt string

type Caption struct {
    Text     string   `json:"text" jsonschema_description:"The main caption text"`
    Hashtags []string `json:"hashtags" jsonschema_description:"Relevant hashtags"`
    CallToAction string `json:"call_to_action" jsonschema_description:"Engagement prompt"`
}

func GenerateCaption(
    ctx context.Context,
    llmService LLMService,
    postDescription string,
    brandVoice string,
    platform string,
) (*Caption, error) {

    // Prepare input data for the template
    input := map[string]interface{}{
        "Description": postDescription,
        "BrandVoice":  brandVoice,
        "Platform":    platform,
    }

    caption := &Caption{}

    // Execute the prompt template and get structured output
    err := llmService.ChatCompletionWithStructuredOutput(
        ctx,
        input,
        caption,
        generateCaptionPrompt,
    )
    if err != nil {
        return nil, err
    }

    return caption, nil
}

And here’s what the corresponding prompt template looks like:

---
temperature: 0.8
max_tokens: 1000
model: "anthropic/claude-sonnet-4"
---

<system>
You are a professional social media copywriter specialized in {{.Platform}}.
Your writing style is {{.BrandVoice}}.
</system>

<user>
Create an engaging caption for the following post:

{{.Description}}

Format your response with the caption text, relevant hashtags, and a call-to-action.
</user>

Key architectural elements
#

1. Embedded Prompts via //go:embed

//go:embed prompts/generate_caption.md
var generateCaptionPrompt string

Go’s embed directive compiles the prompt file directly into the binary. The prompt is version-controlled alongside the code, but remains easily editable as a separate file during development.

2. Typed Output Structures

type Caption struct {
    Text         string   `json:"text" jsonschema_description:"The main caption text"`
    Hashtags     []string `json:"hashtags" jsonschema_description:"Relevant hashtags"`
    CallToAction string   `json:"call_to_action" jsonschema_description:"Engagement prompt"`
}

The output isn’t a raw string; it’s a typed Go struct. The jsonschema_description tags serve double duty:

They document the expected output for developers
They’re used to generate JSON Schema for structured output enforcement, ensuring the AI returns properly formatted data

3. Flexible Input Mapping

input := map[string]interface{}{
    "Description": postDescription,
    "BrandVoice":  brandVoice,
    "Platform":    platform,
}

The input is a dynamic map, allowing each step to customize exactly what data flows into its prompt template. The template’s {{.FieldName}} placeholders reference these keys directly, making the connection between code and template explicit.

The LLM service: structured output and automatic retries
#

The core AI interaction happens through a dedicated LLM service that wraps complexity into a clean interface. The public API is straightforward:

type LLMService interface {
    ChatCompletionWithStructuredOutput(
        ctx context.Context,
        inputData any,
        outputModel any,
        promptStr string,
    ) error
}

The implementation handles three key responsibilities:

1. Automatic Retry Logic

func (s *LLMService) ChatCompletionWithStructuredOutput(
    ctx context.Context,
    inputData any,
    outputModel any,
    promptStr string,
) error {
    const maxRetries = 3

    for attempt := 1; attempt <= maxRetries; attempt++ {
        err := s.executeCompletion(ctx, inputData, outputModel, promptStr)
        if err == nil {
            return nil
        }

        if attempt == maxRetries {
            return fmt.Errorf("failed after %d attempts: %w", maxRetries, err)
        }
    }

    return nil
}

Transient failures (rate limits, network issues) are handled transparently, improving reliability without burdening calling code with retry logic.

2. Template Parsing and Rendering

func (s *LLMService) executeCompletion(
    ctx context.Context,
    inputData any,
    outputModel any,
    promptStr string,
) error {
    // Parse the template with provided data
    prompt, err := promptparser.Parse(promptStr, inputData)
    if err != nil {
        return fmt.Errorf("failed to parse prompt: %w", err)
    }

    // Extract configuration from the parsed prompt
    modelName := prompt.Attributes.Model
    temperature := prompt.Attributes.Temperature
    maxTokens := prompt.Attributes.MaxTokens

    // Build the conversation
    messages := []Message{
        {Role: "system", Content: *prompt.System},
        {Role: "user", Content: *prompt.User},
    }

    // Call the LLM API with structured output
    return s.callLLMWithStructuredOutput(
        ctx,
        messages,
        modelName,
        temperature,
        maxTokens,
        outputModel,
    )
}

The service acts as the bridge between template-based prompts and the underlying LLM API, handling all the parsing and configuration extraction.

3. Usage Tracking

func (s *LLMService) callLLMWithStructuredOutput(
    ctx context.Context,
    messages []Message,
    model string,
    temperature float64,
    maxTokens int,
    outputModel any,
) error {
    response, err := s.llmClient.CreateStructuredCompletion(
        ctx,
        model,
        messages,
        temperature,
        maxTokens,
        outputModel,
    )
    if err != nil {
        return err
    }

    // Track token usage for cost monitoring and optimization
    s.trackUsage(ctx, Usage{
        Model:            model,
        PromptTokens:     response.PromptTokens,
        CompletionTokens: response.CompletionTokens,
        TotalCost:        calculateCost(model, response.PromptTokens, response.CompletionTokens),
    })

    return nil
}

Every API call logs token consumption and estimated costs, enabling data-driven optimization of prompt templates and model selection.

What makes this architecture outstanding
#

Having explored the internals, let’s highlight what sets this approach apart:

1. Separation of concerns
#

Prompts live in .md files. Business logic lives in Go code. Model configuration lives in YAML. Each concern is addressable independently:

Prompt engineers can refine prompts without touching code
Developers can modify orchestration without breaking prompts
Ops teams can adjust model configurations without deployments (via configuration overrides)

2. Testability
#

Each step is a pure function: given inputs and a prompt, produce structured output. This enables:

Unit testing with mocked LLM responses
Integration testing of prompt templates with sample data
Regression testing when prompts change

3. Observability
#

Every AI interaction logs:

Full rendered prompts (in debug mode)
Token consumption
Estimated costs
Response latency

This visibility is crucial for optimizing both quality and costs.

4. Progressive enhancement
#

Need to add a new platform? Implement the connector interface. Need a new content type? Add a new step. Need to improve caption quality? Edit the Markdown prompt file. The architecture accommodates growth without requiring rewrites.

5. Human-readable prompts
#

The Markdown format means prompts are documentation. Reading generate_caption.md tells you exactly what the AI is asked to do, complete with all the instructions, constraints, and formatting rules. No need to trace through code to understand AI behavior.

Lessons learned and best practices
#

Building and maintaining this system revealed several insights:

1. Prompt templates should be comprehensive but not exhaustive. Include all necessary context, but don’t overload with edge cases. The AI handles unexpected inputs better than rigid rule lists.

2. Use clear delimiters for data sections.

--- POST DESCRIPTION ---
{{.Description}}
--- END POST DESCRIPTION ---

These visual markers help both humans and AI understand data boundaries, improving output reliability.

3. Embrace optional sections.

{{if .TargetAudience}}
--- TARGET AUDIENCE ---
{{.TargetAudience}}
--- END TARGET AUDIENCE ---
{{end}}

Prompts should gracefully handle missing data rather than failing or producing placeholder text.

4. Version your prompts like code. Prompt changes affect output quality. Track them, review them, test them.

5. Monitor token consumption aggressively. Template-based prompts can grow surprisingly large. Establish budgets and alerts.

Conclusion: templates as the foundation for AI engineering
#

The shift from hardcoded prompts to template-based rendering isn’t just a code organization improvement; it’s a fundamental change in how we approach AI application development.

By treating prompts as first-class, configurable, version-controlled artifacts, we gain:

Maintainability without sacrificing flexibility
Clarity without losing dynamic capability
Collaboration between technical and non-technical team members
Observability into AI behavior and costs

As AI applications grow more sophisticated, with longer chains, more models, and higher stakes, this kind of engineering rigor becomes not just helpful, but essential.

The system described here powers real-world social media content generation at scale. It’s been battle-tested with thousands of AI calls, dozens of prompt iterations, and multiple model migrations. The template architecture has proven its worth not just in the happy path, but in the messy reality of production AI systems.

The future of AI engineering isn’t just about writing better prompts; it’s about building better systems for managing them.

Appendix: The Prompt Parser Implementation
#

The parser referenced above is available as a standalone package. Here is a simplified implementation:

package promptparser

func Parse(promptStr string, data any) (*Prompt, error) {
    // Split the prompt into front matter and content
    frontMatter, content, err := extractFrontMatter(promptStr)
    if err != nil {
        return nil, fmt.Errorf("failed to extract front matter: %w", err)
    }
    // Parse front matter into PromptAttributes
    var promptAttributes PromptAttributes
    if frontMatter != "" {
        if err := yaml.Unmarshal([]byte(frontMatter), &promptAttributes); err != nil {
            return nil, fmt.Errorf("failed to parse front matter: %w", err)
        }
    }
    // Extract system and user sections
    systemContent, userContent, err := extractSections(content)
    if err != nil {
        return nil, fmt.Errorf("failed to extract sections: %w", err)
    }
    // Execute templates with provided data
    var system, user *string
    if systemContent != "" {
        renderedSystem, err := executeTemplate("system", systemContent, data)
        if err != nil {
            return nil, fmt.Errorf("failed to execute system template: %w", err)
        }
        system = &renderedSystem
    }
    if userContent != "" {
        renderedUser, err := executeTemplate("user", userContent, data)
        if err != nil {
            return nil, fmt.Errorf("failed to execute user template: %w", err)
        }
        user = &renderedUser
    }
    return &Prompt{
        Attributes: promptAttributes,
        System:     system,
        User:       user,
    }, nil
}

The problem: when prompts become code smell#

The solution: using template files#

Anatomy of a prompt template#

1. YAML front matter: the configuration layer#

2. System and user sections: clear role separation#

3. Go templates: the dynamic engine#

The prompt parser: where magic meets engineering#

The step pattern: input -> process -> output#

Key architectural elements#

The LLM service: structured output and automatic retries#

What makes this architecture outstanding#

1. Separation of concerns#

2. Testability#

3. Observability#

4. Progressive enhancement#

5. Human-readable prompts#

Lessons learned and best practices#

Conclusion: templates as the foundation for AI engineering#

Appendix: The Prompt Parser Implementation#

Related