Pipulate Technical Architecture

Pipulate represents a distinctive approach to building SEO applications — one that deliberately prioritizes simplicity, observability, and user control over conventional enterprise patterns. This document explores Pipulate’s architecture for developers and technical SEOs interested in porting Jupyter notebooks to user-friendly web applications.

Architecture Overview

Pipulate was designed based on several key architectural decisions and principles:

                 ┌─────────────┐ Like Electron, but full Linux subsystem 
                 │   Browser   │ in a folder for macOS and Windows (WSL)
                 └─────┬───────┘
                       │ HTTP/WS
                       ▼
    ┌───────────────────────────────────────┐
    │           Nix Flake Shell             │ - In-app LLM (where it belongs)
    │  ┌───────────────┐  ┌──────────────┐  │ - 100% reproducible
    │  │   FastHTML    │  │    Ollama    │  │ - 100% local
    │  │   HTMX App    │  │  Local LLM   │  │ - 100% multi-OS    
    │  └───────┬───────┘  └──────────────┘  │
    │          │                            │
    │    ┌─────▼─────┐     ┌────────────┐   │
    │    │MiniDataAPI│◄───►│ SQLite DB  │   │
    │    └───────────┘     └────────────┘   │
    └───────────────────────────────────────┘

Core Tenets

  1. Local-First & Single-Tenant: Your data, your code, your hardware. This guarantees privacy, performance, and eliminates cloud costs or vendor lock-in.

  2. Simplicity & Observability (“Know EVERYTHING!”): We deliberately avoid complex enterprise patterns (heavy ORMs, message queues, client-side state management, build steps) in favor of transparent server-side state management.

  3. Reproducibility: Nix Flakes guarantee identical development and runtime environments across macOS, Linux, and Windows (WSL), solving the “works on my machine” problem.

  4. Future-Proofing: We rely on durable technologies: standard HTTP/HTML (via HTMX), Python (supercharged by AI), Nix (for universal environments), and local AI (Ollama).

  5. WET Workflows, DRY CRUD: Workflows are intentionally explicit and step-by-step (Write Everything Twice/Explicit), making them easy to port from notebooks and debug. Standard CRUD operations leverage a reusable BaseCrud class (Don’t Repeat Yourself).

Explore our architectural philosophy in depth on our blog →

Critical Implementation Patterns for LLMs

These patterns are essential for LLMs working with Pipulate and are frequently missed:

1. The Auto-Key Generation Pattern (MOST CRITICAL)

When a user hits Enter on an empty key field, this specific sequence occurs:

  1. Form Submission: POSTs to /{APP_NAME}/init with empty pipeline_id
  2. Server Response: The init method MUST return an HX-Refresh response:
    if not user_input:
        from starlette.responses import Response
        response = Response('')
        response.headers['HX-Refresh'] = 'true'
        return response
    
  3. Page Reload: HTMX triggers a full page reload
  4. Auto-Key Population: The landing() method calls pip.generate_pipeline_key(self) to populate the input field
  5. User Interaction: User hits Enter again to start the workflow

Critical Implementation Details:

2. APP_NAME vs. Filename Distinction

Critical for data integrity:

Critical Rule: Never change APP_NAME after workflows have been created, or existing workflow data will be orphaned.

3. Plugin Discovery System

# Plugin Discovery Flow
plugins/
├── 010_tasks.py          → Registered as "tasks" (position 10)
├── 020_hello_workflow.py → Registered as "hello_workflow" (position 20)
├── xx_experimental.py    → Skipped (development prefix)
├── test (Copy).py        → Skipped (parentheses)
└── 999_advanced.py       → Registered as "advanced" (position 999)

Technology Stack

FastHTML

FastHTML is a Python web framework that prioritizes simplicity. It generates HTML directly from Python objects (no template language like Jinja2) and minimizes JavaScript by design.

from fasthtml.common import *

# Create app with SQLite database
app, rt, users, User = fast_app('data.db', users={'username': str})

@rt('/')
def get():
    return HTML(
        Body(
            Main(
                H1("User List"),
                Form(
                    Input(name="username", placeholder="New user"),
                    Button("Add", type="submit"),
                    hx_post="/add-user",
                    hx_target="#user-list",
                    hx_swap="innerHTML"
                ),
                Ul(
                    id="user-list",
                    *[Li(user.username) for user in users()]
                )
            )
        )
    )

@rt('/add-user', methods=['POST'])
def add_user(username: str = ""):
    if username:
        users.insert(username=username)
    return Ul(*[Li(user.username) for user in users()])

HTMX Integration

HTMX enables dynamic, interactive UIs directly in HTML via attributes, minimizing the need for custom JavaScript. Pipulate uses it for server-rendered HTML updates. This approach:

                        HTMX+Python enables a world-class
                 Python front-end Web Development environment.
                             ┌─────────────────────┐
                             │    Navigation Bar   │  - No template language (like Jinja2)
                             ├─────────┬───────────┤  - HTML elements are Python functions
  Simple Python back-end     │  Main   │   Chat    │  - Minimal custom JavaScript
  HTMX "paints" HTML into    │  Area   │ Interface │  - No React/Vue/Angular overhead
  the DOM on demand──────►   │         │           │  - No virtual DOM, JSX, Redux, etc.
                             └─────────┴───────────┘

MiniDataAPI

MiniDataAPI provides simple, dictionary-based interaction with SQLite tables:

# Example unpacking from server.py
app, rt, (store, Store), (tasks, Task) = fast_app(
    "data/data.db",  
    # Schema definitions as keyword arguments:
    store={'key': str, 'value': str, 'pk': 'key'},
    task={'id': int, 'name': str, 'done': bool, 'pk': 'id'}
)

# To use:
tasks.insert(name="New task", done=False)
all_tasks = tasks()  # Fetch all
one_task = tasks(1)  # Fetch by ID
done_tasks = tasks.xtra(name='Charlie')

Ollama for Local LLMs

Ollama allows running AI models locally, providing:

                   ┌──────────────────┐
                   │   Local Ollama   │ - No API keys needed
                   │      Server      │ - Completely private processing
                   └────────┬─────────┘
                            │ Streaming via WebSocket
                            ▼
                   ┌──────────────────┐
                   │   Pipulate App   │ - Monitors WS for JSON/commands
                   │(WebSocket Client)│ - Parses responses in real-time
                   └────────┬─────────┘
                            │ In-memory or DB backed
                            ▼
                   ┌──────────────────┐
                   │     Bounded      │ - Manages context window (~128k)
                   │   Chat History   │ - Enables RAG / tool integration
                   └──────────────────┘

Nix for Environment Reproducibility

Nix Flakes guarantee identical development and runtime environments across operating systems. This ensures:

                   ┌──────────────────┐
                   │  Linux / macOS   │ - Write code once, run anywhere
                   │  Windows (WSL)   │ - Consistent dev environment via Nix
                   └────────┬─────────┘
                            │ Nix manages dependencies
                            ▼
                   ┌──────────────────┐
                   │   CUDA Support   │ - Auto-detects NVIDIA GPU w/ CUDA
                   │   (if present)   │ - Uses GPU for LLM acceleration
                   └──────────────────┘   - Falls back to CPU if no CUDA

Workflow System Architecture

Pipulate’s primary feature is its step-based workflow system, designed specifically for porting Jupyter Notebook concepts into guided, end-user-friendly interfaces. The system’s core innovation is the run_all_cells() method, which creates a perfect mental model by directly mirroring Jupyter’s “Run All Cells” functionality.

Step-Based Pipeline Flow

  ┌─────────┐        ┌─────────┐        ┌─────────┐   - Fully customizable steps
  │ Step 01 │─piped─►│ Step 02 │─piped─►│ Step 03 │   - Interruption-safe & resumable
  └─────────┘        └─────────┘        └─────────┘   - Easily ported from Notebooks
       │                  │                  │        - One DB record per workflow run
       ▼                  ▼                  ▼
  State Saved        State Saved         Finalized?

The Chain Reaction Pattern: Powered by run_all_cells()

The heart of Pipulate’s workflow system is the “chain reaction” pattern - a critical HTMX mechanism that enables automatic progression between steps. This pattern is brilliantly encapsulated by the run_all_cells() method, which creates the same mental model as Jupyter’s “Run All Cells” command. The key elements:

return Div(
    Card(...),  # Current step content
    # CRITICAL: This inner Div triggers loading of the next step
    Div(id=next_step_id, hx_get=f"/{app_name}/{next_step_id}", hx_trigger="load"),
    id=step_id
)

This pattern:

  1. Uses the inner Div with id=next_step_id as a container for the next step
  2. The hx_get attribute requests the next step from the server
  3. CRITICALLY: hx_trigger="load" makes this happen automatically when current step renders

Important: Never remove hx_trigger="load" — it’s essential for reliable step progression.

The run_all_cells() Pedagogical Breakthrough: This method name is pedagogically brilliant because it creates instant understanding. Anyone familiar with Jupyter notebooks immediately grasps the concept - workflows execute from top to bottom, stopping only when they encounter a step requiring input, exactly like running all cells in a notebook. This naming choice makes the entire system more intuitive for both developers and AI assistants.

Workflow Implementation Pattern

Creating workflows follows a consistent pattern:

from collections import namedtuple
Step = namedtuple('Step', ['id', 'done', 'show', 'refill', 'transform'], defaults=(None,))

class MyWorkflow:
    APP_NAME = "unique_name"        # Unique identifier
    DISPLAY_NAME = "User-Facing Name"  # UI display name
    ENDPOINT_MESSAGE = "Welcome message"   # Landing page description
    TRAINING_PROMPT = "workflow_name.md"  # Training context for AI assistance
    
    def __init__(self, app, pipulate, pipeline, db, app_name=APP_NAME):
        self.app = app
        self.pipulate = pipulate
        self.pipeline = pipeline
        self.db = db
        self.app_name = app_name
        self.message_queue = pipulate.get_message_queue()
        
        # Define steps
        self.steps = [
            Step(id='step_01', done='first_field', show='First Step', refill=True),
            Step(id='step_02', done='second_field', show='Second Step', refill=True),
            Step(id='finalize', done='finalized', show='Finalize', refill=False)
        ]
        
        # Register routes
        self.register_routes(app.route)
    
    # Handler methods for each step
    async def step_01(self, request):
        """Handler for step 01 display"""
        # ... implementation
    
    async def step_01_submit(self, request):
        """Handler for step 01 form submission"""
        # ... implementation

Porting from Jupyter Notebooks

Pipulate is specifically designed to convert Jupyter notebook cells into guided workflow steps:

      ┌──────────────────┐    ┌──────────────────┐
      │   Jupyter Lab    │    │    FastHTML      │
      │    Notebooks     │    │     Server       │
      │  ┌──────────┐    │    │  ┌──────────┐    │
      │  │ Cell 1   │    │    │  │ Step 1   │    │
      │  │          │    │--->│  │          │    │
      │  └──────────┘    │    │  └──────────┘    │
      │  ┌──────────┐    │    │  ┌──────────┐    │
      │  │ Cell 2   │    │    │  │ Step 2   │    │
      │  │          │    │--->│  │          │    │
      │  └──────────┘    │    │  └──────────┘    │
      │  localhost:8888  │    │  localhost:5001  │
      └──────────────────┘    └──────────────────┘

Best Practices for Notebook → Workflow Conversion

  1. Split Cell Logic: Split complex notebook cells into smaller, more focused steps
  2. Identify User Input Points: Each form input becomes a distinct workflow step
  3. Use WET Code: Embrace explicit, self-contained step implementations
  4. Preserve State Flow: Ensure data flows properly between steps via the transform function
  5. Add User Guidance: Provide clear instructions for each step
  6. Implement Validation: Add form validation for better user experience

Plugin System Architecture

Pipulate supports two main types of plugins:

  1. CRUD Apps: Standard data management interfaces inheriting from BaseCrud
  2. Workflows: Step-by-step processes implemented as plain Python classes

The plugin discovery system:

# Naming conventions for plugins
workflows/10_hello_workflow.py     # Registered as "hello_flow" in menu position 10
workflows/xx_experimental_flow.py  # Skipped (development version)
workflows/hello_flow (Copy).py     # Skipped (temporary copy)
workflows/hello_flow Copy.py       # Skipped (temporary copy)

Workflow for Creating New Plugins

  1. Copy a Template: Start with a template (e.g., 500_hello_workflow.py) → 500_hello_workflow.py (Copy).py
  2. Modify: Develop your workflow (won’t auto-register with parentheses in name)
  3. Test: Rename to xx_my_flow.py for testing (server auto-reloads but won’t register)
  4. Deploy: Rename to XX_my_flow.py (e.g., 30_my_flow.py) to assign menu order and activate

State Management

Pipulate uses two complementary approaches to state management:

1. DictLikeDB for Workflow State

Workflows store their entire state as JSON blobs in the pipeline table, enabling:

      ┌───────────────────────────────┐ # Benefits of Local-First Simplicity
      │          Web Browser          │
      │                               │ - No mysterious client-side state
      │    ┌────────────────────┐     │ - No full-stack framework churn
      │    │   Server Console   │     │ - No complex ORM or SQL layers
      │    │     & Web Logs     │     │ - No external message queues
      │    └─────────┬──────────┘     │ - No build step required
      │              ▼                │ - Direct, observable state changes
      │    ┌─────────────────────┐    │
      │    │  Server-Side State  │    │ 
      │    │  DictLikeDB + JSON  │ ◄─── (Conceptually like server-side cookies)
      │    └─────────────────────┘    │ - Enables the "Know EVERYTHING!" philosophy
      └───────────────────────────────┘
# Reading workflow state
pipeline_id = db.get("pipeline_id", "unknown")
state = pip.read_state(pipeline_id)

# Updating workflow state
state[step.done] = value
pip.write_state(pipeline_id, state)

2. MiniDataAPI for CRUD Operations

Standard database operations use MiniDataAPI’s table objects:

# Insert a new profile
profiles.insert(name="New Profile")

# Update a profile
profiles.update(1, name="Updated Profile")

# Delete a profile
profiles.delete(1)

# Query profiles
all_profiles = profiles()
specific_profile = profiles(1)

Communication Channels

Pipulate uses three primary communication methods:

  1. HTTP: Standard request/response for most page loads and form submissions
  2. WebSockets: Bidirectional communication for LLM streaming and chat
  3. Server-Sent Events (SSE): Unidirectional server-to-client updates for live reloading and progress notifications

UI Layout Architecture

The application interface is organized into distinct areas:

    ┌─────────────────────────────┐
    │        Navigation           │ (Profiles, Apps, Search)
    ├───────────────┬─────────────┤
    │               │             │
    │    Main Area  │    Chat     │ (Workflow/App UI)
    │   (Pipeline)  │  Interface  │ (LLM Interaction)
    │               │             │
    ├───────────────┴─────────────┤
    │        Poke Button          │ (Quick Action)
    └─────────────────────────────┘

Development Environment

The Pipulate development experience leverages:

        ┌─────────────┐         ┌──────────────┐
        │ File System │ Changes │  AST Syntax  │ Checks Code
        │  Watchdog   │ Detects │   Checker    │ Validity
        └──────┬──────┘         └───────┬──────┘
               │ Valid Change           │
               ▼                        ▼
 ┌───────────────────────────┐     ┌──────────┐
 │    Uvicorn Server         │◄─── │  Reload  │ Triggers Restart
 │ (Handles HTTP, WS, SSE)   │     │ Process  │
 └───────────────────────────┘     └──────────┘

Common LLM Implementation Mistakes

LLMs frequently make these errors when working with Pipulate:

  1. Missing HX-Refresh Response: Forgetting to return the refresh response for empty keys
  2. Incorrect Key Generation: Not using pip.generate_pipeline_key(self) properly
  3. Missing Cursor Positioning: Forgetting the _onfocus attribute for user experience
  4. Wrong Route Handling: Not understanding the difference between landing page and init routes
  5. State Inconsistency: Not properly handling the key generation and storage flow
  6. APP_NAME Changes: Modifying APP_NAME after deployment, orphaning existing data
  7. Chain Reaction Breaks: Not properly implementing the HTMX step progression pattern

Advanced Patterns

Placeholder Steps Pattern

For planning workflow structure before implementing detailed functionality:

Step(
    id='step_XX',            # Use proper sequential numbering
    done='placeholder',      # Field that must aquire data before step proceeds
    show='Placeholder Step', # What the user sees as that field's label
    refill=True,             # Whether that field refills on revert-to-step
)

Breaking the Chain (Cautionary Pattern)

The no-chain-reaction class should only be used in specific scenarios:

# For polling operations (continuous status checking):
return Div(
    progress_indicator,
    cls="polling-status no-chain-reaction",
    hx_get=f"/{app_name}/check_status",
    hx_trigger="load, every 2s",
    hx_target=f"#{step_id}",
    id=step_id
)

Data Visualization Integration

Pipulate supports embedding visualization components:

import pandas as pd
import matplotlib.pyplot as plt
from io import BytesIO
import base64

# Generate plot
fig, ax = plt.subplots(figsize=(10, 6))
df.plot(ax=ax)
plt.tight_layout()

# Convert to base64 for embedding
buffer = BytesIO()
plt.savefig(buffer, format='png')
buffer.seek(0)
image_base64 = base64.b64encode(buffer.read()).decode('utf-8')

# Return in HTML
return Div(
    Card(
        H4("Data Visualization"),
        Img(src=f"data:image/png;base64,{image_base64}",
            style="width:100%;max-width:800px"),
    ),
    Div(id=next_step_id, hx_get=f"/{app_name}/{next_step_id}", hx_trigger="load"),
    id=step_id
)

For Technical SEOs: Bringing Python SEO to the Masses

If you’re a technical SEO who uses Python for SEO tasks, Pipulate offers a unique opportunity to make your tools accessible to non-technical team members:

  1. Convert Existing Notebooks: Turn your current SEO data processing notebooks into guided workflows
  2. Standardize Data Collection: Create consistent interfaces for gathering API credentials and configuration
  3. Visualize Results: Present complex SEO data with clear visualizations
  4. Share Your Expertise: Guide users through your SEO methodology step-by-step
  5. Maintain Privacy: Keep sensitive SEO data and API keys local and secure

Core Principles for Developers

Remember these guiding principles when working with Pipulate:

  1. Keep it simple. Avoid complex patterns when simple ones will work.
  2. Stay local and single-user. Embrace the benefits of local-first design.
  3. Be explicit over implicit. WET code that’s clear is better than DRY code that’s obscure.
  4. Preserve the chain reaction. Maintain the core progression mechanism in workflows.
  5. Embrace observability. Make state changes visible and debuggable.

Contributing to Pipulate

Contributions are welcome! Please adhere to the project’s core philosophy:

License

This project is licensed under the MIT License. See the LICENSE file for details.