Design Patterns for Solo Developers

There’s a pattern I see constantly in architecture content: “here’s how Netflix does it.” Kafka clusters, Kubernetes operators, service meshes, dedicated worker fleets. It’s awesome, it’s correct, but it’s also potentially over the top if you’re one person building a product.

But the opposite advice - “just use a simple queue, you’re not Netflix” - also misses the mark. Because how you build the simple version matters. If your simple version is a mess of hacked-together polling loops with no resilience, you’re not buying time, you’re accumulating debt and dangerously close to the phenomenon of “proto-duction”.

This post is about the middle path. I built a production async processing pipeline - multi-step job processing, file generation, cloud storage, real-time progress, resume on failure - as a solo developer.

I’ll show you exactly how it worked initially, where the enterprise patterns appear in simplified form, and what the concrete upgrade path looks like when you need it.

First, a quick disclaimer. I put this together from notes I’ve accumulated building products over the years. There is no right or wrong answer to any of this, and I was a bit reluctant to post it as I wasn’t sure it would add value. If it does for you, fantastic! If not, I’m always open to discussion on this sort of thing.

I also added a quick price comparison for each pattern, and for these I got an AI to work it out, so it’s very rough.

Let’s go!

Pattern 1: The Database-Backed Queue

Enterprise version

A dedicated message broker - Think Azure Service Bus, AWS SQS, RabbitMQ.

Messages are pushed onto a queue, consumers react when a message arrives. You get built-in dead lettering, message locks, retry policies, fan-out, the whole shebang!

Solo Dev Version

A BackgroundJobs table in PostgreSQL:

CREATE TABLE "BackgroundJobs" (
    "Id"              UUID PRIMARY KEY,
    "UserId"          UUID NOT NULL,
    "JobType"         TEXT NOT NULL,
    "Status"          INT NOT NULL,  -- 0=Queued, 1=Processing, 2=Completed, 3=Failed, 4=Cancelled
    "Payload"         JSONB NOT NULL,
    "Result"          JSONB,
    "ErrorMessage"    TEXT,
    "ProgressPercent" INT,
    "StatusMessage"   TEXT,
    "CreatedAt"       TIMESTAMPTZ NOT NULL,
    "StartedAt"       TIMESTAMPTZ,
    "CompletedAt"     TIMESTAMPTZ
);

When the domain service kicks off a job, it POSTs to the processing service’s internal endpoint. That endpoint writes a row with Status=Queued and returns immediately. The caller doesn’t wait.

The PollingJobProcessor - a .NET IHostedService - runs a loop inside the processing container. It queries for the oldest queued job, marks it as Processing, calls the appropriate handler, and updates the row to Completed or Failed.

protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
    while (!stoppingToken.IsCancellationRequested)
    {
        var processed = await ProcessNextJobAsync(stoppingToken);
        await Task.Delay(processed ? _minDelay : _currentDelay, stoppingToken);
        if (!processed) _currentDelay = Min(_currentDelay * 2, _maxDelay);
        else _currentDelay = _minDelay;
    }
}

The exponential backoff is important. When there’s nothing to do, the polling interval grows, reducing pointless database round trips. When a job is found, the interval resets immediately. The database isn’t hammered for no reason.

Why this works at solo-dev scale

No additional infrastructure to manage
The database you already have is the queue
Job history is queryable with normal SQL - trivial to see what failed and why
Works fine up to hundreds of concurrent jobs

The upgrade path

When you hit contention, such as high write volume, jobs arriving faster than the poller can process them, replace the database write with a Service Bus / SQS message. The handler code doesn’t change. KEDA has native Service Bus and SQS scalers that swap in directly for the Postgres scaler.

AI generated Rough cost

Solo: £0/month extra - the queue is your existing database, no additional infrastructure. Enterprise: Azure Service Bus Standard ~£8–10/month base + ~£0.80 per million operations. AWS SQS is ~£0.40 per million requests (practically free at low volume, scales linearly).

Pattern 2: KEDA Autoscaling Without Kubernetes

Enterprise version

KEDA running on Azure Kubernetes Service (AKS) or AWS Elastic Kubernetes Service (EKS), scaling dedicated worker deployments based on queue depth. Platform team owns the cluster, KEDA configuration, and scaling policies.

Solo Dev Version

Azure Container Apps has KEDA built in and managed by Microsoft. You never see the Kubernetes cluster. The scaling rule lives in Terraform:

custom_scale_rule {
  name             = "active-jobs"
  custom_rule_type = "postgresql"
  metadata = {
    query       = "SELECT COUNT(*) FROM \"BackgroundJobs\" WHERE \"Status\" IN (0, 1)"
    targetValue = "1"
  }
  authentication {
    secret_name       = "db-connection-string"
    trigger_parameter = "connection"
  }
}

KEDA runs that query on an interval. When the count exceeds targetValue, it scales up a replica. When the queue empties, it scales back to zero. The container only exists when there’s work to do.

This means at zero load you pay nothing. The container cold-starts when a job arrives, processes it, and scales back down. For a solo dev, that’s the difference between a service costing £5/month and £50/month.

The upgrade path

The scaling pattern is identical on AWS ECS with KEDA, or on AKS. Swap the Postgres scaler for an SQS or Service Bus scaler. The targetValue concept is the same - scale up when queue depth exceeds a threshold.

AI generated Rough cost

Solo: ~£5–15/month - Azure Container Apps consumption plan scales to zero, so you pay only for active compute. A lightly used processing service might run a few hours a day. Enterprise: ~£100–250/month minimum - AKS requires at least one always-on node pool. A basic 2-node cluster (Standard_D2s_v3) runs ~£120/month before you add monitoring, ingress, or node autoscaler overhead.

Pattern 3: Checkpoint Resume

Enterprise version

Idempotent message processing - each step of the pipeline publishes an event, downstream services consume it. If step 4 fails, you replay from step 4’s event without re-running steps 1–3. On Azure: Azure Event Grid or Azure Event Hubs. On AWS: Amazon EventBridge or SNS + SQS. Checkpoint state stored in Azure Cache for Redis or AWS ElastiCache for fast lookups.

Solo Dev Version

Every completed unit gets uploaded to blob storage immediately as a checkpoint:

{userId}/jobs/{jobId}/units/unit_{unitId}.output

When a job starts (or restarts), it first checks for existing checkpoint blobs and downloads them. It then skips any unit that already has a checkpoint. You only process what hasn’t been done.

This matters because the per-unit processing is the expensive step - both in time and compute cost. A job with 200 units that fails at unit 180 restarts at unit 181, not unit 1.

The stale job cleanup service handles the crash case automatically. A background service runs on a schedule and finds any job stuck in Processing state beyond a timeout threshold - meaning the container died mid-job. It:

Counts how many unit checkpoints exist in blob storage
Marks the job as Failed with the checkpoint count in the error payload
Sends a failure callback to the domain service with checkpoint information

On retry, the domain service can resume from where it left off rather than starting over.

The upgrade path

The checkpoint pattern doesn’t change at any scale. It’s purely a resilience strategy for long-running jobs. At enterprise scale you might use a distributed cache (Redis) for faster checkpoint lookups instead of blob storage queries, but the concept is identical.

AI generated Rough cost

Solo: ~£1–3/month - blob storage at standard tier is ~£0.02/GB. A few thousand checkpoint files barely registers. Enterprise: Add Redis Cache (~£12–45/month for Basic C1 to Standard C1 on Azure) if checkpoint lookups become a bottleneck. Blob storage cost stays the same.

Pattern 4: Multi-Provider Strategy Pattern

Enterprise version

An abstraction layer over multiple vendors with runtime routing, circuit breakers, and fallback chains. Circuit breaking handled by Polly (.NET), Resilience4j (JVM), or AWS App Mesh / Azure API Management policies at the infrastructure level.

Solo Dev Version

A provider factory with a clean interface:

public interface IProcessingProvider
{
    Task<Stream> ProcessAsync(ProcessingRequest request, CancellationToken ct);
    Task<HealthStatus> GetHealthAsync();
}

Multiple implementations behind that interface, plus a mock provider. The factory resolves the right one based on configuration:

return _settings.Provider switch {
    "ProviderA" => _serviceProvider.GetRequiredService<ProviderAImplementation>(),
    "ProviderB" => _serviceProvider.GetRequiredService<ProviderBImplementation>(),
    "Mock"      => _serviceProvider.GetRequiredService<MockProvider>(),
    _           => throw new InvalidOperationException($"Unknown provider: {_settings.Provider}")
};

Switching providers is a config change. The job pipeline doesn’t know which provider it’s using. The mock provider lets the whole pipeline run in tests without any real external service.

The upgrade path

Add a circuit breaker (Polly) around each provider call, and a fallback chain - if the primary provider is unavailable, fall back to secondary. At enterprise scale you might route different workloads to different providers based on latency or cost. The interface stays the same.

AI generated Rough cost

Solo: £0 - this is pure application code. Polly is a free NuGet package. The providers themselves cost money; the pattern doesn’t. Enterprise: £0 - same. The abstraction layer adds no infrastructure cost at any scale.

Pattern 5: Input Validation Without a Third-Party Service

Enterprise version

A dedicated managed validation service. On Azure: Azure AI Content Safety. On AWS: Amazon Rekognition (image/video), Amazon Comprehend (text). Managed, scalable, compliance-certified.

Solo Dev Version

A Python FastAPI microservice with zero external ML dependencies. Multiple analysis layers running locally, combined into a weighted validation score. No per-call API costs. No data leaving the platform. No vendor dependency for a core safety feature.

The upgrade path

At enterprise scale you’d want the compliance certifications that come with managed services - especially in a regulated market. Managed validation services plug in as provider implementations behind the same interface. Any custom scoring logic that no managed service offers stays in-house regardless.

AI generated Rough cost

Solo: ~£5–15/month - a small always-on container running local ML models (Consumption plan, ~0.5 vCPU). No per-call API fees, no data egress. Enterprise: Managed validation APIs typically charge per transaction - roughly £1–1.50/1,000 requests depending on the modality. At moderate volume (50k requests/month) that’s ~£50–75/month.

Pattern 6: Real-Time Progress Without Long-Polling

Enterprise version

A dedicated notification service with a message bus fan-out to push updates to connected clients. On Azure: Azure SignalR Service + Azure Service Bus for fan-out. On AWS: AWS API Gateway WebSockets + Amazon SNS for fan-out. WebSocket infrastructure managed separately from the application.

Solo Dev Version

SignalR baked directly into the processing service. As the job handler updates progress - after each unit completes, at each major milestone - it calls:

await _notificationService.NotifyJobProgressAsync(userId, jobId, percent, message);

That pushes directly to any connected WebSocket clients subscribed to that user’s job.

One non-obvious detail: WebSockets can’t send custom HTTP headers, so the standard Bearer token auth doesn’t work. The JWT is passed as a query string parameter and extracted in the SignalR pipeline configuration:

options.Events = new JwtBearerEvents {
    OnMessageReceived = context => {
        var token = context.Request.Query["access_token"];
        if (!string.IsNullOrEmpty(token) && context.HttpContext.Request.Path.StartsWithSegments("/hubs"))
            context.Token = token;
        return Task.CompletedTask;
    }
};

The upgrade path

At scale, replace self-hosted SignalR with Azure SignalR Service or AWS API Gateway WebSockets. The application code doesn’t change - just the backing transport.

AI generated Rough cost

Solo: £0 - SignalR runs inside your existing service container. No additional infrastructure. Enterprise: Azure SignalR Service Standard tier ~£40–50/month for 1 unit (1,000 concurrent connections). AWS API Gateway WebSockets ~£1/million messages + £0.25/million connection-minutes - near-zero at low volume, but adds up with always-on connections.

Pattern 7: Service-to-Service Auth Without Passing Tokens Manually

Enterprise version

A service mesh handling mTLS between services automatically. On Azure: Istio on AKS (now GA as an AKS add-on) or Azure API Management with managed identities. On AWS: AWS App Mesh or Amazon ECS Service Connect. Zero application-layer auth between internal services.

Solo Dev Version

A DelegatingHandler that automatically forwards the incoming request’s Bearer token to any outbound HTTP calls:

protected override async Task<HttpResponseMessage> SendAsync(
    HttpRequestMessage request, CancellationToken ct)
{
    var token = _httpContextAccessor.HttpContext?
        .Request.Headers["Authorization"]
        .ToString().Replace("Bearer ", "");

    if (!string.IsNullOrEmpty(token))
        request.Headers.Authorization = new AuthenticationHeaderValue("Bearer", token);

    return await base.SendAsync(request, ct);
}

Registered on typed HTTP clients at startup. When one service calls another on behalf of a user request, the user’s JWT flows through automatically. No manual token extraction, no service-specific credentials.

The upgrade path

At enterprise scale, move to mutual TLS between services and a service account model - each service has its own identity rather than forwarding user tokens. But for internal services behind a gateway, token forwarding is secure and simple.

AI generated Rough cost

Solo: £0 - a DelegatingHandler is a few lines of code registered at startup. Enterprise: £0 for the pattern itself, but a service mesh adds real cost. Istio on AKS or App Mesh on EKS adds ops complexity and some CPU overhead; budget ~£20–50/month in additional node capacity to absorb the sidecar proxies.

Pattern 8: Programmatic Container Scaling

Enterprise version

A Kubernetes operator that modifies Deployment replicas via the Kubernetes API, or a Horizontal Pod Autoscaler driven by custom metrics. On Azure: KEDA on AKS with Azure Monitor custom metrics. On AWS: ECS UpdateService API or KEDA on EKS with Amazon CloudWatch metrics.

Solo Dev Version

One service can trigger scaling of another directly via the Azure Management API, authenticated with its managed identity:

PATCH /subscriptions/{id}/resourceGroups/{rg}/providers/Microsoft.App/containerApps/{app}

This means one service can pre-warm another the moment it knows work is coming, rather than waiting for KEDA to react to queue depth. The managed identity means no credentials in config - the container app’s Azure identity is the auth mechanism.

The upgrade path

On AWS this is the ECS UpdateService API, callable with an IAM role attached to the task. The pattern is identical.

AI generated Rough cost

Solo: £0 - Azure Management API calls are free. Managed identity is included with Azure Container Apps. Enterprise: £0 for the API calls themselves. If you move to a Kubernetes HPA with custom metrics you’ll need a Prometheus stack, which adds ~£10–20/month in storage and compute.

Pattern 9: Structured Error Handling Across All Services

Enterprise version

A centralised exception management platform with ProblemDetails RFC 7807 compliance. Correlation IDs tied to distributed tracing. On Azure: Azure Application Insights + Azure Monitor. On AWS: AWS X-Ray + Amazon CloudWatch. Cross-cloud or self-hosted: Datadog, Jaeger, or Zipkin.

Solo Dev Version

A shared exception hierarchy and a single ErrorHandlerMiddleware registered in every service:

{
  "traceId": "0HN4K2VG8T3CP:00000001",
  "success": false,
  "status": 404,
  "errors": [
    { "code": "ResourceNotFound", "message": "The requested resource does not exist" }
  ]
}

Every exception type maps to an HTTP status. Every response includes a traceId from the request context. Unexpected exceptions are logged with full context; expected exceptions (validation failures, not found) are returned cleanly without log noise. The client always gets a consistent shape regardless of which service produced the error.

The upgrade path

Add OpenTelemetry and wire the traceId into a distributed tracing backend - Jaeger, Zipkin, or AWS X-Ray. The traceId is already there; you’re just making it span service boundaries.

AI generated Rough cost

Solo: £0 - Application Insights free tier covers 5GB/month of ingestion, enough for most solo projects. The traceId pattern itself is free. Enterprise: Azure Monitor / Application Insights ~£2.30/GB over the free tier. At 10GB+/month (common for multi-service prod traffic) that’s ~£15–20/month. AWS X-Ray is ~£4.50/million traces recorded - negligible unless you’re tracing every request at high volume.

The Honest Tradeoffs

What the solo approach doesn’t give you:

Message ordering guarantees - the database queue processes oldest-first but doesn’t give you strict FIFO with the reliability guarantees a broker does
Fan-out - one message triggering multiple consumers simultaneously requires either multiple pollers or a proper pub/sub system
Automatic dead lettering - the stale job cleanup service handles this, but it’s bespoke rather than built-in
Backpressure - a message broker can apply backpressure when consumers are overwhelmed; a database table just fills up

What the solo approach gives you that enterprise often loses:

Debuggability - failed jobs are rows in a database. SELECT * FROM BackgroundJobs WHERE Status = 3 tells you everything. Dead letter queues require tooling to inspect.
Simplicity - one less service to deploy, monitor, and pay for
SQL joins - you can join job records directly to user records, unit records, and result records. Message brokers don’t do joins.
The upgrade path is real - every pattern here maps to an enterprise equivalent without rewriting the business logic

The Upgrade Checklist

When you’re ready to move to enterprise infrastructure, here’s a rough order of operations:

Add a message broker - write to SQS/Service Bus instead of the database table. Keep the database record for history. Change KEDA scaler from Postgres to queue depth.
Add Redis - replace in-memory caches with Redis for multi-instance support. Add token blocklist for auth revocation.
Add distributed tracing - wire OpenTelemetry into the existing traceId pattern. Add to the gateway so traces span service boundaries.
Externalise SignalR - Azure SignalR Service or AWS API Gateway WebSockets. Application code unchanged.
Add circuit breakers - wrap provider calls with Polly. The provider factory interface is already in place.

None of these steps require rewriting business logic. The patterns were right the first time - the infrastructure underneath them just changes.

Conclusion

The question isn’t “should I build it the enterprise way or the simple way?” The question is “are these the right patterns, implemented simply?”

The database queue, the checkpoint resume, the provider factory, the structured errors, the auth forwarding - these aren’t compromises. They’re the same patterns enterprise teams use, built with less infrastructure overhead because the scale doesn’t justify it yet.

Build the patterns right. Choose the infrastructure for the scale you’re at. Know exactly what to swap when you outgrow it.

The prototypes you build this way can scale with you accordingly, and hopefully this will make your life easier!

Published May 26, 2026

Software engineer and technical founder in London, focused on building practical products in AI and hardware.lookitskris on Twitter