Skip to content

ADR 003: LLM Provider Abstraction

Status

Accepted

Context

Folionaut uses LLM providers for:

  • Chat responses (primary user interaction)
  • Content summarization
  • Potential future features (embeddings, classification)

Requirements:

  • Support multiple LLM providers (Claude, OpenAI, Ollama)
  • Easy provider switching without code changes
  • Consistent error handling across providers
  • Token usage tracking for cost monitoring
  • PII detection and sanitization in responses

Decision

Implement an LLM Provider abstraction layer with a unified interface that all providers implement.

typescript
interface LLMMessage {
  role: 'system' | 'user' | 'assistant'
  content: string
}

interface LLMResponse {
  content: string
  model: string
  usage: {
    inputTokens: number
    outputTokens: number
    totalTokens: number
  }
  finishReason: 'stop' | 'length' | 'error'
  latencyMs: number
}

interface LLMProvider {
  readonly name: string
  readonly supportedModels: string[]

  chat(messages: LLMMessage[], options: ChatOptions): Promise<LLMResponse>
  stream(messages: LLMMessage[], options: ChatOptions): AsyncIterable<string>
  isAvailable(): Promise<boolean>
}

// Provider factory with fallback chain
class LLMProviderChain {
  constructor(private providers: LLMProvider[]) {}

  async chat(messages: LLMMessage[], options: ChatOptions): Promise<LLMResponse> {
    for (const provider of this.providers) {
      if (await provider.isAvailable()) {
        try {
          return await provider.chat(messages, options)
        } catch (error) {
          if (this.isRetryable(error)) continue
          throw error
        }
      }
    }
    throw new LLMError('All providers unavailable')
  }
}

Alternatives Considered

OptionProsCons
Direct SDK usageSimple, full API accessVendor lock-in, inconsistent interfaces
LangChainRich ecosystem, many integrationsHeavy dependency, abstraction overhead
Vercel AI SDKNice streaming, React integrationVercel-focused, less control
Custom abstractionTailored to needs, lightweightDevelopment effort, maintenance
LiteLLMMany providers, drop-inPython-focused, less TS support

Consequences

Positive

  • Provider flexibility: Switch providers via configuration
  • Fallback resilience: Automatic failover if primary provider fails
  • Consistent metrics: Unified token tracking and latency measurement
  • PII protection: Output guardrails sanitize responses before returning to users
  • Testing: Easy to mock LLM calls in tests
  • Cost control: Token tracking enables budget alerts

Negative

  • Lowest common denominator: Can't use provider-specific features easily
  • Maintenance burden: Must update abstraction for new provider features
  • Abstraction leak: Some provider behaviors hard to hide

Mitigations

  • Provider-specific options: Allow pass-through of provider-specific config
  • Feature detection: Check provider capabilities before using advanced features
  • Minimal abstraction: Only abstract what's actually needed

References

Released under the MIT License.