Implementing OpenAI-compatible API gateway with LiteLLM to load-balance requests across Ollama, LM Studio, and vLLM backends
Why I Built This Gateway I run multiple local LLM backends in my homelab—Ollama for quick inference, LM Studio for testing different models, and vLLM when I...