Self-Hosting

Vane (Perplexica 2.0) Quickstart With Ollama and llama.cpp

Vane (Perplexica 2.0) Quickstart With Ollama and llama.cpp

Self-hosted AI search with local LLMs

Vane is one of the more pragmatic entries in the “AI search with citations” space: a self-hosted answering engine that mixes live web retrieval with local or cloud LLMs, while keeping the whole stack under your control.

llama.swap Model Switcher Quickstart for OpenAI-Compatible Local LLMs

llama.swap Model Switcher Quickstart for OpenAI-Compatible Local LLMs

Hot-swap local LLMs without changing clients.

Soon you are juggling vLLM, llama.cpp, and more—each stack on its own port. Everything downstream still wants one /v1 base URL; otherwise you keep shuffling ports, profiles, and one-off scripts. llama-swap is the /v1 proxy before those stacks.