DevOps

Llama-Server Router Mode - Dynamic Model Switching Without Restarts

For a long time, llama.cpp had a glaring limitation:
you could only serve one model per process, and switching meant a restart.

OpenClaw Plugins — Ecosystem Guide and Practical Picks

This article is about OpenClaw plugins — native gateway packages that add channels, model providers, tools, speech, memory, media, web search, and other runtime surfaces.

Hermes AI Assistant - Install, Setup, Workflow, and Troubleshooting

Hermes Agent is a self-hosted, model-agnostic AI assistant that runs on a local machine or low-cost VPS, works through terminal and messaging interfaces, and improves over time by turning repeated tasks into reusable skills.

Remote Ollama access via Tailscale or WireGuard, no public ports

Ollama is at its happiest when it is treated like a local daemon: the CLI and your apps talk to a loopback HTTP API, and the rest of the network never finds out it exists.

Ollama in Docker Compose with GPU and Persistent Model Storage

Ollama works great on bare metal. It gets even more interesting when you treat it like a service: a stable endpoint, pinned versions, persistent storage, and a GPU that is either available or it is not.

Ollama behind a reverse proxy with Caddy or Nginx for HTTPS streaming

Running Ollama behind a reverse proxy is the simplest way to get HTTPS, optional access control, and predictable streaming behaviour.

Apache Flink on K8s and Kafka: PyFlink, Go, ops, and managed pricing

Apache Flink is a framework for stateful computations over unbounded and bounded data streams.

Neo4j graph database for GraphRAG, install, Cypher, vectors, ops

Neo4j is what you reach for when the relationships are the data. If your domain looks like a whiteboard of circles and arrows, forcing it into tables is painful.

IndexNow explained - notify search engines when you publish

Static sites and blogs change whenever you deploy. Search engines that support IndexNow can learn about those changes without waiting for the next blind crawl.

SGLang QuickStart: Install, Configure, and Serve LLMs via OpenAI API

SGLang is a high-performance serving framework for large language models and multimodal models, built to deliver low-latency and high-throughput inference across everything from a single GPU to distributed clusters.

Apache Kafka Quickstart - Install Kafka 4.2 with CLI and Local Examples

Apache Kafka 4.2.0 is the current supported release line, and it’s the best baseline for a modern Quickstart because Kafka 4.x is fully ZooKeeper-free and built around KRaft by default.

llama.swap Model Switcher Quickstart for OpenAI-Compatible Local LLMs

Soon you are juggling vLLM, llama.cpp, and more—each stack on its own port. Everything downstream still wants one /v1 base URL; otherwise you keep shuffling ports, profiles, and one-off scripts. llama-swap is the /v1 proxy before those stacks.

Developer Tools: The Complete Guide to Modern Development Workflows

Developing software involves Git for version control, Docker for containerization, bash for automation, PostgreSQL for databases, and VS Code for editing — along with countless other tools that make or break your productivity. This page collects the essential cheatsheets, workflows, and comparisons you need to work efficiently across the full development stack.

LocalAI QuickStart: Run OpenAI-Compatible LLMs Locally

LocalAI is a self-hosted, local-first inference server designed to behave like a drop-in OpenAI API for running AI workloads on your own hardware (laptop, workstation, or on-prem server).

llama.cpp Quickstart with CLI and Server

I keep coming back to llama.cpp for local inference—it gives you control that Ollama and others abstract away, and it just works. Easy to run GGUF models interactively with llama-cli or expose an OpenAI-compatible HTTP API with llama-server.

AI Developer Tools: The Complete Guide to AI-Powered Development

Artificial Intelligence is reshaping how software is written, reviewed, deployed, and maintained. From AI coding assistants to GitOps automation and DevOps workflows, developers now rely on AI-powered tools across the entire software lifecycle.