Hermes AI Assistant Skills for Real Production Setups
Profile-first Hermes setups for serious workloads
Hermes AI assistant, officially documented as Hermes Agent, is not positioned as a simple chat wrapper.
Profile-first Hermes setups for serious workloads
Hermes AI assistant, officially documented as Hermes Agent, is not positioned as a simple chat wrapper.
The skills worth keeping, and the ones to skip
OpenClaw has two extension stories, and they are easy to mix up.
Plugins extend the runtime. Skills extend the agent’s behavior.
Plugins first. Skills naming in brief.
This article is about OpenClaw plugins — native gateway packages that add channels, model providers, tools, speech, memory, media, web search, and other runtime surfaces.
How real OpenClaw systems are actually structured
OpenClaw looks simple in demos. In production, it becomes a system.
One database or a real search stack
The real argument is not whether PostgreSQL can search text or whether Elasticsearch can store documents. Both can. The interesting question is where search complexity should live.
Alerting is a response system, not a noise system
Alerting gets described as a monitoring feature far too often. That framing is convenient, but it hides the real problem.
Slack is a workflow UI and alert delivery layer.
Slack integrations look deceptively easy because you can post a message in one HTTP call. The interesting part starts when you want Slack to be interactive and reliable.
Turn Discord into a safe, interactive alert bus.
Discord becomes a serious integration surface when you treat it like one: a place where systems publish events, humans make decisions, and automation continues the workflow.
Chat platforms as control planes for systems
Chat platforms have evolved far beyond messaging tools. In modern systems they operate as interfaces between automated processes and human decision making.
Patterns for integrations, code structure, and data access.
Most app architecture advice is either too abstract to apply or too narrow to scale. Here are practical trade-offs for production systems across integration, code structure, and data access.
Claude subscriptions no longer power agents
The quiet loophole that powered a wave of agent experimentation is now closed.
Self-hosted AI search with local LLMs
Vane is one of the more pragmatic entries in the “AI search with citations” space: a self-hosted answering engine that mixes live web retrieval with local or cloud LLMs, while keeping the whole stack under your control.
Agentic coding, now with local model backends.
Claude Code is not autocomplete with better marketing. It is an agentic coding tool: it reads your codebase, edits files, runs commands, and integrates with your development tools.
Hermes Agent install and quickstart for devs
Hermes Agent is a self-hosted, model-agnostic AI assistant that runs on a local machine or low-cost VPS, works through terminal and messaging interfaces, and improves over time by turning repeated tasks into reusable skills.
Install TGI, ship fast, debug faster
Text Generation Inference (TGI) has a very specific energy. It is not the newest kid in the inference street, but it is the one that already learned how production breaks -
llama.cpp token speed on 16 GB VRAM (tables).
Here I am comparing speed of several LLMs running on GPU with 16GB of VRAM, and choosing the best one for self-hosting.