Stack IA Support Client
Dix picks pour une équipe SaaS qui monte un support tier-1 assisté par IA : boîte omnicanale, ticketing, boîte partagée, couche d'opérations CX, déflection chatbot, framework IA conversationnelle, base de connaissances RAG, brouillons de réponses IA et escalade avec humain dans la boucle. Remplace la facture par siège d'Intercom/Zendesk par une infrastructure ouverte auditable.
What's in this pack
This is the stack a SaaS team would stand up the week they decide tier-1 support has to scale without doubling headcount — not a feature checklist from a vendor blog. Every pick here is open-source, self-hostable, and earns its slot in a real ticket-to-resolution pipeline. The cost story matters: a 10-seat Intercom plan is roughly the same monthly burn as a Hetzner box running this entire stack.
The order matters because each layer expects the one below it to exist. You can stop after step 4 and have a working human support desk; you stop after step 9 and you have AI-assisted support; step 10 (HumanLayer) is what keeps you out of trouble when the AI gets confident about a refund it shouldn't issue.
Install in this order
- Chatwoot — start with the omnichannel inbox. Email, live chat widget, WhatsApp, Instagram DMs, and Twitter all land in one queue. This is the Intercom/Zendesk replacement and the surface your humans actually live in. Self-host on Docker, point your support email's MX records, ship widget JS — you have a working inbox by lunch.
- Zammad — ticketing alternative for teams whose primary channel is email + phone + SLA reporting, not chat. Pick one of Chatwoot or Zammad as your system of record; running both is a data-fragmentation tax you will pay for later. Zammad is the right call when finance and ops want classic ticket reports.
- FreeScout — the third inbox option, for the smallest teams. If you're 1-3 people on a shared mailbox and Chatwoot feels like overkill, FreeScout is Laravel, lightweight, and lives on a $5 VPS. Graduate to Chatwoot once you cross 200 conversations a week.
- Erxes — the CX operations layer. Once you have an inbox, you need a customer profile that survives across channels, a segment view for the success team, and a place to wire campaigns. Erxes plugs in next to Chatwoot/Zammad and replaces the HubSpot + Zendesk combo with a single self-hosted spine.
- Botpress — the visual chatbot for tier-1 deflection. The 60% of tickets that are "what's my password reset URL" should never reach a human. Botpress sits in front of Chatwoot's live chat widget, answers FAQ traffic from a flow you author once, and only escalates the rest into the human queue.
- Rasa — the conversational AI framework, for the day Botpress's visual flows hit a ceiling. Rasa gives you proper intent classification, multi-turn dialog management, and Python custom actions. Use it when you have enough conversation data (3-6 months of tickets) to actually train intents, not before.
- haiku.rag — RAG CLI + MCP server. This is the layer that turns your help-center markdown into citable answers.
haiku-rag add-src ./docs && haiku-rag ask --cite "how do I rotate API keys"returns the doc passage with the source URL. Wire it into Botpress, Rasa, and the AI agent below so every answer is grounded in your docs, not the LLM's training set. - AnythingLLM — the knowledge-base front end. Where haiku.rag is a programmatic primitive, AnythingLLM is the GUI your support manager uses to upload PDFs, sync Notion exports, tag policies, and watch what the bot is being asked. Single source of truth for the corpus that feeds steps 5-7.
- Claude Code Agent: Customer Support — drop this into Claude Code and an LLM drafts ticket responses, FAQ entries, and troubleshooting guides off the open ticket + your RAG corpus. The output goes into Chatwoot/Zammad as a draft, never an auto-send, until your reply-quality dashboard has earned it.
- HumanLayer — the escalation router. The moment an agent is about to issue a refund, cancel an account, or post to a customer-facing channel, HumanLayer wraps that call in an approval loop: a real human gets a Slack ping with the proposed action and either greenlights it or replies inline. Non-negotiable for production AI support.
How they fit together
Customer
│
▼
Chatwoot (or Zammad / FreeScout) ◄── Erxes (CX profile + segments)
│
├─ Botpress (FAQ deflection)
│ └─ Rasa (intent + dialog when flows hit a ceiling)
│
▼
Claude Code Agent: Customer Support
(drafts ticket reply)
│
├─ grounded by ─► haiku.rag ◄── AnythingLLM (corpus admin UI)
│
▼
HumanLayer (approval for refunds / cancels / outbound)
│
▼
Human agent or auto-resolve
The critical join is haiku.rag + HumanLayer: RAG keeps the agent honest about what's true, HumanLayer keeps it honest about what's allowed. Without RAG, the bot hallucinates pricing tiers. Without HumanLayer, the bot one day refunds a $40K annual contract because the customer typed angry words.
Tradeoffs you'll hit
- Chatwoot vs Zammad vs FreeScout — Chatwoot is chat-first with the best widget and modern UI; pick it if you sell B2C or PLG. Zammad is ticket-first with SLA reports finance teams expect; pick it for B2B and enterprise. FreeScout is the cheapest, easiest to operate; pick it only if you're under five agents.
- Botpress vs Rasa — Botpress is faster to ship and your CS manager can edit flows. Rasa requires Python and an ML engineer in the room. Default to Botpress until you have real conversation logs to train on; switch to (or add) Rasa once you do.
- AnythingLLM vs raw vector store — AnythingLLM trades performance ceiling for a usable admin UI. For under 10K documents it's the right call. Past that, swap the storage layer for pgvector + a thin custom UI.
- Auto-send AI drafts vs human-in-the-loop — every team thinks they want auto-send. Every team that flips it on without a quality dashboard rolls it back within a month. Keep drafts as drafts for the first 2,000 tickets; let your reply-edit-rate (target <15%) tell you when it's safe to relax.
Common pitfalls
- Running two inboxes in parallel — "we'll migrate gradually" almost always becomes "we have two systems forever." Pick one of Chatwoot/Zammad/FreeScout and commit. Migration day one is cheaper than living with split data for a year.
- Letting RAG see internal-only docs — your corpus will leak through citations. Tag every doc with an audience flag (public/internal/confidential) and have haiku.rag filter at retrieval. Audit the filter quarterly.
- Skipping HumanLayer for "low-stakes" actions — the day your bot mass-emails a tone-deaf incident message to 30K users is the day you wish refund approval and outbound messaging both went through the same review path. Wrap both from day one.
- No reply-quality dashboard — without edit-distance and CSAT-per-channel metrics, you have no signal for when to expand AI authority. Build the dashboard before you turn on AI drafts, not after.
- Treating the chatbot as the product — Botpress/Rasa exist to route to the right answer fast, not to entertain. If a user has typed two messages and the bot hasn't either answered or escalated, you've failed.
10 ressources prêtes à installer
Questions fréquentes
How long does it actually take to stand this stack up for a real SaaS team?
Realistic timeline for a 5-person SaaS: Week 1 — Chatwoot live on a subdomain, support email migrated, widget deployed (2 days of focused work, the rest is DNS waiting). Week 2 — haiku.rag indexed against your existing help center and AnythingLLM exposed to the support manager for corpus admin. Week 3 — Botpress flow handling the top-10 deflection intents (password reset, billing portal link, status page). Week 4 — Claude Code Customer Support agent drafting replies into Chatwoot with HumanLayer gating any action that touches billing or sends external email. Total: one month with one full-time engineer. Erxes and Rasa land in months 2-3 once the basics are stable.
What does this cost compared to a real Intercom/Zendesk bill?
Pure infra: a single Hetzner CCX23 box (4 vCPU, 16 GB RAM) runs Chatwoot + Erxes + Botpress + haiku.rag + AnythingLLM comfortably for around 30 USD/month. Add an LLM bill (10-50 USD/month while you're ramping) and Anthropic credits for the Customer Support agent. Compare with Intercom's per-seat plan starting around 39 USD/seat/month plus per-resolution AI fees, or Zendesk Suite Professional starting around 115 USD/seat/month — at 10 agents you're saving four figures monthly even before AI-resolution surcharges.
Can the AI agent actually close tickets without a human in the loop?
Technically yes; in practice we recommend no for the first three months. Configure Claude Code Customer Support to draft, not send. Watch your reply-edit-rate — the percentage of AI drafts a human modifies before sending. Below 15% sustained over 500 tickets, you can promote specific intent categories (password resets, status-page questions) to auto-send while keeping refunds, cancellations, and anything customer-facing routed through HumanLayer. Full auto-resolve is reserved for read-only deflection paths the bot already handles in Botpress.
Why both Chatwoot AND Zammad in the same pack?
Because the right pick depends on your channel mix, and a SaaS team building this stack should evaluate both before committing. Chatwoot wins if live chat and social DMs are 60%+ of your volume — its widget and inbox UX are best-in-class. Zammad wins if email and phone dominate and your ops team wants SLA-by-priority reporting out of the box. FreeScout is the escape hatch for teams under five agents who'd rather skip a Postgres deploy. Pick exactly one as your system of record; the other two are options not recommendations.
Where does the RAG corpus actually come from on day one?
Three sources, in this order: (1) your existing help center if you have one — export to markdown and feed haiku.rag directly; (2) your top-50 historical ticket resolutions, anonymised and pasted into AnythingLLM as a 'resolved-cases' workspace; (3) your internal runbooks, but only after you've tagged them 'public' or 'internal' and confirmed haiku.rag respects the filter. Skip product marketing copy — it's the worst RAG source on earth, full of vague benefit language with no factual content the bot can cite.
12 packs · 80+ ressources sélectionnées
Découvrez tous les packs curatés sur la page d'accueil
Retour à tous les packs