
TL;DR
I built Tanvilla, a multi-tenant AI chatbot platform for real estate agencies. It handles WhatsApp, Facebook Messenger, and an embeddable website widget from a unified inbox — understands Tunisian Arabic (derja), searches properties from natural language, books viewings, sends voice replies via ElevenLabs, and feeds everything into a real-time CRM dashboard. Currently in early validation: first demo is out, talking to agencies, no paying customers yet.
Why I Built This
Honestly? I wanted to play with the new Vercel Chat SDK.
When the Chat SDK ecosystem matured in early 2026 with multi-platform support (WhatsApp, Messenger, web, and more) and AI SDK 6 dropped with the ToolLoopAgent class, my immediate reaction was "okay, I need to build something with this." The whole AI agent stack finally felt like something a solo dev could actually ship into production.
So the question wasn't "what's a great business idea?" It was "what's a real use case that will force me to use all these tools properly?"
Real estate kept coming back to me. Every agency in Tunisia runs on WhatsApp. Every agency loses leads at night. Every agency has the same repetitive conversations a hundred times a day ("c'est combien ?", "vous avez des photos ?", "quand je peux visiter ?"). Perfect playground for a conversational AI that needs to actually do things — search a database, send photos, book appointments, qualify leads.
What I wanted to learn:
- How AI agents work in production (not just chat completions)
- Multi-platform messaging at the protocol level
- Multi-tenant SaaS with proper data isolation
- Whether Vercel's whole "Chat SDK + AI SDK + Sandbox" stack actually works together
Spoiler: it does, but production reveals corners the docs don't.
What I Was Trying to Solve
Real estate agents in Tunisia get hammered on WhatsApp all day. A client sees a property on Instagram, DMs the agency, then also messages on WhatsApp, then maybe Facebook Messenger if they don't get a reply. The agent is showing villas, driving between viewings, and trying to respond to 50 messages spread across multiple apps. Nobody wins.
The existing tools all had the same problem: they were either generic chatbot builders that don't understand real estate, or real estate CRMs (Lofty, kvCORE, Real Geeks) that don't understand WhatsApp. Nothing did both, and definitely nothing spoke Tunisian Arabic.
There's also a quieter problem: when an agent quits an agency, their phone goes with them. Months of client conversations, lost overnight. Agencies in Tunisia have basically no system — everything lives in the head of one person or in an Excel spreadsheet from 2019.
So the brief I set for myself: respond to clients across every channel they use, in the language they speak, capture every lead into a CRM the agency actually owns, and make the bot useful — not just a fancy autoresponder.
How I Actually Built It
The initial approach
I started with the chatbot side because that's where the interesting tech was. Built a minimal WhatsApp version using Chat SDK + AI SDK 6 with a fake property database, and tried to have a conversation in derja.
First try: I asked the bot "choufli dar S+2 fi ariana." It found one property. The actual database had four properties in Ariana governorate. The other three were in Ghazela, Raoued, and Soukra — all parts of Ariana — but the bot was doing dumb text matching on the property name.
This is the moment where I realized real estate search isn't a chatbot problem, it's a location data problem. I built a hierarchical locations table (governorate → delegation → locality) with derja aliases on each level. Now when someone says "ariana", the search expands to all child locations. When they say "ghazela", it narrows to that locality. Single change, massive jump in perceived intelligence.
What I shipped (in order)
WhatsApp chatbot with 5 AI tools. The agent has searchProperties, getPropertyPhotos, bookAppointment, escalateToHuman, and updateLeadPreferences. The LLM (GPT-4o-mini) understands derja naturally, extracts intent and entities, then calls the right tools. The tool loop runs until the agent has what it needs.
Multi-tenant routing. Each WhatsApp message arrives at one webhook. The system looks up which organization owns that phone_number_id and routes to the right bot instance. Bot configs are cached for 5 minutes per phone number to avoid hammering the database on every message.
ElevenLabs TTS for voice messages. When the bot replies, it can also generate a voice note using ElevenLabs Flash v2.5 (eleven_flash_v2_5 model). Agencies can configure a custom voice per organization. This feature alone gets reactions — "wait, the bot can talk?"
Properties module with photo upload, drag-and-drop reorder, status management (available/reserved/sold/rented), location dropdowns wired to the hierarchy table, and a maplibre-gl map view.
Leads kanban with optimistic drag-and-drop. Six stages from New to Converted, with the visual pipeline most agencies have never had. Built with @dnd-kit for the drag interactions.
Calendar with appointment booking from the chatbot. When a client picks a time slot in WhatsApp, it lands in the dashboard calendar instantly. Status tracking: confirmed / pending / completed / cancelled / no-show.
Analytics dashboard. KPI cards, leads by channel (pie chart), leads over time (area chart), pipeline breakdown (bar chart) — all using Recharts.
Facebook Messenger adapter. Full parity with WhatsApp — text, voice via ElevenLabs STT/TTS, image attachments. Same handler pipeline, just routed by page_id instead of phone_number_id.
Embeddable website widget. Agencies copy an iframe snippet and paste it on their website. Cookie-based anonymous visitor identity, native SSE streaming via AI SDK useChat. Same bot brain, just a different transport layer.




Technical decisions
Why Vercel stack end-to-end: I could have split this into a Node backend with separate services. I went full Vercel because the whole point was to learn how their ecosystem plays together. Next.js 16, React 19, Supabase for data, Vercel for hosting and Sandbox. One repo, one deploy. Solo dev tax: I can't afford to operate five different systems.
Why OpenAI GPT-4o-mini over Claude for production: I started with Claude because Anthropic models had the best derja understanding in my early tests. Switched to GPT-4o-mini for production after seeing the latency difference and the cost difference at scale. The agent tasks here are tool-calling and entity extraction, not creative writing — GPT-4o-mini is more than enough and ~10x cheaper than Claude Sonnet on this workload. I keep both SDKs installed (@ai-sdk/openai and @ai-sdk/anthropic) so I can switch per-tenant if needed.
Why structured location data over geo coordinates: I almost reached for PostGIS and radius search. Then I remembered most users say "I want a place in Ariana", not "I want a place within 3km of these coordinates". Hierarchical location data with proper aliases is dumb-simple and matches how people actually talk. PostGIS can come later if proximity search becomes important.
Why generalized handlers from the start: When I added Messenger and the website widget, the AI handler pipeline barely changed. I'd parameterized Platform as a type early on and dispatched per-platform only at the edges (lead resolver writes to whatsapp_id / messenger_id / web_session_id; conversation resolver tags by platform). The bot brain itself doesn't know what platform it's running on. This made adding new platforms a matter of new adapter + new webhook route, not rewriting the agent.
Why RLS for multi-tenancy: Past me would have built tenant isolation in the application layer with WHERE organization_id = ? everywhere. Future me knows that's how you eventually leak data when someone forgets a clause. RLS makes the database enforce isolation. Every query is automatically scoped. Took two days to set up properly and saved me from an entire class of bugs.
Problems I hit (the real "oh shit" moments)
The bot stopped responding in production. Worked perfectly locally. In production: webhook receives the message, returns 200 to Meta, then... silence. The AI handler never finishes. The user gets nothing.
Took me a day to figure out: Chat SDK processes messages as detached promises after returning the 200 to Meta. On Vercel serverless, the function gets killed the moment you return a response — there's no event loop running in the background to finish the AI handler. The fix: wrap the background work in waitUntil() from @vercel/functions, which tells Vercel to keep the function alive until the promise resolves. Also bumped maxDuration to 120s for complex conversations. Then parallelized everything I could — Promise.all() on DB saves + history loading, photo sending with Promise.allSettled(), post-AI work running concurrently with the reply send. Latency dropped significantly. This is the single most important production lesson I learned: serverless ≠ Node.js. Detached promises die.
The other production gotcha — Postgres SSL. Chat SDK uses its own Postgres state adapter (@chat-adapter/state-pg) for persistence. Locally it connected fine. Production: SELF_SIGNED_CERT_IN_CHAIN errors immediately. Three things had to be true: use Supabase's Transaction pooler URL (not Direct, because Vercel is serverless and Direct connections don't scale), URL-encode the @ in your password as %40, and pass the URL explicitly to createPostgresState() instead of relying on env var fallback chains. Once those three lined up, everything worked. None of this is in any docs.
WhatsApp Business API onboarding is brutal. Meta's developer portal threw a phantom "phone registration error" on my account that blocked the API Setup page even though I was using their test number. Eventually fixed it by calling the /register endpoint directly via Graph API Explorer with a 6-digit PIN. The dashboard was lying — the API itself worked fine. Lesson: when Meta's UI fails, go straight to the API.
The "I don't understand" reply. First time I sent a cold outreach to a real agency owner, I described the product as "a WhatsApp assistant for real estate." She replied "I don't understand what this is." That was humbling. I realized I'd been describing the technology, not the scenario. Rewrote the pitch to lead with "imagine a client messages you at 11pm — the bot responds instantly with matching properties and books a viewing." Same product, completely different reception. Now I always lead with the scenario.
What I Learned & Would Do Different
The AI agent pattern is genuinely powerful. I went in skeptical that tool-calling LLMs would be reliable enough for production. After building this, I'm convinced this is how most apps will be built in a few years. The bot handles fuzzy requests, multi-step interactions, and context switches that would have taken me hundreds of conditional branches to handle manually. The trick is making the tools narrow and well-described — let the LLM do the orchestration, not the business logic.
Production is where you actually learn. Locally everything works. Production reveals the corners. The waitUntil issue, the Postgres SSL chain, the Vercel function timeouts on long AI chains — none of this is visible until real traffic hits real infrastructure. Spend less time in dev mode, ship the ugly version, see what breaks.
RLS multi-tenancy is the way. I'd never built tenant isolation enforced at the DB level. Now I don't want to go back. Every query is automatically scoped. The app code stays clean. If you're starting a multi-tenant SaaS today, do this on day one.
Generalize early. Splitting the platform-specific code (resolvers, webhook routes) from the platform-agnostic code (agent, tools, business logic) on day one made adding Messenger and the website widget trivial. If I'd started WhatsApp-first and tightly coupled, every new platform would have been a rewrite.
Don't describe technology, describe scenarios. This was the biggest non-technical learning. Every time I talk about Tanvilla as "an AI WhatsApp assistant," people glaze over. Every time I describe a 11pm-client-messaging scenario, people lean in. I now apply this rule to landing pages, cold emails, demos, everywhere.
What I'd do differently: I spent too long perfecting the product before talking to real agencies. I had the chatbot, the kanban, the calendar, the analytics, the Messenger adapter all done before I'd done a single proper customer interview. Classic engineer mistake — building before validating. I've since corrected this by sending out a Mom Test survey and doing customer interviews in parallel with continuing to build.
Where It Stands Now
Honest status: first demo is sent, talking to a couple of agencies in Tunis. No paying customers yet. I'm in the messy middle where the product works but I don't have product-market fit signal yet.
What works end-to-end: WhatsApp + Messenger + embeddable web widget all routing through the same agent, derja understanding, property search via natural language, viewing booking, voice replies, lead capture, real-time dashboard updates, multi-tenant isolation. The technical foundation is solid.
What's not working yet: I'm still figuring out the right pitch, the right pricing anchor, and the right ICP. Tunisian agencies are price-sensitive in a way I didn't fully appreciate. I'm running validation interviews to figure out whether to keep pushing Tunisia hard or expand to a higher-paying market.
What's next:
- Get to 3 paying agencies (to validate the pricing and the basic value proposition)
- Improve the bot's handling of objections ("is this negotiable?", "can I see more photos?")
- Build a smoother onboarding flow so new agencies can set themselves up without me holding their hand
- Maybe add automated property reel video generation via Remotion + Vercel Sandbox (the stack is already in place)
I'm not sure if Tanvilla becomes a real business or stays a learning project. Both outcomes are fine for me right now. The tech I learned applies to a dozen other ideas in my backlog, and even if Tanvilla doesn't scale, I have a working multi-tenant AI agent SaaS template I can fork for the next thing.
The Stack
Framework
- Next.js 16.2.6 (App Router, Turbopack)
- React 19.2.6
- TypeScript
UI
- Tailwind CSS v4
- shadcn/ui + Radix UI
- Lucide + Tabler icons
- @dnd-kit (drag-and-drop kanban)
- Recharts (analytics)
- Motion / Framer Motion v12
- maplibre-gl (property maps)
AI & Messaging
- Vercel AI SDK v6 (ToolLoopAgent for the bot brain)
- Chat SDK v4.28 with
@chat-adapter/whatsapp,@chat-adapter/messenger,@chat-adapter/web @chat-adapter/state-pg(Postgres-backed conversation state)- OpenAI (
@ai-sdk/openai, GPT-4o-mini) for production - Anthropic (
@ai-sdk/anthropic) kept available - ElevenLabs (eleven_flash_v2_5) for TTS voice messages
- WhatsApp Business Cloud API direct for sending images/audio
- Facebook Graph API for Messenger
Database & Auth
- Supabase (PostgreSQL + Auth + Storage + Realtime)
- Row Level Security for multi-tenant isolation
- Transaction pooler connection (critical for serverless)
Video & Media (in progress)
- Remotion v4 with
@remotion/vercel+@remotion/player - Vercel Sandbox for video rendering
- Vercel Blob for output storage
Infrastructure & Ops
- Vercel (Edge + Serverless + Sandbox)
- Resend (transactional email via React Email templates)
- PostHog (product analytics)
- Firecrawl (
@mendable/firecrawl-js) for scraping listing portals
Live: tanvilla.app
If you're building something similar or want to talk about AI agents, multi-tenant SaaS, or what it's like to validate a B2B product in North Africa — find me on LinkedIn or check my other projects on wassimbenr.com.