The Voice AI That Can Only Book a Room Can Only Do 1 in 5 Calls

Hotel voice AI quietly optimized for the only call type the industry measures: reservations. That's one in five. FlowStay was built for the other four.

The receptionist at a 160-room urban boutique keeps a tally on a Post-it stuck to the underside of her desk. She has been doing it for two weeks, because her GM asked her to. One vertical mark for every inbound call. A letter next to the mark for what it was about.

By the end of her Tuesday double-shift she has forty-one marks. Eleven of them say R, for reservation. The other thirty are split across eight letters. H for housekeeping (“the towel in 412”). M for maintenance (“the ice machine on 3”). F for F&B (“are you still serving”). D for directions. W for the wifi password, always. B for billing (“can I get that receipt re-emailed”). C for concierge (“is there a dry cleaner”). S for a walk-in that just wanted to ask if there was space tonight.

Her GM looks at the Post-it at the end of the week, counts the letters, and asks a question nobody in the voice-AI industry wants to answer: the product you’re selling me can do the R column. What about the other seven letters?

The only column the industry publishes is R

The reference point for hospitality voice analytics is Revinate’s 2024 Hospitality Benchmark Report, built on an analysis of 4.3 million hotel phone calls across North American hoteliers. It is, by a wide margin, the largest empirical corpus of hotel voice data in the industry.

Read the methodology carefully. Revinate tracks lead calls: inbound voice that has a plausible path to a reservation. That is what the dataset measures. That is what it is designed to measure.

From Revinate’s own methodology page on call center metrics, the tracked KPIs are lead call volume, lead call conversion rate, average booking value, offer rate, outbound revenue per call. Every one of those is a reservation-funnel metric.

Lead call conversion rate: the percentage of reservation calls that convert to a booking.

Revinate, 8 Essential Hotel Call Center Metrics

What the benchmark does not publish, because it is not in the instrument: the categorical mix of the other 70% of voice traffic. No housekeeping percentage. No maintenance percentage. No F&B percentage. Not because Revinate is hiding it. Because the rest of the industry has never commercially cared to measure it.

This is not a complaint about Revinate. Their report is good, and the reservation-side data in it is load-bearing for every operator we know. It is a complaint about the whole market: the entire non-reservation call universe at hospitality scale is literally unmeasured.

What the ticket taxonomies prove

Here is the structural irony. The hospitality industry already knows the other seven letters exist. It just tracks them somewhere other than the phone.

Open ALICE by Actabl, the most widely deployed hotel operations platform in the independent and upper-upscale segment. Every guest request generated at a property runs through its ticketing system. The request categories, straight from the platform, are the standard taxonomy:

Housekeeping
Maintenance / Engineering
Front desk / Guest services
F&B / Room service
Concierge
Transportation
Wake-up calls
Billing / Reservations

HelloShift’s guest messaging platform surfaces the same structure: housekeeping, maintenance, concierge, F&B, billing, each as its own channel, each with its own ticket queue and its own SLA.

This is what the back of house already runs on. The industry has an explicit, production-grade ontology of non-reservation work. It just never pointed that ontology at the phone system. The phone was left to do whatever fell out of it, and the single metric anybody measures is whether the caller who happened to want a reservation got one.

The voice-AI market built to the benchmark, not to the building

Look at what the major voice-AI vendors in hospitality actually shipped.

PolyAI’s hotel page is explicit, and to their credit, ambitious: “Take room reservations, initiate housekeeping requests, resolve billing inquiries, answer frequently asked questions, call routing.” In the published case studies, though, the numbers that get reported are reservation-shaped: PolyAI’s three-chain case study reports Golden Nugget’s voice agent automating “34% of hotel reservation calls,” Mohegan Sun diverting “30% of calls” from the contact center. Those are real outcomes. They are also, notably, reservation-domain outcomes at casino-resort properties where reservations volume is outsized.

Annette, deployed at Sonesta and adopted as a white-labeled operator solution, is a PolyAI reskin. Same engine, same domain bias.

Canary AI Voice, launched in February 2025 with Wyndham as an early global customer, splits the surface into four separate assistants: AI Front Desk, AI Concierge, AI Central Reservations, AI Booking Agent. Already an improvement over the single-surface architecture. The product literature leans, again, into dining, attractions, and booking intents, the places the revenue is easy to attribute.

We are not knocking any of these products. They are good products that solve the reservation-shaped problem extraordinarily well. The point is narrower: every vendor in hospitality voice has optimized, first, for the column the industry measures. And the column the industry measures is R.

What breaks when the phone is not all-letter

The Post-it tally understates what the receptionist is actually doing, because most of the non-R calls happen between her other tasks. She is walking a bag. She is printing a receipt. She is explaining the spa hours to a guest at the desk. The phone rings. It is someone asking where to park. She can’t pick up. It goes to voicemail. Nobody calls back.

That call, the parking question at 2:14pm, is the shape of the gap. A voice AI that can only do R sits muted for most of the day, waiting for reservations traffic. The real-time operational work, the H, M, F, D, W, B, C, S, continues to drain the receptionist’s available attention, and the guest experience degrades in exactly the way J.D. Power has been measuring for years: slower response, lower satisfaction, lower propensity to return.

The vendor’s dashboard shows a healthy automation rate against the reservations baseline. The owner’s P&L does not see a corresponding shift in overall service quality, because the product is not answering the right set of phones.

What a full-letter agent has to do

This is where the product design has to diverge from the benchmark.

A voice system that serves a hotel has to be able to:

Take and route every inbound call, 24 hours a day, on the first ring, in the guest’s language. Not “the reservations line.” The main line.
Classify the intent into the live ticket taxonomy (R, H, M, F, D, W, B, C, S), so the work is captured in the same queue the rest of the property already operates against.
Resolve in-domain what it can resolve. The wifi password. The breakfast hours. The spa closing time. The pool. The standard pre-arrival questions. The “do you take dogs.”
Open a ticket, not a voicemail, for what it can’t. Maintenance request at 10:47pm gets assigned to the overnight engineer with a priority and a location. Housekeeping request at 7:12am gets routed to the floor team before Marisol’s shift starts.
Close a direct booking when the caller is in reservations intent, on the phone, with the returning guest’s profile loaded before the second ring.

This is the part that is not reducible to “answer more calls” or “automate the first 40%.” The architecture has to match the ontology of the work, not the shape of the published benchmark.

The measurement gap is the competitive opening

For a new entrant, the absence of industry-level call-mix data is the opposite of a problem. It is a wedge.

If the industry has spent a decade optimizing voice products against a reservations benchmark, and the real call distribution at a full-service independent is, by informal operator tally, roughly one in five for reservations and four in five for the operational long tail, then the best voice AI measured on the R benchmark is, by design, a minority product.

The product that wins is not the one with the highest reservation-conversion score on a vendor dashboard. It is the one that, on the receptionist’s Post-it at the end of the week, shows up as the handler for all eight letters.

That is the bet FlowStay is making. We built the voice system for the ticket taxonomy, the one ALICE and HelloShift have been running the back of house on for a decade, not for the one number the industry’s benchmark was instrumented to publish.

The phone at an independent hotel is not a reservations channel with some noise around it. It is the full-letter interface to the building. Treating it as anything less is what leaves the receptionist writing R’s on a Post-it while forty-one other things ring through to voicemail.

She is trying to tell you what the product is supposed to do. The tally is right there, under the desk.