On-demand LLM translation inside a chat interface

By Ahmed "Riz" Ratul · 2026-03-24 03:48:13 · AI, LLMs

Real-time Thai ↔ English translation using LLMs with sub-second latency.

The killer feature in mymuaythai.app: a trainer in Bangkok types in Thai, and the student in London reads it in English. Instantly.

Why not Google Translate API?

We tried it. The translations were technically correct but culturally wrong. Muay Thai has specific terminology that Google mangles. "หมัดตรง" (straight punch) became "straight fist." An LLM with the right system prompt gets this right.

Architecture

1. User sends message → hits Supabase Edge Function

2. Edge Function detects language (simple heuristic: Thai Unicode range check)

3. If translation needed → call OpenRouter with Claude Haiku (fast, cheap, good enough)

4. Store both original and translated text in the message record

5. Client displays the user's preferred language

Latency budget

Total budget: 500ms. Network to Edge Function: ~50ms. LLM call via OpenRouter: ~200-300ms. Database write: ~50ms. We hit sub-500ms consistently with Claude Haiku through OpenRouter.

Cost

At ~$0.25 per million input tokens with Haiku, translation costs are negligible. A busy day might be 10,000 messages = roughly $0.01.

The model choice matters

We use OpenRouter to route between models. Haiku for translation (speed matters). Claude Sonnet for complex queries where accuracy is critical. Groq for anything that needs to be instant.