Give every voice a road.

If the language you work in isn't on the big commercial platforms, you know the struggle. We offer speech-to-text for over 300 languages the industry tends to overlook. It's affordable, it's quick, and it works without fuss. The transcripts won't be perfect, and they're not a substitute for a professional translator. But they give you a solid starting point you can work from, even while you're still in the field.

Try it free, 20 min on us See how it works →

300+

Languages covered & growing

$0.20/min

Transcription + translation included

20min

Free trial, no card needed

Featured Languages Pacific focus

Explore all languages →

How it works

Upload. Select. Receive.

Three steps from audio to a usable transcript. No account setup, no monthly commitment.

Upload your audio

MP3, WAV, M4A, or AAC, up to 150 minutes per file. Drag and drop or select from your device.

Choose your languages

Pick the source language and optionally a target for translation. Our models handle 300+ languages today.

Receive & download

Your transcript is delivered by email and in your dashboard. Choose DOCX, Markdown (for easy AI or LLM processing), or plain text.

Explore all languages →

Language coverage

Why low-resource languages matter.

Over 7,000 languages are spoken in the world today, yet commercial transcription and translation services cover fewer than 150 of them. The communities that speak the rest are not small, and their voices matter just as much. Health workers, farmers, teachers, community leaders, people whose knowledge and experience deserve to be documented and shared.

We started with 300+ languages across the Pacific, Africa, South Asia, and Southeast Asia. The service is meant to help, not to replace. Whether you're a researcher, a programme team, or a human translator looking for a rough draft to work from, we hope this makes the job a little easier.

Good: reliable transcription Medium: usable with light editing Needs work: best for keyword extraction

Accuracy ratings are based on the Meta Omnilingual ASR 7B LLM-ASR model's published Character Error Rates (CER). "Good" means the model achieves CER at or below 0.5% on Meta's test sets, a strong benchmark for practical use. "Medium" reflects a CER above 0.5% where transcripts need light editing, and "Needs work" covers higher CER ranges best for keyword extraction. Important: Meta's test sets, while rigorous, may not fully represent real-world recording conditions, speaker diversity, or regional dialects. We are building our own evaluation benchmarks from community audio samples and will publish more representative accuracy data as we grow. Model accuracy improves over time through fine-tuning on curated data.

Learn the basics

Transcription, translation, explained.

A quick primer on the terms you'll see, so you know what you're getting.

🎤

Transcription

Converting spoken audio into written text in the same language. A 30-minute interview in Swahili becomes a Swahili transcript.

🌍

Translation

Converting the transcript from one language into another. Take that Swahili transcript and get it in English.

Accuracy varies by language pair. Tok Pisin → English is stronger than Tok Pisin → French, simply because more training data exists. Each step (transcription → translation) adds its own margin for error.

📊

Accuracy measures: WER & CER

WER (Word Error Rate): The percentage of words the model got wrong. Lower is better.
CER (Character Error Rate): Like WER but at the character level. Useful for languages with complex spelling.
How we rate accuracy: Our "Good / Medium / Needs work" ratings follow published benchmarks. We are building additional evaluation data from community audio to better reflect real-world conditions.
Audio quality matters: Real-world accuracy depends heavily on recording conditions. Background noise, microphone quality, overlapping speakers, and connection quality all affect results. A clean recording in a quiet room will always outperform field audio.

Tok Pisin transcription with speaker labels and timestamps

SPK_00 [00:00:02] Gutpela moning olgeta. Tenk yu long kam long dispela miting. SPK_01 [00:00:08] I gat planti samting long stori long wara pipalain. SPK_00 [00:00:15] Yes, gavman i konfimim pinis ol fund bilong wara projek. SPK_01 [00:00:22] Em i gutpela nius. Hamas Kina ol i givim? SPK_00 [00:00:28] K150,000 long namba wan fase. Dispela bai kavim drilling na pasin bilong karim wara i go long ples. SPK_01 [00:00:40] Na wanem long trenim ol manmeri long lukautim dispela sistem?

English translation of the same passage

SPK_00 [00:00:02] Good morning everyone. Thank you for coming to this meeting. SPK_01 [00:00:08] There is a lot to discuss about the water project. SPK_00 [00:00:15] Yes, the government has confirmed the funding for the water project. SPK_01 [00:00:22] That is good news. How much have they allocated? SPK_00 [00:00:28] K150,000 for the first phase. This will cover the drilling and the piping to bring water to the village. SPK_01 [00:00:40] And what about training the community to maintain this system?

Where it makes a difference

Bringing voices from the field.

A few examples of where this kind of service makes a real difference.

Health Research

Field interviews in local languages

Health researchers conduct interviews with community health workers in rural Mozambique. Recordings are transcribed and translated, turning hours of spoken insight into data that can be analysed for programme evaluation.

Impact & Evidence

Most Significant Change stories

Programme teams collect MSC stories from communities in Tok Pisin. Transcription captures the narrative evidence about what changed and why, which feeds into evaluations, donor reports, and learning cycles.

Qualitative Research

Participatory community research

Focus groups and key informant interviews across Ethiopian communities in Amharic, Oromo, and Somali. Transcription and translation speed up the research cycle without losing depth.

Simple pricing

Pay for what you use. Nothing more.

No subscriptions, no commitments. Just per-minute pricing and a free trial to get you started.

Transcription & Translation

$0.20 / minute

Transcription in your source language plus translation to English or another target language. All in one price, no separate tiers.

300+ languages covered
Speaker diarization included
Timestamps at sentence level
Source + target text side by side
Download as DOCX, Markdown (for AI/LLM use), or plain text

Try free, 20 min

All pay-as-you-go. No monthly minimum. No cancellation fees. First 20 minutes free.

Trust & privacy

Your data stays yours.

NGOs and researchers work with sensitive content. We keep that in mind at every step.

🔒

Own infrastructure

Your audio files are processed on our own servers, never sent to third-party APIs for processing.

🚫

No training on your data

We do not use customer recordings, transcripts, or personal information to train or improve our models. Your data stays yours — always.

🗑️

Automatic deletion

Audio files are deleted after processing. Transcripts are retained in your account until you remove them.

🔊

Audio quality matters

Accuracy depends heavily on recording conditions. We'd rather be upfront about that than overpromise. See our accuracy guide for tips on getting the best transcript.

About Last Mile Road

The last mile is where we start.

Most technology builds for the centre: well-resourced languages, stable connectivity, established markets. The last mile is everything outside that radius.

It's the thousands of languages with no transcription tools. The community health workers whose field reports go undocumented. The programme evaluators who rely on oral stories because written records don't exist in the languages they work in.

We're building speech-to-text for these languages, starting with the strongest foundations we can find and improving through continuous fine-tuning, in partnership with the communities we serve.

Powered by Meta Omnilingual ASR (Apache 2.0). Models are improved over time through our own fine-tuning work.

🌍 Language access for all 🤝 Community-rooted 🔓 Built on open-source 📈 Always improving