OpenWhispr supports multiple speech-to-text engines and AI providers. Run everything locally for privacy, use cloud APIs for speed, or mix both.
Run speech-to-text entirely on your machine. No internet required, no data leaves your device.
via whisper.cpp · 99+ languages · GGML format
| Model | Size | Speed | Quality |
|---|---|---|---|
Tiny | 75 MB | Fastest | Basic |
BaseRecommended | 142 MB | Fast | Good |
Small | 466 MB | Medium | Better |
Medium | 1.5 GB | Slow | High |
Large v3 | 3 GB | Slowest | Best |
Turbo | 1.6 GB | Fast | Good |
via sherpa-onnx · 25 languages · ONNX INT8
State-of-the-art English accuracy with 25-language multilingual support. INT8 quantized for efficient local inference.
Parakeet languages: English, Bulgarian, Croatian, Czech, Danish, Dutch, Estonian, Finnish, French, German, Greek, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Ukrainian.
Use your own API keys for cloud-powered transcription. Bring-your-own-key keeps you in control.
Fast, accurate transcription
Most accurate OpenAI transcription
Original Whisper API endpoint
216x real-time speed — the fastest cloud transcription available
Connect Ollama, self-hosted Whisper servers, LocalAI, or any service with an OpenAI-compatible /audio/transcriptions endpoint.
Zero-config transcription and AI processing. No API keys needed — just sign in and go.
Managed transcription
We route to the fastest provider automatically
AI text cleanup included
Formatting, punctuation, and filler removal
Free tier available
Upgrade to Pro for unlimited usage
After transcription, OpenWhispr can clean up your text — fixing grammar, removing filler words, and formatting output. Choose from cloud or local AI models.
OpenAI
GPT-5.2, GPT-5 Mini, GPT-5 Nano, GPT-4.1, GPT-4.1 Mini, GPT-4.1 Nano
Anthropic
Claude Opus 4.5, Claude Sonnet 4.5, Claude Haiku 4.5
Gemini 3 Pro, Gemini 3 Flash, Gemini 2.5 Flash Lite
Groq
Qwen3 32B, GPT-OSS 120B, GPT-OSS 20B, LLaMA 3.3 70B, LLaMA 3.1 8B, Mixtral 8x7B
Qwen
Qwen3 8B0.6B, 1.7B, 4B, 8B, 32B variants
Mistral
Mistral 7B InstructQ4 and Q5 quantizations
Meta LLaMA
LLaMA 3.2 3B1B, 3B, 8B variants
OpenAI OSS
GPT-OSS 20BOpen-source flagship
Local reasoning models run via llama.cpp with GGUF quantization. All processing stays on your machine.
1. You speak
Hold your hotkey and talk naturally into any app
2. Model transcribes
Whisper, Parakeet, or a cloud provider converts speech to text
3. AI cleans up
Optional AI processing fixes grammar, removes filler, and formats your text
Download OpenWhispr and choose the models that work best for you. Local models are completely free. Cloud providers use your own API keys.