100% local processing
Your files never leave your machine. No upload, no cloud, no waiting for a queue. Works offline after the first model download.
Drop in a recording, pick a language, get clean text or subtitles. Nothing is uploaded — the model runs locally.
Your files never leave your machine. No upload, no cloud, no waiting for a queue. Works offline after the first model download.
OpenAI Whisper supports nearly a hundred languages with strong accuracy. Auto-detect is the default; pick a specific language if you want to force it.
A single checkbox switches inference to your GPU through Vulkan — NVIDIA, AMD, Intel — and falls back silently to CPU if the driver does not cooperate.
Save a clean text transcript, ready-to-use subtitles, or both from a single run — the heavy inference happens only once.
WhatsApp voice notes (.opus, .ogg), OBS recordings (.mkv), YouTube downloads (.webm), MP3, MP4, MOV, WAV, FLAC, M4A — ffmpeg handles them all.
Your audio and transcripts never leave your machine. The app sends a single ping per installation — a random ID, the app version, and the system language — so it is possible to see which countries and languages to focus on. That is it. Underlying components (FFmpeg, whisper.cpp, Whisper.net, .NET) are open source and independently auditable.
Choose any audio or video file on disk and the folder where the transcript should be saved.
Keep auto-detect or lock a specific language. Pick the Whisper model size (Tiny for speed, Medium or LargeV3 for accuracy on long recordings).
The app extracts a clean audio track, runs Whisper, and writes the .txt and/or .srt next to your chosen folder. That is it.
Built with .NET 8 on Windows. Uses ffmpeg (LGPL) for format decoding and Whisper.net (MIT) wrapping whisper.cpp (MIT) for inference. Vulkan runtime for GPU, CPU with AVX fallback for everything else. Model files are downloaded once from Hugging Face (ggerganov/whisper.cpp) and cached under %LOCALAPPDATA%.
Yes. The app is free. The Microsoft Store may charge a small one-time fee in some regions — that covers distribution, not the software.
Yes, after the first run. The first time you pick a Whisper model, the app downloads it from Hugging Face. After that, everything is local.
That depends on the model size and audio quality. For clean speech in a supported language, Medium and LargeV3 are close to professional transcription services. For noisy phone recordings in mixed languages, expect rough drafts.
No. The app has no server. Files are decoded, transcribed, and saved entirely on your machine. The only external connection is downloading the Whisper model the first time you use a given size.
Available on the Microsoft Store for Windows 10 and 11. The Store build is self-contained — no .NET runtime needed.
Get it on Microsoft Store