Documentation Index
Fetch the complete documentation index at: https://hyperwhisper.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
In addition to live recording, HyperWhisper can transcribe audio files you already have on disk. The flow is the same on both platforms — pick a file from the menu, watch a progress popup, and the result lands in your History — but the supported formats and provider limits differ. Pick your platform below.
Open the file picker
Click the HyperWhisper menu bar icon, hover Transcribe File, and choose the mode you want to use. A standard macOS file picker opens immediately.
Each mode in your library shows up as a submenu item, so you can transcribe with Hyper, Voice to text, Meeting, or any custom mode without changing your default first.HyperWhisper accepts most common audio containers, plus the two main video containers — for video files, the audio track is extracted locally before transcription.| Type | Extensions |
|---|
| Audio | .wav, .mp3, .m4a, .aiff, .webm, .ogg, .flac |
| Video (audio extracted) | .mp4, .mov, .m4v |
Cloud providers each support a different subset of audio formats. If you select a format the provider does not accept, HyperWhisper catches it before upload and tells you which formats that provider supports — so you do not have to wait for a cryptic API error after a long upload.
File size limits
Local models have no file size limit. Cloud providers each have their own cap, enforced by HyperWhisper before upload:| Provider | Max file size |
|---|
| LibWhisper / Parakeet (local) | No limit |
| HyperWhisper Cloud | 2 GB |
| Deepgram | 2 GB |
| AssemblyAI | 2.2 GB |
| ElevenLabs | 3 GB |
| Fireworks AI | 1 GB |
| Mistral | 100 MB |
| OpenAI | 25 MB |
| Groq | 25 MB |
If the file is too large for the selected mode’s provider, you get a friendly error showing the file size, the provider’s limit, and the provider name — switch the mode to a different provider (for example, HyperWhisper Cloud or a local model) to transcribe larger files.What happens during transcription
A floating progress popup appears as soon as you pick the file. It walks through three stages:Preparing (0–15%)
HyperWhisper validates the file size and format, copies the file into your recordings folder, extracts the audio track if it is a video, and runs VAD silence trimming if you have it enabled and the file is at least 30 seconds long.
Transcribing (15–85%)
The audio is sent to the local model or cloud provider configured by your mode. The progress bar animates while the provider works.
Finishing (85–100%)
Post-processing rules from the mode (formatting, vocabulary, custom prompt) are applied, the transcript is saved, and the main window jumps to History so you can copy or edit the result.
You can cancel at any point with the Cancel button on the popup. If you cancel, the copied file is cleaned up and no transcript is saved.VAD trimming
If you have Voice Activity Detection enabled in settings and the imported file is 30 seconds or longer, HyperWhisper trims leading and trailing silence before sending it to the provider. The trimmed version is what gets transcribed, but the original audio is preserved — you can toggle between the two from the History view.Open the file picker
Right-click the HyperWhisper tray icon and choose Transcribe Audio File. A standard Windows file dialog opens. Unlike macOS, Windows uses your currently selected mode — change the mode from the tray menu first if you want to transcribe with something other than the default.The Windows app currently supports three audio formats:| Type | Extensions |
|---|
| Audio | .wav, .mp3, .m4a |
Video files (.mp4, .mov) are not yet supported on Windows — extract the audio track first with a tool like ffmpeg if you need to transcribe a video.
File size limits
File size caps follow the provider attached to the selected mode (same as macOS):| Provider | Max file size |
|---|
| WhisperNet (local) | No limit |
| HyperWhisper Cloud | 2 GB |
| Deepgram | 2 GB |
| AssemblyAI | 2.2 GB |
| ElevenLabs | 3 GB |
| Fireworks AI | 1 GB |
| Mistral | 100 MB |
| OpenAI | 25 MB |
| Groq | 25 MB |
If the file exceeds the limit, you get a toast notification with the maximum size for that provider before any upload happens.What happens during transcription
A progress window appears with the file name and a cancel button:Preparing (0–15%)
HyperWhisper validates the file. If your mode uses a local model (WhisperNet), the file is converted to a 16 kHz mono WAV in your temp folder — this is the format Whisper expects, and HyperWhisper handles resampling and stereo-to-mono conversion automatically. If your mode uses a cloud provider, the original file is sent as-is, so MP3 and M4A are uploaded without re-encoding.
Transcribing (15–85%)
The audio is sent to WhisperNet (with DirectCompute GPU acceleration if available) or to the cloud provider configured by your mode.
Finishing (85–100%)
Post-processing rules from the mode are applied, the transcript is saved to History, and the main window updates to show the result. The transcribed text is also smart-pasted or copied to your clipboard depending on your settings.
Press Cancel at any time to abort. The progress window closes and no transcript is saved.The local WhisperNet engine requires 16 kHz mono WAV input. If you import an MP3, an M4A, or a WAV at a different sample rate or channel count, HyperWhisper converts it on the fly using NAudio:
- Reads the source file with
AudioFileReader (handles MP3 / M4A via Windows Media Foundation).
- Downmixes stereo to mono if needed.
- Resamples to 16 kHz with
WdlResamplingSampleProvider.
- Writes a temporary 16 kHz mono WAV used only for that transcription.
The original file you imported is preserved in your History — only the temp WAV used for the model is converted.
After transcription
On both platforms the result appears in History with the original audio attached. From there you can:
- Re-copy the text or post-processed version to your clipboard
- Re-run a different mode against the same file
- Edit the transcript inline
- Delete the entry along with the saved audio
If a transcription fails partway through, the entry is still created in History with an error so you can retry without re-importing the file.