Simon Willison shares a practical recipe for transcribing audio files using the Gemma 4 E2B model with MLX and mlx-vlm on macOS. The method involves a simple command-line tool and demonstrates reasonable transcription accuracy with minor errors. This provides an accessible way to experiment with local audio-to-text generation using open-source tools.
Background
Gemma is Google's family of open-source language models, while MLX is Apple's machine learning framework for efficient execution on Apple silicon. Audio transcription with local models is an emerging use case for edge-device AI.
- Source
- Simon Willison
- Published
- Apr 13, 2026 at 07:57 AM
- Score
- 5.0 / 10