This tool is designed to recognize speech in real-time, convert it to text, and automatically copy the text to the system clipboard. The tool leverages API services for speech recognition and uses Python libraries for audio capture and clipboard management.
pip install asr2clip # Install the package
asr2clip --edit # Create/edit config file
asr2clip --test # Test your configuration
asr2clip # Start recording and transcribing
Before you begin, ensure you have the following ready:
| Dependency | Purpose | Linux | macOS | Windows |
|---|---|---|---|---|
| ffmpeg | Audio format conversion | apt install ffmpeg |
brew install ffmpeg |
Download |
| PortAudio | Audio recording | apt install libportaudio2 |
brew install portaudio |
Included with sounddevice |
| Clipboard | Copy to clipboard | apt install xclip (X11) or wl-clipboard (Wayland) |
Built-in | Built-in |
# Install using pip
pip install asr2clip
# Or install using pipx (recommended for isolated environments)
pipx install asr2clip
# Upgrade to latest version
pip install --upgrade asr2clip
git clone https://github.com/Oaklight/asr2clip.git
cd asr2clip
pip install -e .
The easiest way to configure asr2clip is using the built-in editor:
asr2clip --edit # Opens config file in your default editor
This will create a config file at ~/.config/asr2clip/config.yaml if it doesn’t exist.
The configuration file uses YAML format:
api_base_url: "https://api.openai.com/v1/" # or other compatible API base URL
api_key: "YOUR_API_KEY" # api key for the platform
model_name: "whisper-1" # or other compatible model
# quiet: false # optional, disable logging
# audio_device: "pulse" # optional, audio input device
Config file locations (searched in order):
./asr2clip.conf (current directory)~/.config/asr2clip/config.yaml~/.config/asr2clip.conf (legacy)~/.asr2clip.conf (legacy)Before using the tool, verify your setup:
asr2clip --test
This will check:
If the default audio device doesn’t work, list available devices and select one:
asr2clip --list_devices # List all audio input devices
asr2clip --device pulse # Use specific device
Or add to your config file:
audio_device: "pulse" # or device index like 12
asr2clip # Record until Ctrl+C, transcribe, copy to clipboard
asr2clip --vad # Continuous recording with voice detection
asr2clip -i audio.mp3 # Transcribe an audio file
usage: asr2clip [-h] [-v] [-c FILE] [-q] [-i FILE] [-o FILE] [--test]
[--list_devices] [--device DEV] [-e] [--generate_config]
[--print_config] [--vad] [--interval SEC] [--adaptive]
[--calibrate] [--silence_threshold RMS]
[--silence_duration SEC] [--no_adaptive]
Record audio and transcribe to clipboard using ASR API
options:
-h, --help show this help message and exit
-v, --version show program's version number and exit
-c FILE, --config FILE
Path to configuration file
-q, --quiet Quiet mode - only output transcription and errors
-i FILE, --input FILE
Transcribe audio file instead of recording
-o FILE, --output FILE
Append transcripts to file
--test Test API configuration and exit
--list_devices List available audio input devices
--device DEV Audio input device (name or index)
-e, --edit Open configuration file in editor
--generate_config Create config file at ~/.config/asr2clip/config.yaml
--print_config Print template configuration to stdout
--vad Continuous recording with voice activity detection
--interval SEC Continuous recording with fixed interval (seconds)
--adaptive Adaptive threshold (default when --vad is used)
--calibrate Calibrate silence threshold from ambient noise
--silence_threshold RMS
Silence threshold (default: auto with adaptive)
--silence_duration SEC
Silence duration to trigger transcription (default: 1.5)
--no_adaptive Disable adaptive threshold (use fixed threshold)
# Single recording (press Ctrl+C to stop)
asr2clip
# Transcribe an audio file
asr2clip -i recording.mp3
# Save transcript to file
asr2clip -o transcript.txt
# Use specific audio device
asr2clip --device pulse
For long recordings like meetings or lectures, use --vad or --interval:
# Continuous with voice activity detection (auto-transcribe on silence)
asr2clip --vad -o ~/meeting.txt
# Continuous with fixed interval (transcribe every 60 seconds)
asr2clip --interval 60 -o ~/meeting.txt
# Combine VAD with max interval
asr2clip --vad --interval 120 -o ~/meeting.txt
In continuous mode:
Enable automatic transcription when you stop speaking:
# Auto-transcribe when silence is detected
asr2clip --daemon --vad
# Calibrate silence threshold for your environment
asr2clip --calibrate
# Use custom silence settings
asr2clip --daemon --vad --silence_threshold 0.005 --silence_duration 2.0
VAD options:
--vad: Enable voice activity detection--adaptive: Enable adaptive threshold that adjusts to ambient noise on-the-fly--calibrate: Measure ambient noise and suggest threshold--silence_threshold: RMS threshold for silence (default: 0.01)--silence_duration: Seconds of silence to trigger transcription (default: 1.5)With VAD enabled, transcription is triggered when:
Tip: Use --adaptive for automatic threshold adjustment during recording:
asr2clip --daemon --vad --adaptive
This continuously monitors ambient noise and adjusts the threshold accordingly.
| Problem | Solution |
|---|---|
| Audio not captured | Run asr2clip --list_devices and select a working device |
| Clipboard not working | Install xclip (X11) or wl-clipboard (Wayland) |
| API errors | Check your API key and endpoint in config |
| Silent audio | Try a different audio device with --device |
Run asr2clip --test to diagnose issues.
If you would like to contribute to this project, please fork the repository and submit a pull request. We welcome any improvements or new features!
This project is licensed under the GNU Affero General Public License v3.0. See the LICENSE file for details.