asr2clip

asr2clip – Speech-to-Text Clipboard Tool

PyPI version GitHub version License

中文

This tool is designed to recognize speech in real-time, convert it to text, and automatically copy the text to the system clipboard. The tool leverages API services for speech recognition and uses Python libraries for audio capture and clipboard management.

TL;DR

pip install asr2clip       # Install the package
asr2clip --edit            # Create/edit config file
asr2clip --test            # Test your configuration
asr2clip                   # Start recording and transcribing

Prerequisites

Before you begin, ensure you have the following ready:

System Dependencies

Dependency Purpose Linux macOS Windows
ffmpeg Audio format conversion apt install ffmpeg brew install ffmpeg Download
PortAudio Audio recording apt install libportaudio2 brew install portaudio Included with sounddevice
Clipboard Copy to clipboard apt install xclip (X11) or wl-clipboard (Wayland) Built-in Built-in

Installation

# Install using pip
pip install asr2clip

# Or install using pipx (recommended for isolated environments)
pipx install asr2clip

# Upgrade to latest version
pip install --upgrade asr2clip

Option 2: Install from source

git clone https://github.com/Oaklight/asr2clip.git
cd asr2clip
pip install -e .

Configuration

Quick Setup

The easiest way to configure asr2clip is using the built-in editor:

asr2clip --edit  # Opens config file in your default editor

This will create a config file at ~/.config/asr2clip/config.yaml if it doesn’t exist.

Configuration File

The configuration file uses YAML format:

api_base_url: "https://api.openai.com/v1/"  # or other compatible API base URL
api_key: "YOUR_API_KEY"                     # api key for the platform
model_name: "whisper-1"                     # or other compatible model
# quiet: false                              # optional, disable logging
# audio_device: "pulse"                     # optional, audio input device

Config file locations (searched in order):

  1. ./asr2clip.conf (current directory)
  2. ~/.config/asr2clip/config.yaml
  3. ~/.config/asr2clip.conf (legacy)
  4. ~/.asr2clip.conf (legacy)

Test Your Configuration

Before using the tool, verify your setup:

asr2clip --test

This will check:

Audio Device Selection

If the default audio device doesn’t work, list available devices and select one:

asr2clip --list_devices    # List all audio input devices
asr2clip --device pulse    # Use specific device

Or add to your config file:

audio_device: "pulse"  # or device index like 12

Usage

Basic Usage

asr2clip                   # Record until Ctrl+C, transcribe, copy to clipboard
asr2clip --vad             # Continuous recording with voice detection
asr2clip -i audio.mp3      # Transcribe an audio file

CLI Options

usage: asr2clip [-h] [-v] [-c FILE] [-q] [-i FILE] [-o FILE] [--test]
                [--list_devices] [--device DEV] [-e] [--generate_config]
                [--print_config] [--vad] [--interval SEC] [--adaptive]
                [--calibrate] [--silence_threshold RMS]
                [--silence_duration SEC] [--no_adaptive]

Record audio and transcribe to clipboard using ASR API

options:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -c FILE, --config FILE
                        Path to configuration file
  -q, --quiet           Quiet mode - only output transcription and errors
  -i FILE, --input FILE
                        Transcribe audio file instead of recording
  -o FILE, --output FILE
                        Append transcripts to file
  --test                Test API configuration and exit
  --list_devices        List available audio input devices
  --device DEV          Audio input device (name or index)
  -e, --edit            Open configuration file in editor
  --generate_config     Create config file at ~/.config/asr2clip/config.yaml
  --print_config        Print template configuration to stdout
  --vad                 Continuous recording with voice activity detection
  --interval SEC        Continuous recording with fixed interval (seconds)
  --adaptive            Adaptive threshold (default when --vad is used)
  --calibrate           Calibrate silence threshold from ambient noise
  --silence_threshold RMS
                        Silence threshold (default: auto with adaptive)
  --silence_duration SEC
                        Silence duration to trigger transcription (default: 1.5)
  --no_adaptive         Disable adaptive threshold (use fixed threshold)

Examples

# Single recording (press Ctrl+C to stop)
asr2clip

# Transcribe an audio file
asr2clip -i recording.mp3

# Save transcript to file
asr2clip -o transcript.txt

# Use specific audio device
asr2clip --device pulse

Continuous Recording Mode

For long recordings like meetings or lectures, use --vad or --interval:

# Continuous with voice activity detection (auto-transcribe on silence)
asr2clip --vad -o ~/meeting.txt

# Continuous with fixed interval (transcribe every 60 seconds)
asr2clip --interval 60 -o ~/meeting.txt

# Combine VAD with max interval
asr2clip --vad --interval 120 -o ~/meeting.txt

In continuous mode:

Voice Activity Detection (VAD)

Enable automatic transcription when you stop speaking:

# Auto-transcribe when silence is detected
asr2clip --daemon --vad

# Calibrate silence threshold for your environment
asr2clip --calibrate

# Use custom silence settings
asr2clip --daemon --vad --silence_threshold 0.005 --silence_duration 2.0

VAD options:

With VAD enabled, transcription is triggered when:

  1. Speech is detected (audio above threshold)
  2. Followed by silence (audio below threshold for the specified duration)

Tip: Use --adaptive for automatic threshold adjustment during recording:

asr2clip --daemon --vad --adaptive

This continuously monitors ambient noise and adjusts the threshold accordingly.

Troubleshooting

Problem Solution
Audio not captured Run asr2clip --list_devices and select a working device
Clipboard not working Install xclip (X11) or wl-clipboard (Wayland)
API errors Check your API key and endpoint in config
Silent audio Try a different audio device with --device

Run asr2clip --test to diagnose issues.

Contributing

If you would like to contribute to this project, please fork the repository and submit a pull request. We welcome any improvements or new features!

License

This project is licensed under the GNU Affero General Public License v3.0. See the LICENSE file for details.