The official installation and getting started instructions are available at talonvoice.com/docs. We strongly recommend you follow the instructions there; this page contains additional advice only. The basic installation flow is as follows:
Talon has two mostly compatible current versions: public and beta. Both versions have support for Mac, Linux, and Windows.
@aegis, the developer of Talon. Download links can be found in the #beta channel’s pinned messages.
When first running
run.sh, Talon does some setup work related to permissions for the eye tracking device. You need to replug the eye tracking device and restart Talon (or of course reboot).
If you use gnome, you need to install AppIndicator and KStatusNotifierItem Support in order to be able to see Talon’s tray icon - which is the only way of configuring it without speech/code.
You might encounter the following error:
ERROR cannot get _NET_CURRENT_DESKTOP
In which case it is necessary to switch to Xorg. (Your distro might support this through a cog wheel on the login screen.)
Talon uses a speech recognition engine that translates voice audio to text. There are multiple options for speech engines, and you will need to choose one. Starting out: unless you already have Dragon, wav2letter (w2l conformer) is recommended.
|w2l conformer||Win/Mac/Linux||Best option for new users. Excellent accuracy and speed for both commands and dictation. Even lower latency for Talon beta users due to ongoing performance optimisations.||Talon Docs||Free|
|w2l gen2||Win/Mac/Linux||Speech engine used prior to conformer. Decent command accuracy. Dictation accuracy is lacking.||Talon Docs||Free|
|Dragon||Win||Good accuracy for both commands and dictation. Has quirks which can’t be fixed by us. Professional version is recommended over home version (home version doesn’t have command mode).||Buy and Install Dragon Professional||$300-$500|
Note: The Professional version of Dragon for Windows is recommended (but not strictly required) because it can be run in Command Mode. Some users have been able to find less expensive copies of Dragon by either waiting for a sale or looking on eBay for older versions.
As of March 2021, w2l only supports English. If you need to dictate text in another language, the Talon beta supports the following options:
|webspeech||Win/Mac/Linux||Excellent accuracy, but added latency. Uses your browser as a voice engine; requires an internet connection. Supports many non-English languages.||See pinned messages in #beta on Slack||Needs Talon Beta|
|vosk||Win/Mac/Linux||Supported languages: https://alphacephei.com/vosk/.||See Github Project||Needs Talon Beta|
Note that you cannot use webspeech or vosk standalone; they don’t handle commands well, only dictation, so you need a command-mode speech recognition engine to use with them.
Note: The Mac Voice Control engine is technically supported for dictation in beta, but it’s not recommended over conformer.
Talon does not come with voice commands out of the box - you must install some configuration scripts. To start out, we strongly recommend that you use the knausj_talon repository. The whole wiki assumes this repository is used, if not otherwise noted.
mkdir -p ~/.talon/user cd ~/.talon/user git clone https://github.com/knausj85/knausj_talon.git knausj_talon
md "%APPDATA%\Talon\user" cd "%APPDATA%\Talon\user" git clone https://github.com/knausj85/knausj_talon.git knausj_talon
If you don’t have
git available, and do not want to install it, download the zip archive of knausj_talon and extract it to the correct folder.
The examples below are just a very small selection of common commands for working with apps, tabs, media, mouse, etc that should help you be productive with Talon right away. These are based on the knaus_talon repository (see Configuration Setup). These commands may vary depending on your individual setup.
knausj_talon has an integrated help. It can show you a list of all defined commands, or just all commands that are available now.
Talon has three basic modes by default: command, dictation, and sleep.
In command mode, your speech will be interpreted as commands by default. In dictation mode, your speech will be transcribed as plain text by default (although with some commands, like “comma” etc. for punctuation), similar to traditional speech recognition systems. In sleep mode, Talon will do nothing until it hears a commands that wakes it up.
There are currently no visual cues about the current mode. You can tell which mode you’re in by running commands and seeing if they are transcribed literally.
window new window next window last window close
focus "app name" (say "focus chrome" for example, to switch active window to chrome) running list (see all active applications) running hide (close the list of active applications)
If you are on Ubuntu or another Gnome-based Linux distribution,
focus might not work consistently across different workspaces, popping up a notification rather than actually switching focus. This extension may help.
tab (open | new) tab last tab next tab close tab (reopen|restore) go tab <number> go tab final
mute play next play previous (play | pause)
control mouse (say "control mouse" to toggle on/off Tobii moving the mouse) run calibration (say "run calibration" to start Tobii calibration)
copy that cut that paste that undo that redo that
dubclick (to double left click) righty (to right click) (page | scroll) up (page | scroll) [down] wheel down wheel tiny [down] wheel downer wheel up wheel tiny up wheel upper wheel gaze (for scrolling down) (this seems like it would use the Tobii eye tracker but it does not) wheel stop wheel left wheel tiny left wheel right wheel tiny right curse yes (shows cursor) curse no (hides cursor) drag
Once the basics somewhat work for you, you’ll likely want to improve your experience using Talon: