Our first goal is that you have a working Talon installation and configuration, such that when you say out loud “phrase hello world”, Talon will literally type “hello world” into the currently open window. To get to that using the public version of Talon, you need to follow a few steps:
If you would rather use the beta version, which comes with earlier access to new features and higher priority support, then instead do the following:
@aegisfor access to the #beta channel.
We describe below: what’s the difference between public and beta, and what exactly is going on at each step?Contents:
Talon has two mostly compatible current versions: public and beta. The public release is free, while the beta version requires a $25/month subscription to the developer’s patreon. The beta version has earlier access to new features and higher priority support. For example, as of September 2020, the beta version supports
sconv-b5, a more powerful speech recognition model based on neural nets; and
webspeech, the option to use your browser’s speech recognition engine for dictation.
There is also a legacy version of Talon for Mac only that is no longer actively developed, but still functioning. For those considering upgrading from legacy, the new version has a new configuration syntax, supports more operating systems, and has integration with the new wav2letter voice engine, an alternative to Dragon created because Dragon was discontinued for Mac. However, the new API is not backwards-compatible, so you will have to change your configuration files.
To get access to the beta: After joining the beta tier and Talon Voice Slack, write a message to
@aegis requesting access to the
#beta channel. He is the developer of Talon, and will grant you access as soon as he’s back at work on Talon.
Click the appropriate download link for your OS on talonvoice.com. (For the beta, find the latest download for your OS in the #beta channel’s pinned messages.) Once the download finishes:
.dmgand drag&drop it to your Applications.
.tar.xzfile to a directory of your choosing, for instance
~/bin/. This will make Talon available for starting via
C:\Program Files\Talon; you can start Talon via
C:\Program Files\Talon\talon.exe. Alternatively, if you want to use Dragon (or continue to use it), follow the instructions on Installing Dragon and Setup Talon on Windows 10 with Dragon. (TODO: This doesn’t seem to be up to date with the public version, where the default download is just a single executable.)
We still need to set-up a speech recognition engine and a talon configuration before are ready to go!
Talon does not come with voice commands or eye-tracking out of the box - you must install some configuration scripts into your
~/.talon/user directory (that is
C:\Users\<username>\AppData\Roaming\talon\user on Windows). To start out, we strongly recommend that you use knausj85’s knausj_talon. The whole wiki assumes this repository is used, if not otherwise noted.
mkdir -p ~/.talon/user cd ~/.talon/user git clone https://github.com/knausj85/knausj_talon.git knausj_talon
md "%APPDATA%\Talon\user" cd "%APPDATA%\Talon\user" git clone https://github.com/knausj85/knausj_talon.git knausj_talon
If you don’t have
git available, and do not want to install it, download the zip archive of knausj_talon and extract it to the correct folder.
One final step before we can test out Talon.
Talon uses a speech recognition engine that translates voice audio to text. There are multiple options for speech engines, and you will need to choose one. Starting out: Only if you are already using Dragon, you might want to start out with Dragon. Otherwise, Talon’s own engine wav2letter is recommended.
Dragon set-up: If you have Dragon installed, ensure that it is running, and is actively listening to your microphone. Talon will automatically recognize Dragon and use it.
Now start/restart Talon. Talon’s icon (TODO: show how it looks like) should show up in the tray area. (If not, recheck the installation, and if all seems in order, ask for help in #help). Then say “help alphabet” and “help close”. That should open and close a window showing you Talon’s spelling alphabet. Or open any text editor of your liking, and say “phrase hello world”. Talon should type
hello world into the text editor. You can also try saying
If the voice commands do nothing, the culprit could be the microphone setting. A right click on Talon’s icon will open a menu; check in “Microphone” that the correct mic is selected. Make sure the microphone is not muted, and that the gain (or volume slider) of the mic is not too low.
Should that not help, check out Troubleshooting, and ask for help in #help.
TODO: per-OS guide plus dragon specifics on how to check for correct microphone.
The examples below are just a very small selection of common commands for working with apps, tabs, media, mouse, etc that should help you be productive with Talon right away. These are based on knausj85’s knaus_talon repository (see Configuration Setup). These commands may vary depending on your individual setup.
knausj_talon has an integrated help. It can show you a list of all defined commands, or just all commands that are available now.
Talon has three basic modes by default: command, dictation, and sleep.
In command mode, your speech will be interpreted as commands by default. In dictation mode, your speech will be transcribed as plain text by default (although with some commands, like “comma” etc. for punctuation), similar to traditional speech recognition systems. In sleep mode, Talon will do nothing until it hears a commands that wakes it up.
There are currently no visual cues about the current mode. You can tell which mode you’re in by running commands and seeing if they are transcribed literally.
window new window next window last window close
focus "app name" (say "focus chrome" for example, to switch active window to chrome) running list (see all active applications) running hide (close the list of active applications)
If you are on Ubuntu or another Gnome-based Linux distribution,
focus might not work consistently across different workspaces, popping up a notification rather than actually switching focus. This extension may help.
tab last tab next tab close tab reopen (page | scroll) up (page | scroll) [down]
mute play next play previous (play | pause)
control mouse (say "control mouse" to toggle on/off Tobii moving the mouse) run calibration (say "run calibration" to start Tobii calibration)
copy that cut that paste that
dubclick (to double left click) righty (to right click) (page | scroll) up (page | scroll) [down] wheel down wheel tiny [down] wheel downer wheel up wheel tiny up wheel upper wheel gaze (for scrolling down) (this seems like it would use the Tobii eye tracker but it does not) wheel stop wheel left wheel tiny left wheel right wheel tiny right curse yes (shows cursor) curse no (hides cursor) drag
Once the basics somewhat work for you, you’ll likely want to improve your experience using Talon: