Shows about tearing down the barriers for our fellow hackers.
In order to meet basic accessibility standards, I need to have text alternatives to the audio of my online video lectures for my music appreciation class. I have a transcription tool called Dragon Dictate that can do most of the heavy lifting as far as getting a raw transcript of the audio, but the transcription it generates needs a lot of attention in terms of correction, capitalization, and punctuation. It also needs to have all of the text separated into logical paragraphs and it really helps to have proper section headings.
There are 20 lectures in all, and I have finished doing 11 of them, but I still have nine to go and no time to do it. I had an idea to crowdsource this effort by giving extra-credit points to my students for doing little bits of it at a time. They get one extra-credit point for every one minute of lecture that they correct.
I got the idea for this from the Distributed Proofreaders project, where volunteers work to help correct any mistakes that are found in the OCR scans of public-domain books before being posted on a website like Project Gutenberg. So far I've gotten about 30 minutes of lecture transcripts corrected by students who needed extra credit, and I have high hopes that we will finish the project either this summer or next fall.
One excellent tool that I found while I was figuring out how to handle this project logistically is the Linux command line tool called mp3splt. I use this tool to cut the long lecture files up into one-minute segments like so:
mp3splt -t 1.0.0 L13audio.mp3
I also wrote my own script that will generate an HTML page with individual audio players for all of these one-minute audio files so that students can very easily choose an audio file to work on that is exactly one minute long. The script also pushes all of the audio files over to my server after creating
ogg versions of the mp3s using mp32ogg.
LESSON=$(ls *.mp3 |head -n1 | sed -e 's/audio.*$//')
cat >> $page <<EOFtop
<h2><a href="http://servername.edu/path/to/filedir/$LESSON.html">RAW TRANSCRIPT HERE</a></h2>
for i in *.mp3; do
stem=$(basename $i .mp3)
cat >> $page <<EOF
<source src="$url/$stem.mp3" type="audio/mpeg">
<source src="$url/$stem.ogg" type="audio/ogg">
scp *.ogg servername:~/path/to/filedir/
scp *.mp3 servername:~/path/to/filedir/
scp $page servername:~/path/to/filedir/
- Bloviate: to speak or write verbosely and windily—pundits bloviating on the radio
In this episode I talk about how to set up custom keystrokes so that you can launch or switch to applications easily using the
super key on your keyboard. I do this on the classic Gnome desktop environment and have not tested it on Gnome 3 or Unity to see whether it works on those.
To create a new custom keystroke, open
System Settings, then go to
Shortcuts. Click on the plus sign to open the dialog box where you specify the name of the keystroke and the command that is to be launched when the keystroke is executed. Click "Apply" and then click "Disabled" and it will allow you to type the keystroke you want to use.
At this point the keystroke configuration is ready, but you have to either log out of the current session and log back in, or find some other way to reload the desktop environment configuration before you can actually use the keystroke.
I also talked about how I use my own scripts to check to see whether a program is running, and then either switch to that program if it's running or launch it if it's not. Here is an example for launching or switching to LibreOffice.
# Look for the string "LibreOffice" on the list of
# window titles and check the return code
checktitle=$(wmctrl -l | grep "LibreOffice" &> /dev/null ; echo $?)
# If the return code is 0 that means it found the
# string, so I use wmctrl to switch to the window
# that has that string in the title.
if [ $checktitle == 0 ] ; then
wmctrl -a "LibreOffice"
# If it returns a 1, then that means it did not
# find a window with that string in it so I
# launch the application.
Save the script somewhere in your PATH, make it executable, and then use the script name in the command when you're setting up the keystroke.
Blather Configuration Part 1: Desktop Management
In this episode I show how to start adding more commands, how to use the language updater script, and how to start doing some basic desktop navigation. I'll show you how to open and quit applications, and how to switch from one application to another using your voice.
For information about installing blather for the first time, as well as the startup script that I use, please refer to episode 0 of this series, which has examples and links for this stuff.
To start using the language updater script, you need to move it or copy it from the blather source code directory into your path (e.g. ~/bin/). To add new commands you will have to edit the main command configuration file:
Commands are configured in a "key: value" pair, where the key is what you wish to say, and the value is the command that will be executed when you say it. We will start out with some very basic ones, but these can be as elaborate as your imagination and scripting skills will allow. You can execute built-in system commands, or you can write your own scripts that will be executed upon the voice command.
Here's an example of a basic desktop application command set:
OPEN CHROMIUM: chromium &
GO TO CHROMIUM: wmctrl -a "google chrome"
QUIT CHROMIUM: wmctrl -c "google chrome"
The first command launches Chromium, the second one will switch focus to Chromium when you are currently in another program, and the third one closes Chromium. This makes use of the command line tool
wmctrl, which is a very handy window management tool. The
wmctrl -a command chooses which window to put focus on (or close) based on the window title, which in the commands above is given in quotation marks. There are many options to how
wmctrl can find windows and take actions, but for now we will just use this basic option.
Once you have one command set of this kind working as you like, it's very easy to set up additional command sets for all of the desktop applications you use most often.
Some applications are more difficult to handle than others. For example media players typically change the window title based on which track is playing. This makes it impossible to use the static window title option above, so I resort to a bit of scripting to help it find the right window to put focus on or close:
OPEN clementine: clementine &
GO TO clementine: rid=$(pgrep clementine -u $(whoami) |head -n 1) && rwinname=$(wmctrl -lp |grep $rid |sed -e "s/.*$rid * //" | sed -e "s/$(hostname) //") && wmctrl -a "$rwinname"
Opening the music player is easy. Switching to it is something else. To make this work I first find the process ID of the Clementine music player, and then I use the
wmctrl list command to list all of the windows that are open and I grep for the process ID that I found in the first part. Then I extract the window name from that command's output and use the result inside quotation marks in the very last command to change Focus to that window. Whew!
One last basic desktop navigation command for this episode. This is one that I use probably more than any other command. What it achieves is the alt + Tab Key stroke, which switches Focus to the previous window. Here's how I do it:
BACK FLIP: xdotool key alt+Tab
This makes use of the wonderful
xdotool package to execute a virtual keystroke. Magic!
In this episode I walk you through the process of getting the Blather GNU/Linux speech recognition program running for the first time.
Arch: On Arch Linux this is really easy. Jezra made a package build for the AUR so you can just install it that way.
Debian: I wrote an installation script for Debian-based systems that installs the dependencies to build pocketsphinx, plus a few extra packages that I use continually when I'm running blather (xvkbd, xdotool, espeak, wmctrl, elinks, xclip, curl). It builds/installs the Sphinx stuff, pulls the blather source code, and puts some configuration files and a startup script in place for you. This should take care of pretty much all of the heavy lifting.
I refer frequently to Jezra's usage notes on the Blather source code page at gitlab, so if you're trying to install this as I talk, you might want to follow along over there.
The trickiest bit in the initial run is the creation and placement of the language files. I normally use a bash script for this, but on this first episode of the series I'm going to use the web-based lmtool to create the language files, just the way Jezra says to do on his usage page. He also includes my automated language updater script in the blather source code, though, so going forward I will be talking about how to use that script instead of the web-based tool.
Blather Launch Script
I use a bash script to launch Blather because I want to set several environmental variables: location of the pocketsphinx gstreamer libraries, default browser, default text-to-speech engine, and so forth. Having these environmental variables set means that I can use easy-to-remember shortcuts in my blather commands config file. Here is my launch script:
# tell it where the Gstreamer libraries are
# set some shortcuts to use in the commands file
#export VOICE="/usr/bin/festival --tts"
export KEYPRESS="xvkbd -xsendevent -secure -text"
# add blather script directory to the user's PATH
# start blather in continuous mode with the GTK GUI
# and a history of 20 recent commands
python2 /home/$(whoami)/code/blather/Blather.py -c -i g -H 20
In this episode I talk about how you can take advantage of the OpenDyslexic font as a user, and also how as a content provider you can use it to help your readers. Incidentally, we also talked about this for a while during episode 1418, one of the 2013 New-Year shows.
Transcript Performed by Dragon Dictate [dumped "as is"]
Hi everybody! This is John Kulp In Lafayette, Louisiana. I am going to do a rather strange episode today. What I'm doing is demonstrating the dictation software that I use on the office computer that I have here at work. If you listen to my previous episodes, then you have heard me speak of the blather speech recognition program that I use on my Linux desktop, but as you may also remember, blather is not a dictation tool. Blather is a tool where you have to set up commands that will run other commands. In other words, you have to configure everything from scratch. I do have some capabilities for dictation on my Linux desktop, but they involve using the Google Web speech API and a special dictation box that I have set up, and these are not at all good for longform dictation. For serious dictation, such as writing letters and memos and other longform text, you really need a proper dictation tool. These are available built into the operating systems of Windows and Mac OS 10, but I normally use the Dragon naturally speaking software instead. I have found that it is more accurate and more powerful than the built-in versions that you can get on either Windows or Mac. That doesn't mean you shouldn't try out the built-in speech recognition on Windows and Mac, you definitely should, because I think you would be very impressed with him. I know for sure that the version on Windows learns from your voice and from the corrections that you make to the text that you were spoken, and eventually becomes very powerful in recognizing your speech. The biggest problem that I had with the Windows speech recognition was that it was a huge memory hog and frequently brought my system to a grinding halt. This is not good. Blather never does that, but then again bladder cannot take dictation. The latest system that I use for dictation is on a fairly recent Mac Mini running the nuance Dragon Dictate software. This is a very powerful dictation program that learns from your speech patterns and you can also add words to the vocabulary so that it will get them right when it hears them. This is especially important to do if you have frequently used unusual words, such as a name with an alternate spelling from what is normally in the program's dictionary. One of the great things about the Mac Dragon Dictate program, also, is its ability to do transcriptions of audio files. In fact the reason I am speaking this way is that I plan to use the transcription of this recording as the show notes verbatim without any corrections. The difficulty that most people have with dictation software at least initially is doing things like punctuation and capitalization. You have to remember to do these things or else your transcript will come out without any punctuation or capitalization, unless the words that you are speaking are known proper nouns. It also capitalizes automatically at the beginning of the sentences, so that if you use periods frequently then you will have capitalized words after those periods. You can see that I'm having trouble speaking this text in a fluent way, and this is one of the other difficulties that people have when initially using transcription software. It works best when you can express complete thoughts without pausing, because it learns from the context of your words. It has algorithms that calculate the possibility of one word or another based on the context, and so it is much better to speak entire sentences at one than it is to pause while trying to gather your thoughts. This is a major difference from trying to write at the keyboard, where it does not matter at all if you pause for seconds or even minutes while you think of what you want to write next. Anyhow, I highly recommend using some kind of dictation software if you suffer from repetitive strain injuries like I do. This will save you many thousands of keystrokes. Even if it's only using the speech recognition that's available on your phones over the web, that's better than nothing. The disadvantage of any of these services that have to send your recording over the web to
get a transcription and then send it back into your device is that they will never learn your voice and your particular speech patterns. In order for that to work best, you really have to use a dedicated standalone speech recognition program that resides locally on your computer and saves your profile and learns from your speaking. Well, I guess that is about it for today, I hope you have enjoyed hearing this brief lesson on dictation. See you next time!
Type the words "foo bar" with
xvkbd -xsendevent -secure -text 'foo bar'
Types out the entire contents of the file "foobar.txt" with
xvkbd -xsendevent -secure -file "foobar.txt"
Send text to the clipboard:
Send clipboard contents to standard output:
Ctrl+C key combination with
xdotool key Control+c
Save this complicated command as an environment variable—then the variable "$KEYPRESS" expands to this command.
export KEYPRESS="xvkbd -xsendevent -secure -text"
With virtual keystrokes and CLI access to the clipboard, you're limited only by your imagination and scripting ability. Here are some examples of how I use them, both for the manipulation of text and for navigation. The words in bold-face are the voice commands I use to launch the written commands.
Capitalize this. Copies selected text to the clipboard, pipes it through
sed and back into the clipboard, then types fixed text back into my document:
xdotool key Control+c && xclip -o \
| sed 's/\(.*\)/\L\1/' \
| sed -r 's/\<./\U&/g' \
| xclip -i && $KEYPRESS "$(xclip -o)"
Go to grades. This example takes advantage of Firefox "quick search." I start with a single quote to match the linked text "grades" and press the Return key (
\r) to follow the link:
First Inbox. From any location within Thunderbird I can run this command and it executes the keystrokes to take me to the first inbox and put focus on the first message:
xdotool key Control+k && $KEYPRESS "\[Tab]\[Home]\[Left]\[Right]\[Down]" && sleep .2 && xdotool key Tab
single ex staff. Type out an entire Lilypond template into an empty text editor window:
xvkbd -xsendevent -secure -file "/path/to/single_ex_staff.ly"
Paragraph Tags. Puts HTML paragraph tags around selected text:
KEYPRESS='xvkbd -xsendevent -secure -text'
xdotool key Control+c
xdotool key Control+v
Launching commands with keystrokes in Openbox
I normally use blather voice commands to launch the scripts and keystroke commands, but I have a handful of frequently-used commands that I launch using keystroke combos configured in the Openbox config file (
~/.config/openbox/rc.xml on my system). This block configures the
super+n key combo to launch my
This show is an interview with Joel Gibbard founder of the OpenHand project.
The interview was recorded on my phone which unfortunately created a few glitches.
I've cleaned the audio up as best I can. Although frustrating, the occasional glitches have not caused anything to be missed that cannot be inferred from the context of the recording.
After creating an artificial hand for his degree project Joel Gibbard wanted to continue the work on the hand with the goal of producing a workable prosthetic hand for $1000, so he launched the OpenHand project with a succesful IndieGoGo fundraiser. In this interview we learn more about the Dextrus hand, the project's
progress to date, and hear of Joel's vision of affordable prosthetics for amputees worldwide.
For a short 4 minute introduction to the project see Joel's video at
The openhand designs and more information are available at
Accessibility tools for the visually impaired
A short explanation of how I personally got involved with accessible computing,
a definition of the term 'accessible' as it is applied to anything in relation
to persons with physical or cognitive impairment, and very short list of the most
commonly used adaptive tools to improve accessibility to Windows and Linux.
- The Orca screen-reader: https://help.gnome.org/users/orca/stable/
- The brltty refreshable Braille display driver: http://mielke.cc/brltty/
brltty has to be the most impressive example of well-documented Open Source.
- Debian Accessibility: https://www.debian.org/devel/debian-accessibility/
Debian has a fully accessible installer. I have installed Debian 7.4 from the net install CD ISO image. The installer is text-based and presents no problem for even the totally blind.
See the Debian Accessibility page linked to above.
- Ubuntu Accessibility: https://help.ubuntu.com/community/Accessibility
The Ubuntu 'Ubiquity' graphical installer is totally accessible. Installing from a live CD or DVD image is simple. See the page linked above.
- Vinux (an Ubuntu variant which is accessible out-of-the-box): http://vinuxproject.org/
This is an Ubuntu variant which comes up talking from the first. Not only is the installer accessible, but considerable attention has been paid to including only applications which are accessible on the CD and DVD images. Applications which are either inaccessible or which simply have little or no relevance to the visually impaired are excluded.
- Talking Arch: http://talkingarch.tk/
Chris Brannan created an accessible ISO image of Arch Linux.
This uses the speakup console-mode screen-reader to provide a way of installing Arch Linux for the visually impaired. Console-mode only, but providing a great starting-point. I have tried various desktops on top of this installation, including mate, LXDE and others.
Talking Arch is now maintained by a couple of names which will be familiar to the Linux VI community; Kyle and Kelly. Erm...embarassingly I can't find their last names right now.
Mike Ray. June 2014
In today's show Ahuka tracks down Jonathan Nadeau, from the Accessible Computing Foundation to discuss the running campaign to improve the Orca Screen Reader.
- ORCA fundraiser: http://www.indiegogo.com/projects/orca-bringing-digital-sight-to-the-vision-impaired
- Accessible Computing Foundation http://theacf.co/
- Sonar http://sonargnulinux.com/
In today's show Ken finally gets around to releasing shows recorded at OggCamp11
OggCamp 11 was a two-day unconference where technology enthusiasts came together to exchange knowledge on a wide range of topics from Linux and open source software to building home automation systems. It was held August 13 and 14 at Farnham Maltings in Surrey in the UK.
Ken Catches up with Steve Lee just before he gave his talk on Open Accessability. After the talk we get to hear his presentation.
In this episode Door shares with us life with a speech impediment, his experiences and his speech goals.
Joanmarie Diggs' talk entitled "The Orca Screen Reader, how it does what it does and how you can help"
Joanmarie Diggs is the Lead Developer for Orca and this talk was recorded at the Northeast GNU/Linux Fest 2012-03-17
- If you are in the way of a blind person say "hi" so they know your there.
- If a blind person is looking for a seat, tell them where there is a vacant space.
- Ask if they need help (warning not all people might appreciate this)
- See you later, Did you watch this movie doesn't bother Jonathan but some people may be.
- When leading a blind person (across the street), walk normally and let the bling person hang on to your elbow.
After his outspoken criticism of accessibility in Ubuntu, Jonathan Nadeau has become the standard barer for accessibility on the FLOSS desktop. In his interview with the KDE spokesperson Aaron Seigo, Jonathan didn't ask any questions about accessibility. I was expecting to hear what accessibility improvements are in the pipeline for KDE.
When I contacted Jonathan about it he immediately replied saying that they did talk about accessibility. He didn't add it as the show was running to long and that he might release it as a separate podcast. I floated the idea of releasing it on HPR and he was kind enough to mail me the segment.
A link to the rest of the interview:
HPR has now no shows in the queue. HPR is a community feed and without shows it will cease to exist. Many people have stepped up and recorded shows but I know there are many more out there who have it in them to contribute. With that in mind please record a show today. Thank you.