My current desktop automation is doing command recognition. Commands like "open editor / email / browser", "shutdown", "suspend"...about 20 commands in all. 'pocketsphinx_continuous' is started as a daemon at startup and keeps listening in the background (I'm on Ubuntu).
I think from a speech recognition internals point of view transcription is more complex than recognizing these short command phrases. The training or adaptation corpus would have to be much larger than what I used.