[Utility] Caster: Program by Voice


Caster: Program by Voice


Caster is an open source collection of tools aimed at enabling programming entirely by voice. It runs on top of Dragonfly, which runs on either Dragon NaturallySpeaking or WSR-Windows Speech Recognition. Caster is programmed in Python and is extensible allowing for additional programming languages and programs. A presentation done demonstrating coding by voice using Dragonfly. Caster more feature-rich and user-friendly compared to straight Dragonfly.

Some of you might be asking yourself the question. How is this useful to average developer that doesn’t want or need to program entirely by voice? Put simply it can boost your productivity by augmenting the power of your voice within development. The skies the limit when it comes to your voice whether through advanced macros or taking control of any program on your computer. Caster can be leveraged to your advantage with the added power of a keyboard and mouse.

Check it out the Caster Repository! Ultimately I’d love to see Sponge and Forge API integrated! Keep in mind it is an active development but very functional. Most framework is complete but the features need to be fleshed out.


I’m not the author of Caster. I brought Caster to the Sponge forums because I love the community. I need feedback on Caster and to see if anyone’s interested collaborating on development. My fork of Caster* Questions are welcome! Feel free to fork and make pull requests!

Features

  • Easy setup
  • Fuzzy string matching for symbols
  • “Alias” Commands - on-the-fly commands created by highlighting stuff
  • “Record From History” - turn previously spoken commands into macros
  • CCR for multiple languages, easily configure new languages
  • Sikuli integration
  • Four additional mouse navigation modes
  • Text navigation by templates and punctuation-seek
  • The Context Stack - asynchronous and context seeking commands
  • Spec reduction via NodeRule
  • Configurable settings
  • Ergonomic break alarm (optional)
  • Automatic fix macro for Dragon double first letter, other Dragon
    enhancements

Demos of a few of Casters features.

Supported languages

  • HTML
  • Python
  • C++
  • JavaScript
  • Java
  • SQL

Supported Editors and IDEs

  • Sublime
  • Atom*
  • Jetbrain IDEs
  • Android Studio
  • Eclipse
  • SQL Developer
  • Microsoft Visual Studio
  • Emacs
  • Notepad++

Miscellaneous Programs

  • Firefox
  • Chrome
  • Command Prompt
  • Gitbash
  • WinWord

Documentation
Caster Wiki
Castor Issue tracker
Dragonfly Documentation

I’m not the author of Caster. I posted this on the Sponge forums because I love the community. I need feedback on caster and to see if anyone’s interested collaborating on development. My fork of Caster Questions are welcome!

[spoiler=-The differences between WSR and DNS-]
In regards to Caster there’s no difference between Dragon Naturally Speaking and Windows Speech Recognition besides initial setup and starting Caster. In terms of Caster features. The only is exception described here temporary as it maybe.

There is a significant difference between WSR and DNS in terms of speech recognition accuracy, speed and app contextual wariness. DNS 13 pretty much works out of the box with a very little training. WSR requires lots of training and even then don’t expect the accuracy that you might see from DNS. Caster improves WSR app contextual wariness and accuracy but cannot make up for the differences between the technology in speech recognition engines.
[/spoiler]

Caster Install Videos for WSR Videos

  1. Video tutorial to install Dragonfly
  2. Video tutorial to install Caster
    *Note NatLink is not required for WSR. To launch click “_caster.py”.

-The quick and dirty install guide for Caster for WRS-
Install python 2.7.9 32bit. Make sure click the ‘add to path’ during the python install. Then copy and pasted the dragonfly repository to the desktop. Open the folder and as Administrator execute ‘python setup.py install’ from CMD. Then pip.bat from caster repository. The Caster repository can be placed anywhere so long as it’s not moved after running pip.bat.

-How to use Caster-
Speech recognition prioritizes speech commands over text dictation. With Caster you will dictate with commands. The current documentation that is available is geared towards furthering Caster development. The first place to look is the Wiki and CasterQuickReference.pfd in the root directory Caster.

Outside of the references above the best place to discover commands and how they’re used is in the project itself by opening up files. Basic command phrases are pretty easy to spot and generally are self explanatory.

[spoiler=-Example of Basic command phrases-]
Example from Atom.py
Spoken Command/Action-----------------> Shortcut keys------------->

    #File Menu
 "[open] new window":      R(Key("cs-n"), rdescript="Atom: New Window"),

Example from a CCR module configjava.txt
Spoken Command/Action-----------------> Shortcut keys------------->

 "convert to integer":          Text("Integer.getInteger()")+ Key("left"),

[/spoiler]
Below I’ve laid out the key file paths relative to the caster root directory with a short description.
[spoiler=Key File Paths]
caster-master\caster\bin\data\ccr

  • Continuous command recognition modules or CCR modules like for HTML
    or Java can be toggled. For instance confightml.txt can be enabled
    by saying ‘enable html’ or disabled by saying ‘disable html’. In
    addition you can refresh CCR modules files on the fly by saying
    ‘refresh (module name)’ like ‘refresh html’.

caster-master\caster\apps

  • Application specific commands which are active automatically when an
    application is launched. The name of the files would correspond
    with the applications.

caster-master\caster\dev.py

  • A development module for general dev commands. Must be enabled
    through settings.json

caster-master\caster\bin\data\settings.json

  • Settings file where you can turn on and off various features.
    [/spoiler]

Troubleshooting
[spoiler=General Tips and Troubleshooting]
Background noise can degrade dictation accuracy. This includes breathing and wind. Ideally the microphone should not be picking up sound when not dictating. Adjusting the OS mic volume and gain along with the mics position.

Try not to max out the mic volume or gain as that can cause artifacts that degrade speech recognition accuracy. Once you’re the appropriate mic volume keep the microphone boom or microphone placement in the same position.

The quality of sound card and I/O shielding can influence dictation accuracy. Negative effects can be seen if the microphone is receiving audio without dictation or background noise. Try moving the microphone plug to the back of the PC. If that fails to help or you don’t have an alternate MIC input you could buy a USB sound card.

No microphone, terrible sound card, poor I/O shielding, cant afford a USB Sound card, or have a crappy pc mic? The cheap or free solution option if you own an android smart phone. Use an app to turn your phone into a wireless microphone. An app wirelessly transmit your phones microphone audio to the PC. This bypasses your computer’s sound system completely. I use WO Mic app with a pair of ear bud with a built in mic. In addition I use SoundWire to pipe my PC audio through my phone.
[/spoiler]

[spoiler=-WSR Specific Optimizations-]
When using WSR it is necessary to go through the training at least two times.

  1. Start WSR.
  2. Right click on the WSRmicrophone Icon or WSRStatus bar.
  3. Click

It may be necessary to add words to WSR dictionary.

  1. Start WSR.
  2. Right click on the WSR microphone Icon or WSR Status bar.
  3. Open Speech Dictionary

Disabling WSR Dictation Pad

  1. Start WSR.
  2. Right click on the WSR microphone Icon or WSR Status bar.
  3. Options
  4. Uncheck ‘Enabled Dictation Pad’

WSR Correction/Alternative Menu

WSR correction menu is only available in WSR supported programs like WordPad or Dictation Pad. If you’re having trouble dictating try speaking it in WordPad. WordPad can be launched by saying “Open WordPad” This allows WSR to obtain user feedback on your dictation patterns increasing accuracy. WSR automatically brings up a alternative panel or you can force a correction by saying ‘correct’ or “correct (dictate a specific word)”

WSR is sensitive to the clarity of the spoken word compared to DNS. See what provides higher accuracy speaking naturally or talking like a robot halting your words but not mimicking the tone.
[/spoiler]

3 Likes

You, sir, are awesome. I’ll try this whenever I gets home.

1 Like

I’m currently flushing out commands for Atom for and creating advanced macros leveraging Atom features.

Unless this has direct relation to Sponge plugin development can it be moved to off-topic?

Just seems like spam on the forums to bring it up at all, but since theres an off-topic section may as well use it.

I’m not certain I agree. It’s directly related to programming and, therefore indirectly to Sponge. It’s close enough to a resource, for me.

1 Like

@ryantheleach

You are entitled to your opinion and I respect that.

Caster goal is to provide an extensible framework enabling an alternative input by voice and to boost productivity of the traditional developer.

There are developers and people who are disabled who cannot utilize a mouse and keyboard. This project empowers a disabled developer to work and opens a world of possibilities to individuals who thought that programming out was out out of their reach. I am one of those individuals who require a tool like Caster to be productive.

I’m beginning to teach myself programming in part by working on this project as it’s a necessity stepping stone to begin my own Sponge plugins.

Aside from the project goals I have my own. As soon as I have Atom completely controlled by voice I will expand upon the Java CCR module. After that is complete I will start the creation of a Sponge CCR module. After that if I’m not burned out forge will be on the docket.

The reason I ask for help it’s because this is a daunting task. A task that I’m willing to take it myself if necessary. I brought it to the sponge community because they share their talent creativity and the best of themselves with the community abroad. I had hoped that if a few individuals might contribute even if they don’t use caster for day-to-day programming out of the Spirit of this community.

Once this project matures, I’ll be able to share what I love minecraft and sponge mods! Even better I will be able to bring a caster and the Sponge API to empower kids in my local community who have disabilities who think programming by its very nature is inaccessible.

3 Likes

I can’t say I like the setup instructions. Besides, where’s the tutorial for WSR?

I can understand your coming from eventually the tutorials will be redone and hopefully an all-in-one installer can be incorporated. As for setup of of Caster with WSR its the same as the posted videos Except you do not need to install Natlink.

When this happens I will be happy to see it as a plugin resource.

But until then a general programming section in off-topic would be very welcome, if the Plugin: Resources category ends up with absolutely anything and everything relating to programming thrown in, be it Scala, Clojure, Ruby, JavaScript just because it happens that you can build Sponge plugins in those languages… You get my point surely?

Edit: In hindsight when I said off-topic I should have said General.

I would argue that Caster is a Sponge plugin resource even in its current state. Just because Caster has a wider scope than just Sponge plugin creation shouldn’t subtract from its value used as a resource for creating plugins or the fact that it is a work in progress.

As per the definition above I believe Castor qualifies as a Sponge plugin resource.

  1. Just because Caster does not have a Sponge CCR module does not mean a developer can’t create a sponge plugin using Caster. Think of CCR module as a large collection of snippets and specialized formatting of code triggered by voice shortcuts instead of the physical input of a keyboard. Without a Sponge CCR it would take a lot longer to create a plugin because more voice commands are necessary.

  2. Caster allows for the control of the all Officially recommended IDEs by voice.

  3. For anyone who is or going to find themselves in a position where they can’t interact with a keyboard and mouse this is the only resource that would allow them to program and in a lesser sense a Sponge plugin.

Edited: For clarity of thought and grammar.

1 Like