Use your smart home device to command or dictate to your Windows computer.
This project originated from a simple desire to dictate to my computer, evolving into a quest for voice command functionality akin to my experience with my Android phone. Through the amalgamation of Python and various free tools, I crafted a functional solution. While it may not rival Microsoft's Windows Speech Recognition, my approach enables control from smart devices, offering a personally more valuable experience. Additionally, this project could prove beneficial for individuals facing tactile input impairments, extending its potential applications beyond conventional use.
All requirements are FREE1. The only exceptions are if you've already used your free allotments, would be the Pushbullet and/or IFTTT accounts.
- Windows PC
- Push2Run application
- Python 3.x
- pynput module. TL:DR
pip install pynput
- pynput module. TL:DR
- (optional) NirCmd for audial responses
- Pushbullet account
- A smart device from which to send Pushbullet messages. (These are two options I've used)
- an Amazon Echo smart home assistant [recommended FREE option]
- with the PC Commander skill (currently in beta)
- a device with Google Assistant [Potentially Free option]
- and an IFTTT account
- with an applet that links your Google and Pushbullet services
- and an IFTTT account
- an Amazon Echo smart home assistant [recommended FREE option]
-
Install the Push2Run application.
-
Setup one or both of these smart home device connections to your Pushbullet service.
-
Alexa Route: Refer to the PC Commander website for instructions.
-
(deprecated) Google Route: Follow these instructions.
With the completion of these steps, you will already be able to do a lot of things such as shutdown, reboot, google search, youtube search, open a program of your choice, etc. See more with these example cards.
-
-
Install python. I recommend checking the "Add python.exe to PATH" check box.
-
Download this project's files to a directory of your choosing. Take note of the path as you will need it later.
- change_audio_volume.py
- dee_logging.py
- keypress_functions.py
- type.py
- Push2Run_type_cards.p2r (optional)
The other files are unnecessary.
-
(optional) "Install"2 NirCmd to enable synthesized "voice" responses from your computer.3 This is a small command-line utility that allows you to do some useful tasks such as voice synthesis.
-
Setup Push2Run (p2r) cards. By this step, you should be ready to import (or create) cards that will facilitate the connection between Push2Run and these python project files. To import, simply drag the included Push2Run_type_cards.p2r file (a JSON file) into your Push2Run client. Feel free to discard the file once imported.
- Pause/Play
Presses space bar - Full Screen
Presses f key - Full Screen and Play
Presses f and space bar - Type *
Bypass command interpretation to simply type out the supplied text - Computer! Do Things
A catch-all card. Attempts to interpret any messages which didn't trigger a Push2Run card as a command. - No matching phrases
Same catch-all functionality as above card
- Pause/Play
-
Change the path in the cards' Parameter field to the directory you chose in step 4, where you've placed this project's files. This can be done either in the p2r file before importing, or after importing within Push2Run's GUI. In the provided cards, the path is set to
C:\Scripts\python\type\
.Click here to see how to build your own cards.
Note that all these cards are set to the "Hidden" window state which is important to prevent a terminal window from being shown.
We'll start with the dictation card. With this, you'll be able to tell your computer to type out long sentences.
Next is the command card. With this, you'll be able to tell your computer to perform a multitude of physical inputs, either colloquially (ex. "minimize") or literally (ex. "press alt space n"). See more.
With these cards, you'll be able to tell your computer to change the volume. You can also tell it to mute, un-mute, toggle mute, or even to "shut up". I'm still working out the kinks for this one so I did not include a card for volume adjustments in the included p2r file.
Note that there are two cards as I found it more successful to separate them like so.
Read an additional brief Push2Run primer...
By this point, you will have an invocation keyword set up to indicate to your digital assistant to forward commands through your Pushbullet service which will be captured by Push2Run. In this readme's example scenarios, we will use the "tell my computer to ~" keywords (the default for both proposed routes) which colloquially just makes sense.
-
$
represents your variable. For example, let's say you've setup your Type card as below with "type $" as one of the entries in the 'Listen for' field...4You say: "tell my computer to type it is a lovely day period mark"
type.py will receive: "-v it is a lovely day period mark" (-v being the verbatim flag) which it will then format the string nicely and simulate the key presses to type it out on your computer. "It is a lovely day."
-
within the "Listen for" field, the
*
is a throw-away catch-all. It's only purpose is for matching miscellaneous phrases, not for capturing text. For example...- You say: "tell my computer to lower the gosh darn volume to 20 percent"
- Push2Run will match and throw away "lower the gosh darn".
- Match the "volume" keyword to the 'Change Volume' card.
- And pass long "to 20 percent" to the script.
Caveats, acknowledgements, and known bugs to fix
- An internet connection is required for your computer to recieve commands.
- You must be logged into your computer for most, if not all, actions to succeed.
- A Digital assistant's attention span is short. So, commands must be swift and to the point.
- As such, performing multiple or complex actions utilizing this project may prove difficult. Thankfully the Alexa method has a follow-up mode which alleviates this pressure.
- Giving literal key-press commands can be tricky to near impossible as it is wholly dependent on what the digital assistant thinks it heard with their tendency to listen for natural spoken language. For example, it may hear "end" when you say "n". I try to work with this by providing an equivalency dictionary but it isn't perfect.
- Log file location may differ depending on whether the script is executed from the console5 or by Push2Run.
-
Here's how to utilize these project files directly, without relying on Push2Run triggers.
-
To type out a string to your computer with basic formatting use...
python type.py -v <string>
-
To give your computer a command (for example these) use...
python type.py <command>
Note that these commands will execute immediately so if you wish to type on or control a particular application, you will need to either execute the command in a hidden window or use a delay timer.
Please note the following
- You can chain commands together with delimiters "and", and "then".6
- Although Google Assistant will handily detect in your speech when you meant to use punctuation, and I acknowledge it's a mouth-full but to explicitely indicate to the script to produce a punctuation mark, you must say "mark" or "sign" afterwards. For example: "open curly bracket mark x closed curly bracket sign" -> "{ x }"
- type a phrase of your choice comma with punctuation exclamation mark
- type i'll be there at 6 pm period mark send
- maximize
- minimize
- restore
- minimize all
- minimize everything else
- move
- resize
- resize left
- resize right
- resize bottom
- resize top
- dock left
- dock right
- close program
- change program
- pause
- play
- full screen (comptible for toggling full screen on most players)
- alt tab [five times]
- alt space n
- shift r
- etc.
- control alt delete <- is a protected key combination thus will NOT work
- go to website dot com
- refresh
- go back
- go forward
- new tab
- close tab
- reopen tab
- change tab
- select all
- cut
- copy
- paste
- undo
- redo
- home
- end
- page up
- page down
- save
- save as
- emojis (don't get excited, just pulls up the menu)
- change input language
- show notifications
- show time
- show calendar
- start dictation (uses Windows Speech Recognition)
- show settings
- take screenshot
- save screenshot
- open system menu
- open control panel
- wait 10 seconds and ...
- type I see you exclamation mark after 3 minutes
Footnotes
-
Aside from the Windows PC and a device with smart home assistant, of course. These devices are ubiquitous but I recognize accessibility to these devices is not universal. ↩
-
Download and extract to a location in your PATH environmental variable OR this project's root folder. ↩
-
Currently, audial responses are only used to confirm volume adjustments and to inform the user when a command was not understood. ↩
-
You can list multiple "Listen for" phrases. Be sparing here as the more variability you add, the greater your chances of stepping on another card's toes causing unexpected results. As you may experience with the Volume cards later. ↩
-
To execute from console do
python type.py DESIRED COMMAND HERE
. Use the-v
argument to avoid interpretation and simply dictate.python type.py -v DESIRED SENTENCE HERE
You may choose to use quotations around your command ("DESIRED COMMAND"
) if you wish. ↩ -
Actually by default, Push2Run also uses "and" as a delimiter to separate commands. Given that setting, I acknowledge that the "Full Screen and Play" card is redundant when you have separate "Full Screen" and "Pause/Play (press Space bar)" cards. ↩