VOICE-TO-TYPE

Use your smart home device to command or dictate to your Windows computer.

Demo Video

Project Description

This project originated from a simple desire to dictate to my computer, evolving into a quest for voice command functionality akin to my experience with my Android phone. Through the amalgamation of Python and various free tools, I crafted a functional solution. While it may not rival Microsoft's Windows Speech Recognition, my approach enables control from smart devices, offering a personally more valuable experience. Additionally, this project could prove beneficial for individuals facing tactile input impairments, extending its potential applications beyond conventional use.

Prerequisites

All requirements are FREE¹. The only exceptions are if you've already used your free allotments, would be the Pushbullet and/or IFTTT accounts.

Windows PC
- Push2Run application
- Python 3.x
  - pynput module. TL:DR pip install pynput
- (optional) NirCmd for audial responses
Pushbullet account
A smart device from which to send Pushbullet messages. (These are two options I've used)
- an Amazon Echo smart home assistant [recommended FREE option]
  - with the PC Commander skill (currently in beta)
- a device with Google Assistant [Potentially Free option]
  - and an IFTTT account
    - with an applet that links your Google and Pushbullet services

Setup

Install the Push2Run application.
Setup one or both of these smart home device connections to your Pushbullet service.
- Alexa Route: Refer to the PC Commander website for instructions.
- (deprecated) Google Route: Follow these instructions.
With the completion of these steps, you will already be able to do a lot of things such as shutdown, reboot, google search, youtube search, open a program of your choice, etc. See more with these example cards.
Install python. I recommend checking the "Add python.exe to PATH" check box.
Download this project's files to a directory of your choosing. Take note of the path as you will need it later.
- change_audio_volume.py
- dee_logging.py
- keypress_functions.py
- type.py
- Push2Run_type_cards.p2r (optional)
The other files are unnecessary.
(optional) "Install"² NirCmd to enable synthesized "voice" responses from your computer.³ This is a small command-line utility that allows you to do some useful tasks such as voice synthesis.
Setup Push2Run (p2r) cards. By this step, you should be ready to import (or create) cards that will facilitate the connection between Push2Run and these python project files. To import, simply drag the included Push2Run_type_cards.p2r file (a JSON file) into your Push2Run client. Feel free to discard the file once imported.

What cards will be imported
- Pause/Play
  Presses space bar
- Full Screen
  Presses f key
- Full Screen and Play
  Presses f and space bar
- Type *
  Bypass command interpretation to simply type out the supplied text
- Computer! Do Things
  A catch-all card. Attempts to interpret any messages which didn't trigger a Push2Run card as a command.
- No matching phrases
  Same catch-all functionality as above card
Change the path in the cards' Parameter field to the directory you chose in step 4, where you've placed this project's files. This can be done either in the p2r file before importing, or after importing within Push2Run's GUI. In the provided cards, the path is set to C:\Scripts\python\type\.

Click here to see how to build your own cards.

Note that all these cards are set to the "Hidden" window state which is important to prevent a terminal window from being shown.

Type card

We'll start with the dictation card. With this, you'll be able to tell your computer to type out long sentences.

Command card

Next is the command card. With this, you'll be able to tell your computer to perform a multitude of physical inputs, either colloquially (ex. "minimize") or literally (ex. "press alt space n"). See more.

Volume card

With these cards, you'll be able to tell your computer to change the volume. You can also tell it to mute, un-mute, toggle mute, or even to "shut up". I'm still working out the kinks for this one so I did not include a card for volume adjustments in the included p2r file.

Note that there are two cards as I found it more successful to separate them like so.
Read an additional brief Push2Run primer...

By this point, you will have an invocation keyword set up to indicate to your digital assistant to forward commands through your Pushbullet service which will be captured by Push2Run. In this readme's example scenarios, we will use the "tell my computer to ~" keywords (the default for both proposed routes) which colloquially just makes sense.
- $ represents your variable. For example, let's say you've setup your Type card as below with "type $" as one of the entries in the 'Listen for' field...⁴
  
  You say: "tell my computer to type it is a lovely day period mark"
  
  type.py will receive: "-v it is a lovely day period mark" (-v being the verbatim flag) which it will then format the string nicely and simulate the key presses to type it out on your computer. "It is a lovely day."
- within the "Listen for" field, the * is a throw-away catch-all. It's only purpose is for matching miscellaneous phrases, not for capturing text. For example...
  1. You say: "tell my computer to lower the gosh darn volume to 20 percent"
  2. Push2Run will match and throw away "lower the gosh darn".
  3. Match the "volume" keyword to the 'Change Volume' card.
  4. And pass long "to 20 percent" to the script.
Caveats, acknowledgements, and known bugs to fix
- An internet connection is required for your computer to recieve commands.
- You must be logged into your computer for most, if not all, actions to succeed.
- A Digital assistant's attention span is short. So, commands must be swift and to the point.
  - As such, performing multiple or complex actions utilizing this project may prove difficult. Thankfully the Alexa method has a follow-up mode which alleviates this pressure.
- Giving literal key-press commands can be tricky to near impossible as it is wholly dependent on what the digital assistant thinks it heard with their tendency to listen for natural spoken language. For example, it may hear "end" when you say "n". I try to work with this by providing an equivalency dictionary but it isn't perfect.
- Log file location may differ depending on whether the script is executed from the console⁵ or by Push2Run.

How to use directly

Here's how to utilize these project files directly, without relying on Push2Run triggers.

To type out a string to your computer with basic formatting use...

python type.py -v <string>
To give your computer a command (for example these) use...

python type.py <command>

Note that these commands will execute immediately so if you wish to type on or control a particular application, you will need to either execute the command in a hidden window or use a delay timer.

List of viable commands

Please note the following

You can chain commands together with delimiters "and", and "then".⁶

Although Google Assistant will handily detect in your speech when you meant to use punctuation, and I acknowledge it's a mouth-full but to explicitely indicate to the script to produce a punctuation mark, you must say "mark" or "sign" afterwards. For example: "open curly bracket mark x closed curly bracket sign" -> "{ x }"

Typing

type a phrase of your choice comma with punctuation exclamation mark
type i'll be there at 6 pm period mark send

Colloquial

maximize
minimize
restore
minimize all
minimize everything else
move
resize
resize left
resize right
resize bottom
resize top
dock left
dock right
close program
change program

Media

pause
play
full screen (comptible for toggling full screen on most players)

Literal

alt tab [five times]
alt space n
shift r
etc.
control alt delete <- is a protected key combination thus will NOT work

in Browser

go to website dot com
refresh
go back
go forward
new tab
close tab
reopen tab
change tab

Text

select all
cut
copy
paste
undo
redo
home
end
page up
page down
save
save as
emojis (don't get excited, just pulls up the menu)
change input language

System

show notifications
show time
show calendar
start dictation (uses Windows Speech Recognition)
show settings
take screenshot
save screenshot
open system menu
open control panel

Misc

wait 10 seconds and ...
type I see you exclamation mark after 3 minutes

Aside from the Windows PC and a device with smart home assistant, of course. These devices are ubiquitous but I recognize accessibility to these devices is not universal. ↩
Download and extract to a location in your PATH environmental variable OR this project's root folder. ↩
Currently, audial responses are only used to confirm volume adjustments and to inform the user when a command was not understood. ↩
You can list multiple "Listen for" phrases. Be sparing here as the more variability you add, the greater your chances of stepping on another card's toes causing unexpected results. As you may experience with the Volume cards later. ↩
To execute from console do python type.py DESIRED COMMAND HERE. Use the -v argument to avoid interpretation and simply dictate. python type.py -v DESIRED SENTENCE HERE You may choose to use quotations around your command ("DESIRED COMMAND") if you wish. ↩
Actually by default, Push2Run also uses "and" as a delimiter to separate commands. Given that setting, I acknowledge that the "Full Screen and Play" card is redundant when you have separate "Full Screen" and "Pause/Play (press Space bar)" cards. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
Push2Run_type_cards.p2r		Push2Run_type_cards.p2r
README.md		README.md
change_audio_volume.py		change_audio_volume.py
dee_logging.py		dee_logging.py
keypress_functions.py		keypress_functions.py
type.py		type.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VOICE-TO-TYPE

Demo Video

Project Description

Prerequisites

Setup

What cards will be imported

Type card

Command card

Volume card

How to use directly

List of viable commands

Typing

Colloquial

Media

Literal

in Browser

Text

System

Misc

About

Releases

Packages

Languages

DeeboyEdx/voice-to-type

Folders and files

Latest commit

History

Repository files navigation

VOICE-TO-TYPE

Demo Video

Project Description

Prerequisites

Setup

What cards will be imported

Type card

Command card

Volume card

How to use directly

List of viable commands

Typing

Colloquial

Media

Literal

in Browser

Text

System

Misc

Footnotes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages