Working with Speech Input Process
Process Purpose
The "Speech Input" process is used to define the operator's speech input in terms of recognition/acceptance and associated functions. It also allows you to define spoken instructions for the operator.
For more information on speech input, see How to Work with Speech.
When you add a process, you are required to define its settings. This occurs in the process's properties window which is displayed automatically after having added the "Speech Input" process. In this case, the properties window includes four tabs - "General", "Advanced", "Routing" and "Help".
If any subsequent edition is required, double-click the process to open its properties window and enter the necessary modifications.
"General" tab
This tab allows you to define the prompt that informs the operator that speech input is possible as well as the words the operator can use for speech input.
Fill in the following options:
Speech Input Dialog |
|
Focus Prompt |
Enter the focus prompt OR click to access the "Speech Prompt Builder" assistant to create it. See The Speech Prompt Builder. The focus prompt's purpose is to provide information or instructions to the operator so that he can use speech input. The focus prompt can be a word, a sound, etc. |
Speech grammar (Word Sequence for Data Input) table |
|
Word List Column |
It is possible to combine two or more word lists to create a word list sequence but this only works for App Word Lists that associate spoken words to numeric/alphanumeric values. App Word Lists that include key codes do NOT apply.
Construct a word sequence with one or more word lists. Select the App Word List from the drop-down OR click and select it from the "Word List" table. The "Word List" table allows you to view the existing App Word Lists (see To View the App Word Lists), to edit them (see To Edit an App Word List) or to add new ones (see To Add an App Word List). As an alternative, you can also enter a variable that contains the name of the intended App Word List.
Remember, this option only works with App Word Lists that associate spoken words to values. The drop-down on the "Word List" table displays all the App Word Lists in your project and it will be up to you to select the compatible App Word List(s).
If you use more than one word list, remember that this is a word sequence, meaning, you have to order your word list selection according to the intended speech input sequence. Ex: For a word sequence such as "Code 5 2 Read", you have to do the following selection: 1st - the word list that includes the word "Code".(List A) 2nd - the word list that includes the numeric units "5" and "2" (List B) 3rd - the word list that includes the word "Read". (List C) If you were to select List B first, then, List C and, then, list A, you would be creating the following speech input sequence - "5 2 Read Code".
|
Min Column |
Specify the minimum number of units from the selected word list to be used in the word sequence. |
Max Column |
Specify the maximum number of units from the selected word list to be used in the word sequence. |
End of Input |
Select the "End of Input" mode from the drop-down OR click and select a compatible word list from the "Word List" table: Auto - use of automatic system criteria to determine the end of speech input. Global Validation Word - requires a specific word to be spoken by the operator to mark the end of speech input. This word is defined in the "Global Words" list - it must have a return key/value of <ENTER>.
When selecting this option, make sure that the "Global Words" list contains a spoken word with the key code <ENTER> and that the list is enabled. See Global Words.
None - speech input ends when the operator uses the maximum length of the input word sequence. Validation Word List - This option is only available in the drop-down, if you have already created an App Word List with a specific characteristic - this App Word List must include spoken words associated to the key code "<Enter>" or "<Enter1>" or <Enter2>", etc. (up to "<Enter9>").
Example of an App Word List that can be used as a validation list:
In this case, the operator will be able to say "Ended"; "Completed", "Done" or "Concluded" to indicate the end of his speech input.
|
End of Prompt |
If required, enter the "End Prompt" OR click to access the "Speech Prompt Builder" assistant to create it. See The Speech Prompt Builder. The end of prompt is a word or sound, heard by the operator, that marks a successful input. |
Store Result into Variable |
|
Variable |
Click to define the variable that receives the data from the speech input (grammar sequence). See To Select/Create a Variable. |
Use the editing icons to the right of the table to move the rows up and down and to delete or add more rows.
In terms of speech grammar sequence, each numeric and/or alphabetic word counts as 1 unit. Although words are composed of many letters, each word is considered 1 unit.
Example 1
Consider a word sequence with 2 word lists - an alphanumeric ("Zone") and a numeric type list ("Numeric1"):
The "Zone " word list contains the words "Green", "Red" and "Blue".
The "Numeric1" word list includes digits from "0" to "9".
The total of minimum units to use per word sequence is 2 (a minimum of 1 unit for each word list).
The total of maximum units to use per word sequence is 3 (a maximum of one unit for the "Zone" list and 2 units for the "Numeric1" list).
This means that the word sequence has to include at least one unit (word) from each word list and that it can include up to 2 numeric words (from the "Numeric1" list).
Since this word sequence allows for variable digit input, it is recommended you define a validation word. To do this, select "Global Validation Word" or an App Word List that works as a “Validation Word List" in the "End of Input" option. This means that the speech input for this control requires the operator to end with a specific word to validate the value that he said.
If you intend to use a "global validation word", you must define a word with a return key/value of "<ENTER>" in the "Global Words" list.
If you intend to use an App Word List, you must create a specific App Word List with one or more words associated to the key code(s) "<ENTER>", "<ENTER1>", "<ENTER2>" and so on.
In this example,we are using the word "ready" to indicate end of speech input.
Each numeric and/or alphabetic word counts as 1 unit.
Possible Speech Inputs
"blue" "1" "0" "ready" (1 + 1 + 1 + global validation word or a validation word from an app word list = 3 units)
"red" "7" "ready" (1 + 1 + global validation word or a validation word from an app word list = 2 units)
Example 2
Consider a word sequence with 2 word lists ("Numeric1" and "Decimal"), with one of the word lists being used twice:
The "Numeric1" word list includes digits from "0" to "9".
The "Decimal" word list contains the word "Dot" with the return value ".".
The total of minimum units to use per word sequence is 1 (only the first selected word list requires a minimum of 1 unit).
The total of maximum units to use per word sequence is 6.
This means that the word sequence has to include at least one unit (word) from the first word list and that it can end there or have 1 more unit (word) from the second list and up to 3 units (words) from the third list.
The global validation word is "ready".
Remember that each numeric and/or alphabetic word counts as 1 unit.
Possible Speech Inputs
"1" "dot" "2" "0" "0" "ready" (1 + 1 + 1 + 1 + 1 + global validation word or a validation word from an app word list = 5 units)
"1" "5" "dot" "1" "2" "5" "ready" (1 + 1 + 1 + 1 + 1 + 1 + global validation word or a validation word from an app word list = 6 units)
"6" "dot" "2" "0" "ready" (1 + 1 + 1 + 1 + global validation word or a validation word from an app word list = 4 units)
"1" "2" "ready" (1 + 1 + global validation word or a validation word from an app word list = 2 units)
"5" "ready" (1 + global validation word or a validation word from an app word list = 1 unit)
It is possible to merge sequential word lists. This means that the operator can speak the words included in those 2 merged lists in any position in the word sequence.
This is done in the "Min" and "Max" columns of the "Speech Grammar" table, which is where you determine the minimum and maximum length of a word list:
a. Select the in the "Max" column of the first word list to be merged.
b. Select the in the "Min" column of the word list that follows.
Example of Merged Word Lists
There are two sequential word lists:
Word List "Numeric1" |
Word List "Good_Bad" |
The "Numeric1" word list includes digits from "0" to "9".
The "Good_Bad" word list contains the words "Good" (return value is "Yes") and "Bad" (return value is "No").
Selecting the arrow symbols as displayed in the image above ensures the merging of the 2 word lists.
The "End of Input" is set to "None", meaning, the operator does NOT have to say a validation word at the end of his speech input (NOT needed because of the fixed digit input).
This means that the operator must say 2 units, EITHER from one or both word lists in no particular order.
Possible Speech Inputs
"1" "2"
"good" "1"
"1" "bad"
"good" "bad"
Proceed to the "Advanced" tab.
"Advanced" tab
In this tab, you can collect information on the received speech input and, if necessary, enable barcode reading as an alternative to speech input.
Fill in the following options:
Recognizer Options |
|
Name |
Enter the name of the process.
Use a name that relates to the speech input you are defining to better identify this specific process within the application's execution rows. Ex: "Location Check Digit", "Delivery confirmation", etc. |
Hint Value(s) |
If required, define "Hint Value(s)" for the current control. Enter the value(s) OR click to select a variable with the intended value. See Variable Usage. Hint Values refer to the expected speech input value(s) for a certain context (ex: the expected words spoken by the operator when the focus is on a specific "Input keyboard"). These Hint Values are only used in a post-speech processing phase. After the speech input has been recognized, in case there is more than one possible match with a similar confidence level, the acceptance criteria will force the selection of the Hint Value, even if its score is slightly lower than the other recognized set of words. Ex: If you set the Hint Values for this control at 1 9 and the operator says "1 9" and "2 6" and both inputs have a high confidence level (leading to two speech recognition results), the Hint Values are taken into consideration and the "2 6" input/result is discarded.
If required, multiple Hint Values can be defined. For multiple Hint Values use ";" as a separator.
|
Sync Mode |
The "sync mode" determines when the operator can start his speech input taking into account the current speech output. Select the most appropriate "sync mode" from the corresponding drop-down: Sequential (End of last Prompt) - The operator has to listen to the complete speech output before he can start his speech input (recommended "sync mode"). Anticipated (start of last Prompt) - The operator can start speaking as soon as the speech output (what the operator hears) is initiated. Continuous (End of last Input) - The operator can speak inputs continuously, one after the other. This sync mode implies the creation/use of specific grammar constructions to allow the application to recognize/validate each speech input. |
Store Result into Variable |
|
Valid/Key Code |
The variable you define here is meant to receive the key codes/key code values associated to the spoken words. It is only applicable if, in the "End of Input" option of the "General" tab, you have selected a specific type of App Word List with key codes or if you intend to use keys/key combinations in the "Routing" tab. •In the case of an App Word List: There is a type of App Word List that can be used to indicate the end of input - this list associates one or more spoken words to a specific set of key codes (<ENTER>, <ENTER1>, <ENTER2>, up to <ENTER9>). Each key code has a default value ("EN", "E1", "E2" up to "E9"). This means that, firstly, any of the listed words can be used to validate the end of input and, secondly, you can,then, use each key code's default value to perform different actions (ex: To redirect the application, depending on the spoken word). The expected variable content will be "EN" for "<ENTER>"; "E1" for "<ENTER1>", "E2" for "<ENTER2>" and so on. See Speech Input Validation Example. •In the case of key codes defined in the "Routing" tab: If you use key/key combinations for routing purposes, the corresponding key code associated to the spoken word will be stored into the variable you select here. Click to select the variable that receives the key codes for end of speech input and/or routing purposes. See To Select/Create a Variable. |
Valid/Key Word |
Click to select the variable that receives the input validation words you defined in the "General" tab - in the "End of Input" option (ex: a global validation word or the words included in an app word list that are associated to specific key codes) or the key word(s) defined in the "Routing" tab. |
Spoken Words |
Click to select the variable that receives the recognized complete set of spoken words (the recognized grammar sequence, including digits, and validation word defined in the "General" tab).
If you intend to define specific speech commands in the "Routing" and/or "Help" tab, use the system variable "X_VOICE_LAST_WORD_SPOKEN" to ensure that those commands are stored when spoken by the operator. |
% Confidence |
Click to select the variable that receives the confidence percentage level (percentage defined by the recognizer engine for the current speech input result) of the grammar sequence defined in the "General" tab.
In case you define speech commands in the "Routing" and/or "Help" tab, use the system variable "X_VOICE_LAST_WORD_SPOKEN_SCORE" so that the confidence percentage level of those commands, when spoken by the operator, is also stored. |
Scanning Options |
|
Enable Scanner |
If required, check this option to enable barcode reading. The scanned value will be stored in the variable you defined in the "General" tab - in the "Variable" option. |
Scan Profile |
Select the appropriate scanner profile from the drop-down OR click to access a table with the existing scanner profiles. This table allows you to edit the existing scanner profiles (see To Edit a Barcode Scanner Profile) and to add more profiles (see To Create a Barcode Scanner Profile). |
On Scan Go to |
Select a target location from the drop-down or list to be executed after the barcode is read. See Detail of a window below.
When defining a screen as a target destination (ex: via a “Go to” process), you CANNOT use variables to specify the name of that target screen. You must select the intended screen from the available drop-down/list. |
Speech Recording |
|
File Name |
Define the sound file (.wav) that receives the operator's speech input. Enter the name of the file OR click to select a variable with the intended value.
To avoid any issues, make sure the used ".wav" file's frequency is 22050 Hz |
Detail of a window:
"S:Menu" is a screen included in the same program as the process.
"R:Routine_1" is a routine included in the same program as the process.
Speech Input Validation Example
This example describes the use of two spoken words (included in an App Word List) to stipulate the end of the speech input and to redirect the application flow according to the used spoken word.
1. In the "Speech" module, create an App Word List ("Validation_Words") with the following content:
2. Add a "Speech Input" process and fill in the necessary fields to construct a speech grammar sequence (in its "General" tab ).
3. In the "End of Input" option, select the "<Validation_Words>" app word list.
4. In the "Advanced" tab of the "Speech Input's" properties window, create the variables that will store the following:
Expected Variable Content: |
|
P_Key_Code - "E1" or "E2" |
|
P_Spoken_Word - "Ready" or "Short" |
|
P_All_Spoken_Words - the defined "speech grammar" sequence + "Ready" or "Short" |
5. Next to the "Speech Input" process, add a "Case & Branch" process, 2 labels (via the "Set Label" process) and a "Go To" process within each label.
Expected Behavior:
If the operator ends his speech input with the word "Short", the application will consider the input as complete and redirect him to "R:Routine_SHORT".
If the operator ends his speech input with the word "Ready", the application will, also, consider the input as complete but will redirect him to "R:Routine_READY".
Click the "Routing" tab to continue.
"Routing" tab
This tab allows you to redirect the application's workflow by combining specific spoken words/keys with "Go To" processes.
Fill in the following options:
Key Words / Branching |
|
Word List |
If you intend to use words contained in an App Word List, select a list from the drop-down OR click and select it from the "Word List" table. This table allows you to view/edit the existing app word lists (see To Edit an App Word List) and to add more, if required (see To Add an App Word List). |
Words/Keys column |
Select the words/commands and/or keys/key combinations that can be used by the operator to send the application's workflow to other destinations from the drop-down. If you want to use words/commands and did NOT select an App Word List in the previous drop-down, you have to enter the intended word. It is possible to combine keys/key combinations with words/commands. Select the intended key/key combination, enter a semicolon (";" is the only accepted separator) and, then, enter the intended word (ex: "<F2>;repeat").
See Example of Application Routing with Speech Input below. |
Go to column |
Select a target location from the drop-down or list, in case the defined words/keys are spoken/pressed by the operator. See Detail of a window below.
When defining a screen as a target destination (ex: via a “Go to” process), you CANNOT use variables to specify the name of that target screen. You must select the intended screen from the available drop-down/list. |
Time Out |
|
Seconds |
If required, associate a time out for the redirection execution. Define the time out value (in seconds). |
Go to |
Select a target location from the drop-down or list, in case the defined "Time Out" is reached. See Detail of a window below.
When defining a screen as a target destination (ex: via a “Go to” process), you CANNOT use variables to specify the name of that target screen. You must select the intended screen from the available drop-down/list. |
Example of Application Routing with Speech Input
This example describes the use of an alphanumeric app word list with two spoken words, for application redirection purposes.
1. In the "Speech" module, create an App Word List ("Good_Bad") with the following content:
2. Add a "Speech Input" process and create two labels with the "Set Label" process.
3. In the "Speech Input's" properties window, open the "Routing Tab".
4. Select the "Good_Bad" app word list and fill in the table as follows:
Expected Behavior:
If the operator says "Good", the application will open the "R:Routine_YES" routine.
If the operator says "Bad", he will be redirected to the "R:Routine_NO" routine.
Detail of a window:
"S:Menu" is a screen included in the same program as the process.
"R:Routine_1" is a routine included in the same program as the process.
Click the "Help" tab to continue.
"Help" tab
This tab allows you to define a "speech help" for the operator - he says specific words/commands that trigger a response from the application (usually instructions to help the operator with his next task).
Fill in the following options:
Key Words/Information |
|
Word List |
If you intend to use words contained in an App Word List, select one from the drop-down OR click and select it from the "Word List" table. This table allows you to view/edit the existing App Word Lists (see To Edit an App Word List) and to add more, if required (see To Add an App Word List). If you intend to use words that are NOT contained in an App Word List, keep this option empty. |
Spoken Words Column |
Define the word(s) to be used by the operator to ask for a response/instruction. Select the intended words from the drop-down (if an App Word List is selected). If you want to use words that are NOT contained in any App Word Lists, you have to enter the intended word(s). |
Speech out Column |
Enter the corresponding word(s)/sentence(s)/instruction(s) OR click and use the "Speech Prompt Builder" assistant to create the prompt. |
"Speech Help" Example
This example describes a "speech help" that was created using the "Speech Prompt Builder":
After filling in the required options, click to conclude or to abort the operation.
If required, use the icon located on the upper right corner of the properties window to open a "Localization" window where you can edit the text element within that control or add translations to it. See Localization.:
You can use relative paths to refer the file(s) you want to use in your project. See Working with Aliases.
If you want to use a label as a target destination, you can use the "Auto-Label" mechanism. This alternative to the "Set Label" process allows you to create a label in the properties window of a process - specifically, in the fields used to define target destinations (ex: the "If Error..." type fields). See To Automatically Create a Label.
Use the right-click in MCL-Designer's input boxes to access some related options as well as the general "Cut", "Copy"; "Paste"; "Search" actions (active/inactive according to the current context).
Ex: If you right-click the "Variable" input box (included in a "Conversion's" properties window), you are provided with general editing/search actions and other more specific options such as "Variable Select" (see "Variable Select"); "Variable Insert" (see "Variable Insert"); "Insert Special Character" (see To Insert Special Characters into a Control's Text Input Field) and "Localization Select" (see Localization List).
If you right-click another input box, it may provide other possibilities.