Products by Recognition Technologies, Inc.

RecoMadeEasy Embedded AudioVisual Recognition Engine by Recognition Technologies, Inc.

RecoMadeEasy^®Products

AudioVisual Recognition
(Combination of Speaker, Speech, Face Recognition, and Object Detection and Recognition with a single interface)

Embedded
Server Based

Speaker Recognition
(Language- and Text-Independent, aka: Speaker Biometrics, Voice Biometrics, or SIV)
Recipient: Frost & Sullivan Award 2011

Embedded
Server Based

RecoMadeEasy^®
Server-Based Speaker Recognition
Platform:
Language- and Text-Independence: The speaker recognition system is completely text- and language-independent. This means that a user may enroll her/his voice into the system in one language and be identified or verified in a completely different language. This allows the engine to be able to handle authentication and identification processes across any number of languages.
The RecoMadeEasy^® Speaker Recognition (SPKR) (SIV) System is an award-winning engine developed entirely by Recognition Technologies, Inc. which currently runs on Linux, Mac, and Windows operating systems. The engine is compatible with all microphone devices and most audio file formats. The SIV system is fully integrated with our IVR system which is compatible with Dialogic^® telephony T1 and E1 cards as well as their analog cards. It may also be run in a stand-alone environment independent of our IVR system in a telephony or non-telephony setting.

This is a state-of-the-art language and text-independent speaker recognition system (voice biometrics system) which has been developed to work in different environments. Large-Scale and Small-Scale versions of this speaker identification and speaker verification (SIV) engine have been developed over many years of research to work in the telephony as well as stand-alone environments. This speaker biometric engine may be customized to fit your exact needs including special modifications to fit the operating environment in which your related applications run. Our staff has been actively involved in defining speaker recognition (speaker biometric or voice biometrics) standards in the VoiceXML and ANSI communities by providing detailed consultation to the VoiceXML and M1 committees involved in defining the speaker verification and identification standards.

Capabilities
The RecoMadeEasy^® SIV system operates in 6 different modalities:
- Speaker Identification (Open-Set and Closed-Set)
  The speaker enrolls his voice with the system. The system trains for this and other speakers' voices. Once the speaker returns, the system only has to listen to the speaker and will be able to identify the speaker's voice among the trained voices it has in the database. The identification process returns an ID for the speaker. There are two different identification approaches. The simpler one is called Closed-Set Identification in which case the ID of the closest voice in the database is returned. In this case, if the speaker is not in the database there is a possibility of a mis-tagged ID since the closest voice is the database is picked. The more sophisticated (but harder) approach is called Open-Set Identification where the speaker may be tagged with an ID from the database or if the speaker has not been enrolled in the database, he is rejected as not-enrolled. Our SIV engine supports both Open-Set and Closed-Set approaches.
- Speaker Verification
  In this modality, again, the speaker has to enroll his voice. Once the enrollment process is done (recording of about 30 seconds of speech and obtaining a positive ID of the speaker), the speaker is added to the database. When the speaker returns, he makes a claim of his identity. He will also speak for a few seconds and the speaker's voice is matched against the database. His identity is either authenticated or he is rejected as an impostor. It is important to note that there are two possible sources of error; 1. False Acceptance and 2. False Rejection. A false acceptance error would happen if the individual is mistakenly authenticated. This is the number that we should try to minimize in more security conscious applications. There is a trade-off between the false acceptance and false rejection. If we reduce the false acceptance rate, it means that we are making the security tighter. This will naturally increase the number of false-rejections. False rejections could become annoying if they are not limited.
- Speaker Classification and Event Detection
  This modality of the engine may be used to classify speakers into groups such as gender groups (male/female/child). Language detection may also be viewed as classification. Age group and many other categories may also be used to perform speaker classification. This may also be used to classify or detect events such as beeps, speech, horn, auto noise, background noise, etc.
- Speaker Detection
  This would be the case where a speaker is already enrolled in the database and we would be trying to find the speaker among recordings or in a live conversation.
- Speaker Tracking
  In this case a speaker's voice is tracked through the conversation and the tracking makes sure the speaker stays on-line.
- Speaker Segmentation
  This would be used to segment the speech between two or more speakers in a conversation.
The Engine May be Used in the Following Ways
1. Standalone engine which may be run through the use of command lines and system calls.
2. Standalone engine which may be used through a very simple C++ SDK and API. This would be most useful for integrating the engine into current products and IVR systems.
3. As a module of our RecoMadeEasy® IVR system.
4. As a web service using our servers.
5. As a web service using your own servers.
Supported Audio Interface
- All Microphone devices
- All Major Audio File Formats
- All Dialogic JCT Telephony cards (T1 and Analog)
Supported Operating Systems
Linux (both 32-bit and 64-bit versions are supported)
- CentOS 8 and 7.9 Linux (Latest)
- Previous CentOS Linux versions: 7.3, 7.2, 7.1, 7.0, 6.6, 6.4, 6.3 6.2, 5.7, 5.6, 5.4
- Fedora 38 Linux (Latest)
- Previous Fedora Linux versions: 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, Core 5, Core 4, Core 3, Core 2, Core
- Ubuntu 22.04 Linux (Latest)
- Previous Ubuntu Linux versions: 20.04, 18.04, 16.04
- N.B.: May be made available for other Unix-Like systems upon request
Microsoft Windows
- 64-bit - Windows 7 (Latest)
- 32-bit - Windows XP
Apple Macintosh
- Mac OS X - 10.8.4 (Latest)
- Previous OS X versions: 10.6.8, 10.5
An evaluation account for the hosted version of the RecoMadeEasy® Speaker Recognition software may be made available to interested organizations.

Large-Vocabulary Speech Recognition
Available for English, Spanish, Mandarin, Arabic, and German
Also Available in Bilingual Spanish-English, Mandarin-English, Arabic-English, and German-English
(Customizable domain full transcription ~ 240,000+ word vocabulary)

Embedded
Server Based

Face Recognition
(face detection and recognition)

Embedded
Server Based

Object Recognition
(object detection and recognition)

Embedded
Server Based

Interactive Voice Response (IVR)
(Graph-based logic, easily configured)

Product Details

Automatic Language Proficiency Rating (ALPR)
(Multi-lingual automated language proficiency rating)

Signature Recognition
Status: Advanced Development Stage

Keystroke Recognition
Status: Research Stage

For further information please contact us at 1-800-215-0841 inside the U.S. or +1-914-997-5676 from any other country. Alternatively, you may send an Email to info@recotechnologies.com.
info@recotechnologies.com	Recognition Technologies, Inc.