A single comprehensive software engine for enabling Speech, Speaker, Face, Object, Emotion Recognition, Translation, Access Controls, and much more, using a unified set of APIs designed for Integrators and Software Developers -- works standalone (Android and Linux) and in client/server mode
RecoMadeEasy Embedded AudioVisual Recognition Engine by Recognition Technologies, Inc.
  • AudioVisual Recognition (Embedded) (Server Based)
    (Combination of Speaker, Speech, Face Recognition, and Object Detection and Recognition with a single interface)

  • Large-Vocabulary Speech Recognition (Embedded) (Server Based)
    Initially available for English, Spanish, Mandarin, Arabic, and German, is now available for 100+ languages
    Also includes multilinguagl support and code-switching
    (Customizable domain full transcription ~ 300,000+ word vocabulary)
    Server-Based Large-Vocabulary Speech Recognition


    RecoMadeEasy® Large-Vocabulary Speech Recognition is a standalone natural language speech recognition engine that offers comprehensive conversational voice interaction through many different mechanisms including websockets, C++ API, and a web interface. The engine has a small memory footprint and is designed to run natively on devices that seek unconstrained natural language interfaces with high recognition accuracy in the presence of service interruption or when full, uninterrupted and secure access to a cloud server is not guaranteed.

    The RecoMadeEasy® Speech Recognition engine has been developed in our research labs in New York. When presented with an audio or audio-video stream, the engine via the API returns JSON or XML results containinng the full transcript of the stream with a configurable number of contending results. The results include a likelihood score as well as a confidence score for each result. The engine also returns the timestamps of turns of audio (sentences). It is also capable of returning the timestamps for the words in the transcript, allowing for alignment and manipulation of the results alongside the original audio stream.

    The engine is built to allow users to speak naturally and be understood – even in a far-field, noisy environment. RecoMadeEasy® (Reco Made Easy) is available as an SDK with an included API that contains all necessary components for full integration and enables engineers to get started easily and without any work or costs for development.

    The RecoMadeEasy® AudioVisual Speech Recognition engine is also available as a server-side and a standalone product.

    This engine provides one of the most accurate transcriptions for English, handling many different dialects and accents in a single large-vocabulary transcription engine, It is also capable of providing real-time processing in a small memory footprint.

    The speech recognition uses a streaming interface where the recognizer, in the form of a TCP/IP listeners, runs on any device. Any light generic client capable of using a websocket interface may stream audio/video to a listener and get back real-time results of the transcript with optional alternative results, including likelihood scores in any codec that is supported by GStreamer-1.0, including MP3, Ogg Vorbis, Free Lossless Audio Codec (FLAC), MP4, Pulse Code Modulation (PCM), or other codecs such as those supported by a standard Waveform Audio File Format (WAVE).

    Supported Languages

      The RecoMadeEasy® (Reco Made Easy) Speech Recognition engine is currently available for the following languages:


      • All dialects of English


      • Major dialects of Spanish

      Chinese (Mandarin)

      • Major dialects of Mandarin Chinese


      • Major dialects of Arabic


      • Major dialects of German

      Multi-Lingual Support for 100+ Languages

      • Includes Code-Switching

    Supported Operating Systems

      The RecoMadeEasy® Speech Recognition engine is available for the following operating systems. The C++ SDK, command-line interface, and web services may be used in any of the following systems:

    Linux (both 32-bit and 64-bit versions are supported)

    • Fedora 40 Linux (Latest)
    • Previous Fedora Linux versions: 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, Core 5, Core 4, Core 3, Core 2, Core

    • Ubuntu 24.04 Linux (Latest)

    • Previous Ubuntu Linux versions: 22.04, 20.04, 18.04, 16.04

    • CentOS 8 and 7.9 Linux (Latest)
    • Previous CentOS Linux versions: 7.3, 7.2, 7.1, 7.0, 6.6, 6.4, 6.3 6.2, 5.7, 5.6, 5.4

    • N.B.: May be made available for other Unix-Like systems upon request

  • Speaker Recognition (Embedded) (Server Based)
    (Language- and Text-Independent, aka: Speaker Biometrics, Voice Biometrics, or SIV)
    Recipient: Frost & Sullivan Award 2011

  • Face Recognition (Embedded) (Server Based)
    (Face detection and recognition)

  • Object Recognition (Embedded) (Server Based)
    (Object detection and recognition)

For further information please contact us at 1-800-215-0841 inside the U.S. or +1-914-997-5676 from any other country. Alternatively, you may send an Email to Recognition Technologies, Inc.