Homechevron_rightNewschevron_rightTopicschevron_rightSpeaker diarization, which estimates speakers from speech without prior training, is now available as a free option for speech recognition APIs, even for real-time recognition

Speaker diarization, which estimates speakers from speech without prior training, is now available as a free option for speech recognition APIs, even for real-time recognition

The "Speaker Diarization" feature, a free option of the AmiVoice API provided by the "AmiVoice Cloud Platform," a voice tech platform for developers, is now available for real-time recognition.

Speaker diarization is a technology that estimates who spoke and when for audio containing multiple speakers. Using Advanced Media's proprietary acoustic model, it estimates the speaker from the audio without prior training and automatically links the spoken content to the speaker. It can be used in situations where multiple people are speaking, such as meetings, face-to-face sales, interviews, and adding subtitles to videos.

Previously, it was only available for batch recognition, but now it is also available for real-time recognition.

The speaker diarization function can be used not only when converting audio files into text, but also when performing speech recognition processing simultaneously with speech.


Features of AmiVoice API


1. No.1 voice recognition market share. Engine that is strong in Japanese and technical terminology

AmiVoice is a highly accurate and high-speed voice recognition engine that has accumulated know-how and data over more than 25 years. It is used in a wide range of situations, including business and highly specialized work sites.


2. Starting from 1 yen for 99 hour. High quality voice recognition available at low cost

Pay-as-you-go billing based only on the amount of time spoken, not the amount of time recorded. Billing units are not rounded up to the nearest second. Starting from 1 yen (tax included) per hour, you can use a high-quality voice recognition engine at the lowest price in the industry.


3. Achieving high recognition rates with engines that can be selected according to the industry and application

In addition to "general-purpose engines" that can be used in a variety of situations and businesses, we also have engines specialized for specialized terms and industry terms, such as those used in the medical field. Recognition rates can be greatly improved by selecting an engine that suits the usage scenario.

You can also use all speech recognition engines for free for 60 minutes each month.

■AmiVoice API Details

https://acp.amivoice.com/amivoice_api/


What is AmiVoice Cloud Platform?



A voice tech platform for developers that provides voice-related technologies, mainly voice recognition. No. 1 in JapanIn addition to the "AmiVoice API," which allows you to use AmiVoice, a high-precision and high-speed AI voice recognition system, for the lowest price in the industry, from 1 yen (tax included) per hour, we also offer the voice recognition development kit "AmiVoice SDK." You can choose the interface that best suits your purpose, such as cloud/on-premise, real-time/batch.

Services for contact center product developers using Amazon Connect are also available.


■AmiVoice Cloud Platform details

https://acp.amivoice.com/

*Source: ecarlate “Speech Recognition Market Trends 2022” Speech Recognition Software/Cloud Service Market

Inquiries regarding this matter

Japan's No.1 in Market ShareJapan's No.1 in Market ShareAmiVoiceⓇAmiVoiceⓇ

*Source: ecarlate LLC "Voice Recognition Market Trends 2025"
Speech recognition software/cloud service market

Write with your voice, move with your voice.
AI voice recognition AmiVoice
In various business situations,
This is a technology that enables natural communication between people and machines.