Viewing a single comment thread. View all comments

suflaj t1_ivuj62a wrote

Yeah, as said previously, Google is a master of it - ex. look at Pixel 7 ASR.

I believe it's still called ASR.

1

Snickersman6 t1_ivum0nx wrote

You mentioned automatic speech recognition which is not what I was really asking about, I was asking about speaker diarization. The link below goes over the differences. It may be a part of ASR, but I don't know if it's does that on it's own as part of the speech recognition.

https://deepgram.com/blog/what-is-speaker-diarization/

1

suflaj t1_ivumz3h wrote

It has not been marketed as such because it's built on top of ASR. Hence, you search for ASR and then look for its features. The same way you look for object detection, and if you need segmentation, you look if it has a detector that does segmentation. A layman looking for a solution does not search for specific terms and marketers know this.

Be as it be, the answer remains the same - Google offers the most advanced and performant solution, it markets it as ASR or how they call it text to speech, with this so called diarization being one feature of it.

2