TY - GEN
T1 - Arasid
T2 - International Conference on Embedded Wireless Systems and Networks, EWSN 2019
AU - Chen, Zeya
AU - Ahmed, Mohsin Y.
AU - Salekin, Asif
AU - Stankovic, John A.
N1 - Publisher Copyright:
© 2019 by the authors.
PY - 2019
Y1 - 2019
N2 - Indoor speaker identification systems have been researched for a long time and are widely used in many human interaction acoustic monitoring systems. Many works have focused on improving accuracy in dealing with different realisms, including noise and varying distances from the microphone. However, these works either require significant extra effort such as measuring room types and dimensions, obtaining many speakers’ samples, or requiring expensive hardware such as microphone arrays and complex deployment settings. In this paper, we introduce a complete speaker identification solution using an artificial reverberation generator with different parameters to adjust the original close-distance speech samples so that each speaker has different artificial voice samples. Samples in different environments are not required because these artificial samples are close approximations to different environments. Two kinds of models, GMM-UBM and the i-vector, are evaluated. The models are trained on all samples separately, and testing is done against all in parallel. A score fusing approach with two thresholds, a minimum value and a minimum difference, is applied to the scores in producing the final result. Also, several standard acoustic pre-processing routines, including a voice activity detection algorithm and an overlapped speech remover, are included to make the system fully deployable. Finally, to assess the improvements when applying a reverberation adjustment, we evaluate our system with two literature speech databases, one has 251 people and the other one has four kinds of emotions. Further, we perform an in-lab speaking experiment. The evaluation results show our system has more than 90% accuracy in identifying speakers within 6 meters if the emotion is neutral, and a 10% improve- ment over no reverberation adjustments when speakers have non-neutral emotions.
AB - Indoor speaker identification systems have been researched for a long time and are widely used in many human interaction acoustic monitoring systems. Many works have focused on improving accuracy in dealing with different realisms, including noise and varying distances from the microphone. However, these works either require significant extra effort such as measuring room types and dimensions, obtaining many speakers’ samples, or requiring expensive hardware such as microphone arrays and complex deployment settings. In this paper, we introduce a complete speaker identification solution using an artificial reverberation generator with different parameters to adjust the original close-distance speech samples so that each speaker has different artificial voice samples. Samples in different environments are not required because these artificial samples are close approximations to different environments. Two kinds of models, GMM-UBM and the i-vector, are evaluated. The models are trained on all samples separately, and testing is done against all in parallel. A score fusing approach with two thresholds, a minimum value and a minimum difference, is applied to the scores in producing the final result. Also, several standard acoustic pre-processing routines, including a voice activity detection algorithm and an overlapped speech remover, are included to make the system fully deployable. Finally, to assess the improvements when applying a reverberation adjustment, we evaluate our system with two literature speech databases, one has 251 people and the other one has four kinds of emotions. Further, we perform an in-lab speaking experiment. The evaluation results show our system has more than 90% accuracy in identifying speakers within 6 meters if the emotion is neutral, and a 10% improve- ment over no reverberation adjustments when speakers have non-neutral emotions.
KW - Distance
KW - Reverberation
KW - Speaker identification
UR - http://www.scopus.com/inward/record.url?scp=85120792639&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85120792639&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85120792639
SN - 9780994988638
T3 - International Conference on Embedded Wireless Systems and Networks
SP - 154
EP - 165
BT - International Conference on Embedded Wireless Systems and Networks, EWSN 2019
A2 - Liu, Yunhao
A2 - Xing, Guoliang
PB - Junction Publishing
Y2 - 25 February 2019 through 27 February 2019
ER -