In:
The Journal of the Acoustical Society of America, Acoustical Society of America (ASA), Vol. 36, No. 10_Supplement ( 1964-10-01), p. 1988-1988
Abstract:
Preliminary results are given on a comparative study of various objective talker-recognition procedures, based on spectrographic analysis of 7 replicate utterances of each of 10 words by each of 10 different speakers. The spectrograms are quantized into 17 frequency channels and approximately 50 time channels. Different summarizations are applied to the spectrograms, including marginal energies, totalled across time, in each frequency channel; marginal energies for each time channel; and momentlike descriptions of energy distribution of the time margin. Various combinations of these summarizations were used as inputs to different multivariate distance measures, including (a) distance from unknown to a speaker centroid, using a metric based on a covariance matrix pooled over all speakers; (b) distances based on eigenvectors, using a classical discriminant-analysis approach; (c) distances based on metrics, employing individual speaker covariance matrices. Percent correct identification varied from 22% (discriminant analysis, using one eigenvector of energy margin on time) to 97% [distance (a) applied to the frequency margins]. Frequency classification of energy is better than time classification; distance (a) is better than the others; certain words are much better than others.
Type of Medium:
Online Resource
ISSN:
0001-4966
,
1520-8524
Language:
English
Publisher:
Acoustical Society of America (ASA)
Publication Date:
1964
detail.hit.zdb_id:
1461063-2
Bookmarklink