C. Julian Chen'Correspondence information about the author C. Julian ChenEmail the author C. Julian Chen, Donald A. Miller
Based on simultaneous voice and electroglottograph (EGG) signals, to gain a better understanding of human voice production process, to make pitch-synchronous segmentation of voice signals, and to make visual representations of pitch marks and timbre spectra with high resolution.
The traditional spectrogram segments the voice signals with a process window of fixed size and fixed shift, then performs fast Fourier transformation after multiplied with a window function, typically a Hamming window. Then display power spectrum in both frequency and time. Pitch information and timbre information are mixed. The new design segments the signals into pitch periods, either using the derivatives of the EGG signals or based on the voice signals, then performs Fourier analysis to the segment of signals in each pitch period without using a window function. The pitch information and the timbre information are cleanly separated. The graphical representations of both pitch marks and timbre spectra exhibit high resolution and high accuracy.
Detailed analysis of simultaneously acquired voice and EGG signals provides a more precise understanding of human-voice production process. The transient theory of voice production, proposed by Leonhard Euler in early 18th century, is substantiated with modern data. Based on the transient theory of voice production, a pitch-synchronous spectrogram software is developed, which makes a visual representation of pitch marks and timbre spectra. In addition, the timbre spectrum and the power evolution pattern in each pitch period can be displayed individually.
Simultaneously acquired voice and EGG signals indicates that each glottal closing triggers a decaying elementary wave in the vocal tract. A superposition of those elementary waves constitutes voice. Based on that concept and using EGG data, a pitch-synchronous voice signal processing method is developed. The voice signal is first segmented into pitch periods, then the two ends are equalized. Fourier analysis is applied to obtain the timbre spectrum of each pitch period. High resolution display of timbre spectrum is generated. The power evolution pattern in each pitch period is also displayed.
Human voice, Production, Analysis, Pitch period, Timbre spectra, Graphical display