Micropower Analog VLSI Continuous Speech Recognition

Abstract

Driven by the proliferation of portable devices like cellular phones, personal digital assistants (PDAs) and smart wrist watches there has been an ever increasing demand for efficient and robust user interfaces. An intelligent speech interface offers an attractive alternative to other means of communication and provides hands free communication with these portable devices. Miniature handheld and wristworn devices require extreme low power solutions to support the use of very small batteries. Micropower analog VLSI provides a viable technology to implement a speech recognition user interface efficiently enough so that it can run off a wristwatch battery. From a computational perspective, parallel analog techniques are feasible because most of the computation involved in recognition is of a probabilistic nature that does not require high precision.

In the first part of the project we designed and developed efficient speech processing and recognition algorithms for small vocabulary systems, in light of efficient implementation in analog hardware. A flexible and scalable design approach allowed to reduce the complexity of the hardware by trading implementation accuracy for reduced silicon area and power dissipation. Theoretical research in this area has resulted in forward decoding kernel machines (FDKM), a maximum-a-posteriori (MAP) based sequence decoding scheme that combines traditional hidden markov models (HMM) with support vector machines (SVMs). The SVMs process acoustic features and produce HMM transition probabilities and a HMM forward decoding block integrates these probabilities to discriminate between phonetic utterances. The performance of FDKM depends on the discriminatory ability of the SVM generating margin classifier. Further investigation in this area has led to the development of the Gini-support vector machine (SVM), a sparse large margin classifier that generates normalized output probability scores. Both Gini-SVM and FDKM have demonstrated state-of-art performance on various signal processing tasks in speech and image recognition.

In the second part of the project the GiniSVM and FDKM algorithms were mapped onto parallel architecture, and implemented in low-power current-mode CMOS analog VLSI. Non-volatile floating-gate MOS storage provides full analog programmability and trainability throughout all stages of the architecture. A calibration scheme, coupled with a chip-in-loop retraining procedure, cancels imprecision due to fabrication-induced mismatch in the analog circuit implementation. A GiniSVM/FDKM processor was prototyped and fabricated in 0.5um CMOS technology. In experiments on a speaker verification task, the chip yielded real-time recognition accuracy at par with floating-point software, but consumed sub-microwatt power.Further materials resulting from this project: http://bach.ece.jhu.edu/catalyst/fdkm

Publications

“Sub-Microwatt Analog VLSI Support Vector Machine for Pattern Classification and Sequence Estimation ,” S. Chakrabartty and G. Cauwenberghs, Adv. Neural Information Processing Systems (NIPS’2004), Cambridge: MIT Press, 17, 2005.”Spike Sorting with Support Vector Machines,” R.J. Vogelstein, K. Murari, P.H. Thakur, G. Cauwenberghs, S. Chakrabartty and C. Diehl, Proc. 26th Ann. Int. Conf. IEEE Engineering in Medicine and Biology Society (EMBS’2004), San Francisco, Sept. 1-4, 2004 (Region 2 Finalist, EMBS-Whitaker Student Paper Competition).”Analog Auditory Perception Model for Robust Speech Recognition,” Y. Deng, S. Chakrabartty and G. Cauwenberghs, Proc. IEEE Int. Joint Conf. Neural Networks (IJCNN’2004), Budapest Hungary, July 25-29, 2004.”Robust Speech Feature Extraction by Growth Transformation in Reproducing Kernel Hilbert Space,” S. Chakrabartty, Y. Deng and G. Cauwenberghs, Proc. IEEE Int. Conf. Acoustics Speech and Signal Processing (ICASSP’2004), Montreal Canada, May 17-21, 2004.”Margin Propagation and Forward Decoding in Analog VLSI,” S. Chakrabartty and G. Cauwenberghs, Proc. IEEE Int. Symp. Circuits and Systems (ISCAS’2004), Vancouver Canada, May 23-26, 2004.”Three-Decade Programmable Fully Differential Linear OTA,” Y. Deng, S. Chakrabartty and G. Cauwenberghs, Proc. IEEE Int. Symp. Circuits and Systems (ISCAS’2004), Vancouver Canada, May 23-26, 2004.”Silicon Support Vector Machine with On-Line Learning,” R. Genov, S. Chakrabartty and G. Cauwenberghs, Int. J. Pattern Recognition and Artificial Intelligence, vol. 17 (3), pp. 385-404, 2003.”Sparse Probability Regression by Label Partitioning,” S. Chakrabartty, G. Cauwenberghs and Jayadeva, Proc. 16th Conf. Computational Learning Theory (COLT’03), Washington DC, Aug. 24-27, 2003.”Power Dissipation Limits and Large Margin in Wireless Sensors,” S. Chakrabartty and G. Cauwenberghs, Proc. IEEE Int. Symp. Circuits and Systems (ISCAS’2003), Bangkok Thailand, May 25-28, 2003.”Robust Cephalometric Landmark Identification Using Support Vector Machines,” S. Chakrabartty, M Yagi, T. Shibata and G. Cauwenberghs, Proc. IEEE Int. Conf. Acoustics Speech and Signal Processing (ICASSP’2003), Hong Kong, Apr. 6-10, 2003.”Expectation Maximization of Forward Decoding Kernel Machines,” S. Chakrabartty and G. Cauwenberghs, Proc. 9th Int. Workshop Artificial Intelligence and Statistics (AISTATS’2003), Key West FL, Jan. 3-6, 2003.”Forward-Decoding Kernel-Based Phone Sequence Recognition,” S. Chakrabartty and G. Cauwenberghs, Adv. Neural Information Processing Systems (NIPS’2002), Cambridge: MIT Press, vol. 15, 2003.”Forward Decoding Kernel Machines: A Hybrid HMM/SVM Approach to Sequence Recognition,” S. Chakrabartty and G. Cauwenberghs, Proc. SVM’2002, Lecture Notes in Computer Science, vol. 2388, pp. 278-292, 2002.”Sequence Estimation and Channel Equalization Using Forward Decoding Kernel Machines,” S. Chakrabartty and G. Cauwenberghs, Proc. IEEE Int. Conf. Acoustics Speech and Signal Processing (ICASSP’2002), Orlando FL, May 13-17, 2002.”Hybrid Support Vector Machine, Hidden Markov Model Approach for Continuous Speech Recognition,” S. Chakrabartty and G. Cauwenberghs, Proc. 43rd IEEE Midwest Symp. Circuits and Systems (MWSCAS’2000), Lansing MI, August 8-11, 2000.

Report

Link to PDF: Final Report