UWEE Tech Report Series

High-speed, Low-Resource ASR Back-end Based on Custom Arithmetic


Xiao Li, Jonathan Malkin, Jeff Bilmes

low power speech recognition, lookup tables, custom arithmetic, high speed, low resource, quantization, normalization, bitwidth allocation, word error reduction,


With the skyrocketing popularity of mobile devices, new processing methods tailored to a specific application have become necessary for low-resource systems. This work presents a high-speed, low-resource speech recognition system using custom arithmetic units, where all system variables are represented by integer indices and all arithmetic operations are replaced by hardware-based table lookups. To this end, several reordering and rescaling techniques, including a linear/tree-structure accumulation for Gaussian evaluation and a novel method for the normalization of Viterbi search scores, are proposed to ensure low entropy for all variables. Furthermore, a discriminatively inspired distortion measure is investigated for scalar quantization to minimize degradation in recognition rate. Finally, heuristic algorithms are explored to optimize system-wide resource allocation. Our best bit-width allocation scheme only requires 59kB of ROMs to hold the lookup tables, and its recognition performance with various vocabulary sizes in both clean and noisy conditions is nearly as good as that of a system using a 32-bit floating-point unit. Simulations on various architectures show that on most modern processor designs, we can expect a cycle-count speedup of at least 3 times over systems with floating-point units. Additionally, the memory bandwidth is reduced by over 70% and the offline storage for model parameters is reduced by 80%.

Download the PDF version