Fast search algorithms (FSA) used for variable block size motion estimation follow irregular search (data access) patterns. This poses as the main challenge in designing hardware architectures for them. In this study, we build a baseline architecture for fast search algorithms using state-of-the-art components available in academia. We improve its performance by introducing: (1) a super 2-dimensional (2-D) random access memory architecture for reading regular and interleaved two-rows or two-columns as opposed to one-row or one-column accessibility of the state of the art; (2) a 2-D processing element array with a tuned interconnect to support neighborhood connections required by the conventional fast search algorithms and to exploit on-chip data reuse. Results show that our design increases system throughput by up to 85.47%, and achieves power reduction by up to 13.83% with an area increase in the worst case by up to 65.53% compared to the baseline architecture.
ASJC Scopus subject areas
- Control and Systems Engineering
- General Computer Science
- Electrical and Electronic Engineering