TY - GEN
T1 - A coarse grained and hybrid reconfigurable architecture with flexible NOC router for variable block size motion estimation
AU - Verma, Ruchika
AU - Akoglu, Ali
PY - 2008
Y1 - 2008
N2 - This paper proposes a novel application-specific hybrid coarse-grained reconfigurable architecture with a flexible network on chip (NoC) mechanism. Architecture supports variable block size motion estimation (VBSME) with much less resources than ASIC based and coarse grained reconfigurable architectures. The intelligent NoC router supports full search motion estimation algorithm as well as other fast search algorithms like diamond, hexagon, big hexagon and spiral. Our model is a hierarchical hybrid processing element based 2D architecture which supports reuse of reference frame blocks between the processing elements through NoC routers. This reduces the transactions from/to the main memory. Proposed architecture is designed with Verilog-HDL description and synthesized by 90 nm CMOS standard cell library. Results show that our architecture reduces the gate count by 7x compared to its ASIC counterpart that only supports full search method. Moreover, the proposed architecture operates at a frequency comparable to ASIC based implementation to sustain 30fps. Our approach is based on a simple design which utilizes a high-level of parallelism with an intensive data reuse. Therefore, proposed architecture supports run-time reconfiguration for any block size and for any search pattern depending on the application requirement.
AB - This paper proposes a novel application-specific hybrid coarse-grained reconfigurable architecture with a flexible network on chip (NoC) mechanism. Architecture supports variable block size motion estimation (VBSME) with much less resources than ASIC based and coarse grained reconfigurable architectures. The intelligent NoC router supports full search motion estimation algorithm as well as other fast search algorithms like diamond, hexagon, big hexagon and spiral. Our model is a hierarchical hybrid processing element based 2D architecture which supports reuse of reference frame blocks between the processing elements through NoC routers. This reduces the transactions from/to the main memory. Proposed architecture is designed with Verilog-HDL description and synthesized by 90 nm CMOS standard cell library. Results show that our architecture reduces the gate count by 7x compared to its ASIC counterpart that only supports full search method. Moreover, the proposed architecture operates at a frequency comparable to ASIC based implementation to sustain 30fps. Our approach is based on a simple design which utilizes a high-level of parallelism with an intensive data reuse. Therefore, proposed architecture supports run-time reconfiguration for any block size and for any search pattern depending on the application requirement.
UR - http://www.scopus.com/inward/record.url?scp=51049118363&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=51049118363&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2008.4536528
DO - 10.1109/IPDPS.2008.4536528
M3 - Conference contribution
AN - SCOPUS:51049118363
SN - 9781424416943
T3 - IPDPS Miami 2008 - Proceedings of the 22nd IEEE International Parallel and Distributed Processing Symposium, Program and CD-ROM
BT - IPDPS Miami 2008 - Proceedings of the 22nd IEEE International Parallel and Distributed Processing Symposium, Program and CD-ROM
T2 - IPDPS 2008 - 22nd IEEE International Parallel and Distributed Processing Symposium
Y2 - 14 April 2008 through 18 April 2008
ER -