，成果详细信息-中国科技论文在线

陈云霁

35浏览
0点赞
0收藏
0分享
0下载
0评论
引用

期刊论文

A Small-Footprint Accelerator for Large-Scale Neural Networks

暂无

ACM Transactions on Computer Systems，2015，33（2）：6 | 2015年05月01日 | doi.org/10.1145/2701417

URL:https://dl.acm.org/doi/10.1145/2701417

摘要/描述

Machine-learning tasks are becoming pervasive in a broad range of domains, and in a broad range of systems (from embedded systems to data centers). At the same time, a small set of machine-learning algorithms (especially Convolutional and Deep Neural Networks, i.e., CNNs and DNNs) are proving to be state-of-the-art across many applications. As architectures evolve toward heterogeneous multicores composed of a mix of cores and accelerators, a machine-learning accelerator can achieve the rare combination of efficiency (due to the small number of target algorithms) and broad application scope. Until now, most machine-learning accelerator designs have been focusing on efficiently implementing the computational part of the algorithms. However, recent state-of-the-art CNNs and DNNs are characterized by their large size. In this study, we design an accelerator for large-scale CNNs and DNNs, with a special emphasis on the impact of memory on accelerator design, performance, and energy. We show that it is possible to design an accelerator with a high throughput, capable of performing 452 GOP/s (key NN operations such as synaptic weight multiplications and neurons outputs additions) in a small footprint of 3.02mm<sup>2</sup> and 485mW; compared to a 128-bit 2GHz SIMD processor, the accelerator is 117.87 × faster, and it can reduce the total energy by 21.08 ×. The accelerator characteristics are obtained after layout at 65nm. Such a high throughput in a small footprint can open up the usage of state-of-the-art machine-learning algorithms in a broad set of systems and for a broad set of applications.

关键词: 无

问答

暂无问题，成为第一个提问者

我要提问全部问题

学者未上传该成果的PDF文件，请等待学者更新

我要评论

全部评论 共 0 条

本学者其他成果

同领域成果