パイプライン処理を用いた畳込みニューラルネットワークのFPGA実装

山田, 瑛叶

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

パイプライン処理を用いた畳込みニューラルネットワークのFPGA実装

http://hdl.handle.net/10076/0002000679

名前 / ファイル	ライセンス	アクション
2023ME0211.pdf (2.4 MB)

Item type

学位論文 / Thesis or Dissertation(1)

公開日

2024-04-10

タイトル

パイプライン処理を用いた畳込みニューラルネットワークのFPGA実装

言語

jpn

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_46ec

資源タイプ

thesis

著者

山田, 瑛叶

抄録

内容記述タイプ

Abstract

内容記述

CNN（Convolutional Neural Network）は，画像処理の分野において高い精度を記録しており，産業用ロボットや車の自動運転技術など多岐にわたっての応用されている．CNN の推論処理では，学習とは異なりシステムにリアルタイム性が求められるため精度以上に速度が重視される傾向にある．FPGA（Field Programmable Gate Array）は，ユーザが手元で内部の論理回路の構造を再構成できる集積回路である．FPGA で処理をハードウェア化することで消費電力を抑えることや並列化による高速処理が可能である. 本研究では，CNN をFPGA に実装することで高速化を実現した．具体的には，CNN 処理全体を層単位でパイプライン処理し，それぞれの層内部の演算もパイプライン処理を効率的に行えるように並列化を行った．処理の中で一部の畳込み層の演算速度がパイプライン処理のボトルネックとなるため, その畳込み層を分割することで問題を解消する. また，様々なCNN モデルのハードウェア実装に同じ手法を適用できるため，設計プロセスの自動化を提案する．実装は高位合成を用い，ハードウェア記述言語を利用した設計手法より短期間での設計を可能とした．この際，回路設計をソフトウェア開発で使用される言語であるC++言語で記載し, コンパイラの指示はプラグマ形式で行う．評価実験では，ハードウェアをAMDのAlveo U50 に実装し，Intel Xeon Silver4214 プロセッサと比較し，約460 倍高速に動作することを示した．回路規模と消費電力についての評価も行った．

言語

抄録

内容記述タイプ

Abstract

内容記述

Convolutional neural networks (CNNs) have a record of high accuracy in the field of image processing, and are used in a wide range of applications such as industrial robotics and self-driving technology. CNN inference tends to focus more on speed than on accuracy because, in contrast to learning, real-time performance is required of the system. Field Programmable Gate Arrays (FPGAs) are integrated circuits that provide the user with the ability to reconfigure the structure of the internal logic circuits in the field. Hardware processing in FPGAs reduces power consumption and enables high-speed processing through parallelization. In this research, CNNs are implemented in an FPGA to achieve high speed. Specifically, the entire CNN processing is pipelined layer by layer, and the operations inside each layer are also parallelized to enable efficient pipeline processing. Since the computation speed of some convolutional layers becomes a bottleneck in the pipeline processing, the problem is solved by dividing the convolutional layers. In addition, since the same method can be applied to the hardware implementation of various CNN models, we propose to automate the design process. The implementation is based on high-level synthesis, which enables design in a shorter period of time than design methods based on hardware description languages. In our design flow, the circuit design is described in the C++ language, which is often used in software developement, with compiler diretives in the form of pragmas. In the evaluation experiments, the hardware was implemented on AMD’s Alveo U50 and was shown to be approximately 460 times faster than the Intel Xeon Silver 4214 processor. The circuit size and power consumption were also evaluated, and it was confirmed that the increase in power consumption is not significant.

言語

内容記述

内容記述タイプ

Other

内容記述

三重大学大学院工学研究科情報工学専攻コンピュータアーキテクチャ研究室

内容記述

内容記述タイプ

Other

内容記述

35p

書誌情報

発行日 2024-03

フォーマット

内容記述タイプ

Other

内容記述

application/pdf

著者版フラグ

出版タイプ

VoR

出版タイプResource

http://purl.org/coar/version/c_970fb48d4fbd8a85

出版者

三重大学

出版者（ヨミ）

値

ミエダイガク

修士論文指導教員

姓名

高木, 一義

言語

資源タイプ（三重大）

値

Master's Thesis / 修士論文

戻る

views

See details

	Views

Versions

Ver.1

2024-04-10 05:36:51.270060

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR 2.0
JPCOAR 1.0
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

パイプライン処理を用いた畳込みニューラルネットワークのFPGA実装

× 山田, 瑛叶

Versions

Share

Cite as

エクスポート