FPT18 Workshop on Embedded Machine Learning

Embedded Machine Learning: Technology and Opportunities

International Conference on Field Programmable Technology Naha, Okinawa http://www.fpt18.sakura.ne.jp/workshop.html December 14, 2018 (afternoon)

Organisers: David Boland and Philip Leong, The University of Sydney

This workshop will provide a forum to discuss technical challenges and product opportunities for applying deep learning within embedded systems that take advantage of the energy and latency benefits offered by field programmable gate arrays (FPGAs). In particular, the following questions will be addressed:

Is current academic research addressing the needs of industry? What are the new research problems important to them?
What are the new markets and applications created by FPGA-based machine learning?
Will FPGAs be able to capture a significant marketshare, or will hardware-accelerated machine learning be consolidated such that application specific integrated circuits (ASICs) are the preferred solution?

The format is a 2-hour panel session with guests representing industry and academia. Each speaker will be given 10-15 minutes to briefly state their positions regarding the above questions. We will then open the floor to questions from the audience, which we expect to lead to engaging discussions.

We hope that by attending this workshop you will learn about new embedded applications that could benefit from the use of FPGA accelerated machine learning, as well as the technical challenges that need to be addressed in order to make this a reality. Furthermore, by gathering likeminded individuals, this workshop provides an opportunity to develop collaborations with panellists or other attendees.

Speakers:

**Kees Vissers, **graduated from Delft University in the Netherlands. He worked at Philips Research in Eindhoven, the Netherlands, for many years. The work included Digital Video system design, HW –SW co-design, VLIW processor design and dedicated video processors. He was a visiting industrial fellow at Carnegie Mellon University, where he worked on early High Level Synthesis tools. He was a visiting industrial fellow at UC Berkeley where he worked on several models of computation and dataflow computing. He was a director of architecture at Trimedia, and CTO at Chameleon Systems. For more than a decade he is heading a team of researchers at Xilinx, including a significant part of the Xilinx European Laboratories. The research topics include next generation programming environments for processors and FPGA fabric, high-performance video systems, machine learning applications and architectures, wireless applications and new datacenter applications. He has been instrumental in the High-Level Synthesis technology and one of the technical leads in the novel ACAP technology. He is now a Fellow at Xilinx.

Xinyu Niu received his B.S degree in Electronic Engineering from Fudan Univerisity in 2010, his M.Sc degree in Electrical and Electronic Engineering from Imperial College London in 2011, and his Ph.D. degree in Computing from Imperial College London in 2014. He has over 30 publications and 17 patent applications in the area of high-peformance computing. He is the founder and CEO of Corerain Technologies, a AI chip startup that focus on high-performance AI computing platform for IoT applications. Corerain has developed the Custom AI Streaming Accelerator (CAISA) architecture and Rainbuilder compiler for deep-learning algorithms. The Rainman and Nebula accelerator boards are widely used in cameras, robotics, and servers to provide real-time high-performacne AI computing. The Rainbuilder compiler can support seamless compilation from algorithms developed from frameworks such as Tensorflow and Caffe.

Nachiket Kapre is an Assistant Professor in the Department of Electrical and Computer Engineering at University of Waterloo, Canada. He was previously an Assistant Professor at Nanyang Technological University, Singapore in the School of Computer Engineering. He has received his M.S in Electrical Engineering (2005) and Computer Science (2006) and a PhD in Computer Science (2010) from California Institute of Technology, Pasadena. He is primarily interested in understanding and exploiting the potential of parallel, spatial architectures such as FPGAs for energy-efficient computing. He has worked in the field of embedded machine learning for the past few years through contributions to (1) the CaffePresso framework presented at CASES 2016 (paper) and 2017 (tutorial), and (2) the NengoFPGA framework built in collaboration with ABR Inc at Waterloo.

Shinya Takamaeda-Yamazaki received the B.E, M.E, and D.E degrees from Tokyo Institute of Technology, Japan in 2009, 2011, and 2014 respectively. From 2011 to 2014, he was a JSPS research fellow (DC1). From 2014 to 2016, he was an assistant professor of Nara Institute of Science and Technology, Japan. Since 2016, he has been an associate professor of Hokkaido University, Japan. Since 2018, he has been a researcher of JST PRESTO. His research interests include computer architecture, high level synthesis, and machine learning. He is a member of IEEE, IEICE, and IPSJ.

Hiroki Nakahara received the B.E., M.E., and Ph.D. degrees in computer science from Kyushu Institute of Technology, Fukuoka, Japan, in 2003, 2005, and 2007, respectively. He has held research/faculty positions at Kyushu Institute of Technology, Iizuka, Japan, Kagoshima University, Kagoshima, Japan, and Ehime University, Ehime, Japan. Now, he is an associate professor at Tokyo Institute of Technology, Japan. He was the Workshop Chairman for the International Workshop on Post-Binary ULSI Systems (ULSIWS) in 2014, 2015, 2016 and 2017, respectively. He served the Program Chairman for the International Symposium on 8th Highly-Efficient Accelerators and Reconfigurable Technologies (HEART) in

He received the 8th IEEE/ACM MEMOCODE Design Contest 1st Place Award in 2010, the SASIMI Outstanding Paper Award in 2010, IPSJ Yamashita SIG Research Award in 2011, the 11st FIT Funai Best Paper Award in 2012, the 7th IEEE MCSoC-13 Best Paper Award in 2013, and the ISMVL2013 Kenneth C. Smith Early Career Award in 2014, respectively. His research interests include logic synthesis, reconfigurable architecture, digital signal processing, embedded systems, and machine learning. He is a member of the IEEE, the ACM, and the IEICE.