Advance Program

Tutorials: Sunday, August 25, 2024

Time (PDT)	Title	Presenters
7:45AM-8:30AM	Breakfast/Registration
8:30AM-10:35AM	Tutorial 1: AI Assisted Hardware Design - Will AI Elevate or Replace Hardware Engineers? Chair: Bryan Chin, UCSD
	Introduction	Bryan Chin, UCSD
	Introduction to AI for Chip Design	Mark Ren, NVIDIA
	AI Driven Optimization	Stelios Diamantidis, Synopsys
	LLM and Chip Design	Hans Bouwmeester, PrimisAI
10:35AM-11:00AM	Coffee Break (1/2 hr)
11:00AM-12:30PM	Tutorial 1: AI Assisted Hardware Design - Will AI Elevate or Replace Hardware Engineers? Chair: Bryan Chin, UCSD
	Domain Adaptive LLM Models	Hanxian Huang, UCSD
	LLM Agents for Chip Design	Mark Ren, NVIDIA
	Future Directions & Panel Discussion
12:30PM-1:45PM	Lunch (1 hr 15 min)
1:45PM-3:15PM	Tutorial 2: The Cooling of Hot Chips: How thermal technology is keeping up with the AI revolution Chair: Seshu Madhavapeddy, Frore Systems
	Thermal techniques for higher data center compute density	Tom Garvens, Supermicro
	Next-generation cooling for NVIDIA’s Accelerated Computing	Ali Heydari, NVIDIA
3:15PM-3:45PM	Coffee Break (1/2 hr)
3:45PM-6:00PM	Tutorial 2: The Cooling of Hot Chips: How thermal technology is keeping up with the AI revolution Chair: Seshu Madhavapeddy, Frore Systems
	Thermal challenges of Edge Devices	Nader Nikfar, Qualcomm
	Solid-state active cooling helps maintain Moore’s Law	Prabhu Sathyamurthy, Frore Systems
	Applications for thermo-electric cooling	Jesse Edwards, Phononic
6:00PM-8:00PM	Reception

Conference Day 1: Monday, August 26, 2024

Time (PDT)	Title	Presenters
7:45AM-9:15AM	Breakfast/Registration
9:15AM-9:30AM	Welcome
	General Chair Welcome	Ron Diamant, General Chair
	Progam Co-Chairs Welcome	Rob Aitken & Larry Yang, PC Co-Chairs
9:30AM-11:00AM	High-Performance Processors Part 1 Chair: Ian Bratt
	Snapdragon X Elite Qualcomm Oryon CPU: Design & Architecture Overview	Gerard Williams, Qualcomm
	Lunar Lake: Powering the Next Generation of AI PCs	Arik Gihon, Intel
	IBM Next Generation Processor and AI Accelerator	Chris Berry, IBM
11:00AM-11:30AM	Coffee Break
11:30AM-1:00PM	Specialized Processors Chair: Renu Raman
	Blackhole and TT-Metalium - The Standalone AI Computer and its Programming Model	Jasmina Vasiljevic & Davor Capalija, Tenstorrent
	SK Hynix AI-Specific Computing Memory Solution: From AiM device to Heterogeneous AiMX-xPU System for Comprehensive LLM Inference	Guhyun Kim, SK Hynix
	Built for the Edge: The next generation Intel® Xeon 6 SoC	Praveen Mosur, Intel
1:00PM-2:15PM	Lunch (1 hr 15 min)
2:15PM-3:15PM	Keynote #1 Chair: Ralph Wittig
	Predictable Scaling and Infrastructure	Trevor Cai, OpenAI
3:15PM-4:15PM	AI Processors Part 1 Chair: Pradeep Dubey
	NVIDIA Blackwell Platform: Advancing Generative AI and Accelerated Computing	Ajay Tirumala & Raymond Wong, NVIDIA
	SambaNova SN40L RDU: Breaking the Barrier of Trillion+ Parameter Scale Gen AI Computing	Raghu Prabhakar, SambaNova
4:15PM-4:45PM	Coffee Break (1/2 hr)
4:45PM-6:45PM	AI Processors Part 2 Chair: David Weaver
	Intel Gaudi 3 AI Accelerator: Architected for Gen AI Training and Inference	Roman Kaplan, Intel
	AMD InstinctTM MI300X Generative AI Accelerator and Platform Architecture	Alan Smith & Vamsi Krishna Alla, AMD
	An AI Compute ASIC with Optical Attach to Enable Next Generation Scale-up Architectures	Manish Mehta, Broadcom
	FuriosaAI RNGD: A Tensor Contraction Processor for Sustainable AI Computing	June Paik, Furiosa
6:45PM-8:30PM	Reception

Conference Day 2: Tuesday, August 27, 2024

Time (PDT)	Title	Presenters
7:45AM-8:30AM	Breakfast/Registration
8:30AM-9:00AM	Poster Lightning Session
9:00AM-10:30AM	AI Processors Part 3 Chair: Yasuo Ishii
	AMD Versal™ AI Edge Series Gen 2 for Vision and Automotive	Tomai Knopp & Jeffrey Chu, AMD
	Onyx: A Programmable Accelerator for Sparse Tensor Algebra	Kalhan Koul, Stanford
	Next Gen MTIA - Meta’s Recommendation Inference Accelerator	Mahesh Maddury & Pankaj Kansal, Meta
10:30AM-11:00AM	Coffee Break
11:00AM-12:30PM	Networking Processors Chair: Jae W. Lee
	DOJO: An Exa-Scale Lossy AI Network using the Tesla Transport Protocol over Ethernet (TTPoE)	Eric Quinnell, Tesla
	ACF-S: An 8-Tbit/s SuperNIC for High-Performance Data Movement in AI & Accelerated Compute Networks	Shrijeet Mukherjee & Thomas Norrie, Enfabrica
	4 Tbit/s Optical Compute Interconnect Chiplet for XPU-to-XPU Connectivity	Saeed Fathololoumi, Intel
12:30PM-1:45PM	Lunch (1 hr 15 min)
1:45PM-2:45PM	Keynote #2 Chair: Ian Bratt
	The Journey to Life with AI Pervasiveness	Victor Peng, AMD
2:45PM-3:45PM	High-Performance Processors Part 2 Chair: Nhon Quach
	Wafer-Scale AI: Enabling Unprecedented AI Compute Performance	Sean Lie, Cerebras
	XiangShan: An Open-Source Project for High-Performance RISC-V Processors Meeting Industrial-Grade Standards	Kaifan Wang, Chinese Academy of Sciences
3:45PM-4:15PM	Coffee Break (1/2 hr)
4:15PM-6:15PM	High-Performance Processors Part 3 Chair: Lingjie Xu
	AmpereOne: Sustainable Computing for AI & Cloud Native Workloads	Matthew Erler, Ampere Computing
	Inside MAIA 100	Sherry Xu & Chandru Ramakrishnan, Microsoft
	AMD Next Generation “Zen 5” Core	Brad Cohen & Mahesh Subramony, AMD
	MN-Core 2: Second-generation processor of MN-Core architecture for AI and general-purpose HPC applications	Jun Makino, Preferred Networks
6:15PM-6:30PM	IEEE TCMM Awards
	IEEE TCMM Awards	Gabriel Southern, TCMM Chair
6:30PM-6:45PM	Closing Remarks
	Closing Remarks	Jan-Willem van de Waerdt, Vice Chair

Posters

Title	Authors & Affiliation
Picasso: An Area/Energy-Efficient End-to-End Diffusion Accelerator with Hyper-Precision Data Type	Sungyeob Yoo, Geonwoo Ko, Seri Ham, Seeyeon Kim, Yi Chen & Joo-Young Kim; Korea Advanced Institute of Science and Technology
NeuGPU: A Neural Graphics Processing Unit for Instant Modeling and Real-Time Rendering for Mobile AR/VR Devices	Junha Ryu, Hankyul Kwon, Wonhoon Park, Zhiyong Li, Beomseok Kwon, Donghyeon Han, Dongseok Im, Sangyeob Kim, Hyungnam Joo, Minsung Kim & Hoi-Jun Yoo; Korea Advanced Institute of Science and Technology
Space-Mate: A 303.5mW Real-Time NeRF SLAM Processor with Sparse Mixture-of-Experts-based Acceleration	Gwangtae Park, Seokchan Song, Haoyang Sang, Dongseok Im, Donghyeon Han, Sangyeob Kim, Hongseok Lee & Hoi-Jun Yoo; Korea Advanced Institute of Science and Technology
A Low-power Large-Language-Model Processor with Big-Little Network and Implicit-Weight-Generation for On-device AI	Sangyeob Kim, Sangjin Kim, Wooyoung Jo, Soyeon Kim, Seongyon Hong, Nayeong Lee & Hoi-Jun Yoo; Korea Advanced Institute of Science and Technology
A 40-nm 13.88-TOPS/W FC-DNN Engine for 16-bit Intelligent Audio Processing Featuring Weight-Sharing and Approximate Computing	Tay-Jyi Lin, Ze Li, Yun-Cheng Chen, Chien-Tung Liu & Jinn-Shyan Wang; National Chung Cheng University, Taiwan
A Trusted Execution Environment RISC-V System on Chip	Binh Kieu-Do-Nguyen, Tuan-Kiet Dang, Khai-Duy Nguyen, Cong-Kha Pham & Trong-Thuc Hoang; University of Electro-Communications
A 1.19GHz 9.52Gsamples/sec Radix-8 FFT Hardware Accelerator in 28nm	Larry Tang, Siyuan Chen, Keshav Harisrikanth, Guanglin Xu, Franz Franchetti & Ken Mai; Carnegie Mellon University
PACE: A Scalable and Energy Efficient CGRA in a RISC-V SoC for Edge Computing Applications	Vishnu Nambiar, Yi Sheng Chong, Thilini Bandara, Dhananjaya Wijerathne, Zhaoying Li, Rohan Juneja, Li-Shiuan Peh, Tulika Mitra & Anh Tuan Do; Institute of Microelectronics, Agency for Science, Technology and Research (A*STAR) and National University of Singapore
LSPU: A 20.7ms Low-latency Point Neural Network-based 3D Perception and Semantic LiDAR SLAM System-on-Chip for Autonomous Driving System	Jueun Jung, Seungbin Kim, Bokyoung Seo, Wuyoung Jang, Sangho Lee, Jeongmin Shin, Donghyeon Han & Kyuho Lee; Ulsan National Institute of Science and Technology
RISC-V-based System-on-Chips for IoT Applications	Khai-Duy Nguyen, Tuan-Kiet Dang, Binh Kieu-Do-Nguyen, Cong-Kha Pham & Trong-Thuc Hoang; University of Electro-Communications (UEC), Tokyo, Japan
A Smart Cache for a SmartNIC! Scaling End-Host Networking to 400Gbps and Beyond	Annus Zulfiqar, Ali Imran, Venkat Kunaparaju, Ben Pfaff, Gianni Antichi & Muhammad Shahbaz; Purdue University
Towards True GPU Performance Scaling for OpenGPU	Blaise Tine & Hyesoon Kim; UCLA
CogniVision: A mW Power envelope SoC for Always-on Smart Vision in 40nm	Anuimesh Gupta, Japesh Vohra & Massimo Alioto; National University of Singapore
NeCTAr and RASoC: Tale of Two Class SoCs for Language Model Inference and Robotics in Intel 16	Viansa Schmulbach, Jason Kim, Ethan Gao, Nikhil Jha, Ethan Wu, Oliver Yu, Ben Oliveau, Xiangwei Kong, Brendan Roberts, Connor McMahon, Lixiang Yin, Vamber Yang, Brendan Brenner, George Moujaes, Boyu Hao, Lucy Revina, Kevin Anderson, Bryan Ngo, Yufeng Chi, Hongyi Huang, Reza Sajadiany, Raghav Gupta, Ella Schwarz, Jennifer Zhou, Ken Ho, Jerry Zhao, Anita Flynn and Borivoje Nikolić; University of California, Berkeley