oai:cds.cern.ch:3025561

Faster, Greener, Precise Enough: Challenges and Directions in GPU Auto-Tuning

APA

(2026). Faster, Greener, Precise Enough: Challenges and Directions in GPU Auto-Tuning. SciVideos. https://videos.cern.ch/record/3025561

MLA

Faster, Greener, Precise Enough: Challenges and Directions in GPU Auto-Tuning. SciVideos, May. 05, 2026, https://videos.cern.ch/record/3025561

BibTex

          @misc{ scivideos_oai:cds.cern.ch:3025561,
            doi = {},
            url = {https://videos.cern.ch/record/3025561},
            author = {},
            keywords = {},
            language = {en},
            title = {Faster, Greener, Precise Enough: Challenges and Directions in GPU Auto-Tuning},
            publisher = {},
            year = {2026},
            month = {may},
            note = {oai:cds.cern.ch:3025561 see, \url{https://scivideos.org/cern-cds/3025561}}
          }
          
van Werkhoven, Ben
Talk numberoai:cds.cern.ch:3025561
Subject

Abstract

High-Performance Computing (HPC) drives discovery across science and industry and underpins the rapid advances in AI. At the heart of modern HPC platforms is the Graphics Processing Unit (GPU), which delivers the bulk of compute power but also dominates energy consumption. As GPU architectures increasingly prioritize low-precision arithmetic for AI workloads, HPC applications that depend on higher precision face new programmability challenges alongside new opportunities in mixed-precision computing. Crucially, the energy efficiency of GPU applications depends not only on compute utilization but also on memory traffic patterns, and the fastest implementation is not always the most energy efficient. Reliable exploration of these trade-offs is further complicated by the limited accuracy and temporal resolution of current power measurement tools. Combined with the vast, discontinuous design spaces inherent to GPU programming, manual optimization is infeasible. Automatic performance tuning, or auto-tuning, offers a proven approach to this problem, automatically searching for optimal configurations across algorithm, application, and hardware parameters. To address the emerging demands of mixed-precision computing and energy-aware execution, the field is now moving toward constrained and multi-objective optimization to enable systematic exploration of the trade-offs between performance, energy consumption, and numerical accuracy. In this talk, I will highlight key challenges, recent developments, and future directions in GPU auto-tuning.

00:00:00 Slide 1
00:02:23 Slide 2
00:02:58 Slide 3
00:03:44 Slide 4
00:04:27 Slide 5
00:06:59 Slide 6
00:07:42 Slide 7
00:10:02 Slide 8
00:11:03 Slide 9
00:12:09 Slide 10
00:13:00 Slide 11
00:14:16 Slide 12
00:15:10 Slide 13
00:15:34 Slide 14
00:16:18 Slide 15
00:17:21 Slide 16
00:18:45 Slide 17
00:20:38 Slide 18
00:21:00 Slide 19
00:22:27 Slide 20
00:23:58 Slide 21
00:24:43 Slide 22
00:25:28 Slide 23
00:26:26 Slide 24
00:27:57 Slide 25
00:28:46 Slide 26
00:29:37 Slide 27
00:31:48 Slide 28
00:34:09 Slide 29
00:36:38 Slide 30
00:38:46 Slide 31
00:39:02 Slide 32
00:39:36 Slide 33