The Quest for Energy Efficient Generative AI
Monday | October 27, 2025 | 8:30 - 9:30
Artificial Intelligence (AI) applications have entered and impacted our lives unlike any other technology advance from the recent past. While current AI tools are approaching Artificial General Intelligence (AGI) capabilities, large gaps persist due to diminished benefits despite deploying enormous computing and ever-increasing energy demands. This quandary requires a holistic approach in determining energy efficient solutions for achieving AGI. While the holy grail for judging the quality of an AI model has largely been serving accuracy, and only recently its resource usage, neither of these metrics translate directly to energy efficiency, latency, or mobile device battery lifetime. More recently, generative AI relying on transformer, diffusion, or state-space models have revolutionized the way we approach reasoning and learning tasks across all types of modalities, from image to video and language, bringing increased performance at the expense of significant hardware costs. This talk uncovers the need for building accurate, platform‐specific power and latency models and efficient hardware-aware design methodologies for AI systems, thus allowing AI and hardware designers to identify not just the best accuracy AI configuration, but also those that satisfy given hardware constraints. Furthermore, together with model compression techniques, such as quantization and sparsity-aware co-design, these approaches demonstrate the feasibility of achieving cloud to edge energy efficient generative AI for multi-modal tasks.

