Colloquium on Artificial Intelligence Research and Optimization – Fall 21

Every first Wednesday of the month, at 3:00 pm CST, Zoom

Some of today’s most visible and, indeed, remarkable achievements in artificial intelligence (AI) have come from advances in deep learning (DL). The formula for the success of DL has been compute power – artificial neural networks are a decades-old idea, but it was the use of powerful accelerators, mainly GPUs, that truly enabled DL to blossom into its current form.

As significant as the impacts of DL have been, there is a realization that current approaches are merely scratching the surface of what might be possible and that researchers could more rapidly conduct exploratory research on ever larger and more complex systems – if only more compute power could be effectively applied.

There are three emerging trends that, if properly harnessed, could enable such a boost in compute power applied to AI, thereby paving the way for major advances in AI capabilities.

Optimization algorithms based on higher-order derivatives are well-established numerical methods, offering superior convergence characteristics and inherently exposing more opportunities for scalable parallel performance than first-order methods commonly applied today. Despite their potential advantages, these algorithms have not yet found their way into mainstream AI applications, as they require significantly more powerful computational resources and must manage significantly larger amounts of data.
High-performance computing (HPC) brings more compute power to bear via parallel programming techniques and large-scale hardware clusters and will be required to satisfy the resource requirements of higher-order methods. That DL is not currently taking advantage of HPC resources is not due to lack of imagination or lack of initiative in the community. Rather, matching the needs of DL systems with the capabilities of HPC platforms presents significant challenges that can only be met by coordinated advances across multiple disciplines.
Hardware architecture advances continue apace, with diversification and specialization increasingly being seen as a critical mechanism for increased performance. Cyberinfrastructure (CI) and runtime systems that insulate users from hardware changes, coupled with tools that support performance evaluation and adaptive optimization of AI applications, are increasingly important to achieving high user productivity, code portability, and application performance.

The colloquium collates experts in the fields of algorithmic theory, artificial intelligence (AI), and high-performance computing (HPC) and aims to transform research in the broader field of AI and Optimization. The first aspects of the colloquium are distributed AI frameworks, e.g. TensorFlow, PyTorch, Horovod, and Phylanx. Here, one challenge is the integration of accelerator devices and support of a wide variety of target architectures, since recent supercomputers are getting more inhomogeneous, having accelerator cards or solely CPUs. The framework should be easy to deploy and maintain and provide good portability and productivity. Here, some abstractions and a unified API to hide the zoo of accelerator devices from the users is important.

The second aspect are higher-order algorithms, e.g. second order methods or Bayesian optimization. These methods might result in a higher accuracy, but are more computationally intense. We will look into the theoretical and computational aspects of these methods.

This will be the second term for our Colloquium. For details from the inaugural Colloquium series, including speaker information and links to presentations, click here.

______________________________________________________________________________

Confirmed Speakers

09/08/21	J. Nathan Kutz	University of Washington in Seattle
TBD – rescheduled	George Em Karniadakis	Brown University
*11/03/21 @ 2 pm CST	Daniel Soudry	Simons Institute
12/01/21	Alex Hanna	Google Ethics

Registration

Registration for the colloquium is free. Please complete your registration here: registration form

Local organizers

Patrick Diehl
Katie Bailey
Hartmut Kaiser
Mayank Tyagi

For questions or comments regarding the colloquium, please contact Katie Bailey.

Talks

Speaker: Dr. J. Nathan Kutz, University of Washington in Seattle

Date: Wed 9/8 @ 3:00 pm CDT

Title: Deep learning for the discovery of parsimonious physics models

Abstract: A major challenge in the study of dynamical systems is that of model discovery: turning data into reduced order models that are not just predictive, but provide insight into the nature of the underlying dynamical system that generated the data. We introduce a number of data-driven strategies for discovering nonlinear multiscale dynamical systems and their embeddings from data. We consider two canonical cases: (i) systems for which we have full measurements of the governing variables, and (ii) systems for which we have incomplete measurements. For systems with full state measurements, we show that the recent sparse identification of nonlinear dynamical systems (SINDy) method can discover governing equations with relatively little data and introduce a sampling method that allows SINDy to scale efficiently to problems with multiple time scales, noise and parametric dependencies. For systems with incomplete observations, we show that the Hankel alternative view of Koopman (HAVOK) method, based on time-delay embedding coordinates and the dynamic mode decomposition, can be used to obtain a linear models and Koopman invariant measurement systems that nearly perfectly captures the dynamics of nonlinear quasiperiodic systems. Neural networks are used in targeted ways to aid in the model reduction process. Together, these approaches provide a suite of mathematical strategies for reducing the data required to discover and model nonlinear multiscale systems.

Bio: Nathan Kutz is the Yasuko Endo and Robert Bolles Professor of Applied Mathematics at the University of Washington, having served as chair of the department from 2007-2015. He received the BS degree in physics and mathematics from the University of Washington in 1990 and the Phd in applied mathematics from Northwestern University in 1994. He was a postdoc in the applied and computational mathematics program at Princeton University before taking his faculty position. He has a wide range of interests, including neuroscience to fluid dynamics where he integrates machine learning with dynamical systems and control.

Web: Kutz Home | Kutz Research Group (washington.edu)

______________________________________________________________________________

Speaker: Dr. Daniel Soudry, Simons Institute

Date: Wed 11/3 @ 2:00 pm CDT

Title: Algorithmic Bias Control in Deep learning

Abstract: Deep learning relies on Artificial Neural Networks (ANNs) with deep architectures – machine learning models that have reached unparalleled performance in many domains, such as machine translation, autonomous vehicles, computer vision, text generation, and speech understanding. However, this impressive performance typically requires large datasets and massive ANN models. Gathering the data and training the models – all can take long times and have prohibitive costs. Significant research efforts are being invested in improving ANN training efficiency, i.e. the amount of time, data, and resources required to train these models. For example, changing the model (e.g., architecture, numerical precision) or the training algorithm (e.g., parallelization). However, such modifications often cause an unexplained degradation in the generalization performance of the ANN to unseen data. Recent findings suggest that this degradation is caused by changes to the hidden algorithmic bias of the training algorithm and model. This bias determines which solution is selected from all solutions which fit the data. I will discuss how understanding and controlling such algorithmic bias can be the key to unlocking the full potential of deep learning.

Bio: Daniel is an associate professor in the Department of Electrical Engineering at the Technion, working in the areas of machine learning and theoretical neuroscience. He did his post-doc (as a Gruss Lipper fellow) working with Prof. Liam Paninski in the Department of Statistics and the Center for Theoretical Neuroscience at Columbia University. He is interested in all aspects of neural networks and deep learning. His recent works focus on quantization, resource efficiency, and implicit bias in neural networks. He is the recipient of the Gruss Lipper fellowship, the Goldberg Award, and Intel’s Rising Star Faculty Award.

______________________________________________________________________________

Speaker: Dr. Alex Hanna, Google Ethics

Date: Wed 12/1 @ 3:00 pm CDT

Title: Beyond Bias: Algorithmic Unfairness, Infrastructure, and Genealogies of Data

Abstract: Problems of algorithmic bias are often framed in terms of lack of representative data or formal fairness optimization constraints to be applied to automated decision-making systems. However, these discussions sidestep deeper issues with data used in AI, including problematic categorizations and the extractive logics of crowdwork and data mining. In this talk, I make two interventions: first by reframing of data as a form of infrastructure, and as such, implicating politics and power in the construction of datasets; and secondly discussing the development of a research program around the genealogy of datasets used in machine learning and AI systems. These genealogies should be attentive to the constellation of organizations and stakeholders involved in their creation, the intent, values, and assumptions of their authors and curators, and the adoption of datasets by subsequent researchers.

Bio: Alex Hanna is a sociologist and Senior Research Scientist working on the Ethical AI team at Google. Before joining Google, she was an Assistant Professor in the Institute of Communication, Culture, Information and Technology at the University of Toronto. Her research centers on the origins of the training data which form the informational infrastructure of AI and algorithmic fairness frameworks, and the way these datasets exacerbate racial, gender, and class inequality.

______________________________________________________________________________