Colloquium on Artificial Intelligence Research and Optimization

Every first and third Wednesday of the month, at 1pm CST, Zoom

Some of today’s most visible and, indeed, remarkable achievements in artificial intelligence (AI) have come from advances in deep learning (DL). The formula for the success of DL has been compute power – artificial neural networks are a decades-old idea, but it was the use of powerful accelerators, mainly GPUs, that truly enabled DL to blossom into its current form.

As significant as the impacts of DL have been, there is a realization that current approaches are merely scratching the surface of what might be possible and that researchers could more rapidly conduct exploratory research on ever larger and more complex systems – if only more compute power could be effectively applied.

There are three emerging trends that, if properly harnessed, could enable such a boost in compute power applied to AI, thereby paving the way for major advances in AI capabilities. 

  • Optimization algorithms based on higher-order derivatives are well-established numerical methods, offering superior convergence characteristics and inherently exposing more opportunities for scalable parallel performance than first-order methods commonly applied today. Despite their potential advantages, these algorithms have not yet found their way into mainstream AI applications, as they require significantly more powerful computational resources and must manage significantly larger amounts of data.
  • High-performance computing (HPC) brings more compute power to bear via parallel programming techniques and large-scale hardware clusters and will be required to satisfy the resource requirements of higher-order methods. That DL is not currently taking advantage of HPC resources is not due to lack of imagination or lack of initiative in the community.  Rather, matching the needs of DL systems with the capabilities of HPC platforms presents significant challenges that can only be met by coordinated advances across multiple disciplines.
  • Hardware architecture advances continue apace, with diversification and specialization increasingly being seen as a critical mechanism for increased performance. Cyberinfrastructure (CI) and runtime systems that insulate users from hardware changes, coupled with tools that support performance evaluation and adaptive optimization of AI applications, are increasingly important to achieving high user productivity, code portability, and application performance.

The colloquium collates experts in the fields of algorithmic theory, artificial intelligence (AI), and high-performance computing (HPC) and aims to transform research in the broader field of AI and Optimization. The first aspects of the colloquium are distributed AI frameworks, e.g. TensorFlow, PyTorch, Horovod, and Phylanx. Here, one challenge is the integration of accelerator devices and support of a wide variety of target architectures, since recent supercomputers are getting more inhomogeneous, having accelerator cards or solely CPUs. The framework should be easy to deploy and maintain and provide good portability and productivity. Here, some abstractions and a unified API to hide the zoo of accelerator devices from the users is important.

The second aspect are higher-order algorithms, e.g. second order methods or Bayesian optimization. These methods might result in a higher accuracy, but are more computationally intense. We will look into the theoretical and computational aspects of these methods.

Local organizers

  • Patrick Diehl
  • Katie Bailey
  • Hartmut Kaiser
  • Bita Hasheminezhad
  • Mayank Tagi

For questions or comments regarding the colloquium, please contact Bita Hasheminezhad.

Talks

Speaker:            Dr. Hongchao Zhang, Louisiana State University

Date:                  Wed 2/3 @ 1:00 pm CST

Title:                   Inexact proximal stochastic gradient method for empirical risk minimization

Abstract:            We will talk about algorithm frameworks of inexact proximalstochastic gradient method for solving empirical composite optimization, whose objective function is a summation of an average of a large number of smooth convex or nonconvex functions and a convex, but possibly nonsmooth, function. At each iteration, the algorithm inexactly solves a proximal subproblem constructed by using a stochastic gradient of the objective function. Variance reduction techniques are incorporated in the method to reduce the stochastic gradient variance. The main feature of these algorithms is to allow solving the proximal subproblems inexactly while still keeping the global convergence with desirable complexity bounds. Global convergence and the component gradient complexity bounds are derived for the cases when the objective function is strongly convex, convex or nonconvex. Some preliminary numerical experiments indicate the efficiency of the algorithm.

Bio:                     Hongchao Zhang received his PhD in applied mathematics from University of Florida in 2006. He then had a postdoc position at the Institute for Mathematics and Its Applications (IMA) and IBM T.J. Watson Research Center. He joined LSU as an assistant professor in 2008 and is now a professor in the department of mathematics and Center for Computation & Technology (CCT) at LSU. His research interests are nonlinear optimization theory, algorithm and applications

______________________________________________________________________________

Speaker:            Dr. John T. Foster, The University of Texas at Austin

Date:                  Wed 3/3 @ 1:00 pm CST

Title:                   Scientific Machine Learning (SciML): An overview and discussion of applications in petroleum engineering

Abstract:            Scientific Machine Learning or SciML is a relatively new phrase that is used to describe the intersection of data science, machine learning, and physics based computational simulation. SciML encompasses many ideas including physics informed neural networks, universal differential equations, and the use of synthetic data generated from physical simulators in training machine learning models for rapid decision making. This talk will give an overview of SciML using simple examples and discuss recent results from our investigations using SciML in petroleum engineering applications, specifically for reservoir simulation and drill string dynamics.

Bio:                     Before joining UT Austin, John was a faculty member in mechanical engineering at UTSA and was a Senior Member of the Technical  Staff at Sandia National Laboratories  He received his BS and MS in mechanical engineering from Texas Tech University and PhD from Purdue University.  He is a registered Professional Engineer in the State of Texas and the co-founder and CTO of Daytum, a tech-enabled professional education company for data science and machine learning targeting the energy industry.

______________________________________________________________________________