LSU Colloquium on Artificial Intelligence Research and Optimization – Fall 22

Every first Wednesday of the month, at 3:00 pm Central Time

Some of today’s most visible and, indeed, remarkable achievements in artificial intelligence (AI) have come from advances in deep learning (DL). The formula for the success of DL has been compute power – artificial neural networks are a decades-old idea, but it was the use of powerful accelerators, mainly GPUs, that truly enabled DL to blossom into its current form.

As significant as the impacts of DL have been, there is a realization that current approaches are merely scratching the surface of what might be possible and that researchers could more rapidly conduct exploratory research on ever larger and more complex systems – if only more compute power could be effectively applied.

There are three emerging trends that, if properly harnessed, could enable such a boost in compute power applied to AI, thereby paving the way for major advances in AI capabilities. 

  • Optimization algorithms based on higher-order derivatives are well-established numerical methods, offering superior convergence characteristics and inherently exposing more opportunities for scalable parallel performance than first-order methods commonly applied today. Despite their potential advantages, these algorithms have not yet found their way into mainstream AI applications, as they require significantly more powerful computational resources and must manage significantly larger amounts of data.
  • High-performance computing (HPC) brings more compute power to bear via parallel programming techniques and large-scale hardware clusters and will be required to satisfy the resource requirements of higher-order methods. That DL is not currently taking advantage of HPC resources is not due to lack of imagination or lack of initiative in the community.  Rather, matching the needs of DL systems with the capabilities of HPC platforms presents significant challenges that can only be met by coordinated advances across multiple disciplines.
  • Hardware architecture advances continue apace, with diversification and specialization increasingly being seen as a critical mechanism for increased performance. Cyberinfrastructure (CI) and runtime systems that insulate users from hardware changes, coupled with tools that support performance evaluation and adaptive optimization of AI applications, are increasingly important to achieving high user productivity, code portability, and application performance.

The colloquium, hosted by CCT (Center for Computation and Technology) at LSU, collates experts in the fields of algorithmic theory, artificial intelligence (AI), and high-performance computing (HPC) and aims to transform research in the broader field of AI and Optimization. The first aspects of the colloquium are distributed AI frameworks, e.g. TensorFlow, PyTorch, Horovod, and Phylanx. Here, one challenge is the integration of accelerator devices and support of a wide variety of target architectures, since recent supercomputers are getting more inhomogeneous, having accelerator cards or solely CPUs. The framework should be easy to deploy and maintain and provide good portability and productivity. Here, some abstractions and a unified API to hide the zoo of accelerator devices from the users is important.

The second aspect are higher-order algorithms, e.g. second order methods or Bayesian optimization. These methods might result in a higher accuracy, but are more computationally intense. We will look into the theoretical and computational aspects of these methods.

______________________________________________________________________________

Confirmed Speakers

Tue 9/20/22@3:30 Rajeev ShoreyThe University of Queensland – IIT Delhi Academy of Research
10/05/22Abhinav BhateleUniversity of Maryland, College Park
11/02/22Steve SunColumbia University
12/07/22Ope OwoyeleLouisiana State University

Registration

Registration for the colloquium is free. Please complete your registration here: registration form

Logistics

This semester, we will have both Zoom and in-person presentations. In-person presentations will also be available through Zoom. We will keep this page up-to-date with information regarding which presentations will take place at LSU/CCT.

Local organizers

  • Patrick Diehl
  • Katie Bailey
  • Hartmut Kaiser
  • Mayank Tyagi

For questions or comments regarding the colloquium, please contact Katie Bailey.

Talks

Speaker:   Dr. Rajeev Shorey        

Date:       Tues, Sept 20 @ 3:30 pm CDT           

Title:       Recent Investigations in Machine Learning and Edge Computing

Abstract:     

(1) Latency-Memory Optimized Splitting of Convolution Neural Networks for Resource Constrained Edge Devices

Abstract:

With the increasing reliance of users on smart devices, bringing essential computation at the edge has become a crucial requirement for any type of business. Many such computations utilize Convolution Neural Networks (CNNs) to perform AI tasks, having high resource and computation requirements, that are infeasible for edge devices. Splitting the CNN architecture to perform part of the computation on edge and remaining on the cloud is an area of research that has seen increasing interest in the field. In this work, we assert that running CNNs between an edge device and the cloud is synonymous with solving a resource-constrained optimization problem that minimizes the latency and maximizes resource utilization at the edge. We formulate a multi-objective optimization problem and propose the LMOS algorithm to achieve a Pareto efficient solution. Experiments done on real-world edge devices show that LMOS ensures feasible execution of different CNN models at the edge and also improves upon existing state-of-the-art approaches.

(2) Federated Learning in a Faulty Edge Ecosystem: Analysis, Mitigation and Applications

Abstract:

Federated Learning deviates from the norm of ”send data to model” to ”send model to data”. When used in an edge ecosystem, numerous heterogeneous edge devices collecting data through different means and connected through different network channels get involved in the training process. Failure of edge devices in such an ecosystem due to device fault or network issues is highly likely. In this work, we first analyse the impact of the number of edge devices on an FL model and provide a strategy to select an optimal number of devices that would contribute to the model. We observe the impact of data distribution on the number of optimal devices. We then investigate how the edge ecosystem behaves when the selected devices fail and provide a mitigation strategy to ensure a robust Federated Learning technique. Finally, we design a real-world application to highlight the impact of the designed mitigation strategy.

 Bio:       Dr Rajeev Shorey is the CEO of The University of Queensland – IIT Delhi Academy of Research (UQIDAR).  Rajeev also serves as an adjunct faculty in the Computer Science & Engineering department at IIT Delhi.

Dr Shorey received his Ph.D. and M.S. (Engg) in Electrical Communication Engineering from the Indian Institute of Science (IISc), Bangalore, India in 1997 and 1991 respectively. He received his B.E degree in Computer Science and Engineering from IISc, Bangalore in 1987.

Dr Shorey’s career spans several reputed research labs – TCS Research & Innovation, General Motors (GM) India Science Laboratory (ISL), IBM India Research Laboratory and SASKEN Technologies. Dr Shorey served as the first President of NIIT University from 2009 to 2013 before joining the TCS Research Labs in 2014.

Dr Shorey’s work has resulted in more than 70 publications in international journals and conferences and several US patents, all in the area of wireless and wired networks. He has 12 issued US patents and several pending US patents to his credit. Dr Shorey serves on the editorial boards of two of the top journals in the area – IEEE Internet of Things Journal and Springer’s Journal of Wireless Networks. His areas of interest are Wireless Networks including 5G Networks, Telematics, IoT, Industrial IoT, IoT Security and Automotive Networks, including Automotive Cybersecurity.

For his contributions in the area of Communication Networks, Dr. Shorey was elected a Fellow of the Indian National Academy of Engineering in 2007. Dr Shorey was recognized by ACM as a Distinguished Scientist in December 2014. He is a Fellow of the Institution of Electronics and Telecommunication Engineers, India and a Senior Member of IEEE.

    

______________________________________________________________________________

Speaker:         Dr. Abhinav Bhatele, University of Maryland, College Park 

Date:             Wed, Oct 5 @ 3:00 pm CDT     

Title:      Rethinking the Parallelization of Extreme-scale Deep Learning

Abstract:      The rapid increase in memory capacity and computational power of modern architectures, especially accelerators, in large data centers and supercomputers, has led to a frenzy in training extremely large deep neural networks. However, efficient use of large parallel resources for extreme-scale deep learning requires scalable algorithms coupled with high-performing implementations on such machines. In this talk, I will present AxoNN, a parallel deep learning framework that exploits asynchrony and message-driven execution to optimize work scheduling and communication, which are often critical bottlenecks in achieving high performance. I will also discuss different approaches for memory savings such as using CPU memory as a scratch pad, and magnitude-based parameter pruning. Integrating these approaches with AxoNN enables us to train large models using fewer GPUs, and also helps reduce the volume of communication sent over the network.     

Bio:     Abhinav Bhatele is an assistant professor in the department of computer science, and director of the Parallel Software and Systems Group at the University of Maryland, College Park. His research interests are broadly in systems and networks, with a focus on parallel computing and large-scale data analytics. He has published research in parallel programming models and runtimes, network design and simulation, applications of machine learning to parallel systems, parallel deep learning, and on analyzing/visualizing, modeling and optimizing the performance of parallel software and systems. Abhinav has received best paper awards at Euro-Par 2009, IPDPS 2013 and IPDPS 2016. Abhinav was selected as a recipient of the IEEE TCSC Young Achievers in Scalable Computing award in 2014, the LLNL Early and Mid-Career Recognition award in 2018, and the NSF CAREER award in 2021. Abhinav received a B.Tech. degree in Computer Science and Engineering from I.I.T. Kanpur, India in May 2005, and M.S. and Ph.D. degrees in Computer Science from the University of Illinois at Urbana-Champaign in 2007 and 2010 respectively. He was a post-doc and later computer scientist in the Center for Applied Scientific Computing at Lawrence Livermore National Laboratory from 2011-2019.

______________________________________________________________________________

Speaker:         Dr. Steve Sun, Columbia University

Date:      Wed, Nov 2 @ 3:00 pm CST 

Title:      Graph embedding for interpretable multiscale plasticity

Abstract:         The history-dependent behaviors of classical plasticity models are often driven by internal variables evolved according to phenomenological laws. The difficulty to interpret how these internal variables represent a history of deformation, the lack of direct measurement of these internal variables for calibration and validation, and the weak physical underpinning of those phenomenological laws have long been criticized as barriers to creating realistic models. In this work, geometric machine learning on graph data (e.g., finite element solutions) is used as a means to establish a connection between nonlinear dimensional reduction techniques and plasticity models. Geometric learning-based encoding on graphs allows the embedding of rich time-history data onto a low-dimensional Euclidean space such that the evolution of plastic deformation can be predicted in the embedded feature space. A corresponding decoder can then convert these low-dimensional internal variables back into a weighted graph such that the dominating topological features of plastic deformation can be observed and analyzed.

Bio:    Dr. Sun is an associate professor at Columbia University and UPS Foundation visiting professor at Stanford University. He obtained his B.S. from UC Davis (2005); M.S. in civil engineering (geomechanics) from Stanford (2007); M.A. (Civil Engineering) from Princeton (2008); and Ph.D. in theoretical and applied mechanics from Northwestern (2011). Sun’s research focuses on theoretical, computational, and data-driven mechanics for porous and energetic materials. He is the recipient of the IACM John Argyris Award (2022), NSF CAREER Award (2019), the EMI Leonardo da Vinci Award (2018), the Zienkiewicz Numerical Methods Engineering Prize (2017), AFOSR Young Investigator Program Award (2017), Dresden Fellowship (2016), ARO Young Investigator Program Award (2015), and the Caterpillar Best Paper Prize (2014).

______________________________________________________________________________

Speaker:          Ope Owoyele

Date:       Wed, Dec 7 @ 3:00 pm CST

Title:      Machine learning techniques for simulation-driven design optimization

Abstract:        An important component of the design process of new reactive flow devices lies in optimizing them for efficiency under constraints relating to undesirable emissions, thermo-mechanical limits, and stability. Computational modeling can play a vital role in this process, whereby design optimization can be performed using computational fluid dynamics (CFD) simulations to identify promising designs for experimental prototyping. However, CFD simulations of such systems are compute-intensive because they involve capturing multi-physical processes that include turbulent gas dynamics, liquid spray injection and breakup, chemical kinetics, heat transfer, and their complex interactions. In this talk, I will present a mixture of deep experts approach that automatically divides modeling tasks amongst specialized learners, leading to simulations that capture experimental trends with reduced computational costs. Efficient simulation-driven design optimization depends, not solely on tractable and predictive computational models, but also on optimizers that can drive design decisions by utilizing these simulations to quickly identify promising designs. Accordingly, I will also talk about a novel approach that employs reinforcement learning to rapidly discover domain-specific and simulation-efficient optimizers. I will conclude my talk by outlining lingering challenges and future directions. 

Bio:        Opeoluwa (Ope) Owoyele has been an Assistant Professor of Mechanical Engineering at Louisiana State University since August 2021. Before joining LSU, he was a Postdoctoral Appointee in the Computational Multi-Physics Research Section at Argonne National Laboratory (ANL). Prior to this, he was a recipient of the ORISE postgraduate fellowship to perform research at the National Energy Technology Laboratory (NETL). He obtained his Master’s and Ph.D. degrees in Mechanical Engineering from North Carolina State University. At ANL, he received an Impact Argonne Award in the category of Innovation and a Postdoctoral Performance Award in the Engineering Research category. He is also the recipient of the R&D 100 Award for developing a machine learning-genetic algorithm for rapid product design optimization. His research interests lie at the intersection of numerical methods, data science, machine learning, and high-performance computing for design optimization and data-derived reduced-order modeling.