HAKIZUMWAMI BIRALI RUNESHA, PH.D
Hakizumwami Birali Runesha Associate Vice President, Research Computing Center (RCC) at The University of Chicago. He is the current President of the Great Lakes Consortium for Petascale Computation (GLCPC)-USA, member of Rwanda’s National Council of Science and Technology and founding member of Intel and Lenovo’s Project Everyscale, an exascale visionary council. Dr Runesha is a former Director of Scientific Computing and Applications at the University of Minnesota Supercomputing Institute, Research Associate at the Hong Kong University of Science and Technology and Assistant Professor in the Civil Engineering Department at the University of Kinshasa. Dr. Runesha serves as principal investigator on multiple funded research grants and has more than 28 years of experience in high performance computing, Big data and scientific software development. His research interests are in parallel computing, sparse numerical libraries, AI/Deep learning, finite element analysis and design optimization in engineering.
Topic: Reproducibility of published scientific research in HPC: Myth or reality?
Researchers are running simulations today on High Performance Computing (HPC) systems and generating a vast array of research artifacts such as data, code, algorithms and a diversity of software tools that need to be shared. However, scholarly publications today are still mostly disconnected from the underlying data and application software used to produce the published results and findings. With funding agencies and publishers increasingly requiring to share data used to generate results in a publication, the data can only be useful if the results can be reproducible. Often this is not easy for applications running on scalable HPC systems with thousands of processors and petabytes of storage. This presentation will discuss a funded project to acquire and operate an extensible Data Lifecycle instrument (DaLI) for management and sharing of data that enables researchers to (i) acquire, transfer, process, and store data from computations, experiments and observations in a unified workflow, (ii) manage data collections over their entire life cycle, and (iii) share and publish data. This talk will cover available solutions that are believed to partially address the problem, and present a developed approach and platform to make data sharable, findable, reusable and reproducible. The talk will also discuss the role of such platforms in conducting scientific research and the need to update university curriculums.