big data - RISC2 Project https://www.risc2-project.eu Mon, 11 Sep 2023 15:01:49 +0000 en-US hourly 1 https://wordpress.org/?v=6.6.2 Centro de Análisis de Datos y Supercómputo https://www.risc2-project.eu/2023/06/12/centro-de-analisis-de-datos-y-supercomputo/ Mon, 12 Jun 2023 14:05:09 +0000 https://www.risc2-project.eu/?p=2870 System name: Centro de Análisis de Datos y Supercómputo Location: Universidad de Guadalajara Areas: Big data, mathematics and engineering Web

The post Centro de Análisis de Datos y Supercómputo first appeared on RISC2 Project.

]]>
  • System name: Centro de Análisis de Datos y Supercómputo
  • Location: Universidad de Guadalajara
  • Areas: Big data, mathematics and engineering
  • Web
  • The post Centro de Análisis de Datos y Supercómputo first appeared on RISC2 Project.

    ]]>
    Latin American researchers present greener gateways for Big Data in INRIA Brazil Workshop https://www.risc2-project.eu/2023/05/03/latin-american-researchers-present-greener-gateways-for-big-data-in-inria-brazil-workshop/ Wed, 03 May 2023 13:29:03 +0000 https://www.risc2-project.eu/?p=2802 In the scope of the RISC2 Project, the State University of Sao Paulo and INRIA (Institut National de Recherche en Informatique et en Automatique), a renowned French research institute, held a workshop, on  that set the stage for the presentation of the results accomplished under the work Developing Efficient Scientific Gateways for Bioinformatics in Supercomputer […]

    The post Latin American researchers present greener gateways for Big Data in INRIA Brazil Workshop first appeared on RISC2 Project.

    ]]>
    In the scope of the RISC2 Project, the State University of Sao Paulo and INRIA (Institut National de Recherche en Informatique et en Automatique), a renowned French research institute, held a workshop, on  that set the stage for the presentation of the results accomplished under the work Developing Efficient Scientific Gateways for Bioinformatics in Supercomputer Environments Supported by Artificial Intelligence.

    The goal of the investigation is to provide users with simplified access to computing structures through scientific solutions that represent significant developments in their fields. In the case of this project, it is intended to develop intelligent green scientific solutions for BioinfoPortal (a multiuser Brazilian infrastructure)supported by High-Performance Computing environments.

    Technologically, it includes areas such as scientific workflows, data mining, machine learning, and deep learning. The outlook, in case of success, is the analysis and interpretation of Big Data allowing new paths in molecular biology, genetics, biomedicine, and health— so it becomes necessary tools capable of digesting the amount of information, efficiently, which can come.

    The team performed several large-scale bioinformatics experiments that are considered to be computationally intensive. Currently, artificial intelligence is being used to generate models to analyze computational and bioinformatics metadata to understand how automatic learning can predict computational resources efficiently. The workshop was held from April 10th to 11th, and took place in the University of Sao Paulo.

    RISC2 Project, which aims to explore the HPC impact in the economies of Latin America and Europe, relies on the interaction between researchers and policymakers in both regions. It also includes 16 academic partners such as the University of Buenos Aires, National Laboratory for High Performance Computing of Chile, Julich Supercomputing Centre, Barcelona Supercomputing Center (the leader of the consortium), among others.

    The post Latin American researchers present greener gateways for Big Data in INRIA Brazil Workshop first appeared on RISC2 Project.

    ]]>
    Developing Efficient Scientific Gateways for Bioinformatics in Supercomputer Environments Supported by Artificial Intelligence https://www.risc2-project.eu/2023/03/20/developing-efficient-scientific-gateways-for-bioinformatics-in-supercomputer-environments-supported-by-artificial-intelligence/ Mon, 20 Mar 2023 09:37:46 +0000 https://www.risc2-project.eu/?p=2781 Scientific gateways bring enormous benefits to end users by simplifying access and hiding the complexity of the underlying distributed computing infrastructure. Gateways require significant development and maintenance efforts. BioinfoPortal[1], through its CSGrid[2]  middleware, takes advantage of Santos Dumont [3] heterogeneous resources. However, task submission still requires a substantial step regarding deciding the best configuration that […]

    The post Developing Efficient Scientific Gateways for Bioinformatics in Supercomputer Environments Supported by Artificial Intelligence first appeared on RISC2 Project.

    ]]>
    Scientific gateways bring enormous benefits to end users by simplifying access and hiding the complexity of the underlying distributed computing infrastructure. Gateways require significant development and maintenance efforts. BioinfoPortal[1], through its CSGrid[2]  middleware, takes advantage of Santos Dumont [3] heterogeneous resources. However, task submission still requires a substantial step regarding deciding the best configuration that leads to efficient execution. This project aims to develop green and intelligent scientific gateways for BioinfoPortal supported by high-performance computing environments (HPC) and specialised technologies such as scientific workflows, data mining, machine learning, and deep learning. The efficient analysis and interpretation of Big Data opens new challenges to explore molecular biology, genetics, biomedical, and healthcare to improve personalised diagnostics and therapeutics; finding new avenues to deal with this massive amount of information becomes necessary. New Bioinformatics and Computational Biology paradigms drive storage, management, and data access. HPC and Big Data advanced in this domain represent a vast new field of opportunities for bioinformatics researchers and a significant challenge. the BioinfoPortal science gateway is a multiuser Brazilian infrastructure. We present several challenges for efficiently executing applications and discuss the findings on improving the use of computational resources. We performed several large-scale bioinformatics experiments that are considered computationally intensive and time-consuming. We are currently coupling artificial intelligence to generate models to analyze computational and bioinformatics metadata to understand how automatic learning can predict computational resources’ efficient use. The computational executions are conducted at Santos Dumont, the largest supercomputer in Latin America, dedicated to the research community with 5.1 Petaflops and 36,472 computational cores distributed in 1,134 computational nodes.

    By:

    Carneiro, B. Fagundes, C. Osthoff, G. Freire, K. Ocaña, L. Cruz, L. Gadelha, M. Coelho, M. Galheigo, and R. Terra are with the National Laboratory of Scientific Computing, Rio de Janeiro, Brazil.

    Carvalho is with the Federal Center for Technological Education Celso Suckow da Fonseca, Rio de Janeiro, Brazil.

    Douglas Cardoso is with the Polytechnic Institute of Tomar, Portugal.

    Boito and L, Teylo is with the University of Bordeaux, CNRS, Bordeaux INP, INRIA, LaBRI, Talence, France.

    Navaux is with the Informatics Institute, the Federal University of Rio Grande do Sul, and Rio Grande do Sul, Brazil.

    References:

    Ocaña, K. A. C. S.; Galheigo, M.; Osthoff, C.; Gadelha, L. M. R.; Porto, F.; Gomes, A. T. A.; Oliveira, D.; Vasconcelos, A. T. BioinfoPortal: A scientific gateway for integrating bioinformatics applications on the Brazilian national high-performance computing network. Future Generation Computer Systems, v. 107, p. 192-214, 2020.

    Mondelli, M. L.; Magalhães, T.; Loss, G.; Wilde, M.; Foster, I.; Mattoso, M. L. Q.; Katz, D. S.; Barbosa, H. J. C.; Vasconcelos, A. T. R.; Ocaña, K. A. C. S; Gadelha, L. BioWorkbench: A High-Performance Framework for Managing and Analyzing Bioinformatics Experiments. PeerJ, v. 1, p. 1, 2018.

    Coelho, M.; Freire, G.; Ocaña, K.; Osthoff, C.; Galheigo, M.; Carneiro, A. R.; Boito, F.; Navaux, P.; Cardoso, D. O. Desenvolvimento de um Framework de Aprendizado de Máquina no Apoio a Gateways Científicos Verdes, Inteligentes e Eficientes: BioinfoPortal como Caso de Estudo Brasileiro In: XXIII Simpósio em Sistemas Computacionais de Alto Desempenho – WSCAD 2022 (https://wscad.ufsc.br/), 2022.

    Terra, R.; Ocaña, K.; Osthoff, C.; Cruz, L.; Boito, F.; Navaux, P.; Carvalho, D. Framework para a Construção de Redes Filogenéticas em Ambiente de Computação de Alto Desempenho. In: XXIII Simpósio em Sistemas Computacionais de Alto Desempenho – WSCAD 2022 (https://wscad.ufsc.br/), 2022.

    Ocaña, K.; Cruz, L.; Coelho, M.; Terra, R.; Galheigo, M.; Carneiro, A.; Carvalho, D.; Gadelha, L.; Boito, F.; Navaux, P.; Osthoff, C. ParslRNA-Seq: an efficient and scalable RNAseq analysis workflow for studies of differentiated gene expression. In: Latin America High-Performance Computing Conference (CARLA), 2022, Rio Grande do Sul, Brazil. Proceedings of the Latin American High-Performance Computing Conference – CARLA 2022 (http://www.carla22.org/), 2022.

    [1] https://bioinfo.lncc.br/

    [2] https://git.tecgraf.puc-rio.br/csbase-dev/csgrid/-/tree/CSGRID-2.3-LNCC

    [3] https://https://sdumont.lncc.br

    The post Developing Efficient Scientific Gateways for Bioinformatics in Supercomputer Environments Supported by Artificial Intelligence first appeared on RISC2 Project.

    ]]>
    14th International SuperComputing Camp 2023 https://www.risc2-project.eu/events/14th-international-supercomputing-camp-2023/ Mon, 27 Feb 2023 12:31:53 +0000 https://www.risc2-project.eu/?post_type=mec-events&p=2763

    The post 14th International SuperComputing Camp 2023 first appeared on RISC2 Project.

    ]]>

    The post 14th International SuperComputing Camp 2023 first appeared on RISC2 Project.

    ]]>
    Managing Data and Machine Learning Models in HPC Applications https://www.risc2-project.eu/2022/11/21/managing-data-and-machine-learning-models-in-hpc-applications/ Mon, 21 Nov 2022 14:09:42 +0000 https://www.risc2-project.eu/?p=2508 The synergy of data science (including big data and machine learning) and HPC yields many benefits for data-intensive applications in terms of more accurate predictive data analysis and better decision making. For instance, in the context of the HPDaSc (High Performance Data Science) project between Inria and Brazil, we have shown the importance of realtime […]

    The post Managing Data and Machine Learning Models in HPC Applications first appeared on RISC2 Project.

    ]]>
    The synergy of data science (including big data and machine learning) and HPC yields many benefits for data-intensive applications in terms of more accurate predictive data analysis and better decision making. For instance, in the context of the HPDaSc (High Performance Data Science) project between Inria and Brazil, we have shown the importance of realtime analytics to make critical high-consequence decisions in HPC applications, e.g., preventing useless drilling based on a driller’s realtime data and realtime visualization of simulated data, or the effectiveness of ML to deal with scientific data, e.g., computing Probability Density Functions (PDFs) over simulated seismic data using Spark.

    However, to realize the full potential of this synergy, ML models (or models for short) must be built, combined and ensembled, which can be very complex as there can be many models to select from. Furthermore, they should be shared and reused, in particular, in different execution environments such as HPC or Spark clusters.

    To address this problem, we proposed Gypscie [Porto 2022, Zorrilla 2022], a new framework that supports the entire ML lifecycle and enables model reuse and import from other frameworks. The approach behind Gypscie is to combine several rich capabilities for model and data management, and model execution, which are typically provided by different tools, in a unique framework. Overall, Gypscie provides: a platform for supporting the complete model life-cycle, from model building to deployment, monitoring and policies enforcement; an environment for casual users to find ready-to-use models that best fit a particular prediction problem, an environment to optimize ML task scheduling and execution; an easy way for developers to benchmark their models against other competitive models and improve them; a central point of access to assess models’ compliance to policies and ethics and obtain and curate observational and predictive data; provenance information and model explainability. Finally, Gypscie interfaces with multiple execution environments to run ML tasks, e.g., an HPC system such as the Santos Dumont supercomputer at LNCC or a Spark cluster. 

    Gypscie comes with SAVIME [Silva 2020], a multidimensional array in-memory database system for importing, storing and querying model (tensor) data. The SAVIME open-source system has been developed to support analytical queries over scientific data. Its offers an extremely efficient ingestion procedure, which practically eliminates the waiting time to analyze incoming data. It also supports dense and sparse arrays and non-integer dimension indexing. It offers a functional query language processed by a query optimiser that generates efficient query execution plans.

     

    References

    [Porto 2022] Fabio Porto, Patrick Valduriez: Data and Machine Learning Model Management with Gypscie. CARLA 2022 – Workshop on HPC and Data Sciences meet Scientific Computing, SCALAC, Sep 2022, Porto Alegre, Brazil. pp.1-2. 

    [Zorrilla 2022] Rocío Zorrilla, Eduardo Ogasawara, Patrick Valduriez, Fabio Porto: A Data-Driven Model Selection Approach to Spatio-Temporal Prediction. SBBD 2022 – Brazilian Symposium on Databases, SBBD, Sep 2022, Buzios, Brazil. pp.1-12. 

    [Silva 2020] A.C. Silva, H. Lourenço, D. Ramos, F. Porto, P. Valduriez. Savime: An Array DBMS for Simulation Analysis and Prediction. Journal of Information Data Management 11(3), 2020. 

     

    By LNCC and Inria 

    The post Managing Data and Machine Learning Models in HPC Applications first appeared on RISC2 Project.

    ]]>
    HPC meets AI and Big Data https://www.risc2-project.eu/2022/10/06/hpc-meets-ai-and-big-data/ Thu, 06 Oct 2022 08:23:34 +0000 https://www.risc2-project.eu/?p=2413 HPC services are no longer solely targeted at highly parallel modelling and simulation tasks. Indeed, the computational power offered by these services is now being used to support data-centric Big Data and Artificial Intelligence (AI) applications. By combining both types of computational paradigms, HPC infrastructures will be key for improving the lives of citizens, speeding […]

    The post HPC meets AI and Big Data first appeared on RISC2 Project.

    ]]>
    HPC services are no longer solely targeted at highly parallel modelling and simulation tasks. Indeed, the computational power offered by these services is now being used to support data-centric Big Data and Artificial Intelligence (AI) applications. By combining both types of computational paradigms, HPC infrastructures will be key for improving the lives of citizens, speeding up scientific breakthrough in different fields (e.g., health, IoT, biology, chemistry, physics), and increasing the competitiveness of companies [OG+15, NCR+18].

    As the utility and usage of HPC infrastructures increases, more computational and storage power is required to efficiently handle the amount of targeted applications. In fact, many HPC centers are now aiming at exascale supercomputers supporting at least one exaFLOPs (1018 operations per second), which represents a thousandfold increase in processing power over the first petascale computer deployed in 2008 [RD+15]. Although this is a necessary requirement for handling the increasing number of HPC applications, there are several outstanding challenges that still need to be tackled so that this extra computational power can be fully leveraged. 

    Management of large infrastructures and heterogeneous workloads: By adding more compute and storage nodes, one is also increasing the complexity of the overall HPC distributed infrastructure and making it harder to monitor and manage. This complexity is increased due to the need of supporting highly heterogeneous applications that translate into different workloads with specific data storage and processing needs [ECS+17]. For example, on the one hand, traditional scientific modeling and simulation tasks require large slices of computational time, are CPU-bound, and rely on iterative approaches (parametric/stochastic modeling). On the other hand, data-driven Big Data applications contemplate shorter computational tasks, that are I/O bound and, in some cases, have real-time response requirements (i.e., latency-oriented). Also, many of the applications leverage AI and machine learning tools that require specific hardware (e.g., GPUs) in order to be efficient.

    Support for general-purpose analytics: The increased heterogeneity also demands that HPC infrastructures are now able to support general-purpose AI and BigData applications that were not designed explicitly to run on specialised HPC hardware [KWG+13]. Therefore, developers are not required to significantly change their applications so that they can execute efficiently at HPC clusters.

    Avoiding the storage bottleneck: By only increasing the computational power and improving the management of HPC infrastructures it may still not be possible to fully harmed the capabilities of these infrastructures. In fact, Big Data and AI applications are data-driven and require efficient data storage and retrieval from HPC clusters. With an increasing number of applications and heterogeneous workloads, the storage systems supporting HPC may easily become a bottleneck [YDI+16, ECS+17]. Indeed, as pointed out by several studies, the storage access time is one of the major bottlenecks limiting the efficiency of current and next-generation HPC infrastructures. 

    In order to address these challenges, RISC2 partners are exploring: New monitoring and debugging tools that can aid in the analysis of complex AI and Big Data workloads in order to pinpoint potential performance and efficiency bottlenecks, while helping system administrators and developers on troubleshooting these [ENO+21].

    Emerging virtualization technologies, such as containers, that enable users to efficiently deploy and execute traditional AI and BigData applications in an HPC environment, without requiring any changes to their source-code [FMP21].  

    The Software-Defined Storage paradigm in order to improve the Quality-of-Service (QoS) for HPC’s storage services when supporting hundreds to thousands of data-intensive AI and Big Data applications [DLC+22, MTH+22].  

    To sum up, these three research goals, and respective contributions, will enable the next generation of HPC infrastructures and services that can efficiently meet the demands of Big Data and AI workloads. 

     

    References

    [DLC+22] Dantas, M., Leitão, D., Cui, P., Macedo, R., Liu, X., Xu, W., Paulo, J., 2022. Accelerating Deep Learning Training Through Transparent Storage Tiering. IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid)  

    [ECS+17] Joseph, E., Conway, S., Sorensen, B., Thorp, M., 2017. Trends in the Worldwide HPC Market (Hyperion Presentation). HPC User Forum at HLRS.  

    [FMP21] Faria, A., Macedo, R., Paulo, J., 2021. Pods-as-Volumes: Effortlessly Integrating Storage Systems and Middleware into Kubernetes. Workshop on Container Technologies and Container Clouds (WoC’21). 

    [KWG+13] Katal, A., Wazid, M. and Goudar, R.H., 2013. Big data: issues, challenges, tools and good practices. International conference on contemporary computing (IC3). 

    [NCR+18] Netto, M.A., Calheiros, R.N., Rodrigues, E.R., Cunha, R.L. and Buyya, R., 2018. HPC cloud for scientific and business applications: Taxonomy, vision, and research challenges. ACM Computing Surveys (CSUR). 

    [MTH+22] Macedo, R., Tanimura, Y., Haga, J., Chidambaram, V., Pereira, J., Paulo, J., 2022. PAIO: General, Portable I/O Optimizations With Minor Application Modifications. USENIX Conference on File and Storage Technologies (FAST). 

    [OG+15] Osseyran, A. and Giles, M. eds., 2015. Industrial applications of high-performance computing: best global practices. 

    [RD+15] Reed, D.A. and Dongarra, J., 2015. Exascale computing and big data. Communications of the ACM. 

    [ENO+21] Esteves, T., Neves, F., Oliveira, R., Paulo, J., 2021. CaT: Content-aware Tracing and Analysis for Distributed Systems. ACM/IFIP Middleware conference (Middleware). 

    [YDI+16] Yildiz, O., Dorier, M., Ibrahim, S., Ross, R. and Antoniu, G., 2016, May. On the root causes of cross-application I/O interference in HPC storage systems. IEEE International Parallel and Distributed Processing Symposium (IPDPS). 

     

    By INESC TEC

    The post HPC meets AI and Big Data first appeared on RISC2 Project.

    ]]>
    Webinar: Interactive High-Performance Computing with JupyterLab https://www.risc2-project.eu/events/webinar-2-interactive-high-performance-computing-with-jupyterlab/ Tue, 26 Jul 2022 12:31:35 +0000 https://www.risc2-project.eu/?post_type=mec-events&p=2241 Date: September 22, 2022 | 4 p.m. (UTC+1) Speaker: Jens Henrik Göbbert, JSC Moderator: Esteban Mocskos, Universidad de Buenos Aires Interactive exploration and analysis of large amounts of data from scientific simulations, in-situ visualization and application control are convincing scenarios for explorative sciences. Based on the open source software Jupyter or JupyterLab, a way has been available for […]

    The post Webinar: Interactive High-Performance Computing with JupyterLab first appeared on RISC2 Project.

    ]]>

    Date: September 22, 2022 | 4 p.m. (UTC+1)

    Speaker: Jens Henrik Göbbert, JSC

    Moderator: Esteban Mocskos, Universidad de Buenos Aires

    Interactive exploration and analysis of large amounts of data from scientific simulations, in-situ visualization and application control are convincing scenarios for explorative sciences. Based on the open source software Jupyter or JupyterLab, a way has been available for some time now that combines interactive with reproducible computing while at the same time meeting the challenges of support for the wide range of different software workflows.

    Even on supercomputers, the method enables the creation of documents that combine live code with narrative text, mathematical equations, visualizations, interactive controls, and other extensive output. However, a number of challenges must be mastered in order to make existing workflows ready for interactive high-performance computing. With so many possibilities, it’s easy to lose sight of the big picture. This webinar provides a compact introduction to high performance interactive computing.

    Speaker’s presentation is available here.

    About the Speaker: Jens Henrik Göbbert graduated in mechanical engineering in 2006 and worked until 2014 as a research assistant at the Institute for Technical Combustion in the area of turbulence modelling and high performance computing. He joined the cross-sectional group “Immersive Visualization” of the Jülich Aachen Research Alliance (part of the Virtual Reality Group of the IT Center at the RWTH Aachen University) and became part of the cross-sectional team “Visualization” of the Jülich Supercomputing Center at the FZJ in 2016 as an expert in visualization of large scientific data sets, in situ visualization & coupling and interactive supercomputing.

    About the Moderator: Esteban Mocskos is a full-time professor at Universidad de Buenos Aires (UBA) and researcher at the Center for Computer Simulation (CSC-CONICET). He received his Ph.D. in Computer Science from UBA in 2008 and was postdoc at the Protein Modelling group at UBA. His research interests include distributed systems & blockchain, computer networks, processor architecture, and parallel programming. He is part of the steering committee of the Latin-American HPC CARLA conference and one of the committee members of Argentina’s National HPC system.

    The post Webinar: Interactive High-Performance Computing with JupyterLab first appeared on RISC2 Project.

    ]]>
    Kabré Supercomputer https://www.risc2-project.eu/2022/04/22/kabre-supercomputer/ Fri, 22 Apr 2022 09:12:36 +0000 https://www.risc2-project.eu/?p=1885 Title: Kabré Supercomputer System name: Kabré Supercomputer Location: National High Technology Center (CeNAT) Web OS: Linux CentOS 7.2 Country: Costa Rica Processor architecture: Simulation nodes (32 nodes): Intel Xeon Phi KNL, 64 physical cores, 96 GB of main memory Data science nodes (5 nodes): Intel Xeon, 24 physical cores, 16-128 GB main memory Machine learning […]

    The post Kabré Supercomputer first appeared on RISC2 Project.

    ]]>
  • Title: Kabré Supercomputer
  • System name: Kabré Supercomputer
  • Location: National High Technology Center (CeNAT)
  • Web
  • OS: Linux CentOS 7.2
  • Country: Costa Rica
  • Processor architecture:
    • Simulation nodes (32 nodes): Intel Xeon Phi KNL, 64 physical cores, 96 GB of main memory
    • Data science nodes (5 nodes): Intel Xeon, 24 physical cores, 16-128 GB main memory
    • Machine learning nodes (8 nodes): Intel Xeon, 16 physical cores, 16-32 GB main memory, half the nodes with NVIDIA K40 GPU, half with NVIDIA V100 GPU
    • Bioinformatics nodes (7 nodes): Intel Xeon, 24 physical cores, 512-1024 GB main memory
    • Storage capacity: 120 TB
  • Access Policy: Restricted to students and staff of all public universities in Costa Rica
  • Main research domains: Full computational science spectrum, big data, artificial intelligence, bioinformatics
  • The post Kabré Supercomputer first appeared on RISC2 Project.

    ]]>
    National Laboratory for Scientific Computing participated in the ISC2021 https://www.risc2-project.eu/2021/08/13/national-laboratory-for-scientific-computing-participated-in-the-isc2021/ Fri, 13 Aug 2021 09:55:06 +0000 https://www.risc2-project.eu/?p=1799 The National Laboratory for Scientific Computing (LNCC), one of the RISC2 partners from Brazil, presented two posters at the Event for High Performance Computing, Machine Learning and Data Analysis (ISC) 2021. The posters “Developing Efficient Scientific Gateways for Bioinformatics in Supercomputing Environments Supported by Artificial Intelligence” and “Scalable Numerical Method for Biphasic Flows in Heterogeneous […]

    The post National Laboratory for Scientific Computing participated in the ISC2021 first appeared on RISC2 Project.

    ]]>
    The National Laboratory for Scientific Computing (LNCC), one of the RISC2 partners from Brazil, presented two posters at the Event for High Performance Computing, Machine Learning and Data Analysis (ISC) 2021.

    The posters “Developing Efficient Scientific Gateways for Bioinformatics in Supercomputing Environments Supported by Artificial Intelligence” and “Scalable Numerical Method for Biphasic Flows in Heterogeneous Porous Media in High-Performance Computational Environments” are part of the activities of the LNCC RISC2 projects.

    According to Carla Osthoff (LNCC) , former poster presents a collaboration project that aims to develop green and intelligent scientific gateways for bioinformatics supported by high-performance computing environments (HPC) and specialized technologies such as scientific workflows, data mining, machine learning, and deep learning.  The efficient analysis and interpretation of Big Data open new challenges to explore molecular biology, genetics, biomedical, and healthcare to improve personalized diagnostics and therapeutics; then, it becomes necessary to availability of new avenues to deal with this massive amount of information. New paradigms in Bioinformatics and Computational Biology drive the storing, managing, and accessing of data. HPC and Big Data advances in this domain represent a vast new field of opportunities for bioinformatics researchers and a significant challenge. The Bioinfo-Portal science gateway is a multiuser Brazilian infrastructure for bioinformatics applications, benefiting from the HPC infrastructure. We present several challenges for efficiently executing applications and discussing the findings on how to improve the use of computational resources. We performed several large-scale bioinformatics experiments that are considered computationally intensive and time-consuming. We are currently coupling artificial intelligence to generate models to analyze computational and bioinformatics metadata to understand how automatic learning can predict computational resources’ efficient use. The computational executions are carried out at Santos Dumont Supercomputer. This is a multi-disciplinary project requiring expertise from several knowledge areas from four research institutes (LNCC, UFRGS, INRIA Bordeaux, and CENAT in Costa Rica). Finally, Brazilian funding agencies (CNPQ, CAPES) and the RISC-2 project from the European Economic and Social Committee (EESC) support the project.

    Latter poster presents a project that aims to develop a scalable numerical approach for biphasic flows in heterogeneous porous media in high-performance computing environments based on the high-performance numerical methodology. In this system, an elliptical subsystem determines the velocity field, and a non-linear hyperbolic equation represents the transport of the flowing phases (saturation equation). The model applies a locally conservative finite element method for the mixing speed. Furthermore, the model employs a high-order non-oscillatory finite volume method, based on central schemes, for the non-linear hyperbolic equation that governs phase saturation. Specifically, the project aims to build scalable codes for a high-performance environment. Identified the bottlenecks in the code, the project is now working in four different research areas. Parallel I/O routines and high-performance visualization to decrease the I/O transfers bottleneck, Parallel programming to reduce code bottlenecks for multicore and manycore architectures. and Adaptive MPI  to decrease the message communication bottleneck. The poster presents the first performance evaluation results used to guide the project research areas. This endeavor is a multi-disciplinary project requiring expertise from several knowledge areas from four research institutes (LNCC, UFRGS, UFLA in Brazil, and CENAT in Costa Rica). Finally, Brazilian funding agencies (CNPQ, CAPES) and the RISC-2 project.

    The post National Laboratory for Scientific Computing participated in the ISC2021 first appeared on RISC2 Project.

    ]]>