data science - RISC2 Project

RISC2’s partners gather in Brussels to reflect on three years of collaboration between EU and Latin America

wp_risc — Wed, 26 Jul 2023 12:03:56 +0000

Over the past three years, the RISC2 project has established a network for the exchange of knowledge and experience that has enabled its European and Latin American partners to strengthen relations in HPC and take significant steps forward in this area. With the project quickly coming to an end, it was time to meet face-to-face in Brussels to reflect on the progress and achievements, the goals set, the difficulties faced, and, above all, what can be expected for the future.

The session began with a welcome and introduction by Mateo Valero (BSC), one of the main drivers of this cooperation and a leading name in the field of HPC. This intervention was later complemented by Fabrizio Gagliardi (BSC). Afterward, Elsa Carvalho (INESC TEC) presented the work done in terms of communication by the RISC2 team, an important segment for all the news and achievements to reach all the partners and countries involved.

Carlos J. Barrios Hernandez then presented the work done within the HPC Observatory, a relevant source of information that European and Latin American research organizations can address with HPC and/or AI issues.

The session closed with an important and pertinent debate on how to strengthen cooperation in HPC between the European Union and Latin America, in which all participants contributed and gave their opinion, committing to efforts so that the work developed within the framework of RISC2 is continued.

What our partners had to say about the meeting?

Rafael Mayo Garcia, CIEMAT:

“The policy event organized by RISC2 in Brussels was of utmost importance for the development of HPC and digital capabilities for a shared infrastructure between EU and LAC. Even more, it has had crucial contributions to international entities such as CYTED, the Ibero-American Programme for the Development of Science and Technology. On the CIEMAT side, it has been a new step beyond for building and participating in a HPC shared ecosystem.”

Esteban Meneses, CeNAT:

“In Costa Rica, CeNAT plays a critical role in fostering technological change. To achieve that goal, it is fundamental to synchronize our efforts with other key players, particularly government institutions. The event policy in Brussels was a great opportunity to get closer to our science and technology ministry and start a dialogue on the importance of HPC, data science, and artificial intelligence for bringing about the societal changes we aim for.”

Esteban Mocskos, UBA:

“The Policy Event recently held in Brussels and organized by the RISC2 project had several remarkable points. The gathering of experts in HPC research and management in Latin America and Europe served to plan the next steps in the joint endeavor to deepen the collaboration in this field. The advance in management policies, application optimization, and user engagement are fundamental topics treated during the main sessions and also during the point-to-point talks in every corner of the meeting room.
I can say that this meeting will also spawn different paths in these collaboration efforts that we’ll surely see their results during the following years with a positive impact on both sides of this fruitful relationship: Latin America and Europe.”

Sergio Nesmachnow, Universidad de la República:

“The National Supercomputing Center (Uruguay) and Universidad de la República have led the development of HPC strategies and technologies and their application to relevant problems in Uruguay. Specific meetings such as the policy event organized by RISC2 in Brussels are key to present and disseminate the current developments and achievements to relevant political and technological leaders in our country, so that they gain knowledge about the usefulness of HPC technologies and infrastructure to foster the development of national scientific research in capital areas such as sustainability, energy, and social development. It was very important to present the network of collaborators in Latin America and Europe and to show the involvement of institutional and government agencies.

Within the contacts and talks during the organization of the meeting, we introduced the projecto to national authorities, including the National Director of Science and Technology, Ministry of Education and Culture, and the President of the National Agency for Research and Innovation, as well as the Uruguayan Agency for International Cooperation and academic authorities from all institutions involved in the National Supercomputing Center initiative. We hope the established contacts can result in productive joint efforts to foster the development of HPC and related scientific areas in our country and the region.”

Carla Osthoff, LNCC:

“In Brazil, LNCC is critical in providing High Performance Computing Resources for the Research Community and training Human Resources and fostering new technologies. The policy event organized by RISC2 in Brussels was fundamental to synchronizing LNCC efforts with other government institutions and international entities. On the LNCC side, it has been a new step beyond building and participating in an HPC-shared ecosystem.

Specific meetings such as the policy event organized by RISC2 in Brussels were very important to present the network of collaborators in Latin America and Europe and to show the involvement of institutional and government agencies.

As a result of joint activities in research and development in the areas of information and communication technologies (ICT), artificial intelligence, applied mathematics, and computational modelling, with emphasis on the areas of scientific computing and data science, a Memorandum of Understanding (MoU) have been signed between LNCC and Inria/France. As a result of new joint activities, LNCC and INESC TEC/Portugal are starting collaboration through INESC TEC International Visiting Researcher Programme 2023.”

The post RISC2’s partners gather in Brussels to reflect on three years of collaboration between EU and Latin America first appeared on RISC2 Project.

Scientific Machine Learning and HPC

wp_risc — Wed, 28 Jun 2023 08:24:28 +0000

In recent years we have seen rapid growth in interest in artificial intelligence in general, and machine learning (ML) techniques, particularly in different branches of science and engineering. The rapid growth of the Scientific Machine Learning field derives from the combined development and use of efficient data analysis algorithms, the availability of data from scientific instruments and computer simulations, and advances in high-performance computing. On May 25 2023, COPPE/UFRJ organized a forum to discuss Artificial Intelligence developments and its impact on the society [*].

As the coordinator of the High Performance Computing Center (Nacad) at COPPE/UFRJ, Alvaro Coutinho, presented advances in AI in Engineering and the importance of multidisciplinary research networks to address current issues in Scientific Machine Learning. Alvaro took the opportunity to highlight the need for Brazil to invest in high performance computing capacity.

The country’s sovereignty needs autonomy in producing ML advances, which depends on HPC support at the Universities and Research Centers. Brazil has nine machines in the Top 500 list of the most powerful computer systems in the world, but almost all at Petrobras company, and Universities need much more. ML is well-known to require HPC, when combined to scientific computer simulations it becomes essential.

The conventional notion of ML involves training an algorithm to automatically discover patterns, signals, or structures that may be hidden in huge databases and whose exact nature is unknown and therefore cannot be explicitly programmed. This field may face two major drawbacks: the need for a significant volume of (labelled) expensive to acquire data and limitations for extrapolating (making predictions beyond scenarios contained in the trained data difficult).

Considering that an algorithm’s predictive ability is a learning skill, current challenges must be addressed to improve the analytical and predictive capacity of Scientific ML algorithms, for example, to maximize its impact in applications of renewable energy. References [1-5] illustrate recent advances in Scientific Machine Learning in different areas of engineering and computer science.

References:

[*] https://www.coppe.ufrj.br/pt-br/planeta-coppe-noticias/noticias/coppe-e-sociedade-especialistas-debatem-os-reflexos-da-inteligencia

[1] Baker, Nathan, Steven L. Brunton, J. Nathan Kutz, Krithika Manohar, Aleksandr Y. Aravkin, Kristi Morgansen, Jennifer Klemisch, Nicholas Goebel, James Buttrick, Jeffrey Poskin, Agnes Blom-Schieber, Thomas Hogan, Darren McDonaldAlexander, Frank, Bremer, Timo, Hagberg, Aric, Kevrekidis, Yannis, Najm, Habib, Parashar, Manish, Patra, Abani, Sethian, James, Wild, Stefan, Willcox, Karen, and Lee, Steven. Workshop Report on Basic Research Needs for Scientific Machine Learning: Core Technologies for Artificial Intelligence. United States: N. p., 2019. Web. doi:10.2172/1478744.

[2] Brunton, Steven L., Bernd R. Noack, and Petros Koumoutsakos. “Machine learning for fluid mechanics.” Annual Review of Fluid Mechanics 52 (2020): 477-508.

[3] Karniadakis, George Em, et al. “Physics-informed machine learning.” Nature Reviews Physics 3.6 (2021): 422-440.

[4] Inria White Book on Artificial Intelligence: Current challenges and Inria’s engagement, 2nd edition, 2021. URL: https://www.inria.fr/en/white-paper-inria-artificial-intelligence

[5] Silva, Romulo, Umair bin Waheed, Alvaro Coutinho, and George Em Karniadakis. “Improving PINN-based Seismic Tomography by Respecting Physical Causality.” In AGU Fall Meeting Abstracts, vol. 2022, pp. S11C-09. 2022.

The post Scientific Machine Learning and HPC first appeared on RISC2 Project.

More than 100 students participated in the HPC, Data & Architecture Week

wp_risc — Tue, 21 Mar 2023 10:18:44 +0000

RISC2 supported the ‘HPC, Data & Architecture Week’, which took place between March 13 and 17, 2023, in Buenos Aires. This initiative aimed to recover and deepen the training of human resources for the development of scientific applications and their efficient use in parallel computing environments.

This event had four main courses: “Foundations of Parallel Programming”, “Large scale data processing and machine learning”, “New architectures and specific computing platforms”, and “Administrations techniques for large-scale computing facilities”.

More than 100 students actively participated in the event who traveled from different part of the country. 30 students received financial support to participate (traveling and living) provided by the National HPC System (SNCAD) dependent of the Argentina’s Ministry of Science.

Esteban Mocskos, one of the organizers of the event, believes “this kind of events should be organized regularly to sustain the flux of students in the area of HPC”. In his opinion, “a lot of students from Argentina get their first contact with HPC topics. As such a large country, impacting a distant region also means impacting the neighboring countries. Those students will bring their experience to other students in their places”. According to Mocskos, initiatives like the “HPC, Data & Architecture Week” spark a lot of collaborations.

The post More than 100 students participated in the HPC, Data & Architecture Week first appeared on RISC2 Project.

Developing Efficient Scientific Gateways for Bioinformatics in Supercomputer Environments Supported by Artificial Intelligence

wp_risc — Mon, 20 Mar 2023 09:37:46 +0000

Scientific gateways bring enormous benefits to end users by simplifying access and hiding the complexity of the underlying distributed computing infrastructure. Gateways require significant development and maintenance efforts. BioinfoPortal^[1], through its CSGrid^[2] middleware, takes advantage of Santos Dumont ^[3] heterogeneous resources. However, task submission still requires a substantial step regarding deciding the best configuration that leads to efficient execution. This project aims to develop green and intelligent scientific gateways for BioinfoPortal supported by high-performance computing environments (HPC) and specialised technologies such as scientific workflows, data mining, machine learning, and deep learning. The efficient analysis and interpretation of Big Data opens new challenges to explore molecular biology, genetics, biomedical, and healthcare to improve personalised diagnostics and therapeutics; finding new avenues to deal with this massive amount of information becomes necessary. New Bioinformatics and Computational Biology paradigms drive storage, management, and data access. HPC and Big Data advanced in this domain represent a vast new field of opportunities for bioinformatics researchers and a significant challenge. the BioinfoPortal science gateway is a multiuser Brazilian infrastructure. We present several challenges for efficiently executing applications and discuss the findings on improving the use of computational resources. We performed several large-scale bioinformatics experiments that are considered computationally intensive and time-consuming. We are currently coupling artificial intelligence to generate models to analyze computational and bioinformatics metadata to understand how automatic learning can predict computational resources’ efficient use. The computational executions are conducted at Santos Dumont, the largest supercomputer in Latin America, dedicated to the research community with 5.1 Petaflops and 36,472 computational cores distributed in 1,134 computational nodes.

By:

Carneiro, B. Fagundes, C. Osthoff, G. Freire, K. Ocaña, L. Cruz, L. Gadelha, M. Coelho, M. Galheigo, and R. Terra are with the National Laboratory of Scientific Computing, Rio de Janeiro, Brazil.

Carvalho is with the Federal Center for Technological Education Celso Suckow da Fonseca, Rio de Janeiro, Brazil.

Douglas Cardoso is with the Polytechnic Institute of Tomar, Portugal.

Boito and L, Teylo is with the University of Bordeaux, CNRS, Bordeaux INP, INRIA, LaBRI, Talence, France.

Navaux is with the Informatics Institute, the Federal University of Rio Grande do Sul, and Rio Grande do Sul, Brazil.

References:

Ocaña, K. A. C. S.; Galheigo, M.; Osthoff, C.; Gadelha, L. M. R.; Porto, F.; Gomes, A. T. A.; Oliveira, D.; Vasconcelos, A. T. BioinfoPortal: A scientific gateway for integrating bioinformatics applications on the Brazilian national high-performance computing network. Future Generation Computer Systems, v. 107, p. 192-214, 2020.

Mondelli, M. L.; Magalhães, T.; Loss, G.; Wilde, M.; Foster, I.; Mattoso, M. L. Q.; Katz, D. S.; Barbosa, H. J. C.; Vasconcelos, A. T. R.; Ocaña, K. A. C. S; Gadelha, L. BioWorkbench: A High-Performance Framework for Managing and Analyzing Bioinformatics Experiments. PeerJ, v. 1, p. 1, 2018.

Coelho, M.; Freire, G.; Ocaña, K.; Osthoff, C.; Galheigo, M.; Carneiro, A. R.; Boito, F.; Navaux, P.; Cardoso, D. O. Desenvolvimento de um Framework de Aprendizado de Máquina no Apoio a Gateways Científicos Verdes, Inteligentes e Eficientes: BioinfoPortal como Caso de Estudo Brasileiro In: XXIII Simpósio em Sistemas Computacionais de Alto Desempenho – WSCAD 2022 (https://wscad.ufsc.br/), 2022.

Terra, R.; Ocaña, K.; Osthoff, C.; Cruz, L.; Boito, F.; Navaux, P.; Carvalho, D. Framework para a Construção de Redes Filogenéticas em Ambiente de Computação de Alto Desempenho. In: XXIII Simpósio em Sistemas Computacionais de Alto Desempenho – WSCAD 2022 (https://wscad.ufsc.br/), 2022.

Ocaña, K.; Cruz, L.; Coelho, M.; Terra, R.; Galheigo, M.; Carneiro, A.; Carvalho, D.; Gadelha, L.; Boito, F.; Navaux, P.; Osthoff, C. ParslRNA-Seq: an efficient and scalable RNAseq analysis workflow for studies of differentiated gene expression. In: Latin America High-Performance Computing Conference (CARLA), 2022, Rio Grande do Sul, Brazil. Proceedings of the Latin American High-Performance Computing Conference – CARLA 2022 (http://www.carla22.org/), 2022.

^[1] https://bioinfo.lncc.br/

^[2] https://git.tecgraf.puc-rio.br/csbase-dev/csgrid/-/tree/CSGRID-2.3-LNCC

^[3] https://https://sdumont.lncc.br

The post Developing Efficient Scientific Gateways for Bioinformatics in Supercomputer Environments Supported by Artificial Intelligence first appeared on RISC2 Project.

Inria Brasil Workshops

wp_risc — Tue, 14 Mar 2023 12:55:49 +0000

The post Inria Brasil Workshops first appeared on RISC2 Project.

Towards a greater HPC capacity in Latin America

wp_risc — Fri, 24 Feb 2023 15:36:39 +0000

High-Performance Computing (HPC) has proven to be a strong driver for science and technology development, and is increasingly considered indispensable for most scientific disciplines. HPC is making a difference in key topics of great interest such as climate change, personalised medicine, engineering, astronomy, education, economics, industry and public policy, becoming a pillar for the development of any country, and to which the great powers are giving strategic importance and investing billions of dollars, in competition without limits where data is the new gold.

A country that does not have the computational capacity to solve its own problems will have no alternative but to try to acquire solutions provided by others. One of the most important aspects of sovereignty in the 21st century is the ability to produce mathematical models and to have the capacity to solve them. Today, the availability of computing power commensurate with one’s wealth exponentially increases a country’s capacity to produce knowledge. in the developed world, it is estimated that for every dollar invested in supercomputing, the return to society is of the order of US$ 44(1) and to the academic world US$ 30(2). For these reasons, HPC occupies an important place on the political and diplomatic agendas of developed countries.

In Latin America, investment in HPC is very low compared to what’s the US, Asia and Europe are doing. In order to quantify this difference, we present the tables below, which show the accumulated computing capacity in the ranking of the 500 most powerful supercomputers in the world – the TOP500(3) – (Table 1), and the local reality (Table 2). Other data are also included, such as the population (in millions), the number of researchers per 1,000 inhabitants (Res/1000), the computing capacity per researcher (Gflops/Res) and the computing capacity per US$ million of GPD. In Table 1, we have grouped the countries by geographical area. America appears as the area with the highest computing capacity, essentially due to the USA, which has almost 45% of the world’s computing capacity in the TOP500. It if followed by Asia and then Europe. Tis TOP500 list includes mainly academic research centres, but also industry ones, typically those used in applied research (many private ones do not wish to publish such information for obvious reasons). For example, in Brazil – which shows good computing capacity with 88,175 TFlops – the vast majority is in the hands of the oil industry and only about 3,000 TFlops are used for basic research. Countries listed in the TOP500 invest in HPC from a few TFlops per million GDP (Belgium 5, Spain 7, Bulgaria 8), through countries investing in the order of hundreds (Italy 176, Japan 151, USA 138), to even thousands, as is the case in Finland with 1,478. For those countries where we were able to find data on the number of researchers, these range from a few Gflops per researcher (Belgium 19, Spain 24, Hungary 52) to close to 1,000 GFlops, i.e. 1 TFlop (USA 970, Italy 966), with Finland surpassing this barrier with 4,647. Note that, unlike what happens locally, countries with a certain degree of development invest every 3-4 years in supercomputing, so the data we are showing will soon be updated and there will be variations in the list. For example, this year a new supercomputer will come into operation in Spain(4), which, with an investment of some 150 million euros, will give Spain one of the most powerful supercomputers in Europe – and the world.

Country	Rpeak (TFlops)	Population (millions)	Res/1000	GFlops/Res	Tflops/M US$
United States	3.216.124	335	9.9	969.7	138.0
Canada	71.911	39	8.8	209.5	40.0
Brazil	88.175	216	1.1	371.1	51.9
AMERICA	3.376.211	590

China	1.132.071	1400			67.4
Japan	815.667	124	10.0	657.8	151.0
South Korea	128.264	52	16.6	148.6	71.3
Saudi Arabia	98.982	35			141.4
Taiwan	19.562	23			21.7
Singapore	15.785	6			52.6
Thailand	13.773	70			27.5
United Arab Emirates	12.164	10			15.2
India	12.082	1380			4.0
ASIA	2.248.353	3100

Finland	443.391	6	15.9	4647.7	1478.0
Italy	370.262	59	6.5	965.5	176.3
Germany	331.231	85	10.1	385.8	78.9
France	251.166	65	11.4	339.0	83.7
Russia	101.737	145			59.8
United Kingdom	92.563	68	9.6	141.8	29.9
Netherlands	56.740	18	10.6	297.4	56.7
Switzerland	38.600	9	9.4	456.3	48.3
Sweden	32.727	10	15.8	207.1	54.5
Ireland	26.320	5	10.6	496.6	65.8
Luxembourg	18.291	0.6			365.8
Poland	17.099	38	7.6	59.2	28.5
Norway	17.031	6	13.0	218.3	34.1
Czech Republic	12.914	10	8.3	155.6	43.0
Spain	10.296	47	7.4	29.6	7.4
Slovenia	10.047	2	9.9	507.4	167.5
Austria	6.809	9	11.6	65.2	13.6
Bulgaria	5.942	6			8.5
Hungary	4.669	10	9.0	51.9	23.3
Belgium	3.094	12	13.6	19.0	5.2
EUROPA	1.850.934	610.6

OTHER
Australia	60.177	26			40.1
Morocco	5.014	39			50.1

Table 1. HPC availability per researcher and relative to GDP in the TOP500 countries (includes HPC in industry).

The local reality is far from this data. Table 2 shows data from Argentina, Brazil, Chile and Mexico. In Chile, the availability of computing power is 2-3 times less per researcher than in countries with less computing power in the OECD and up to 100 times less than a researcher in the US. In Chile, our investment measured in TFlops per million US$ of GDP is 166 times less than in the US; with respect to European countries that invest less in HPC it is 9 times less, and with respect to the European average (including Finland) it is 80 times less, i.e. the difference is considerable. It is clear that we need to close this gap. An investment go about 5 million dollars in HPC infrastructure in the next 5 years would close this gap by a factor of almost 20 times our computational capacity. However, returning to the example of Spain, the supercomputer it will have this year will offer 23 times more computing power than at present and, therefore, we will only maintain our relative distance. If we do not invest, the dap will increase by at least 23 times and will end up being huge. Therefore, we do not only need a one-time investment, but we need to ensure a regular investment. Some neighbouring countries are already investing significantly in supercomputing. This is the case in Argentina, where they are investing 7 million dollars (2 million for the datacenter and 5 million to buy a new supercomputer), which will increase their current capacities by almost 40 times(5).

Country	Rpeak (TFlops)	Population (millions)	Res/1000	GFlops/Res	Tflops/M US$
Brazil*	3.000	216	1.1	12.6	1.8
Mexico	2.200	130	1.2	14.1	1.8
Argentina	400	45	1.2	7.4	0.8
Chile	250	20	1.3	9.6	0.8

Table 2. HPC availability per researcher and relative to GDP in the region (*only HPC capacity in academia is considered in this table).

For the above reasons, we are working to convince the Chilean authorities that we must have greater funding and, more crucially, permanent state funding in HPC. In relation to this, on July 6 we signed a collaboration agreement between 44 institutions with the support of the Ministry of Science to work on the creation of the National Supercomputing Laboratory(6). The agreement recognised that supercomputers are a critical infrastructure for Chile’s development, that it is necessary to centralise the requirements/resources at the national level, obtain permanent funding from the State and create a new institutional framework to provide governance. In an unprecedented inter-institutional collaboration in Chile, the competition for HPC resources at the national level is eliminated ad the possibility of direct funding from the State is opened up without generating controversy.

Undoubtedly, supercomputing is a fundamental pillar for the development of any country, where increasing investment provides a strategic advantage, and in Latin America we should not be left behind.

By NLHPC

References

(1) Hyperion Research HPC Investments Bring High Returns

(2) EESI-2 Special Study To Measure And Model How Investments In HPC Can Create Financial ROI And Scientific Innovation In Europe

(3) https://top500.org/

(4) https://www.lavanguardia.com/ciencia/20230129/8713515/llega-superordenador-marenostrum-5-bsc-barcelona.html

(5) https://www.hpcwire.com/2022/12/15/argentina-announces-new-supercomputer-for-national-science/

(6) https://uchile.cl/noticias/187955/44-instituciones-crearan-el-laboratorio-nacional-de-supercomputacion

The post Towards a greater HPC capacity in Latin America first appeared on RISC2 Project.

Mapping human brain functions using HPC

wp_risc — Wed, 01 Feb 2023 13:17:19 +0000

ContentMAP is the first Portuguese project in the field of Psychology and Cognitive Neuroscience to be awarded with European Research Council grant (ERC Starting Grant #802553). In this project one is mapping how the human brain represents object knowledge – for example, how one represents in the brain all one knows about a knife (that it cuts, that it has a handle, that is made out of metal and plastic or metal and wood, that it has a serrated and sharp part, that it is smooth and cold, etc.)? To do this, the project collects numerous MRI images while participants see and interact with objects (fMRI). HPC (High Performance Computing) is of central importance for processing these images . The use of HPC has allowed to manipulate these data, perform analysis with machine learning and complex computing in a timely manner.

Humans are particularly efficient at recognising objects – think about what surrounds us: one recognises the object where one is reading the text from as a screen, the place where one sits as a chair, the utensil in which one drinks coffee as a cup, and one does all of this extremely quickly and virtually automatically. One is able to do all this despite the fact that 1) one holds large amounts of information about each object (if one is asked to write down everything you know about a pen, you would certainly have a lot to say); and that 2) there are several exemplars of each object type (a glass can be tall, made out of glass, metal, paper or plastic, it can be different colours, etc. – but despite that, any of them would still be a glass). How does one do this? How one is able to store and process so much information in the process of recognising a glass, and generalise all the different instances of a glass to get the concept “glass”? The goal of the ContentMAP is to understand the processes that lead to successful object recognition.

The answer to these question lies in better understanding of the organisational principles of information in the brain. It is, in fact, the efficient organisation of conceptual information and object representations in the brain that allows one to quickly and efficiently recognise the keyboard that is in front of each of us. To study the neuronal organisation of object knowledge, the project collects large sets of fMRI data from several participants, and then try to decode the organisational principles of information in the brain.

Given the amount of data and the computational requirements of this type of data at the level of pre-processing and post processing, the use of HPC is essential to enable these studies to be conducted in a timely manner. For example, at the post-processing level, the project uses whole brain Support Vector Machine classification algorithms (searchlight procedures) that require hundreds of thousands of classifiers to be trained. Moreover, for each of these classifiers one needs to compute a sample distribution of the average, as well as test the various classifications of interest, and this has to be done per participant.

Because of this, the use of HPC facilities of of the Advanced Computing Laboratory (LCA) at University of Coimbra is crucial. It allows us to actually perform these analyses in one to two weeks – something that on our 14-core computers would take a few months, which in pratice would mean, most probably, that the analysis would not be done.

By Faculty of Psychology and Educational Sciences, University of Coimbra

Reference

ProAction Lab http://proactionlab.fpce.uc.pt/

The post Mapping human brain functions using HPC first appeared on RISC2 Project.

Webinar: Developing complex workflows that include HPC, Artificial Intelligence and Data Analytics

wp_risc — Tue, 24 Jan 2023 10:51:32 +0000

Date: February 22, 2023 | 4 p.m. (UTC)

Speaker: Rosa M. Badia, Barcelona Supercomputing Center

Moderator: Esteban Mocskos, Universidad de Buenos Aires

The evolution of High-Performance Computing (HPC) systems towards every-time more complex machines is opening the opportunity of hosting larger and heterogeneous applications. In this sense, the demand for developing applications that are not purely HPC, but that combine aspects of Artifical Intelligence and or Data analytics is becoming more common. However, there is a lack of environments that support the development of these complex workflows. The webinar will present PyCOMPSs, a parallel task-based programming in Python. Based on simple annotations, sequential Python programs can be executed in parallel in HPC-clusters and other distributed infrastructures.

PyCOMPSs has been extended to support tasks that invoke HPC applications and can be combined with Artificial Intelligence and Data analytics frameworks.

Some of these extensions are made in the framework of the eFlows4HPC project, which in addition is developing the HPC Workflows as a Service (HPCWaaS) methodology to make the development, deployment, execution and reuse of workflows easier. The webinar will present the current status of the PyCOMPSs programming model and how it is being extended in the eFlows4HPC project towards the project needs. Also, the HPCWaaS methodology will be introduced.

About the speaker: Rosa M. Badia holds a PhD on Computer Science (1994) from the Technical University of Catalonia (UPC). She is the manager of the Workflows and Distributed Computing research group at the Barcelona Supercomputing Center (BSC).

Her current research interests are programming models for complex platforms (from edge, fog, to Clouds and large HPC systems). The group led by Dr. Badia has been developing StarSs programming model for more than 15 years, with a high success in adoption by application developers. Currently the group focuses its efforts in PyCOMPSs/COMPSs, an instance of the programming model for distributed computing including Cloud.

Dr Badia has published nearly 200 papers in international conferences and journals in the topics of her research. Her group is very active in projects funded by the European Commission and in contracts with industry. Dr Badia is the PI of the eFlows4HPC project.

Registrations are now closed.

The post Webinar: Developing complex workflows that include HPC, Artificial Intelligence and Data Analytics first appeared on RISC2 Project.

JUPITER Ascending – First European Exascale Supercomputer Coming to Jülich

wp_risc — Mon, 02 Jan 2023 12:14:22 +0000

It was finally decided in 2022: Forschungszentrum Jülich will be home to Europe’s first exascale computer. The supercomputer is set to be the first in Europe to surpass the threshold of one trillion (“1” followed by 18 zeros) calculations per second. The system will be acquired by the European supercomputing initiative EuroHPC JU. The exascale computer should help to solve important and urgent scientific questions regarding, for example, climate change, how to combat pandemics, and sustainable energy production, while also enabling the intensive use of artificial intelligence and the analysis of large data volumes. The overall costs for the system amount to 500 million euros. Of this total, 250 million euros is being provided by EuroHPC JU and a further 250 million euros in equal parts by the German Federal Ministry of Education and Research (BMBF) and the Ministry of Culture and Science of the State of North Rhine-Westphalia (MKW NRW).

The computer named JUPITER (short for “Joint Undertaking Pioneer for Innovative and Transformative Exascale Research”) will be installed 2023/2024 on the campus of Forschungszentrum Jülich. It is intended that the system will be operated by the Jülich Supercomputing Centre (JSC), whose supercomputers JUWELS and JURECA currently rank among the most powerful in the world. JSC has participated in the application procedure for a high-end supercomputer as a member of the Gauss Centre for Supercomputing (GCS), an association of the three German national supercomputing centres JSC in Jülich, High Performance Computing Stuttgart (HLRS), and Leibniz Computing Centre (LRZ) in Garching. The competition was organized by the European supercomputing initiative EuroHPC JU, which was formed by the European Union together with European countries and private companies.

JUPITER is now set to become the first European supercomputer to make the leap into the exascale class. In terms of computing power, it will be more powerful that 5 million modern laptops of PCs. Just like Jülich’s current supercomputer JUWELS, JUPITER will be based on a dynamic, modular supercomputing architecture, which Forschungszentrum Jülich developed together with European and international partners in the EU’s DEEP research projects.

In a modular supercomputer, various computing modules are coupled together. This enables program parts of complex simulations to be distributed over several modules, ensuring that the various hardware properties can be optimally utilized in each case. Its modular construction also means that the system is well prepared for integrating future technologies such as quantum computing or neurotrophic modules, which emulate the neural structure of a biological brain.

Figure Modular Supercomputing Architecture: Computing and storage modules of the exascale computer in its basis configuration (blue) as well as optional modules (green) and modules for future technologies (purple) as possible extensions.

In its basis configuration, JUPITER will have and enormously powerful booster module with highly efficient GPU-based computation accelerators. Massively parallel applications are accelerated by this booster in a similar way to a turbocharger, for example to calculate high-resolution climate models, develop new materials, simulate complex cell processes and energy systems, advanced basic research, or train next-generation, computationally intensive machine-learning algorithms.

One major challenge is the energy that is required for such large computing power. The average power is anticipated to be up to 15 megawatts. JUPITER has been designed as a “green” supercomputer and will be powered by green electricity. The envisaged warm water cooling system should help to ensure that JUPITER achieves the highest efficiency values. At the same time, the cooling technology opens up the possibility of intelligently using the waste heat that is produced. For example, just like its predecessor system JUWELS, JUPITER will be connected to the new low-temperature network on the Forschungszentrum Jülich campus. Further potential applications for the waste heat from JUPITER are currently being investigated by Forschungszentrum Jülich.

By Jülich Supercomputing Centre (JSC)

The first image is JUWELS: Germany’s fastest supercomputer JUWELS at Forschungszentrum Jülich, which is funded in equal parts by the Federal Ministry of Education and Research (BMBF) and the Ministry of Culture and Science of the State of North Rhine-Westphalia (MKW NRW) via the Gauss Centre for Supercomputing (GCS). (Copyright: Forschungszentrum Jülich / Sascha Kreklau)

The post JUPITER Ascending – First European Exascale Supercomputer Coming to Jülich first appeared on RISC2 Project.

Managing Data and Machine Learning Models in HPC Applications

wp_risc — Mon, 21 Nov 2022 14:09:42 +0000

The synergy of data science (including big data and machine learning) and HPC yields many benefits for data-intensive applications in terms of more accurate predictive data analysis and better decision making. For instance, in the context of the HPDaSc (High Performance Data Science) project between Inria and Brazil, we have shown the importance of realtime analytics to make critical high-consequence decisions in HPC applications, e.g., preventing useless drilling based on a driller’s realtime data and realtime visualization of simulated data, or the effectiveness of ML to deal with scientific data, e.g., computing Probability Density Functions (PDFs) over simulated seismic data using Spark.

However, to realize the full potential of this synergy, ML models (or models for short) must be built, combined and ensembled, which can be very complex as there can be many models to select from. Furthermore, they should be shared and reused, in particular, in different execution environments such as HPC or Spark clusters.

To address this problem, we proposed Gypscie [Porto 2022, Zorrilla 2022], a new framework that supports the entire ML lifecycle and enables model reuse and import from other frameworks. The approach behind Gypscie is to combine several rich capabilities for model and data management, and model execution, which are typically provided by different tools, in a unique framework. Overall, Gypscie provides: a platform for supporting the complete model life-cycle, from model building to deployment, monitoring and policies enforcement; an environment for casual users to find ready-to-use models that best fit a particular prediction problem, an environment to optimize ML task scheduling and execution; an easy way for developers to benchmark their models against other competitive models and improve them; a central point of access to assess models’ compliance to policies and ethics and obtain and curate observational and predictive data; provenance information and model explainability. Finally, Gypscie interfaces with multiple execution environments to run ML tasks, e.g., an HPC system such as the Santos Dumont supercomputer at LNCC or a Spark cluster.

Gypscie comes with SAVIME [Silva 2020], a multidimensional array in-memory database system for importing, storing and querying model (tensor) data. The SAVIME open-source system has been developed to support analytical queries over scientific data. Its offers an extremely efficient ingestion procedure, which practically eliminates the waiting time to analyze incoming data. It also supports dense and sparse arrays and non-integer dimension indexing. It offers a functional query language processed by a query optimiser that generates efficient query execution plans.

References

[Porto 2022] Fabio Porto, Patrick Valduriez: Data and Machine Learning Model Management with Gypscie. CARLA 2022 – Workshop on HPC and Data Sciences meet Scientific Computing, SCALAC, Sep 2022, Porto Alegre, Brazil. pp.1-2.

[Zorrilla 2022] Rocío Zorrilla, Eduardo Ogasawara, Patrick Valduriez, Fabio Porto: A Data-Driven Model Selection Approach to Spatio-Temporal Prediction. SBBD 2022 – Brazilian Symposium on Databases, SBBD, Sep 2022, Buzios, Brazil. pp.1-12.

[Silva 2020] A.C. Silva, H. Lourenço, D. Ramos, F. Porto, P. Valduriez. Savime: An Array DBMS for Simulation Analysis and Prediction. Journal of Information Data Management 11(3), 2020.

By LNCC and Inria

The post Managing Data and Machine Learning Models in HPC Applications first appeared on RISC2 Project.