call - RISC2 Project

Three years of building bridges in HPC research between Europe and Latin America: RISC2 project comes to an end

wp_risc — Thu, 16 Nov 2023 15:56:54 +0000

Artificial intelligence, personalised medicine, the development of new drugs or the fight against climate change. These are just a few examples of areas where high performance computing has an impact and could prove to be essential. With the aim of fostering cooperation between Europe and Latin America in this field, 16 organisations from the two continents have launched the RISC2 project.

“The RISC2 project has proven to be a team effort in which European and Latin American partners worked together to drive HPC collaboration forward. We have been able to create a lively and active community across the Atlantic to stimulate dialogue and boost cooperation that won’t die with RISC2’s formal end”, says Fabrizio Gagliardi, manager director of RISC2.

Since 2021, this knowledge-sharing network has organised webinars, summer schools, meetings with policymakers and participated in conferences and dissemination events on both sides of the Atlantic. The project also resulted in the HPC Observatory Repository — a collection of documents and training materials produced as part of the project – and the White Paper on HPC R&I Collaboration Opportunities, a document that reviews the key socio-economic and environmental factors and trends that influence HPC needs.

These were two of the issues highlighted by European Commission officials and experts during the final evaluation of the project, which could provide continuity to the work carried out by the consortium over the last three years, in line with the wishes of the partners and the advice of the evaluators. “Beyond RISC2, we should keep the momentum and leverage the importance of Latin America in the frame of the Green Deal actions: HPC stakeholders should encourage policymakers to build bilateral agreements and offer open calls focused on HPC collaboration“, reflects Fabrizio Gagliardi.

The post Three years of building bridges in HPC research between Europe and Latin America: RISC2 project comes to an end first appeared on RISC2 Project.

Scientific Machine Learning and HPC

wp_risc — Wed, 28 Jun 2023 08:24:28 +0000

In recent years we have seen rapid growth in interest in artificial intelligence in general, and machine learning (ML) techniques, particularly in different branches of science and engineering. The rapid growth of the Scientific Machine Learning field derives from the combined development and use of efficient data analysis algorithms, the availability of data from scientific instruments and computer simulations, and advances in high-performance computing. On May 25 2023, COPPE/UFRJ organized a forum to discuss Artificial Intelligence developments and its impact on the society [*].

As the coordinator of the High Performance Computing Center (Nacad) at COPPE/UFRJ, Alvaro Coutinho, presented advances in AI in Engineering and the importance of multidisciplinary research networks to address current issues in Scientific Machine Learning. Alvaro took the opportunity to highlight the need for Brazil to invest in high performance computing capacity.

The country’s sovereignty needs autonomy in producing ML advances, which depends on HPC support at the Universities and Research Centers. Brazil has nine machines in the Top 500 list of the most powerful computer systems in the world, but almost all at Petrobras company, and Universities need much more. ML is well-known to require HPC, when combined to scientific computer simulations it becomes essential.

The conventional notion of ML involves training an algorithm to automatically discover patterns, signals, or structures that may be hidden in huge databases and whose exact nature is unknown and therefore cannot be explicitly programmed. This field may face two major drawbacks: the need for a significant volume of (labelled) expensive to acquire data and limitations for extrapolating (making predictions beyond scenarios contained in the trained data difficult).

Considering that an algorithm’s predictive ability is a learning skill, current challenges must be addressed to improve the analytical and predictive capacity of Scientific ML algorithms, for example, to maximize its impact in applications of renewable energy. References [1-5] illustrate recent advances in Scientific Machine Learning in different areas of engineering and computer science.

References:

[*] https://www.coppe.ufrj.br/pt-br/planeta-coppe-noticias/noticias/coppe-e-sociedade-especialistas-debatem-os-reflexos-da-inteligencia

[1] Baker, Nathan, Steven L. Brunton, J. Nathan Kutz, Krithika Manohar, Aleksandr Y. Aravkin, Kristi Morgansen, Jennifer Klemisch, Nicholas Goebel, James Buttrick, Jeffrey Poskin, Agnes Blom-Schieber, Thomas Hogan, Darren McDonaldAlexander, Frank, Bremer, Timo, Hagberg, Aric, Kevrekidis, Yannis, Najm, Habib, Parashar, Manish, Patra, Abani, Sethian, James, Wild, Stefan, Willcox, Karen, and Lee, Steven. Workshop Report on Basic Research Needs for Scientific Machine Learning: Core Technologies for Artificial Intelligence. United States: N. p., 2019. Web. doi:10.2172/1478744.

[2] Brunton, Steven L., Bernd R. Noack, and Petros Koumoutsakos. “Machine learning for fluid mechanics.” Annual Review of Fluid Mechanics 52 (2020): 477-508.

[3] Karniadakis, George Em, et al. “Physics-informed machine learning.” Nature Reviews Physics 3.6 (2021): 422-440.

[4] Inria White Book on Artificial Intelligence: Current challenges and Inria’s engagement, 2nd edition, 2021. URL: https://www.inria.fr/en/white-paper-inria-artificial-intelligence

[5] Silva, Romulo, Umair bin Waheed, Alvaro Coutinho, and George Em Karniadakis. “Improving PINN-based Seismic Tomography by Respecting Physical Causality.” In AGU Fall Meeting Abstracts, vol. 2022, pp. S11C-09. 2022.

The post Scientific Machine Learning and HPC first appeared on RISC2 Project.

Subsequent Progress And Challenges Concerning The México-UE Project ENERXICO: Supercomputing And Energy For México

wp_risc — Wed, 24 May 2023 09:38:01 +0000

In this short notice, we briefly describe some afterward advances and challenges with respect to two work packages developed in the ENERXICO Project. This opened the possibility of collaborating with colleagues from institutions that did not participate in the project, for example from the University of Santander in Colombia and from the University of Vigo in Spain. This exemplifies the importance of the RISC2 project in the sense that strengthening collaboration and finding joint research areas and HPC applied ventures is of great benefit for both: our Latin American Countries and the EU. We are now initiating talks to target several Energy related topics with some of the RISC2 partners.

The ENERXICO Project focused on developing advanced simulation software solutions for oil & gas, wind energy and transportation powertrain industries. The institutions that collaborated in the project are for México: ININ (Institution responsible for México), Centro de Investigación y de Estudios Avanzados del IPN (Cinvestav), Universidad Nacional Autónoma de México (UNAM IINGEN, FCUNAM), Universidad Autónoma Metropolitana-Azcapotzalco, Instituto Mexicano del Petróleo, Instituto Politécnico Nacional (IPN) and Pemex, and for the European Union: Centro de Supercómputo de Barcelona (Institution responsible for the EU), Technische Universitäts München, Alemania (TUM), Universidad de Grenoble Alpes, Francia (UGA), CIEMAT, España, Repsol, Iberdrola, Bull, Francia e Universidad Politécnica de Valencia, España.

The Project contemplated four working packages (WP):

WP1 Exascale Enabling: This was a cross-cutting work package that focused on assessing performance bottlenecks and improving the efficiency of the HPC codes proposed in vertical WP (UE Coordinator: BULL, MEX Coordinator: CINVESTAV-COMPUTACIÓN);

WP2 Renewable energies: This WP deployed new applications required to design, optimize and forecast the production of wind farms (UE Coordinator: IBR, MEX Coordinator: ININ);

WP3 Oil and gas energies: This WP addressed the impact of HPC on the entire oil industry chain (UE Coordinator: REPSOL, MEX Coordinator: ININ);

WP4 Biofuels for transport: This WP displayed advanced numerical simulations of biofuels under conditions similar to those of an engine (UE Coordinator: UPV-CMT, MEX Coordinator: UNAM);

For WP1 the following codes were optimized for exascale computers: Alya, Bsit, DualSPHysics, ExaHyPE, Seossol, SEM46 and WRF.

As an example, we present some of the results for the DualPHYysics code. We evaluated two architectures: The first set of hardware used were identical nodes, each equipped with 2 ”Intel Xeon Gold 6248 Processors”, clocking at 2.5 GHz with about 192 GB of system memory. Each node contained 4 Nvidia V100 Tesla GPUs with 32 GB of main memory each. The second set of hardware used were identical nodes, each equipped with 2 ”AMD Milan 7763 Processors”, clocking at 2.45 GHz with about 512 GB of system memory. Each node contained 4 Nvidia V100 Ampere GPUs with 40 GB of main memory each. The code was compiled and linked with CUDA 10.2 and OpenMPI 4. The application was executed using one GPU per MPI rank.

In Figures 1 and 2 we show the scalability of the code for the strong and weak scaling tests that indicate that the scaling is very good. Motivated by these excellent results, we are in the process of performing in the LUMI supercomputer new SPH simulations with up to 26,834 million particles that will be run with up to 500 GPUs, which is 53.7 million particles per GPU. These simulations will be done initially for a Wave Energy Converter (WEC) Farm (see Figure 3), and later for turbulent models.

Figure 1. Strong scaling test with a fix number of particles but increasing number of GPUs.

Figure 2. Weak scaling test with increasing number of particles and GPUs.

Figure 3. Wave Energy Converter (WEC) Farm (taken from https://corpowerocean.com/)

As part of WP3, ENERXICO developed a first version of a computer code called Black Hole (or BH code) for the numerical simulation of oil reservoirs, based on the numerical technique known as Smoothed Particle Hydrodynamics or SPH. This new code is an extension of the DualSPHysics code (https://dual.sphysics.org/) and is the first SPH based code that has been developed for the numerical simulation of oil reservoirs and has important benefits versus commercial codes based on other numerical techniques.

The BH code is a large-scale massively parallel reservoir simulator capable of performing simulations with billions of “particles” or fluid elements that represent the system under study. It contains improved multi-physics modules that automatically combine the effects of interrelated physical and chemical phenomena to accurately simulate in-situ recovery processes. This has led to the development of a graphical user interface, considered as a multiple-platform application for code execution and visualization, and for carrying out simulations with data provided by industrial partners and performing comparisons with available commercial packages.

Furthermore, a considerable effort is presently being made to simplify the process of setting up the input for reservoir simulations from exploration data by means of a workflow fully integrated in our industrial partners’ software environment. A crucial part of the numerical simulations is the equation of state. We have developed an equation of state based on crude oil data (the so-called PVT) in two forms, the first as a subroutine that is integrated into the code, and the second as an interpolation subroutine of properties’ tables that are generated from the equation of state subroutine.

An oil reservoir is composed of a porous medium with a multiphase fluid made of oil, gas, rock and other solids. The aim of the code is to simulate fluid flow in a porous medium, as well as the behaviour of the system at different pressures and temperatures. The tool should allow the reduction of uncertainties in the predictions that are carried out. For example, it may answer questions about the benefits of injecting a solvent, which could be CO2, nitrogen, combustion gases, methane, etc. into a reservoir, and the times of eruption of the gases in the production wells. With these estimates, it can take the necessary measures to mitigate their presence, calculate the expense, the pressure to be injected, the injection volumes and most importantly, where and for how long. The same happens with more complex processes such as those where fluids, air or steam are injected, which interact with the rock, oil, water and gas present in the reservoir. The simulator should be capable of monitoring and preparing measurement plans.

In order to be able to perform a simulation of a reservoir oil field, an initial model needs to be created. Using geophysical forward and inverse numerical techniques, the ENERXICO project evaluated novel, high-performance simulation packages for challenging seismic exploration cases that are characterized by extreme geometric complexity. Now, we are undergoing an exploration of high-order methods based upon fully unstructured tetrahedral meshes and also tree-structured Cartesian meshes with adaptive mesh refinement (AMR) for better spatial resolution. Using this methodology, our packages (and some commercial packages) together with seismic and geophysical data of naturally fractured reservoir oil fields, are able to create the geometry (see Figure 4), and exhibit basic properties of the oil reservoir field we want to study. A number of numerical simulations are performed and from these oil fields exploitation scenarios are generated.

Figure 4. A detail of the initial model for a SPH simulation of a porous medium.

More information about the ENERXICO Project can be found in: https://enerxico-project.eu/

By: Jaime Klapp (ININ, México) and Isidoro Gitler (Cinvestav, México)

The post Subsequent Progress And Challenges Concerning The México-UE Project ENERXICO: Supercomputing And Energy For México first appeared on RISC2 Project.

Latin American researchers present greener gateways for Big Data in INRIA Brazil Workshop

wp_risc — Wed, 03 May 2023 13:29:03 +0000

In the scope of the RISC2 Project, the State University of Sao Paulo and INRIA (Institut National de Recherche en Informatique et en Automatique), a renowned French research institute, held a workshop, on that set the stage for the presentation of the results accomplished under the work Developing Efficient Scientific Gateways for Bioinformatics in Supercomputer Environments Supported by Artificial Intelligence.

The goal of the investigation is to provide users with simplified access to computing structures through scientific solutions that represent significant developments in their fields. In the case of this project, it is intended to develop intelligent green scientific solutions for BioinfoPortal (a multiuser Brazilian infrastructure)supported by High-Performance Computing environments.

Technologically, it includes areas such as scientific workflows, data mining, machine learning, and deep learning. The outlook, in case of success, is the analysis and interpretation of Big Data allowing new paths in molecular biology, genetics, biomedicine, and health— so it becomes necessary tools capable of digesting the amount of information, efficiently, which can come.

The team performed several large-scale bioinformatics experiments that are considered to be computationally intensive. Currently, artificial intelligence is being used to generate models to analyze computational and bioinformatics metadata to understand how automatic learning can predict computational resources efficiently. The workshop was held from April 10th to 11th, and took place in the University of Sao Paulo.

RISC2 Project, which aims to explore the HPC impact in the economies of Latin America and Europe, relies on the interaction between researchers and policymakers in both regions. It also includes 16 academic partners such as the University of Buenos Aires, National Laboratory for High Performance Computing of Chile, Julich Supercomputing Centre, Barcelona Supercomputing Center (the leader of the consortium), among others.

The post Latin American researchers present greener gateways for Big Data in INRIA Brazil Workshop first appeared on RISC2 Project.

Towards a greater HPC capacity in Latin America

wp_risc — Fri, 24 Feb 2023 15:36:39 +0000

High-Performance Computing (HPC) has proven to be a strong driver for science and technology development, and is increasingly considered indispensable for most scientific disciplines. HPC is making a difference in key topics of great interest such as climate change, personalised medicine, engineering, astronomy, education, economics, industry and public policy, becoming a pillar for the development of any country, and to which the great powers are giving strategic importance and investing billions of dollars, in competition without limits where data is the new gold.

A country that does not have the computational capacity to solve its own problems will have no alternative but to try to acquire solutions provided by others. One of the most important aspects of sovereignty in the 21st century is the ability to produce mathematical models and to have the capacity to solve them. Today, the availability of computing power commensurate with one’s wealth exponentially increases a country’s capacity to produce knowledge. in the developed world, it is estimated that for every dollar invested in supercomputing, the return to society is of the order of US$ 44(1) and to the academic world US$ 30(2). For these reasons, HPC occupies an important place on the political and diplomatic agendas of developed countries.

In Latin America, investment in HPC is very low compared to what’s the US, Asia and Europe are doing. In order to quantify this difference, we present the tables below, which show the accumulated computing capacity in the ranking of the 500 most powerful supercomputers in the world – the TOP500(3) – (Table 1), and the local reality (Table 2). Other data are also included, such as the population (in millions), the number of researchers per 1,000 inhabitants (Res/1000), the computing capacity per researcher (Gflops/Res) and the computing capacity per US$ million of GPD. In Table 1, we have grouped the countries by geographical area. America appears as the area with the highest computing capacity, essentially due to the USA, which has almost 45% of the world’s computing capacity in the TOP500. It if followed by Asia and then Europe. Tis TOP500 list includes mainly academic research centres, but also industry ones, typically those used in applied research (many private ones do not wish to publish such information for obvious reasons). For example, in Brazil – which shows good computing capacity with 88,175 TFlops – the vast majority is in the hands of the oil industry and only about 3,000 TFlops are used for basic research. Countries listed in the TOP500 invest in HPC from a few TFlops per million GDP (Belgium 5, Spain 7, Bulgaria 8), through countries investing in the order of hundreds (Italy 176, Japan 151, USA 138), to even thousands, as is the case in Finland with 1,478. For those countries where we were able to find data on the number of researchers, these range from a few Gflops per researcher (Belgium 19, Spain 24, Hungary 52) to close to 1,000 GFlops, i.e. 1 TFlop (USA 970, Italy 966), with Finland surpassing this barrier with 4,647. Note that, unlike what happens locally, countries with a certain degree of development invest every 3-4 years in supercomputing, so the data we are showing will soon be updated and there will be variations in the list. For example, this year a new supercomputer will come into operation in Spain(4), which, with an investment of some 150 million euros, will give Spain one of the most powerful supercomputers in Europe – and the world.

Country	Rpeak (TFlops)	Population (millions)	Res/1000	GFlops/Res	Tflops/M US$
United States	3.216.124	335	9.9	969.7	138.0
Canada	71.911	39	8.8	209.5	40.0
Brazil	88.175	216	1.1	371.1	51.9
AMERICA	3.376.211	590

China	1.132.071	1400			67.4
Japan	815.667	124	10.0	657.8	151.0
South Korea	128.264	52	16.6	148.6	71.3
Saudi Arabia	98.982	35			141.4
Taiwan	19.562	23			21.7
Singapore	15.785	6			52.6
Thailand	13.773	70			27.5
United Arab Emirates	12.164	10			15.2
India	12.082	1380			4.0
ASIA	2.248.353	3100

Finland	443.391	6	15.9	4647.7	1478.0
Italy	370.262	59	6.5	965.5	176.3
Germany	331.231	85	10.1	385.8	78.9
France	251.166	65	11.4	339.0	83.7
Russia	101.737	145			59.8
United Kingdom	92.563	68	9.6	141.8	29.9
Netherlands	56.740	18	10.6	297.4	56.7
Switzerland	38.600	9	9.4	456.3	48.3
Sweden	32.727	10	15.8	207.1	54.5
Ireland	26.320	5	10.6	496.6	65.8
Luxembourg	18.291	0.6			365.8
Poland	17.099	38	7.6	59.2	28.5
Norway	17.031	6	13.0	218.3	34.1
Czech Republic	12.914	10	8.3	155.6	43.0
Spain	10.296	47	7.4	29.6	7.4
Slovenia	10.047	2	9.9	507.4	167.5
Austria	6.809	9	11.6	65.2	13.6
Bulgaria	5.942	6			8.5
Hungary	4.669	10	9.0	51.9	23.3
Belgium	3.094	12	13.6	19.0	5.2
EUROPA	1.850.934	610.6

OTHER
Australia	60.177	26			40.1
Morocco	5.014	39			50.1

Table 1. HPC availability per researcher and relative to GDP in the TOP500 countries (includes HPC in industry).

The local reality is far from this data. Table 2 shows data from Argentina, Brazil, Chile and Mexico. In Chile, the availability of computing power is 2-3 times less per researcher than in countries with less computing power in the OECD and up to 100 times less than a researcher in the US. In Chile, our investment measured in TFlops per million US$ of GDP is 166 times less than in the US; with respect to European countries that invest less in HPC it is 9 times less, and with respect to the European average (including Finland) it is 80 times less, i.e. the difference is considerable. It is clear that we need to close this gap. An investment go about 5 million dollars in HPC infrastructure in the next 5 years would close this gap by a factor of almost 20 times our computational capacity. However, returning to the example of Spain, the supercomputer it will have this year will offer 23 times more computing power than at present and, therefore, we will only maintain our relative distance. If we do not invest, the dap will increase by at least 23 times and will end up being huge. Therefore, we do not only need a one-time investment, but we need to ensure a regular investment. Some neighbouring countries are already investing significantly in supercomputing. This is the case in Argentina, where they are investing 7 million dollars (2 million for the datacenter and 5 million to buy a new supercomputer), which will increase their current capacities by almost 40 times(5).

Country	Rpeak (TFlops)	Population (millions)	Res/1000	GFlops/Res	Tflops/M US$
Brazil*	3.000	216	1.1	12.6	1.8
Mexico	2.200	130	1.2	14.1	1.8
Argentina	400	45	1.2	7.4	0.8
Chile	250	20	1.3	9.6	0.8

Table 2. HPC availability per researcher and relative to GDP in the region (*only HPC capacity in academia is considered in this table).

For the above reasons, we are working to convince the Chilean authorities that we must have greater funding and, more crucially, permanent state funding in HPC. In relation to this, on July 6 we signed a collaboration agreement between 44 institutions with the support of the Ministry of Science to work on the creation of the National Supercomputing Laboratory(6). The agreement recognised that supercomputers are a critical infrastructure for Chile’s development, that it is necessary to centralise the requirements/resources at the national level, obtain permanent funding from the State and create a new institutional framework to provide governance. In an unprecedented inter-institutional collaboration in Chile, the competition for HPC resources at the national level is eliminated ad the possibility of direct funding from the State is opened up without generating controversy.

Undoubtedly, supercomputing is a fundamental pillar for the development of any country, where increasing investment provides a strategic advantage, and in Latin America we should not be left behind.

By NLHPC

References

(1) Hyperion Research HPC Investments Bring High Returns

(2) EESI-2 Special Study To Measure And Model How Investments In HPC Can Create Financial ROI And Scientific Innovation In Europe

(3) https://top500.org/

(4) https://www.lavanguardia.com/ciencia/20230129/8713515/llega-superordenador-marenostrum-5-bsc-barcelona.html

(5) https://www.hpcwire.com/2022/12/15/argentina-announces-new-supercomputer-for-national-science/

(6) https://uchile.cl/noticias/187955/44-instituciones-crearan-el-laboratorio-nacional-de-supercomputacion

The post Towards a greater HPC capacity in Latin America first appeared on RISC2 Project.

Mapping human brain functions using HPC

wp_risc — Wed, 01 Feb 2023 13:17:19 +0000

ContentMAP is the first Portuguese project in the field of Psychology and Cognitive Neuroscience to be awarded with European Research Council grant (ERC Starting Grant #802553). In this project one is mapping how the human brain represents object knowledge – for example, how one represents in the brain all one knows about a knife (that it cuts, that it has a handle, that is made out of metal and plastic or metal and wood, that it has a serrated and sharp part, that it is smooth and cold, etc.)? To do this, the project collects numerous MRI images while participants see and interact with objects (fMRI). HPC (High Performance Computing) is of central importance for processing these images . The use of HPC has allowed to manipulate these data, perform analysis with machine learning and complex computing in a timely manner.

Humans are particularly efficient at recognising objects – think about what surrounds us: one recognises the object where one is reading the text from as a screen, the place where one sits as a chair, the utensil in which one drinks coffee as a cup, and one does all of this extremely quickly and virtually automatically. One is able to do all this despite the fact that 1) one holds large amounts of information about each object (if one is asked to write down everything you know about a pen, you would certainly have a lot to say); and that 2) there are several exemplars of each object type (a glass can be tall, made out of glass, metal, paper or plastic, it can be different colours, etc. – but despite that, any of them would still be a glass). How does one do this? How one is able to store and process so much information in the process of recognising a glass, and generalise all the different instances of a glass to get the concept “glass”? The goal of the ContentMAP is to understand the processes that lead to successful object recognition.

The answer to these question lies in better understanding of the organisational principles of information in the brain. It is, in fact, the efficient organisation of conceptual information and object representations in the brain that allows one to quickly and efficiently recognise the keyboard that is in front of each of us. To study the neuronal organisation of object knowledge, the project collects large sets of fMRI data from several participants, and then try to decode the organisational principles of information in the brain.

Given the amount of data and the computational requirements of this type of data at the level of pre-processing and post processing, the use of HPC is essential to enable these studies to be conducted in a timely manner. For example, at the post-processing level, the project uses whole brain Support Vector Machine classification algorithms (searchlight procedures) that require hundreds of thousands of classifiers to be trained. Moreover, for each of these classifiers one needs to compute a sample distribution of the average, as well as test the various classifications of interest, and this has to be done per participant.

Because of this, the use of HPC facilities of of the Advanced Computing Laboratory (LCA) at University of Coimbra is crucial. It allows us to actually perform these analyses in one to two weeks – something that on our 14-core computers would take a few months, which in pratice would mean, most probably, that the analysis would not be done.

By Faculty of Psychology and Educational Sciences, University of Coimbra

Reference

ProAction Lab http://proactionlab.fpce.uc.pt/

The post Mapping human brain functions using HPC first appeared on RISC2 Project.

Call for Proposals to Support High Performance Computing Centers FAPESP-MCTI-MCom-CGI.br

wp_risc — Thu, 12 Jan 2023 12:41:06 +0000

The call is now open until March 31, 2023.

The Call for High Performance Computing (HPC) Centers aims to support the acquisition of high performance computing equipment that can provide computational infrastructure to conduct research in all areas of knowledge that are intensive in computing resources. The resources necessary for the development of the infrastructure of the facilities to receive the high performance computing equipment are considered to be the responsibility of the proponent institutions and constitute the required counterpart for the presentation of the proposal. In addition, proposers must demonstrate a proven track record as an HPC center.

This program has the nature of creating infrastructure and is not intended to provide conventional funding for research projects that will eventually take advantage of the infrastructure supported here, and the support for the realization of such research projects should be sought in the lines of funding for research.

A portion of the maintenance costs of the equipment to be purchased may be requested in this Call. However, it is expected that proposals submitted to this Call will also propose other ways to cover equipment maintenance costs. No funds may be requested to cover costs for the maintenance of the building infrastructure and support for computer equipment, such as air conditioning and the like, which should be covered by funds contributed by the proponent institutions or from other sources. Furthermore, the costs of salaries and other charges related to the support staff that this Call for Proposals foresees should be available for the operation of the center cannot be requested in the proposals submitted to this Call for Proposals and are the sole responsibility of the proposing institutions. The proponents may foresee, in their business plan, charging for the provision of the services, provided that some level of gratuity is offered to users from academic institutions.

WHO?

This Call is open to Education or Research Institutions from all over Brazil, consortium or not, to support 1 center in the state of São Paulo and 1 or 2 centers in other Brazilian states, in a total amount of up to R$ 100 million. The center based in São Paulo may receive, in this Call, resources of up to R$ 50 million, and must meet the demand for high performance computing services within the entire state of São Paulo. The centers located in other states may receive resources of up to R$ 25 million, in the case of non-consortium projects, or up to R$ 50 million in the case of a consortium of several institutions that meet the demand for high performance computing services nationwide.

This Call is launched in the scope of FAPESP’s Multiuser Equipment Program – EMU and has an infra-structural nature.

Know more about this call here.

The post Call for Proposals to Support High Performance Computing Centers FAPESP-MCTI-MCom-CGI.br first appeared on RISC2 Project.

Managing Data and Machine Learning Models in HPC Applications

wp_risc — Mon, 21 Nov 2022 14:09:42 +0000

The synergy of data science (including big data and machine learning) and HPC yields many benefits for data-intensive applications in terms of more accurate predictive data analysis and better decision making. For instance, in the context of the HPDaSc (High Performance Data Science) project between Inria and Brazil, we have shown the importance of realtime analytics to make critical high-consequence decisions in HPC applications, e.g., preventing useless drilling based on a driller’s realtime data and realtime visualization of simulated data, or the effectiveness of ML to deal with scientific data, e.g., computing Probability Density Functions (PDFs) over simulated seismic data using Spark.

However, to realize the full potential of this synergy, ML models (or models for short) must be built, combined and ensembled, which can be very complex as there can be many models to select from. Furthermore, they should be shared and reused, in particular, in different execution environments such as HPC or Spark clusters.

To address this problem, we proposed Gypscie [Porto 2022, Zorrilla 2022], a new framework that supports the entire ML lifecycle and enables model reuse and import from other frameworks. The approach behind Gypscie is to combine several rich capabilities for model and data management, and model execution, which are typically provided by different tools, in a unique framework. Overall, Gypscie provides: a platform for supporting the complete model life-cycle, from model building to deployment, monitoring and policies enforcement; an environment for casual users to find ready-to-use models that best fit a particular prediction problem, an environment to optimize ML task scheduling and execution; an easy way for developers to benchmark their models against other competitive models and improve them; a central point of access to assess models’ compliance to policies and ethics and obtain and curate observational and predictive data; provenance information and model explainability. Finally, Gypscie interfaces with multiple execution environments to run ML tasks, e.g., an HPC system such as the Santos Dumont supercomputer at LNCC or a Spark cluster.

Gypscie comes with SAVIME [Silva 2020], a multidimensional array in-memory database system for importing, storing and querying model (tensor) data. The SAVIME open-source system has been developed to support analytical queries over scientific data. Its offers an extremely efficient ingestion procedure, which practically eliminates the waiting time to analyze incoming data. It also supports dense and sparse arrays and non-integer dimension indexing. It offers a functional query language processed by a query optimiser that generates efficient query execution plans.

References

[Porto 2022] Fabio Porto, Patrick Valduriez: Data and Machine Learning Model Management with Gypscie. CARLA 2022 – Workshop on HPC and Data Sciences meet Scientific Computing, SCALAC, Sep 2022, Porto Alegre, Brazil. pp.1-2.

[Zorrilla 2022] Rocío Zorrilla, Eduardo Ogasawara, Patrick Valduriez, Fabio Porto: A Data-Driven Model Selection Approach to Spatio-Temporal Prediction. SBBD 2022 – Brazilian Symposium on Databases, SBBD, Sep 2022, Buzios, Brazil. pp.1-12.

[Silva 2020] A.C. Silva, H. Lourenço, D. Ramos, F. Porto, P. Valduriez. Savime: An Array DBMS for Simulation Analysis and Prediction. Journal of Information Data Management 11(3), 2020.

By LNCC and Inria

The post Managing Data and Machine Learning Models in HPC Applications first appeared on RISC2 Project.

Using supercomputing for accelerating life science solutions

wp_risc — Tue, 01 Nov 2022 14:11:06 +0000

The world of High Performance Computing (HPC) is now moving towards exascale performance, i.e. the ability of calculating 10¹⁸ operations per second. A variety of applications will be improved to take advantage of this computing power, leading to better prediction and models in different fields, like Environmental Sciences, Artificial Intelligence, Material Sciences and Life Sciences.

In Life Sciences, HPC advancements can improve different areas:

a reduced time to scientific discovery;
the ability of generating predictions necessary for precision medicine;
new healthcare and genomics-driven research approaches;
the processing of huge datasets for deep and machine learning;
the optimization of modeling, such as Computer Aided Drug Design (CADD);
enhanched security and protection of healthcare data in HPC environments, in compliance with European GDPR regulations;
management of massive amount of data for example for clinical trials, drug development and genomics data analytics.

The outbreak of COVID-19 has further accelerated this progress from different points of view. Some European projects aim at reusing known and active ingredients to prepare new drugs as contrast therapy against COVID disease [Exscalate4CoV, Ligate], while others focus on the management and monitoring of contagion clusters to provide an innovative approach to learn from SARS-CoV-2 crisis and derive recommendations for future waves and pandemics [Orchestra].

The ability to deal with massive amounts of data in HPC environments is also used to create databases with data from nucleic acids sequencing and use them to detect allelic variant frequencies, as in the NIG project [Nig], a collaboration with the Network for Italian Genomes. Another example of usage of this capability is the set-up of data sharing platform based on novel Federated Learning schemes, to advance research in personalised medicine in haematological diseases [Genomed4All].

Supercomputing is widely used in Drug Design (the process of finding medicines for disease for which there are no or insufficient treatments), with many projects active in this field just like RISC2.

Sometimes, when there is no previous knowledge of the biological target, just like what happened with COVID-19, discovering new drugs requires creating from scratch new molecules [Novartis]. This process involves billion dollar investments to produce and test thousands of molecules and it usually has a low success rate: only about 12% of potential drugs entering the clinical development are approved [Engitix]. The whole process from identifying a possible compound to the end of the clinical trial can take up to 10 years. Nowadays there is an uneven coverage of disease: most of the compounds are used for genetic conditions, while only a few antiviral and antibiotics have been found.

The search for candidate drugs occurs mainly through two different approaches: high-throughput screening and virtual screening. The first one is more reliable but also very expensive and time consuming: it is usually applied when dealing with well-known targets by mainly pharmaceutical companies. The second approach is a good compromise between cost and accuracy and is typically applied against relatively new targets, in academics laboratories, where it is also used to discover or understand better mechanisms of these targets. [Liu2016]

Candidate drugs are usually small molecules that bind to a specific protein or part of it, inhibiting the usual activity of the protein itself. For example, binding the correct ligand to a vial enzyme may stop viral infection. In the process of virtual screening million of compounds are screened against the target protein at different levels: the most basic one simply takes into account the shape to correctly fit into the protein, at higher level also other features are considered as specific interactions, protein flexibility, solubility, human tolerance, and so on. A “score” is assigned to each docked ligand: compounds with highest score are further studied. With massively parallel computers, we can rapidly filter extremely large molecule databases (e.g. billions of molecules).

The current computational power of HPC clusters allow us to analyze up to 3 million compounds per second [Exscalate]. Even though vaccines were developed remarkably quickly, effective drug treatments for people already suffering from covid-19 were very fresh at the beginning of the pandemic. At that time, supercomputers around the world were asked to help with drug design, a real-world example of the power of Urgent Computing. CINECA participates in Exscalate4cov [Exscalate4Cov], currently the most advanced center of competence for fighting the coronavirus, combining the most powerful supercomputing resources and Artificial Intelligence with experimental facilities and clinical validation.

References

[Engitix] https://engitix.com/technology/

[Exscalate] https://www.exscalate.eu/en/projects.html

[Exscalate4CoV] https://www.exscalate4cov.eu/

[Genomed4All] https://genomed4all.eu/

[Ligate] https://www.ligateproject.eu/

[Liu2016] T. Liu, D. Lu, H. Zhang, M. Zheng, H. Yang, Ye. Xu, C. Luo, W. Zhu, K. Yu, and H. Jiang, “Applying high-performance computing in drug discovery and molecular simulation” Natl Sci Rev. 2016 Mar; 3(1): 49–63.

[Nig] http://www.nig.cineca.it/

[Novartis] https://www.novartis.com/stories/art-drug-design-technological-age

[Orchestra] https://orchestra-cohort.eu/

By CINECA

The post Using supercomputing for accelerating life science solutions first appeared on RISC2 Project.

Webinar: Application Benchmarking with JUBE: Lessons Learned

wp_risc — Tue, 26 Jul 2022 12:36:09 +0000

Date: October 19, 2022 | 4 p.m. (UTC+1)

Speaker: Marc-André Hermanns, RWTH Aachen

Moderator: Bernd Mohr, Jülich Supercomputer Centre

JUBE can help in the automating application benchmarking on a given platform. JUBE’s features in automatic sandboxing and parameter-space creation can assist to easily sweep build and runtime parameters for an application on a given platform to identify the best build and run configuration.

This talk provides some lessons learned in building a JUBE-based benchmark Suite for the RWTH Aachen University Job-Mix that reduces redundancy of information and allows for easy integration of future applications. It will specifically address advanced features for parameter settings, parameter inheritance, and some tips and tricks to overcome some of its limitations.

About the speaker: Marc-André Hermanns is a member of the HPC group at the IT Center of RWTH Aachen University. His research focuses on tools and interfaces for the performance analysis of parallel applications. He has been involved in the design and implementation of various courses on topics of parallel programming for high-performance computing. Next to supporting HPC users as part of the competence network for high-performance computing in North-Rhinewestphalia (HPC.NRW), he also contributes to the development of online tutorials and courses within the competence network. He is a long time user and advocator for JUBE and created configurations for various applications and benchmarks, both for classical system benchmarking, as well as integration of performance analysis tools in such workflows.

About the moderator: Bernd Mohr started to design and develop tools for performance analysis of parallel programs already with his diploma thesis (1987) at the University of Erlangen in Germany, and continued this in his Ph.D. work (1987 to 1992). During a three year postdoc position at the University of Oregon, he designed and implemented the original TAU performance analysis framework. Since 1996 he has been a senior scientist at Forschungszentrum Juelich. Since 2000, he has been the team leader of the group ”Programming Environments and Performance Analysis”. Besides being responsible for user support and training in regard to performance tools at the Juelich Supercomputing Centre (JSC), he is leading the Scalasca performance tools efforts in collaboration with Prof. Felix Wolf of TU Darmstadt. Since 2007, he has also served as deputy head for the JSC division ”Application support”. He was an active member in the International Exascale Software Project (IESP/BDEC) and work package leader in the European (EESI2) and Juelich (EIC, ECL) Exascale efforts. For the SC and ISC Conference series, he served on the Steering Committee. He is the author of several dozen conference and journal articles about performance analysis and tuning of parallel programs.

Registrations are now closed.

The post Webinar: Application Benchmarking with JUBE: Lessons Learned first appeared on RISC2 Project.