Project profile
France, Germany, Greece, Italy, Spain, Switzerland
01/04/2021 - 31/03/2024
Motivation – Exascale system challenges
- High node counts (hundreds of thousands of nodes, millions of cores) require new levels of parallelism and communication
- Heterogeneous computing devices: modern accelerators (such as GPUs) processing data at high rates (tens of TFlop/s) significantly increase data rates
- These results in huge increases of interconnect network traffic
Next generation of Exascale systems will require innovative, highly efficient interconnect networks
- Must support high node counts and massively parallel processing systems
- Providing a set of “smart” features (i.e., efficient network resource management, in-network computing, fabric mgt) allowing applications to scale efficiently at Exascale levels and beyond
- Support power-efficient accelerators and compute units, and wide-spread data-centric and AI-related applications
Goal: covers the variety of solutions (to the above network requirements)
- Improve the current Atos BXI technology (Bullsequana eXascale Interconnect)
- Contribute to the design and specification of the next BXI generation to anticipating future systems requirements
- Enrich the European interconnect technology ecosystem to foster new kinds of networks, system SW and applications
Status / Highlights
- Network requirement are defined, and network traces for MPI-based applications (NEST, LAMMPS, TSMP) collected
- Hardware testbeds were setup, and are running with good stability
- Architecture specifications defined for both Ethernet Gateway IP and Low-Latency Ethernet MAC & PCS to be used in future switches
- NIC: architecture specification of transport layer defined; first RTL version of Low-Latency Ethernet MAC & PCS completed
- Efficient Network Resource management: congestion characterisation finished; first results of congestion management obtained
- Endpoint functions and reliability: the API of sPIN (Processing In-Network) was defined; a first prototype of MPC MPI using multiple rails was developed; ParaStation MPI was extended to support BXI
Copyright © CEA
Collaboration with other projects
- The three SEA Projects (DEEP-SEA, IO-SEA and RED-SEA) develop complementary technologies for the Modular Supercomputing Architecture