Our technology has no boundaries! NVIDIA is building the world’s most groundbreaking and state of the art accelerated compute platforms for the world to use. It’s because of our work that scientists, researchers and engineers can advance their ideas. We pioneered a supercharged form of computing loved by the fastest paced computer users in the world - scientists, designers, artists, and gamers.
We are seeking a highly motivated High-Performance System Architect to join our team of experts and help shape the future of high-performance and ML / AI computing. Our next-generation Infiniband and NVL systems will be at the forefront of connecting and powering the world's most advanced compute clusters, from supercomputers used in AI research to high-performance clusters used at almost every industry today, such as car and Pharmaceutical. As a high-performance system architect at NVIDIA, you will have the opportunity to work on some of the most cutting-edge technology and help to drive the innovation of our next generation networks that will be used by top researchers and engineers around the world.
What You’ll Be Doing
- Define the Infiniband and NVL system architecture end-to-end, by internal requirements and customers requirements through all product life cycles (post/pre silicon, to deployments).
- research of various solutions to enable the next large-scale-high-performance computing clusters. The position spans over various layers from algorithms, software, firmware, and HW.
- Collaborate with cross-functional teams, including other architecture teams, logic design, system software, firmware, and research teams, to ensure the successful execution of the project.
What We Need To See
- B.Sc, M.Sc, or Ph.D degree in Computer Science, Computer Engineer, or Electrical Engineer or equivalent experience.
- 5+ years of industry or research experience in computer networks.
- Excellent understanding of large-scale networks behavior and the effect of distributed computing workloads effect on the network.
- Experience in development of simulation environments.
- Possess strong managerial, problem solving and critical thinking skills.
- Ability to work and operate in a highly dynamic environment.
- Partner with multiple groups in the organization.
Ways To Stand Out Of The Crowd
- Good knowledge in network protocols - such as InfiniBand, IP, TCP and RoCE and network topologies.
- Good knowledge in Python, C++.
- Familiarity with HPC environments, routing algorithms, Omnet++ and NS3 simulation environments.
- Experience with AI workloads such as LLM and DLRM and familiarity with communication libraries like NCCL
We are committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.