
חדש באתר! העלו קורות חיים אנונימיים לאתר ואפשרו למעסיקים לפנות אליכם!
Hod HaSharon | Haifa
Who are we?
Our team at the Huawei Computing Network Innovation Lab is looking for exceptional talent to join us and lead the development of next generation data centers. We create cutting-edge technologies that synergize software and hardware in tandem to accelerate compute, storage and networking at large-scale. We aim to drive innovations and deliver software defined infrastructure and algorithms to HPC, AI/ML, and Big Data applications.
We are looking for outstanding candidates with hands-on experience in development and optimization of AI frameworks. If you are a team player with excellent communication skills and motivation to revolutionize application performance, you’re welcome on board!
What will you be doing?
• Work as part of an innovative research team to analyze, develop, test and deploy improvements that enhance Huawei’s distributed AI framework.
• Develop optimizations that leverage hardware accelerator capabilities, minimize communication overhead and improve training/inference throughput
• Research state-of-the-art, distributed AI training and inference algorithms (e.g. FSDP, DDP) to develop accessible model sharding capabilities
• Profile different distributed AI training strategies, compare parallelization methods, and identify the main bottlenecks to be optimized on the computation and network communication levels.
• Work in a distributed computing environment to optimize for both scale-up (multi-device) and scale-out (multi-node) systems
• Utilize advanced concepts such as Uncertainty Quantification, Mixed Precision Computing and Model Sparsity to improve performance and enable training of very large AI models
• Collaborate with partners from top universities, and open-source communities to conduct state-of-the-art research
What do we want to see?
• B.Sc. degree in computer science, computer engineering, or a closely related field
• Excellent C/C++ programming and software design skills, including debugging, performance analysis, and testing
• Strong technical skills and experience with developing code in a Linux environment
• Excellent teamwork and interpersonal skills
• Ability to work independently, define project goals and scope, and lead your own development effort
• Innovative thinking
Ways to stand out from the crowd:
• M.Sc. or Ph.D. degree
• Proven track record of conducting and publishing independent research
• Experience in optimizing distributed deep learning pipelines with TensorFlow / PyTorch
• Experience in analyzing workloads on large scale heterogeneous clusters
• Hands-on experience in developing code to target heterogeneous architectures (e.g. CPU/GPU/TPU)
• Experience in developing and contributing to large open-source libraries