Your Mission
As a Senior SLURM/HPC Engineer, you will play a key role in managing and optimizing our HPC infrastructures based on
modern cloud environments. You will collaborate closely with international, multidisciplinary teams and be responsible for the HPC
application layer.
Your primary objective will be to ensure the performance, scalability, and reliability of our infrastructures while integrating
innovative solutions tailored to client needs.
Main Responsibilities
- Manage and optimize SLURM clusters in cloud environments, addressing high-performance and HPC orchestration
requirements.
- Collaborate with the Kubernetes (K8s) engineer to integrate and optimize HPC orchestration (e.g., SLURM on K8s or
equivalent).
- Design HPC solutions adapted to hybrid environments (on-premises/cloud).
- Implement efficient workflows for end-users, particularly in AI, scientific simulation, and data-intensive fields.
- Maintain, troubleshoot, and continuously improve HPC infrastructures.
- Deploy and manage tailored HPC environments using leading cloud platforms (CoreWeave, Nebius, AWS, Azure, Oracle).
- Work closely with DevOps and infrastructure teams to ensure seamless integration between application and orchestration
layers.
- Draft detailed technical documentation and facilitate knowledge transfer to internal teams.
Your Profile
We are looking for an experienced, passionate candidate with a strong background in high-performance computing, capable of
thriving in complex and challenging technical environments.
Required skills
- Proven expertise in SLURM and its application in HPC environments.
- Solid experience in deploying and managing HPC on Kubernetes (or equivalent).
- Proficiency in leading cloud solutions (CoreWeave, Nebius, AWS, Microsoft Azure, Oracle).
- Strong understanding of HPC workflows and specific user needs (AI, scientific research, etc.).
- Advanced scripting and automation skills (Bash, Python, Ansible, or Terraform).
- Ability to collaborate with multidisciplinary teams, including DevOps and Kubernetes engineers.
- Experience in hybrid cloud environments (on-premises/cloud)
Additional Assets
- Experience integrating cloud-native solutions with HPC environments.
- Knowledge of other HPC cluster managers (e.g., Slurm++, Grid Engine).🚀
- Experience in HPC performance benchmarking and optimization.
What We Offer
- Fully remote : the opportunity to work from anywhere while collaborating with international teams.
- A dynamic, innovation-driven, and collaborative work environment.
- The chance to work on large-scale projects with prestigious clients.
Apply through hiring@sesterce.com to join us and help shape the future of high-performance computing and cloud technology! 🚀