← All Positions
Posted Jun 23, 2026

[Remote] Principal Site Reliability Engineer - AI Infrastructure Operations

Apply Now

Note: The job is a remote job and is open to candidates in USA. Nscale is a GPU cloud provider focused on AI, offering high-performance infrastructure for AI start-ups and large enterprises. They are seeking a Principal Site Reliability Engineer to lead reliability strategy, design foundational systems, and drive operational excellence across their AI Infrastructure Operations team.


Responsibilities


Skills


Benefits


Company Overview

  • Nscale builds AI data centers and provides GPU cloud infrastructure that companies use to train, run, and scale large AI models. It was founded in 2024, and is headquartered in London, England, GBR, with a workforce of 201-500 employees. Its website is https://www.nscale.com.

  •