All roles

Open role

[Remote] Datacenter Hardware Operations Technician Lead, Industrial Compute

Remote · Ethiopia Full-time

Note: The job is a remote job and is open to candidates in USA. OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. They are seeking a Datacenter Hardware Operations Technician Lead to serve as the senior on-site technical authority for hardware reliability and fleet health at one of OpenAI’s flagship AI campuses. The role involves driving technical triage and resolution of hardware issues, collaborating with various teams, and establishing operational standards for hardware maintenance.

Responsibilities

  • Serve as OpenAI’s senior on-site hardware operations lead for server, GPU, storage, and rack-level infrastructure
  • Drive technical triage and resolution of complex hardware failures impacting production systems
  • Partner with Fleet Health Engineering to investigate recurring hardware issues, identify failure patterns, and improve fleet reliability
  • Lead root cause analysis (RCA) efforts for critical hardware incidents and develop corrective and preventive action plans
  • Collaborate with Oracle operations teams and OEM vendors to coordinate repairs, replacements, upgrades, and hardware lifecycle activities
  • Establish and continuously improve hardware maintenance procedures, operational runbooks, and troubleshooting standards
  • Analyze hardware failure trends and operational metrics to identify reliability risks and improvement opportunities
  • Support new hardware introductions, validation activities, and production readiness reviews
  • Coordinate spare parts strategy and inventory planning with supply chain and operations teams
  • Partner with Hardware Engineering, Manufacturing, and Infrastructure teams to provide field feedback that improves future platform designs
  • Develop scalable operational standards and best practices that can be deployed across future Stargate campuses
  • Mentor technicians and partner teams on advanced troubleshooting methodologies and hardware operational excellence

Skills

  • 8+ years of experience supporting large-scale datacenter hardware infrastructure, with experience in a senior technician, sustaining engineering, or hardware operations leadership role
  • Deep expertise with server platforms, GPU systems, storage infrastructure, rack integration, and datacenter hardware architecture
  • Strong experience diagnosing complex hardware failures and leading repair efforts in production environments
  • Experience conducting root cause analysis and driving long-term corrective actions
  • Strong understanding of hardware reliability engineering principles and fleet-health management
  • Proven ability to partner effectively across engineering, operations, manufacturing, and vendor organizations
  • Comfortable operating independently in high-priority production environments with significant operational responsibility
  • Excellent written and verbal communication skills with the ability to influence technical and operational decisions
  • Experience developing operational processes, maintenance standards, and technical documentation
  • Ability to travel occasionally to support new campus deployments and operational readiness activities
  • Experience supporting large-scale GPU clusters or AI/ML infrastructure environments
  • Familiarity with fleet health systems, telemetry platforms, and hardware monitoring tools
  • Experience with failure analysis methodologies such as FRACAS, RCCA, 5-Why, Fishbone, or FMEA
  • Knowledge of Linux system administration and hardware validation workflows
  • Experience supporting hyperscale datacenter operations or HPC environments
  • Familiarity with server manufacturing, rack integration, or NPI-to-sustaining transitions
  • Industry certifications such as CompTIA Server+, OEM hardware certifications, or equivalent experience
  • Experience applying Environmental Health and Safety (EHS) practices in mission-critical datacenter environments

Company Overview

  • OpenAI is an AI research and deployment company that develops advanced AI models, including ChatGPT. It is a sub-organization of OpenAI Foundation. It was founded in 2015, and is headquartered in San Francisco, California, USA, with a workforce of 1001-5000 employees. Its website is https://www.openai.com.
  • More open positions

    [Remote] Senior Data Engineer

    Work from home Full-time role

    [Remote] Lead AWS Serverless Full Stack Engineer

    Work from home Full-time role

    [Remote] AI Solutions Engineer

    Work from home Full-time role

    [Remote] Manager, Emerging Technologies – Digital Marketing

    Work from home Full-time role

    [Remote] Enterprise System Administrator (Salesforce Admin)

    Work from home Full-time role

    Remote Part-Time Data Entry Specialist – QuickClaim Systems, HIPAA Compliance & Image Processing

    Work from home Full-time role

    [Remote] Seasonal Customer Service Representative (HelpLine) - Remote

    Work from home Full-time role

    Remote Customer Support Associate – Entry‑Level Position with careerzynith’s Dynamic Remote Team

    Work from home Full-time role

    Therapist (Remote)- Adult BehavioralHealth

    Work from home Full-time role

    Paid Media Specialist (Remote, Malaysia)

    Work from home Full-time role

    SEO Copywriter

    Work from home Full-time role

    Entry-Level Remote Opportunity - (No Experience Needed) - Immediate Start - Most Responsive

    Work from home Full-time role

    Remote Data Entry Specialist – $25/hr Flexible Home‑Based Role with careerzynith – No Experience Required, Part‑Time Opportunity

    Work from home Full-time role

    [Remote] Senior Data Scientist

    Work from home Full-time role

    Staff Web Engineer, Reservation Management | Airbnb | Remote (United States)

    Work from home Full-time role

    Lead Product Marketing Manager, Device Intelligence

    Work from home Full-time role

    AI Avatar and Video Editor

    Work from home Full-time role

    [Remote] Slovak Language Prompt Creators | Slovakia

    Work from home Full-time role

    Experienced Remote Customer Experience Concierge – careerzynith

    Work from home Full-time role

    .NET Developer Senior (100% teletrabajo)

    Work from home Full-time role

    Remote HR Generalist Jobs in America Apply Now

    Work from home Full-time role