Home Job Details
N
Information Technology 🏢 Full Time ⭐️ Verified

Senior AI Infrastructure Engineer (2026 Vision)

Nexus Horizon Labs
San Francisco
Estimated Salary
USD 180.000 – USD 250.000
New
Live Update
1 Juli 2026
Deadline
1 Jul 2027

Job Description

Are you ready to architect the backbone of tomorrow's Artificial General Intelligence? Nexus Horizon Labs is looking for a visionary Senior AI Infrastructure Engineer to lead our high-performance computing initiatives. As we prepare for the massive scalability demands of 2026 and beyond, you will be at the forefront of building resilient, secure, and lightning-fast AI ecosystems.

We are not just building software; we are engineering the future. Join a team of elite engineers dedicated to pushing the boundaries of what is possible in deep learning, quantum-ready architectures, and next-gen cloud integration.

Responsibilities

  • Architect Scalable Infrastructure: Design and deploy high-availability GPU clusters optimized for massive model training and inference workloads.
  • Optimize Training Pipelines: Implement advanced data pipelines and distributed training strategies to reduce latency and maximize compute efficiency.
  • Cloud & Hybrid Strategy: Lead the migration and management of complex cloud environments (AWS/Azure/GCP) with a focus on cost optimization and security compliance.
  • System Reliability: Build automated monitoring and alerting systems to ensure 99.99% uptime for critical AI services.
  • Future-Proofing: Research and prototype technologies relevant to the 2026 tech landscape, including edge computing and federated learning.
  • Team Mentorship: Guide junior engineers and conduct code reviews to maintain the highest standards of engineering excellence.

Qualifications

  • Experience: 5+ years of experience in systems engineering, backend development, or infrastructure architecture.
  • Programming: Proficiency in Python, C++, and shell scripting with a deep understanding of low-level system optimization.
  • Cloud Expertise: Strong hands-on experience with Kubernetes, Docker, and major cloud providers.
  • AI Stack: Familiarity with PyTorch, TensorFlow, and MLOps tools (MLflow, Kubeflow).
  • Problem Solving: Proven ability to troubleshoot complex, multi-node distributed systems issues under pressure.
  • Communication: Excellent written and verbal communication skills for technical documentation and stakeholder presentations.

Required Skills

Python C++ Kubernetes Docker AWS GCP PyTorch TensorFlow Machine Learning MLOps Distributed Systems Linux

Ready to Take This Challenge?

Make sure your resume is ready. Submit your application now before the deadline.

Apply Now

Related Jobs

Similar job recommendations for you

View All