USNLX Ability Jobs

USNLX Ability Careers

Job Information

Nvidia Senior SRE Engineer in Santa Clara, California

NVIDIA is looking for a seasoned SRE to join its multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced crew that develops and maintains sophisticated Nvidia’s internal cloud provisioning product for GPUs and Tegra systems. The team works with various other business units within NVIDIA Software such as Graphics Processors, Mobile Processors, Deep Learning, Artificial Intelligence and Driverless Cars to cater to their infrastructure & systems needs.

As an SRE, you’ll also be working in conjunction with various teams such as software engineering to deploy these new products and manage our infrastructure, associated processes and systems. Keen attention to detail, problem-solving abilities, and a solid knowledge base are essential.

What you’ll be doing:

  • Kubernetes System Administration for DevOps & CI/CD. Designing and implementing clusters, cluster segmentation, internal/external networking for multiple clusters and environments.

  • Monitor system performance and troubleshoot issues related to CPU, memory, disk, and network utilization.

  • Architect CI/CD pipelines for container build and deployment.

  • Craft and develop tools needed for automating workflows.

  • Develop, Improve and Maintain our infrastructure codebase.

  • Craft and implement critical metrics using various analytics methods and dashboards.

  • Take part in prototyping, crafting, and developing cloud infrastructure for NVIDIA.

  • Reuse AI techniques to extract useful signals about machines and jobs from the data generated.

What we need to see:

  • Kubernetes domain expertise with extensive experience building scalable, resilient platforms in both public and private cloud capable of providing platform engineering / architecture standard methodologies (including experience with architecting and implementing the overall platform, orchestration, security, and monitoring ecosystem). High proficiency in administering and configuring Kubernetes.

  • Proficient with CI/CD pipelines like Jenkins, Gitlab CI, Github Actions, ArgoCD etc.

  • Experience with data analytics/visualization tools like Kibana, Grafana, Splunk etc.

  • Strong Ansible skills. Experience with other configuration tools like Chef and Puppet is also good to have.

  • Proficient using source code management and binary repository systems like GitLab, GitHub, Artifactory, Perforce etc.

  • Knowledge of monitoring systems such as Zabbix, Alertmanager, PagerDuty and/or similar systems.

  • Well versed in Prometheus, writing custom exporters and PromQL.

  • 8+ years of proven experience.

  • Bachelor's degree in Computer Science, Information Technology, or related field, or equivalent experience.

Ways to stand out from the crowd:

  • Experience managing NVIDIA hardware like GPUs and Tegras.

  • Background with Gitlab CI.

  • Experience with building and deploying containers.

  • Solid understanding of containerization and microservices architecture. Certified Kubernetes Administrator (CKA), Certified Kubernetes Security Specialist (CKS) & Certified Kubernetes Application Developer (CKAD) preferred.

  • Ability to design simple systems that can work efficiently without needing much support.

With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and, due to outstanding growth, our exclusive engineering teams are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you.

The base salary range is 140,000 USD - 258,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits (https://www.nvidia.com/en-us/benefits/) . NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

DirectEmployers