You are viewing a preview of this job. Log in or register to view more details about this job.

Support Operations Engineer (SOE)

Fueled by CoreWeave’s remarkable growth over the last year and to keep up with the surging demand we're seeing from customers and in the market, CoreWeave is expanding its operations to Europe. We’ve built a reputation for delivering cutting-edge GPU infrastructure for leading AI companies, and are thrilled to continue this journey as a trusted partner to the AI community in Europe. If you thrive in fast-paced, high growth environments and want to play a key role in building and delivering the critical infrastructure required by AI, we’d love to hear from you. Learn more at www.coreweave.com.

As a Support Operations Engineer, you will be at the forefront of this transformational technology. You will assist a list of cutting-edge companies and developers using our accelerated compute services and features to run their mission-critical applications. You will be a crucial component in the success of their production-critical implementations via deployment, monitoring, triaging, and troubleshooting of critical infrastructure and jobs. Your efforts will ensure the efficient and uninterrupted execution of our clients’ jobs.

This role includes shift work, participation in an on-call rotation, and occasional after-hours support. It operates within a fast-paced, global, 24/7 support team environment, requiring flexibility for collaboration across different time zones.

Location & Travel: This is a hybrid role based in London or fully remote within the UK.

  • Hybrid in London: If you live within a commutable distance of our London offices, we expect you to be there at least twice a week to foster collaboration and connection.
  • Fully Remote: If you live outside a commutable distance, you can work remotely with occasional visits to the office.

Salary: £48,000 - £60,000

Job Duties

  • Deployment and configuration of platform infrastructure in a Linux environment
  • Monitor software and infrastructure for issues and act quickly to stem any negative impact
  • Work with Development, Infrastructure, and Network Operations teams to troubleshoot and resolve deployment-related software, network, installation, and configuration issues
  • Support Development, Infrastructure, and Network Operations teams in resolving infrastructure issues
  • Work with contractors in remote sites to install, configure, and troubleshoot servers, network equipment, and data centre infrastructure
  • Reconfigure or decommission existing infrastructure as needed
  • Identify, maintain and create documentation for new hardware deployments, all varieties of corner case scenarios, and troubleshooting workflows
  • Streamline deployments to increase efficiency and reduce deployment times
  • Support the development, testing, and integration of new hardware into the platform
  • Liaise closely with the Client Support Engineers team to monitor customer support requests and act as an extension of the Client Support team
  • Help maintain high customer satisfaction by acting with empathy, understanding the business impact and priority of customer issues, and following our best practices
  • Promptly act on technical incidents and escalations, communicating effectively with all stakeholders
  • Assist with the training and development of new hires
  • Plan, organize, and manage tasks, resources, and timelines across teams to accomplish work accurately and on time

Required skills and experience

  • Strong Linux command-line skills and experience with system administration.
  • Experience with High-Performance Computing (HPC) system administration
  • Working experience with Kubernetes & Docker
  • Proficiency in scripting languages such as Bash, Python, for automation.
  • Solid understanding of distributed computing environments and methodologies, including storage volumes, private networks, load balancers, and virtual machines
  • User Support experience and excellent communication skills to assist and train end-users.
  • You have a knack for solving problems; you're adept at recognizing technical issues and developing appropriate solutions

Desired Skills and Experience

  • Data Center Experience: Proficient in rack and stack and server and cable troubleshooting.
  • Understanding of networking concepts and troubleshooting (e.g., TCP/IP, InfiniBand).
  • Experience with server hardware installation and server configuration in data center environments
  • GPU Hardware and HPC: Familiar with GPU hardware and high-performance computing use cases.
  • AI and ML: Knowledgeable in artificial intelligence and machine learning.
  • Operational/System Administration: Experienced in working from ticket queues, Network Operations Centers (NOC), and dashboards.
  • Monitoring Tools: Experienced with Grafana and other monitoring tools.
  • Web Technologies: Intermediate skills in troubleshooting web technologies, including web servers, frameworks, HTTP, and authentication.
  • Cloud Concepts: Skilled in system, API, and infrastructure design using cloud concepts such as storage volumes, private networks, load balancers, and virtual machines.

 

 

Why CoreWeave?

At CoreWeave, we work hard, have fun, and move fast!  We’re in an exciting stage of hyper-growth that you will not want to miss out on. We’re not afraid of a little chaos, and we’re constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values: 

  • Be Curious at your Core
  • Act like an Owner
  • Empower Employees
  • Deliver Best In-Class Client Experience 
  • Achieve More Together

We support and encourage an entrepreneurial outlook and independent thinking. We foster an environment that encourages collaboration and provides the opportunity to develop innovative solutions to complex problems. As we get set for take off, the growth opportunities within the organization are constantly expanding. You will be surrounded by some of the best talent in the industry, who will want to learn from you, too. Come join us! 

Benefits

We offer a competitive salary and benefits, including: 

  • Family-level Medical Insurance
  • Family-level Dental Insurance
  • Generous Pension Contribution
  • Life Assurance at 4x Salary
  • Critical Illness Cover
  • Employee Assistance Programme
  • Tuition Reimbursement
  • Work culture focused on innovative disruption

Benefits may vary by location. 

CoreWeave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, sexual orientation, gender identity, national origin, veteran status, or genetic information.

CoreWeave does not accept speculative CVs. Any unsolicited CVs received will be treated as the property of CoreWeave and your Terms & Conditions associated with the use of CVs will be considered null and void.

Any unsolicited CVs sent by your company to us – that is to say, in any situation where we have not directly engaged your company in writing to supply candidates for a specific vacancy – will be considered by us to be a “free gift”, leaving us liable for no fees whatsoever should we choose to contact the candidate directly and engage the candidate’s services, and will in no way establish any prior claim by your company to representation of that candidate should the candidate’s details also be submitted by any other party.