As a Senior Site Reliability Engineer at Striveworks you will be challenged – and trusted – on day one to take ownership of a team that is responsible for maintaining, optimizing, and enhancing our on-premises computing environments. You will play a crucial role in the successful deployment of our software solutions to clients. You will be responsible for leading and executing the technical aspects of implementation projects, ensuring seamless integration, customization, and configuration of our software products. Your expertise will guide the implementation team, drive best practices, and ensure client satisfaction.

You are right for this challenge if you value and possess technical expertise and you enjoy pushing the boundaries of your own capabilities. You will be responsible for maintaining Striveworks’ software deployments both on-prem and in various cloud service providers—using Infrastructure-as-Code methodologies. You will be working with a team of software engineers and data scientists helping them integrate their work into functional, AI solutions.

Day-to-day Responsibilities:

  • Oversight for the automation of infrastructure-as-code for standing up virtual machines and custom Kubernetes clusters in AWS, Azure, GCP, on-premises, or hybrid cloud environments
  • Internal triage of issues reported by platform users
  • Working with platform developers to define requirements and build solutions for customer use cases of the platform
  • Software deployments to on-prem unclassified, CUI, Secret, and Top Secret networks
  • Participate in on-call rotations and incident response to swiftly address and resolve critical system issues.

The Senior Site Reliability Engineer on the Engagements team, leads a small team of Site Reliability Engineers  who are directly engaged with the customer, predominantly with their networking and platform management teams, to sustain the Chariot platform in air-gapped computing environments. You will have a direct contribution to the success of mission-critical systems within National Security and Commercial clients. You will be expected to wear multiple hats, step into vacuums where more work is needed, and will be given the breadth to explore new technologies. You will work side-by-side daily with software engineers, data scientists and end users of our products, learning from them so that functional decisions become second nature to you.

The anticipated base pay range for this position is $170,000–$210,000/year. Striveworks’ total compensation package includes a competitive base salary, annual performance-based equity grants, and a lucrative yearly cash bonus.

This position offers a fully remote work environment, but requires a willingness to travel to customer sites up to 50% of the time. Preferred locations of residency for this role are in Southern Pines, NC or Tampa, FL. Alternatively, you can work hybrid/onsite at our office in northwest Austin, TX. 

The Right Fit

We spend a lot of time during our hiring process talking about shared values.

Why? We passionately believe that fostering an environment where people can self-actualize and pursue greatness is the best way to achieve our individual and collective goals.

What does this mean for you? We want to provide you with the conditions to thrive in an environment where you can achieve your goals, where you know the team shares your goals, and where you make and accept decisions for the team with humility. At Striveworks, we want your say/do ratio to be 1 and to know that being part of a top-tier team means that there is no smartest person in the room. If that makes sense, we are already on the same page.

Here’s what we’re looking for:

  • Top Secret U.S. security clearance
  • 4+ years total experience as a Site Reliability Engineer, Software Engineer, or DevOps Engineer
  • 2+ years relevant experience in:
    • Developing for and/or deploying microservices in Kubernetes
    • Programming in Python and Golang
    • Developing for and/or deploying microservices in Kubernetes
    • Writing and deploying Helm Charts
    • Deploying a web-based application to a DoD/IC air-gapped network
    • Automation and infrastructure-as-code (e.g. Terraform, Ansible)
    • Deploying infrastructure in a cloud such as AWS, Azure, GCP, or OpenStack
  • Understanding of networking concepts, security best practices, and disaster recovery strategies.
  • Excellent communication and collaboration skills to work effectively in a cross-functional team environment; Strong problem-solving skills and the ability to troubleshoot complex technical issues.

The Wish List

We are very interested in candidates who possess the above qualifications, and we appreciate and consider the addition of:

  • Deploying, maintaining, or contributing to CNCF projects
  • Deploying, managing, and/or supporting enterprise information systems in a DoD environment
  • U.S. federal information system security policies, including Security Technical Implementation Guides (STIGs), NIST 800-171, NIST 800-53, CMMC, ICD 503
  • DevSecOps, CI/CD pipelines, or automated security scanning
  • Administration and deployment of GPU-enabled servers
  • Storage technologies, NAS/SAN tools
  • Directly leading engineering initiatives and/or teams
  • Experience deploying infrastructure to AWS C2S, or similar
  • Service mesh
  • Blue-green and Canary deployments
  • Multi-cloud

The Benefits

  • Top-of-market salary and total compensation
  • Generous equity plan
  • Health/vision/dental insurance
  • Flexible PTO
  • Parental leave

Striveworks: Better Models, Faster

The world has looked to data analytics to bridge the gap between floods of data and the struggle to use that data effectively to make timely, impactful decisions. Today, most organizations are awash in analytics that “aren’t quite right”—models that were developed too generally, or too slowly, to be effective in dynamic, fast-paced environments. Striveworks is simplifying ModelOps with a powerful and extensible platform that instantiates the data analytic process as code.

Striveworks is trusted by leading Fortune 500 firms as well as leaders in the public sector as a primary solution for managing model development, monitoring, and governance—and ensuring those models solve the real challenges their organizations face.

Striveworks’ Chariot platform enables users to turn their own production data into models and turn models into production systems. Uniquely, as you train, test, deploy, and use models, our lineage system enables you to track not only the “upstream” provenance of model and data sources, but also the “downstream” usage of the resultant model inferences. Combining this with a principled experience for data and model development, Chariot gives our customers in highly regulated industries an unmatched governance solution over the top of a performant ModelOps platform.

Striveworks is an Equal Opportunity Employer and does not discriminate in employment on the basis of race, color, religion, belief, sex (including pregnancy and gender identity or expression), national origin, social or ethnic origin, political affiliation, sexual orientation, marital status, disability, genetic information, age, membership in an employee organization, retaliation, parental status, military service, or other non-merit factor. Striveworks will not tolerate discrimination or harassment of any kind.

If you require assistance or a reasonable accommodation in the application process, please contact Operations at [email protected]

Striveworks is a participating employer in the E-Verify program.