Contact us for help?

Contact with us through our representative or submit a business inquiry online.

Contact Us

Site Reliability Engineer

Application Developer/Site Reliability Engineer with Bachelor’s Degree in Computer Science, Computer Information Systems, Information Technology, or a combination of education and experience equating to the U.S. equivalent of a Bachelor’s degree in one of the aforementioned subjects.

Job Duties and Responsibilities:

  • Responsible for reliability and availability of all Production environments, their health, on-going monitoring, proactive and preventive health assessments.
  • Transform Operations & influence Engineering practices to achieve the strategic goal of new code deployed in Production frequently via Continuous Delivery (CD) pipelines.
  • Encompasses handling complex and varied product platforms and multiple Cloud deployment platforms.

Key Responsibilities include, but are not limited to:

  • Design the SRE function with the goal of providing 24x7x365 coverage.
  • Build and evolve an Operations Model that can handle complexities. spanning various cloud-based deployment models, and technology partner integrations.
  • Create & support a delivery ecosystem that thrives on demonstrating value to stakeholders by adopting highly iterative & Continuous delivery models.
  • Work with the product management team to define Service Level Agreements (SLAs) Service Level Objectives (SLOs) and implement Service Level Indicators (SLIs) for core capabilities.
  • Collaborate with product and engineering to drive and improve the whole lifecycle of operational readiness - from inception to design, through deployment, operations, and proactive refinement.
  • Influence Architectural and Product decisions with a bias towards Scale, Observability, Monitoring & Stability and Security.
  • Drive incident management process and support a blameless post-mortem culture.
  • Own and drive high profile customer escalations.
  • Drive and implement lean-ops culture by applying self-service, self-healing, and automation.
  • Advocate for SRE Principles, collaborate with all Engineering teams to create a DevOps mindset.
  • Responsible for Capacity forecast, Budget & Cost optimization.
  • Define and deliver KPIs, Metrics for Operations & Quality to stakeholders -- Deployment Frequency, MTTR, Lead Time, etc.
  • Adopt and evolve internal processes based on industry best practices in SRE.
  • Grow team members through career development through coaching and mentoring for junior engineers, foster leadership principles and behaviors to groom the next generation of leaders.

Skills/Knowledge required:

  • Minimum 3 years of Software Engineering and/or Infrastructure Operations, 2+ years in SRE role.
  • Ability to work with distributed, multicultural, and diverse teams.
  • Experience with customer escalations and/or operations war room.
  • Strong understanding of modern monitoring and logging technologies.
  • Strong analytical skills with a data-driven approach to solving problems.
  • The ability to partner and influence product, engineering, and operations teams is a must.
  • Strong organizational planning and development, business judgment, influential skills, and technical leadership.
  • Experience with Agile methodologies -- SCRUM, KANBAN, etc.

Work location is Portland, ME with required travel to client locations throughout USA.

Rite Pros is an equal opportunity employer (EOE).

Please Mail Resumes to:
Rite Pros, Inc.
565 Congress St, Suite # 305
Portland, ME 04101.