Senior Site Reliability Engineer (m/f/x)

Software Development

REWE Group Österreich

Wien

Senior Site Reliability Engineer (m/f/x)

REWE Group Österreich

Software Development

Wien

Company Description

As the IT of the REWE Group Austria, we work together with our more than 600 employees to develop innovative IT products and services for all our corporate divisions in Austria and abroad, setting the tone for modern trade.

We are looking for a highly skilled and experienced SRE (Site Reliability Engineer) Expert to join our team. The ideal candidate will ensure the reliability, availability, and performance of our critical infrastructure and services. This role involves collaborating with cross-functional teams to build and maintain scalable and efficient systems, implement automation, and drive improvements in system reliability.

Job Description
  • Design, implement, and maintain highly reliable and scalable infrastructure and services using cloud platforms (e.g. GCP).
  • Automate repetitive tasks using tools such as Terraform, Ansible and SaltStack.
  • Collaborate with development and operations teams to ensure smooth deployment and operation of services using CI/CD pipelines (e.g. Gitlab).
  • Establish and monitor Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to ensure system reliability using monitoring tools like Prometheus and Grafana.
  • Perform capacity planning and optimization to handle growth and scale.
  • Lead incident management and post-mortem processes to ensure continuous improvement. In addition to conducting root analysis of system failures.
Qualifications
  • Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience).
  • **5+ years of experience** as an SRE, DevOps Engineer, or similar role.
  • Strong understanding of cloud infrastructure (specifically GCP) and containerization technologies (Docker, Kubernetes).
  • Proficiency in scripting and programming languages (Java, Python, Go, Bash).
  • Experience with monitoring and observability tools (Prometheus, Grafana, ELK Stack, Fluentd, Splunk).
  • Solid knowledge of networking (DNS, TCP/IP, HTTP), security best practices (SSL/TLS, firewalls, IAM), and system administration (Linux, Windows).
  • Experience with Incident Management (Jira, ServiceNow), version control systems (Git, SVN) and CI/CD
Additional Information
  • Long-term, interesting and varied work for a reliable employer in a supportive team
  • A family-friendly company culture with flexible working hours and remote working options available according to your individual needs
  • Numerous training and further development opportunities within the Group (5% of working time for self-organized training and education)
  • Staff shopping and travel discounts
  • Easy public access
  • An industry-standard, attractive and performance based annual gross salary starting at 54,000 Euro (on a full-time basis) with the possibility of higher pay according to experience and qualifications

No matter where you are in your career, we have a path for you. Whether you’re looking for your first job, advancement in your field, or a new career shift. We’re proud to employ great people who are passionate about their jobs. But they’re all different. No matter who you are, what you need and where you’re going, REWE Group can be a part of it. Apply now!

Please upload your resume to give us insight of your work experience - anonymously if you like!

We promote a diverse and inclusive work environment. Therefore, we welcome applications from people of different gender, age, cultural or social background, sexual identity and applications from people with disabilities. In addition, we would like to increase the proportion of women in technical professions and are particularly pleased to receive applications from women for this position.