About the company At Jobtome - https : / / weare.jobtome.com / - we are building a modern' cloud-native recruitment and marketing platform used at scale across multiple countries and brands. Our systems power high-traffic job distribution' integrations with external partners' and real-time data pipelines' with a strong focus on reliability' observability' and automation. Engineering is a core function of the company : we value ownership' pragmatic decision-making' and long-term technical excellence over short-term fixes. The role As a Senior Site Reliability Engineer' you will be responsible for ensuring the reliability' scalability' and performance of our production systems. You will work closely with Backend' Frontend' and Product teams to : - design resilient architectures - define reliability standards - improve observability and incident response - reduce operational toil through automation This is not a pure ops role : you will contribute to codebases' collaborate on system design' and help evolve our engineering culture toward SRE best practices. What you will do - Design' implement' and maintain reliable and scalable cloud infrastructure - Define and evolve SLIs' SLOs' and error budgets - Improve monitoring' alerting' and observability across services - Lead and participate in incident response' post-mortems' and root-cause analysis - Automate repetitive operational tasks to reduce toil - Collaborate with Backend engineers on service design' scalability' and failure modes - Improve CI / CD pipelines' deployment strategies' and release safety - Contribute to infrastructure as code and platform tooling - Act as a reliability advocate across the engineering organization Tech stack - Cloud : Google Cloud Platform (preferred)' AWS - Containers &, orchestration : Docker' Kubernetes (GKE) - Infrastructure as Code : Terraform - CI / CD : GitLab CI / CD - Observability : Cloud Monitoring' Logging' Prometheus' Grafana - Languages : Go' Python' Bash - Networking &, security : IAM' VPCs' service accounts' secrets management What we expect from a senior SRE - Strong experience running production systems at scale - Solid understanding of distributed systems and failure modes - Proven experience with SLO-driven reliability - Strong coding skills - Cloud infrastructure automation experience - Ability to debug complex cross-system issues - Ownership mindset and strong communication skills - Pragmatic approach to reliability' speed' and cost trade-offs Working model - Flexible working hours - Remote-friendly setup - Small autonomous teams - Direct collaboration with product and leadership
Site Reliability Engineer Full Remote EU only • Mendrisio, IT