20 ene
Thomson Reuters
Xico
**Senior Site Reliability Engineer**:
Are you passionate about the chance to bring your experience to a world-class company that is market-leading for both content and technology?
If yes, we are looking for you!
Join our team!
Thomson Reuters provides knowledge to act: we deliver information quickly and efficiently so professionals can make decisions that matter.
We combine industry expertise with innovative technology to deliver critical information to leading decision-makers in the legal, tax, and accounting, powered by the world's most trusted news organization.
We aim to provide every one of our employees with a positive working environment, continued professional development,
and a commitment to work-life balance and equal opportunities.
We are also part of a vast network of global career opportunities and actively encourage the development of staff in the business.
This position will allow you to expand your technical skills while networking with professionals in Cloud operations, technology development, and project management teams.
Thomson Reuters ONESOURCE Indirect Tax's SRE team is looking for a Site Reliability Engineer who will provide hands-on technical skills and share industry best practices with other team members on core SRE principles and tools.
The Site Reliability Engineer will participate in end-to-end operational aspects of ONESOURCE Indirect Tax Applications by working closely with Architects, DevOps, Product, and development teams to ensure we get the most out of the software on AWS and OCI.
**About the Role**
In this opportunity as a **Senior Site Reliability Engineer**, you will:
- Be a Team Player: Working in a collaborative team-oriented environment, you will share information, value diverse ideas, and partner with cross-functional and remote teams.
- Be an Agile Person: with a strong sense of urgency and a desire to work in a fast-paced, dynamic environment, you will deliver solutions against strict timelines.
- Be Innovative: you are empowered to try new approaches and learn new technologies.
You will contribute innovative ideas, create solutions, and be accountable for end-to-end deliveries.
- Be an Effective Communicator: through active engagement and communication with cross-functional partners and team members, you will effectively articulate ideas and collaborate on technical developments.
- Represent operations in a technical fashion to leadership and development teams.
- Engagements that evolve the stability, scalability,
and supportability with development and other operations teams to continue evolving our monitoring and operational procedures for the architecture.
- Contribute and develop documentation on Application services, infrastructure details, Recovery Procedures, Root cause analysis, and post-incident review.
- Perform change management functions involving software deployment and server maintenance requests and documenting the change procedures/instructions and presenting in change advisory board meetings for necessary approvals.
- Respond to and mitigate incidents as they occur within the environment.
- Security best practices for cloud.
Design, create, and manage the performance, availability, recovery requirements, and standards across the Observability Platform.
- Impact the engineering function by influencing decisions through advice, counsel, or facilitating services
- Preference to work in an Agile environment committed to continuously improving that environment.
- Perform Troubleshooting, deploy systems or execute maintenance tasks as necessary to meet the specified SLOs
- Perform periodic on-call duty on a rotational basis.
**About You**
You're a fit for the role if your background includes:
- Bachelor's degree
- 5-8 years of experience in enterprise-level operations support role OR DevOps role.
- Working knowledge of infrastructure components (e.g., routers, load balancers, cloud products, container systems, compute, storage, and networks)
- Expertise in observability and monitoring tools, like Dynatrace, Datadog, AppDynamics, Splunk, etc.
- Deep understanding of Application performance monitoring (APM) and user monitoring.
- Experience with Load balancers and AWS services such as AWS ECS, EMR, State Machines/ Step Functions, CloudFormation,
CloudWatch, Lambda, SQS, ECR, Fargate, Elastic Search, networking concepts, etc.,
- Sound knowledge of ITSM process, SI/SLO/SLA management, incident resolution, and automation techniques
- Incident response and recovery: SREs are responsible for responding to incidents and implementing processes for incident response, monitoring, and automated recovery.
- Ability to code in one of the programming languages (Java, C#, Python, JavaScript, etc.)
- Working knowledge of ITIL Change and Incident management processes.
- Experience in site reliability engineering in Java, Kubernetes, Kafka, and Database platforms (like Postgres)
- Excellent written and verbal communication skills and strong collaboration ski
Muestra tus habilidades a la empresa, rellenar el formulario y deja un toque personal en la carta, ayudará el reclutador en la elección del candidato.