06 feb
Perficient
Monterrey
.Job DescriptionWe currently have a career opportunity for a Senior Site Reliability Engineer to join our team located in Mexico.At Perficient, we're passionate about building software that solves problems.
We count on our site reliability engineers (SREs) to empower users with a rich feature set, high availability, and stellar performance level to pursue their missions.
As we expand customer deployments, we're seeking an experienced SRE to deliver insights from massive-scale data in real time.
Specifically, we're searching for someone who has fresh ideas and a unique viewpoint,
and who enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences for every interaction.Perficient is always looking for the best and brightest talent and we need you!
We're a quickly-growing, global digital consulting leader, and we're transforming the world's largest enterprises and biggest brands.
You'll work with the latest technologies, expand your skills, and become a part of our global community of talented, diverse, and knowledgeable colleagues.ResponsibilitiesGather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding.Partner with development teams to improve services through rigorous testing and release procedures.Participate in system design consulting, platform management, and capacity planning.Create sustainable systems and services through automation and uplifts.Balance feature development speed and reliability with well-defined service-level objectives.QualificationsBachelor's degree (or equivalent) in computer science or related discipline.5+ years' experience with JAVA, J2EE, NoSQL/SQL Datastore, Spring Boot,
GCP/AWS/Azure & Docker/K8 in Maintenance and Development of multi-tier applications.Understanding of RESTful APIs and microservices platform.4+ Years of experience with any of APM and other monitoring tools such as Dynatrace, New Relic, ELK, Splunk, Prometheus, Sensu, Nagios, Kafka, DataDog, PagerDuty.Strong experience with product & development teams to establish error budgets by identifying the right SLOs (Service level objective), SLIs (Service level indicators), KPIs (Key performance indicators) and effectively drive the use of the budget to ensure maximum domain availability/uptime.Experience in solving complex architecture/design & business problems, work to simplify, optimize, remove bottlenecks, etc.Architect, design & develop automation experience to reduce toil, improve recoverability, availability, latency & scalability of supported applications with understanding of MTTD (Mean Time to Detection)
& MTTR (Mean Time to Resolution).Ability to quickly diagnose and resolve issues in high-pressure situations.Strong verbal and written communication skills to effectively collaborate with cross-functional teams and articulate technical concepts to non-technical stakeholders
Muestra tus habilidades a la empresa, rellenar el formulario y deja un toque personal en la carta, ayudará el reclutador en la elección del candidato.