09 ene
Empresa líder
México
Site Reliability Engineer (SRE)
REMOTE / Mexico
Descripción
Fast Dolphin is an international staffing company with over two decades of expereince specializing in recruiting bilingual and multilingual IT experts across the Americas as well as providing payroll and customized staffing solutions. Moreover, we take pride in boosting the careers and dreams of our professionals.
We are currently looking for Site Reliability Engineer (SRE) Cloud & Automation to work in a hybrid position with posible travels to São Paulo Brazil for a 6+ month job opportunity.
Job Title: Site Reliability Engineer (SRE) Cloud & Automation
Location: Remote
Start Date: ASAP
Length: 6+ months
Responsibilities
Site Reliability Engineer (SRE) Role Summary
We are looking for a seasoned Site Reliability Engineer with expertise in Kafka, Kubernetes, and MongoDB to ensure infrastructure reliability, scalability, and performance. The role focuses on designing and maintaining resilient systems, optimizing performance, and automating processes while collaborating with teams to improve application deployment and performance.
Key Responsibilities:
- Design and maintain resilient infrastructure for Kafka, Kubernetes, and MongoDB.
- Monitor system performance using modern observability tools.
- Develop monitoring, alerting, and logging frameworks for effective failure detection.
- Perform root cause analysis and resolve incidents to minimize downtime.
- Automate tasks to enhance efficiency and reduce manual efforts.
- Collaborate on application performance and deployment strategies.
- Define best practices for scaling, capacity planning, and disaster recovery.
- Improve infrastructure as code (IaC) and deployment pipelines continuously.
Required Skills:
- Bachelor's degree in Computer Science or related experience.
- 3+ years in SRE, DevOps, or a similar role.
- Expertise in managing Kafka, Kubernetes, and MongoDB in production.
- Strong knowledge of distributed systems, networking, and database tuning.
- Experience with monitoring tools (e.g., Prometheus, Grafana, ELK Stack).
- Proficiency in scripting (Python, Bash) and CI/CD tools (Jenkins, ArgoCD).
- Familiarity with cloud platforms (AWS, Azure, or GCP).
- Excellent problem-solving, communication, and collaboration skills.
Preferred Skills:
- Familiarity with MongoDB Atlas and Kubernetes Operators.
- Knowledge of RedHat OpenShift and configuration tools like Ansible or Terraform.
- Understanding of SLA, SLO, and error budgets within SRE practices.
- Languages:
- English B2
- Spanish
If you fulfill these requirements and are interested in this position,
please send your resume along with your availability to start in this project, to the following e-mail address: ***********@fastdolphin.com
Rosa Trinidad Romero Mancilla
IT Recruiting Master
Fast Dolphin, Inc.
www.fastdolphin.com
12555 Orange Drive, Suite 4059
Ft. Lauderdale, FL 33330
Phone: +1 (954) 233-0647
WhatsApp +52(554) 164-9564
Skype: rostry2000
Fecha de publicación: 08-01-2025
Muestra tus habilidades a la empresa, rellenar el formulario y deja un toque personal en la carta, ayudará el reclutador en la elección del candidato.