12 ene
Tech Mahindra
Saltillo
.Data EngineerJob Description: Data is at the core of our business.
The data engineer is a technical job that requires substantial expertise in a broad range of software development and programming fields.
The data engineer should especially have sufficient knowledge of big data solutions to be able to implement those on premises or in the cloud.A data engineer generally works on implementing complex big data projects with a focus on collecting, parsing, managing, analyzing and visualizing large sets of data to turn information into insights using multiple platforms.
She or he should be able to decide on the needed hardware and software design needs and act according to the decisions.
The big data engineer should be able to develop prototypes and proof of concepts for the selected solutions.Specific responsibilities are as follows:- Utilize the data engineering skills within and outside of the developing information ecosystem for discovery, analytics and data management- Work with data science team to deploy Machine Learning Models- You will be using Data wrangling techniques converting one "raw" form into another including data visualization, data aggregation, training a statistical model etc.- Work with various relational and non-relational data sources with the target being Azure based SQL Data Warehouse & Cosmos DB repositories- Clean, unify and organize messy and complex data sets for easy access and analysis- Create different levels of abstractions of data depending on analytics needs- Hands on data preparation activities using the Azure technology stack especially Azure Databricks is strongly prefeered- Implement discovery solutions for high speed data ingestion- Work closely with the Data Science team to perform complex analytics and data preparation tasks- Work with the Sr.
Data Engineers on the team to develop APIs- Utilize state of the art methods for data manning especially unstructured data- Experience with Complex Data Parsing (Big Data Parser) and Natural Language Processing (NLP) Transforms on Azure a plus- Design solutions for managing highly complex business rules within the Azure ecosystem- Performance tune data loadsSkills Required- Mid to advanced level knowledge of Python and Pyspark is an absolute must.- Knowledge of Azure, Hadooop 2.0 ecosystems, HDFS, MapReduce, Hive, Pig, sqoop, Mahout, Spark etc.
a must- Experience with Web Scraping frameworks (Scrapy or Beautiful Soup or similar)- Extensive experience working with Data APIs (Working with RESTful endpoints and/or SOAP)- Significant programming experience (with above technologies as well as Java, R and Python on Linux)
a must- Knowledge of any commercial distribution like HortonWorks, Cloudera, MapR etc.
a must- Excellent working knowledge of relational databases, MySQL, Oracle etc.- Experience with Complex Data Parsing (Big Data Parser) a must
Muestra tus habilidades a la empresa, rellenar el formulario y deja un toque personal en la carta, ayudará el reclutador en la elección del candidato.