Job Description:
Job Duties and Responsibilities:
We are looking for a self-starter to join our Data Engineering team. You will work in a
fast-paced environment where you will get an opportunity to build and contribute to the
full lifecycle development and maintenance of the data engineering platform
With the Data Engineering team you will get an opportunity to –
● Design and implement data engineering solutions that is scalable, reliable and
secure on the Cloud environment
● Understand and translate business needs into data engineering
solutions
● Build large scale data pipelines that can handle big data sets using distributed
data processing techniques that supports the efforts of the data science and data
application teams
● Partner with cross-functional stakeholder including Product managers, Architects,
Data Quality engineers, Application and Quantitative Science end users to deliver
engineering solutions
● Contribute to defining data governance across the platform
Basic Requirements:
● A minimum of a BS degree in computer science, software engineering, or related
scientific discipline is desired
● 3+ years of work experience in building scalable and robust data engineering
solutions
● Strong understanding of Object Oriented programming and proficiency with
programming in Python (TDD) and Pyspark to build scalable algorithms
● 3+ years of experience in distributed computing and big data processing using
the Apache Spark framework including Spark optimization techniques
● 2+ years of experience with Databricks, Delta tables, unity catalog, Delta
Sharing, Delta live tables(DLT) and incremental data processing
● Experience with Delta lake, Unity Catalog
● Advanced SQL coding and query optimization experience including the ability to
write analytical and nested queries
● 3+ years of experience in building scalable ETL/ ELT Data Pipelines on Databricks
and AWS (EMR)
● 2+ Experience of orchestrating data pipelines using Apache Airflow/ MWAA
● Understanding and experience of AWS Services that include ADX, EC2, S3
● 3+ years of experience with data modeling techniques for structured/
unstructured datasets
● Experience with relational/columnar databases – Redshift, RDS and interactive
querying services – Athena/ Redshift Spectrum
● Passion towards healthcare and improving patient outcomes
● Demonstrate analytical thinking with strong problem solving skills
● Stay on top of emerging technologies and posses willingness to learn.
Bonus Experience (optional)
● Experience with an Agile environment
● Experience operating in a CI/CD environment
● Experience building HTTP/REST APIs using popular frameworks
● Healthcare experience
Location: Pune, Maharashtra,
IndiaBudget: 25,00,000