Job Description
ACG_3024_JOB
Our client is a leading technology institution who is looking for a qualified candidate to join their firm.
Design, develop, test, deploy, and monitor data pipelines in Databricks on AWS from various data sources.
Develop, test, deploy, and maintain scalable PySpark and SQL code within Databricks.
Identify opportunities to enhance internal processes through code optimization and automation.
Build data quality dashboards, lineage tracking, and monitoring tools to enable active pipeline monitoring and provide actionable insights on data quality and governance.
Assist in migrating data from legacy systems to newly developed solutions.
Follow and lead best practices related to data security, retention, and privacy policies.
Requirements
Bachelor’s degree in a relevant field.
M inimum of 3 years’ experience in developing ETL / ELT pipelines.
Demonstrated skills in designing, developing, implementing, reporting, and analyzing solutions.
Skilled in programming with Apache Spark, Python, and SQL.
Familiar with handling data in Text, Delta, Parquet, JSON, CSV, and XML formats.
Knowledgeable in Spark structured streaming.
Experienced in working with AWS infrastructure, especially S3.
Well-versed in git version control, DevOps methodologies, and CI / CD processes; experience with Atlassian tools is beneficial.
Familiarity with common web API frameworks and web services.
Strong teamwork, relationship-building, and client management skills, with the ability to influence peers and senior management.
Willingness to adopt modern technologies, best practices, and innovative ways of working.
Benefits
Performance bonus up to two months’ salary
Pro-rated 13th month salary
15 days annual leave plus 3 days sick leave, 1 birthday leave, and 1 Christmas leave day
Contact : Nhat Anh Nguyen
Due to the immense number of applications, only shortlisted candidates will be contacted.
Data Engineer • Ho Chi Minh, SG, vn