Remote Data Engineer

Indeed

Full-time

Onsite

No experience limit

No degree limit

PA239-Parada / Museo Militar, Santiago, Región Metropolitana, Chile

Favourites

Some content was automatically translatedView Original

Description

Job Summary: We are seeking a Data Engineer for a remote project, responsible for designing, developing, and documenting ETL solutions for multiple data sources. Key Responsibilities: 1. Design and implementation of ETL/ELT pipelines. 2. AWS platform configuration for data lake and orchestration. 3. Documentation and handover of ETL flows. A technology company is looking for a Data Engineer to work remotely on a project. • Conditions: **1. Rate:** To be agreed upon. **2. Schedule:** Monday to Friday, 44-hour workweek. **3. Work Location: Remote. • 4. Responsibilities:** Review of documentation and data sources Analyze sample files from up to 6 data sources (brokers/vehicles, JSON assets, etc.) and their available metadata. Identify key fields, business keys, and requirements for normalization/anonymization. Technical design of the ETL solution Define the common data model for the 6 sources (schemas, data types, partitions, and S3 naming conventions). Design the ETL flow AWS platform configuration Create and/or adjust S3 buckets, folder structures, and basic permissions for the data lake. Configure Glue Catalog (tables and databases) and basic Glue resources for orchestration. Development of ETL pipelines for up to 6 sources **Implement ingestion jobs:** file reading, field typing, error handling. **Implement normalization jobs:** column mapping to standard model, basic enrichments, generation of curated datasets ready for computation. Incorporate minimum data quality rules (mandatory fields, data types, value ranges) and logging of rejected records. Testing and tuning Execute tests using real/example data across the 6 sources, document incidents, and refine transformations. Measure processing times and review partitioning structure to optimize subsequent queries. Documentation and handover Document ETL flows (simple diagrams, job/table descriptions, S3 paths, source-specific rules). Conduct a handover session with the client’s team to explain how to operate and extend the pipelines. • 5. Requirements: **Minimum experience:** 3 years as a Data Engineer working with ETL processes. **Experience in:** Design and implementation of ETL/ELT pipelines (ideally in multi-source consolidation projects). **AWS data handling:** **Mandatory:** S3, IAM, data-oriented compute services (AWS Glue, AWS Lambda, or similar). **Desirable:** Athena and/or Redshift for data testing/validation. Use of SQL for queries and validations; Python desirable for transformation scripts. Working with data formats such as CSV, Excel, JSON. \-Requirements\- Minimum education: University / Professional Institute / Technical Training Center. 3 years of experience. Keywords: data, datos, engineer, ingeniero, ingeniera, ing, engineers, casa, remoto, remote, etrabajo, home. A technology company is looking for a Data Engineer to work remotely on a project. • Conditions: **1. Rate:** To be agreed upon. **2. Schedule:** Monday to Friday, 44-hour workweek. **3. Work Location: Remote. • 4. Responsibilities:** Review of documentation and data sources Analyze sample files from up to 6 data sources (brokers/vehicles, JSON assets, etc.) and their available metadata. Identify key fields, business keys, and requirements for normalization/anonymization. Technical design of the ETL solution Define the common data model for the 6 sources (schemas, data types, partitions, and S3 naming conventions). Design the ETL flow AWS platform configuration Create and/or adjust S3 buckets, folder structures, and basic permissions for the data lake. Configure Glue Catalog (tables and databases) and basic Glue resources for orchestration. Development of ETL pipelines for up to 6 sources **Implement ingestion jobs:** file reading, field typing, error handling. **Implement normalization jobs:** column mapping to standard model, basic enrichments, generation of curated datasets ready for computation. Incorporate minimum data quality rules (mandatory fields, data types, value ranges) and logging of rejected records. Testing and tuning Execute tests using real/example data across the 6 sources, document incidents, and refine transformations. Measure processing times and review partitioning structure to optimize subsequent queries. Documentation and handover Document ETL flows (simple diagrams, job/table descriptions, S3 paths, source-specific rules). Conduct a handover session with the client’s team to explain how to operate and extend the pipelines. • 5. Requirements: **Minimum experience:** 3 years as a Data Engineer working with ETL processes. **Experience in:** Design and implementation of ETL/ELT pipelines (ideally in multi-source consolidation projects). **AWS data handling:** **Mandatory:** S3, IAM, data-oriented compute services (AWS Glue, AWS Lambda, or similar). **Desirable:** Athena and/or Redshift for data testing/validation. Use of SQL for queries and validations; Python desirable for transformation scripts. Working with data formats such as CSV, Excel, JSON. \-Requirements\- Minimum education: University / Professional Institute / Technical Training Center. 3 years of experience. Keywords: data, datos, engineer, ingeniero, ingeniera, ing, engineers, casa, remoto, remote, etrabajo, home **Salary:** 0 CLP/MONTH.

Source: indeed View original post