Data Engineer
Location: On-Site, Plano, TX

Job Description:

Job Title: Data Engineer

Location: Richmond, VA or MCLean, VA or Plano, TX (Onsite)

Key Responsibilities:

  • Create, maintain, and optimize ETL/ELT pipelines to ingest, process, and manage data from various sources using Python, Apache Spark, and AWS services.
  • Design data models, build data structures, and implement data storage solutions that ensure data integrity, consistency, and security.
  • Tune data processing workflows for performance, scalability, and cost efficiency on distributed systems using Spark and AWS.
  • Work with cross-functional teams (e.g., data science, product, analytics) to understand data requirements and support business needs. Document data workflows, processes, and solutions for transparency and reproducibility
  • Implement data quality checks, error handling, and recovery processes. Ensure compliance with data governance and security protocols.

 

Key Qualifications:

  • Proficient in Python for data processing, scripting, and automation.
  • Experience with Spark for data transformation, distributed processing, and ETL workflows.
  • Hands-on experience with core AWS services like S3, Lambda, Glue, EMR, Redshift, and RDS. Knowledge of IAM, CloudFormation, and/or Terraform for infrastructure management is a plus.
  • Strong understanding of SQL, data warehousing, and database design principles.
  • Familiarity with data modeling, schema design, and query optimization.

 

Other Skills:

  • Experience with version control (Git) and CI/CD practices.
  • Strong problem-solving skills and ability to work in an Agile environment.
  • Excellent communication skills and ability to work with non-technical stakeholders.

Preferred Qualifications:

  • Familiarity with additional tools like Airflow for workflow orchestration.
  • Experience with data streaming technologies (e.g., Kafka, Kinesis).

Key Skills:

  • Data engineer and AWS and Airflow