Hadoop Admin
Location: On-Site, Houston, TX

Job Description:

Required Skills/Responsibilities:

?Expertise and knowledge: Cloudera Data Platform, Oozie, Hive, Spark, Spark Streaming and Presto

Data Pipeline Development:

Design, develop, and implement scalable data pipelines using Cloudera tools like Hadoop, Spark, Hive, Impala, and HDFS.
Write and optimize ETL processes to extract, transform, and load data into data lakes or warehouses.

Big Data Application Development:

  • Develop applications to process large datasets efficiently using frameworks such as Apache Spark and MapReduce.
     
  • Build solutions for batch and real-time data processing.
  • Cluster Management:
     
  • Work with Cloudera Manager for cluster setup, configuration, monitoring, and performance optimization.
     
  • Ensure high availability and scalability of Cloudera clusters.
     
  • -??System dimensioning (computational resources/Storage/Networks).
     

    -??System reconfiguration in case of HW extension and/or replacement.
     

    -??OS and Cloudera Software upgrades.
     

    -??Cloudera SW vulnerabilities and patching management.
     

    -??Access and permission management.
     

    -??Installation of any other Cloudera application if needed.
     

  • Data Storage and Management:
     
  • Design and implement data storage strategies using HDFS, HBase, and other Cloudera-supported tools.
     
  • Optimize data storage and retrieval processes to improve performance.
     
  • Performance Tuning:
     
  • Monitor and optimize the performance of Hadoop and Spark jobs.
     
  • Troubleshoot and resolve performance bottlenecks in data pipelines.
     
  • -??Assist in Designing scalable architectures for high volume data.
     

    -??Ensure E2E pipeline stability for already developed and future use cases.
     

    -??Performance tuning of Spark workflows.
     

  • Integration and Collaboration:
     
  • Integrate Cloudera solutions with external systems, databases, and APIs.
     
  • Collaborate with data scientists, analysts, and other teams to understand requirements and deliver data solutions.

Srinithi / srinithi@vysystems.com

 

 


Key Skills:

  • Cloudera Data Platform, Oozie, Hive, Spark, Spark Streaming and Presto
    Hadoop