Hadoop Developer

Responsibilities

  • Lead a team of highly motivated data integration engineers
  • Provide technical advisory and expertise on Analytics subject matter
  • Create, implement and execute the roadmap for providing Analytics insight and Machine Learning
  • Identify useful technology that can be used to fulfill user story requirements from an Analytics perspective

  • Experiment with new technology as an ongoing proof of concept
  • Architect and develop data integration pipelines using a combination of stream and batch processing techniques
  • Integrate multiple data sources using Extraction, Transformation and Loading (ETL)
  • Build data lake and data marts using HDFS, NoSQL and Relational databases
  • Manage multiple Big Data clusters and data storage in the cloud
  • Collect and process event data from multiple application sources with both internal Elsevier and external vendor products
  • Understand data science and work directly with data scientists and machine learning engineers

Required Skills

  • Excellent SQL skills from different range levels of ANSI compliancy
  • Basic knowledge of Applied Statistics
  • Advanced knowledge of Systems and Service Architecture
  • Advanced knowledge of Polyglot Persistence and use of RDBMS, In-Memory Key/Value stores, BigTable databases and Distributed File Systems such as HDFS and Amazon S3
  • Extensive knowledge of the Hadoop ecosystem and its components such as HDFS, Kafka, Spark, Flume, Oozie, HBase, Hive
  • Advanced knowledge of ETL/Data Routing and understanding of tools such as NiFi, Kinesis, etc
  • Good understanding of DevOps, SDLC and Agile methodology
  • Software/Infrastructure Diagrams such as Sequence, UML, Data Flows
  • Requirements Analysis, Planning, Problem Solving, Strategic Planning
  • Excellent Verbal Communication, Self-Motivated with Initiative

Required Experience

  • 8+ years’ experience in software programming using
  • Java, JavaScript Spring, SQL, etc
  • 3+ years experience in service integration using REST, SOAP, RPC, etc
  • 3+ years experience in Data Management, Data Modeling
  • Industry experience working with large scale stream processing, batch processing and data mining
  • Experience with Cloud services such as AWS or Azure
  • Experience with Linux/UNIX systems and the best practices for deploying applications to Hadoop from those environments

Education Requiements

  • Education business domain knowledge preferred
Upload your CV/resume or any other relevant file. Max. file size: 256 MB.

Leave a Comment