Responsibilities
- Lead a team of highly motivated data integration engineers
- Provide technical advisory and expertise on Analytics subject matter
- Create, implement and execute the roadmap for providing Analytics insight and Machine Learning
- Experiment with new technology as an ongoing proof of concept
- Architect and develop data integration pipelines using a combination of stream and batch processing techniques
- Integrate multiple data sources using Extraction, Transformation and Loading (ETL)
- Build data lake and data marts using HDFS, NoSQL and Relational databases
- Manage multiple Big Data clusters and data storage in the cloud
- Collect and process event data from multiple application sources with both internal Elsevier and external vendor products
- Understand data science and work directly with data scientists and machine learning engineers
Identify useful technology that can be used to fulfill user story requirements from an Analytics perspective
Required Skills
- Excellent SQL skills from different range levels of ANSI compliancy
- Basic knowledge of Applied Statistics
- Advanced knowledge of Systems and Service Architecture
- Advanced knowledge of Polyglot Persistence and use of RDBMS, In-Memory Key/Value stores, BigTable databases and Distributed File Systems such as HDFS and Amazon S3
- Extensive knowledge of the Hadoop ecosystem and its components such as HDFS, Kafka, Spark, Flume, Oozie, HBase, Hive
- Advanced knowledge of ETL/Data Routing and understanding of tools such as NiFi, Kinesis, etc
- Good understanding of DevOps, SDLC and Agile methodology
- Software/Infrastructure Diagrams such as Sequence, UML, Data Flows
- Requirements Analysis, Planning, Problem Solving, Strategic Planning
- Excellent Verbal Communication, Self-Motivated with Initiative
Required Experience
- 8+ years experience in software programming using
- Java, JavaScript Spring, SQL, etc
- 3+ years experience in service integration using REST, SOAP, RPC, etc
- 3+ years experience in Data Management, Data Modeling
- Industry experience working with large scale stream processing, batch processing and data mining
- Experience with Cloud services such as AWS or Azure
- Experience with Linux/UNIX systems and the best practices for deploying applications to Hadoop from those environments
Education Requiements
- Education business domain knowledge preferred