The succesful candidate will be responsible for: Creating data feeds from on-premise to AWS Cloud Support data feeds in production on break fix basis Creating data marts using Talend or similar ETL development tool Manipulating data using python and pyspark Processing data using the Hadoop paradigm particularly using EMR, AWS’s distribution of Hadoop Devop for Big Data and Business Intelligence including automated testing and deployment Design and develop data feeds from an on-premise environment into a datalake environment in an AWS cloud environment Design and develop programmatic transformations of the solution, by correctly partitioning, formatting and validating the data quality