As a company, FrieslandCampina is running the transition towards pervasive data-driven decision making, not only on strategic level, it penetrates right into our 24/7 operations company wide. We have been working hard to deliver high-quality business intelligence reports and dashboards, that enable descriptive and diagnostic capabilities across all business groups. At the same time, we focus our efforts on improving our predictive and forecasting capabilities which are executed by business-embedded data scientists.
The Global Data Platform team is there to support these analytical efforts and products on a global level. We are building a state-of-the-art AWS cloud-based data landscape, consisting of a data lake and high performance data-warehouse, using largely AWS standard services (S3, Redshift, Cloudwatch,etc.).
In your capacity as Data Engineer, you ensure that our data pipelines are built to perform and scale. You will be responsible for operating our existing PySpark-based pipelines, running on AWS-standard building blocks like S3, Lambda and EMR. You will also be responsible for orchestrating the pipelines, using Airflow, as well as driving continuous improvement on our CI/CD pipelines and data access configuration. Except for developing and operating your own pipelines, you will also be expected to review and test those of built by other developers.
To succeed in this role, you should have a solid foundational understanding of the data engineer’s toolbox. You will perform a wide range of activities, from exploring new solutions, to executing root-cause-analysis for production data or CI/CD pipeline breaks down. You will be challenged constantly in seeking the balance between assuring an “Always On ” production landscape”, whilst driving innovation with a fail-fast attitude.
From a technical perspective, you should have demonstrable knowledge of both data engineering-, as well as DevOps / Platform engineering skills. Here are some of the skills we will expect from you:
- Proven experience with AWS services (ideally certified as AWS Solution Architect or AWS Certified Developer – Associate)
- Extensive experience with Python, PySpark, SQL and Linux scripting
- You have a foundational understanding of CI/CD, including principles of code branching and merging and can explain how to utilize AWS Code Commit & Code Pipeline (or at least GIT) to build and operate a CI/CD pipeline
- Ideally you are also familiar with Airflow and know how to build and debug DAG’s
- If you already know how to manage access controls using AWS IAM roles, coupled with Microsoft Active Directory services, that is nice. If not, you will learn!
As an international company, our primary business language is English and we often work with remote teams. This means that you need to bring in an excellent communication skills and good cultural awareness.