Skip to main content

Junior Data Engineer

Functieomschrijving

Are you an experienced Data Engineerlooking for an international, creative and innovative environment?Would you like to work on a self-service data platform, making sureour data makes its way from a vast array of sources to the rightplace?

Atthe IT Department of Randstad Groep Nederland (HQ) we are lookingfor you! We’re looking for a Junior Data Engineer available to joinour internal teamimmediately.

DataEngineering at Randstad Groep Nederland(HQ)

As a memberof the DataHub Team you are responsible for the development andmaintenance of the Randstad data lake and the services offered todata scientists and dataanalysts. 

TheDataHub Team is making use of a variety of technologies and we areresponsible for our owninfrastructure.

Weprovide a platform to distribute data to data scientists andanalysts all over the organization to make use of all the data thatis generated in the Randstad GroupNetherlands.

Youwill be part of an agile team and play a vital role in the designand development of a cloud-based dataplatform.

  • Buildand manage the DataHub, whichincludes:

    • Afront-end datacatalog

    • DataHubusers and data science projects management

    • Datasubscriptions

    • AnAWS s3 based DataLake

    • etc. 

  • Developways to improve self-service data consumption and datapublishing:

    • Buildand manage ETL pipelines in Airflow, which are responsible ofingesting the data and making the data available tousers

    • Developstandard ways to deliver data in theDataHub

    • DevelopCI/CD pipelines for data consuming teams to let them develop theirproducts

    • etc. 

 

Youwill be responsible for producing quality code and reusablecomponents.

Usingcontainerization, CI/CD and other automation technologies, you willbe responsible for creating a backend for high availability andscalability, while at the same time being easily deployable,manageable andsecure.

Together withthe rest of the team you will be involved in the full productdevelopment process, from design, implementation, to testing,documentation and automateddeployment.

 

Respondto and resolve operational incidents, performing root causeanalysis and managing changes required to prevent futureoccurrences.

In thisteam you will have a wide range of responsibilities and should bewilling to adapt to many differentchallenges.

Discuss withthe users of the platform requirements and future improvements, butalso come with proposals for our users on how to use theplatform. 

 

Manageand develop our data persistence environments (data lake, storage,etc) to ensure that data is properly available to users andsecured

Monitor systemsfor uptime andperformance. 

 

Thedata lake we maintain is partly in Redshift, and is moving fullytowards S3. The new S3 Data Lake will be accessed through the TrinoQuery engine that lives on an auto-scaling EKS cluster and eats rawdata via Spark through an EMR cluster that makes use of a fleet ofSpot instances. We have created a Django based metadata catalogthat functions at the same time as a portal to monitor the data andto provide services for ourconsumers. 

Forgeneral usage we offer the functionality of data subscriptionsthrough scheduled unloads to a project space on S3. Furthermore weoffer tools to work with machine learning models using Sagemakernotebooks.

 

Youwill be designing and setting up infrastructure on AWS for theexpanding services of our platform and develop airflow dags thatrepresent our data pipelines. Most of our coding is done inpython.

 

So what would we like you to know and bringto thetable?

  • Experienceworking in our tech stack, including Python, AWS cloud, Airflow,SQL andDocker;
  • Experiencewith scripting/automation oftasks;
  • Experience withcommon devops and CI/CD practices to make your own work as easy aspossible and guarantee the quality of ourproducts;
  • Beingcomfortable with working in test-driven development and know whatthis means;
  • Goodcommunicationskills;
  • A proactiveand energeticmindset;
  • A naturalliking  to take thelead;
  • A self-starting & curiouspersonality.

Ifyou really want to impress us, you can do so by having experiencein: containerization platforms like; Spark, Jenkins, Django, EMR,Kubernetes/EKS,Presto/Trino;   
 

What do weoffer?

  • Plentyof training and development opportunities within the group. Asignificant share of our employees have held several roles in theiryears in the business, with RGN giving you the tools you need tochallenge and developyourself;
  • A verycompetitive package depending on yourexperience;
  • 8,5%holiday allowance;
  • Agenerous monthly benefit budget on top of your salary and holidaypay that you can choose to spend on extra time off, perks such as abike, tablet, gym subscription or simply get paidout;
  • 25 days holidaywith the option to buy additional 25 days off via the abovebudget;
  • A generoussabbatical program;
  • Agood mobility scheme, laptop and everything you need to performyour job well;
  • Anattractive bonus scheme and the option to earn an outperformancebonus twice peryear.       

 

Doesthis sound like the right next step for you? Fantastic! Applydirectly by clicking apply or contact our Senior Staff Specialistfor more information (paul.van.os@randstadgroep.nl |0651578290)

This is afull time (40h/week) role. Read more about working at RGN IT andourbenefits.  

 

Junior Data Engineer

Randstad, Diemen
Contract type: 
Junior, Permanent
Categories: 
Data Engineer
Degree level: 
Master
Career level: 
Junior