Internship Data Science & Data Engineer - ML in production and streaming data ingestion

Internship Data Science & Data Engineer - ML in production and streaming data ingestion

By David WORMS

Nov 26, 2019

Context

The exponential evolution of data has turned the industry upside down by redefining data storage, processing and data ingestion pipelines. Mastering these methods considerably facilitates decision-making and creates new entrepreneurial opportunities. The Internet of Things, or IoT, connects objects to massive storage and processing environments via the Internet. The project consists of mounting a SaSS platform for collection and processing data in streaming. Depending on the skills and affinities of the trainee, the subject will be oriented on the processing of video streams or time data of sensors.

Objectives

The objective of the internship is to understand the roles of the different actors of a Data project (Data Architect, Data Engineer, Data Analyst, Data Scientist …) and to master the DevOps processes and the requirements of producing and operating a machine learning model. The selected project enables the manipulation of batch and streaming data, the application of Data Science models and the acquisition of a deep experience in distributed architectures.

Technologies at your disposal

A laptop with the following features:

  • 32GB RAM
  • 1TB SSD
  • 8c/16t CPU

A cluster composed of:

  • 3x 28c/56t Intel Xeon Scalable Gold 6132
  • 3x 192TB RAM DDR4 ECC 2666MHz
  • 3x 14 SSD 480GB SATA Intel S4500 6Gbps

Platforms, components, tools

Kubernetes, Hadoop, NoSQL, Git, LXD, Kafka, Spark, Ceph, Infrastructure as Code…

Environment

Adaltas is a team of consultants with an expertise in open source, Big Data and distributed systems. We are present in France, Canada and Morocco. Our Big Data expertise began in 2009 with the support of EDF and the collection of smart meters data called Linky. Since then, Adaltas supports major French and international groups in their digital transition and the valuation of their data. Today, Adaltas is a privileged partner of Cloudera and DataBricks, two of the leading publishers in the Big Data ecosystem.

  • Location: Boulogne Billancourt, France
  • Languages: French or English
  • Period: spring-summer 2020

Information

We invite you to contact us if you are interested or if you just want more information.

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.