Kubernetes

Kubernetes is an open source system released in 2015 that allows the orchestration and management of containers on server clusters, whether physical or virtual, public or private.

It can support multiple Linux host machines or servers called nodes, each hosting one or more containers. Kubernetes allows you to create application services on several containers, plan their execution in a cluster, guarantee their integrity and ensure their monitoring.

A server called "Master" will be the conductor of the cluster and will run the containers of one or more nodes according to the availability of resources on each server. Kubernetes therefore provides a routing service between the client and the containers that have deployed the service he’s looking for.

Learn more: Official website; Repository; Twitter
Related tags: Docker

Kubernetes 1.8

Categories: Containers Orchestration, Open Source Summit Europe 2017 | Tags: containerd, CRD, RBAC, Kubernetes, Network, OCI, Release and features

The 1.8 release of Kubernetes brings a lot of new things. With 2500+ pull request, 2000+ commits, 400+ commiters, Kubernetes added 39 new features in this version. This is the richest release in terms…

By Younes YASSINE

Oct 24, 2017

Kubernetes Storage Primitives for Stateful Workloads

Categories: Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags: Container Storage Interface (CSI), PVC, Azure, Docker, GCE, Kubernetes, Storage

This article is based on the presentation “Introduction to Kubernetes Storage Primitives for Stateful Workloads” from the OSS Convention Prague 2017 by the {Code} team. So, let’s start, what is…

By Pierre SAUVAGE

Oct 28, 2017

Micro Services

Categories: Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags: Mesos, DNS, Encryption, gRPC, Istio, Linkerd, Micro Services, MITM, Service Mesh, CNCF, Kubernetes, Proxy, SPOF, SSL/TLS

Back in the days, applications were monolithic and we could use an IP address to access a service. With virtual machines (VM), multiple hosts started to appear on the same machine with multiple apps…

By David WORMS

Nov 14, 2017

Notes after Katacoda Training on Kubernetes Container Orchestration

Categories: Containers Orchestration, Learning | Tags: Helm, Ingress, Kubeadm, CNI, Micro Services, Minikube, Kubernetes

A few weeks ago, I dedicated two days to follow the turorials available on Katacoda, the interactive learning platform for Kubernetes or any other container orchestration platform. I’m sharing my…

By David WORMS

Dec 14, 2017

What's new in Apache Spark 2.3?

Categories: Data Engineering, DataWorks Summit 2018 | Tags: Arrow, PySpark, Tuning, ORC, Spark, Spark MLlib, Data Science, Docker, Kubernetes, pandas, Streaming

Let’s dive into the new features offered by the 2.3 distribution of Apache Spark. This article is a composition of the following talks seen at the DataWorks Summit 2018 and additional research: Apache…

By César BEREZOWSKI

May 23, 2018

TensorFlow on Spark 2.3: The Best of Both Worlds

Categories: Data Science, DataWorks Summit 2018 | Tags: Mesos, C++, CPU, GPU, Tuning, Spark, YARN, JavaScript, Keras, Kubernetes, Machine Learning, Python, TensorFlow

The integration of TensorFlow With Spark has a lot of potential and creates new opportunities. This article is based on a conference seen at the DataWorks Summit 2018 in Berlin. It was about the new…

By Yliess HATI

May 29, 2018

Lando: Deep Learning used to summarize conversations

Categories: Data Science, Learning | Tags: Micro Services, Open API, Deep Learning, Internship, Kubernetes, Neural Network, Node.js

Lando is an application to summarize conversations using Speech To Text to translate the written record of a meeting into text and Deep Learning technics to summarize contents. It allows users to…

By Yliess HATI

Sep 18, 2018

One week to discuss technology in a Moroccan riad

Categories: Adaltas Summit 2018, Learning | Tags: CDSW, Gatsby, React.js, Flink, Hadoop, Knox, Data Science, Deep Learning, Kubernetes, Node.js

Adaltas organise the year its first conference between the 22 and 26 of October. On the agenda of these 5 days of conference: discuss technology in one of the most beautiful riad of Marrakech. Mix the…

By David WORMS

Oct 11, 2018

Apache Flink: past, present and future

Categories: Data Engineering | Tags: Pipeline, Flink, Kubernetes, Machine Learning, SQL, Streaming

Apache Flink is a little gem which deserves a lot more attention. Let’s dive into Flink’s past, its current state and the future it is heading to by following the keynotes and presentations at Flink…

By César BEREZOWSKI

Nov 5, 2018

Microsoft introduces Cloud Native Application Bundles

Categories: Containers Orchestration | Tags: CLI, Helm, Packaging, Docker, Kubernetes

At DockerCon EU 2018 in Barcelona, Matt Butcher, Principal Engineer at Microsoft and inventor of Helm, introduced CNAB, Cloud Native Application Bundles, a packaging format for distributed…

By Arthur BUSSER

Dec 4, 2018

Monitoring a production Hadoop cluster with Kubernetes

Categories: DevOps & SRE | Tags: Thrift, Shinken, Hadoop, Knox, Cluster, Docker, Elasticsearch, Grafana, Kubernetes, Node, Node.js, Prometheus, Python

Monitoring a production grade Hadoop cluster is a real challenge and needs to be constantly evolving. The software we use today is based on Nagios. Very efficient when it comes to the simplest…

By Paul-Adrien CORDONNIER

Dec 21, 2018

LXD: The Missing Piece

Categories: Containers Orchestration | Tags: CPU, Linux, LXD, VM, Docker, Kubernetes

LXD stands for Linux Container Daemon. Yet another container technology. But LXD is very different. It stands apart from the pack. It is not necessarily better nor much faster nor more secure! But it…

By Tariq SAHNOUNI

Dec 28, 2018

Installing Kubernetes on CentOS 7

Categories: Containers Orchestration | Tags: CentOS, cgroups, DevOps, Infrastructure, Namespaces, Red Hat, VM, Ceph, CNCF, Docker, Kubernetes

This article explains how to install a Kubernetes cluster. I will dive into what each step does so you can build a thorough understanding of what is going on. This article is based on my talk from the…

By Arthur BUSSER

Jan 29, 2019

Introduction to Cloudera Data Science Workbench

Categories: Data Science | Tags: Azure, Cloudera, Docker, Git, Kubernetes, Machine Learning, MLOps, Notebook

Cloudera Data Science Workbench is a platform that allows Data Scientists to create, manage, run and schedule data science workflows from their browser. Thus it enables them to focus on their main…

By Mehdi ELALAMI

Feb 28, 2019

Google Cloud Summit Paris Notes

Categories: Events | Tags: AWS, Azure, Cloud, GCP, Kubernetes, On-premises

Google organized its yearly Summit edition 2019 in Paris on the 18th of June. This year’s event was the biggest yet in Paris, which reflect Google’s commitment to position itself in the French market…

By Tariq SAHNOUNI

Jun 26, 2019

Auto-scaling Druid with Kubernetes

Categories: Big Data, Business Intelligence, Containers Orchestration | Tags: Helm, Metrics, OLAP, Operation, Container Orchestration, EC2, Druid, Cloud, CNCF, Data Analytics, Kubernetes, Prometheus, Python

Apache Druid is an open-source analytics data store which could leverage the auto-scaling abilities of Kubernetes due to its distributed nature and its reliance on memory. I was inspired by the talk…

By Leo SCHOUKROUN

Jul 16, 2019

Users and RBAC authorizations in Kubernetes

Categories: Containers Orchestration, Data Governance | Tags: Cyber Security, RBAC, Authentication, Authorization, Kubernetes, SSL/TLS

Having your Kubernetes cluster up and running is just the start of your journey and you now need to operate. To secure its access, user identities must be declared along with authentication and…

By Robert Walid SOARES

Aug 7, 2019

Rook with Ceph doesn't provision my Persistent Volume Claims!

Categories: DevOps & SRE | Tags: PVC, Linux, Rook, Ubuntu, Ceph, Cluster, Internship, Kubernetes

Ceph installation inside Kubernetes can be provisioned using Rook. Currently doing an internship at Adaltas, I was in charge of participating in the setup of a Kubernetes (k8s) cluster. To avoid…

By Eyal CHOJNOWSKI

Sep 9, 2019

Machine Learning model deployment

Categories: Big Data, Data Engineering, Data Science, DevOps & SRE | Tags: DevOps, Operation, AI, Cloud, Machine Learning, MLOps, On-premises, Schema

“Enterprise Machine Learning requires looking at the big picture […] from a data engineering and a data platform perspective,” lectured Justin Norman during the talk on the deployment of Machine…

By Oskar RYNKIEWICZ

Sep 30, 2019

Internship Data Science & Data Engineer - ML in production and streaming data ingestion

Categories: Data Engineering, Data Science | Tags: DevOps, Flink, Hadoop, HBase, Kafka, Spark, Internship, Kubernetes, Python

Context The exponential evolution of data has turned the industry upside down by redefining data storage, processing and data ingestion pipelines. Mastering these methods considerably facilitates…

By David WORMS

Nov 26, 2019

InfraOps & DevOps Internship - build a Big Data & Kubernetes PaaS

Categories: Big Data, Containers Orchestration | Tags: DevOps, LXD, Hadoop, Kafka, Spark, Ceph, Internship, Kubernetes, NoSQL

Context The acquisition of a high-capacity cluster is in line with Adaltas’ desire to build a PAAS-type offering to use and to provide Big Data and container orchestration platforms. The platforms are…

By David WORMS

Nov 26, 2019

Hadoop Ozone part 1: an introduction of the new filesystem

Categories: Infrastructure | Tags: HDFS, Ozone, Cluster, Kubernetes

Hadoop Ozone is an object store for Hadoop. It is designed to scale to billions of objects of varying sizes. It is currently in development. The roadmap is available on the project wiki. This article…

By Paul-Adrien CORDONNIER

Dec 3, 2019

Should you move your Big Data and Data Lake to the Cloud

Categories: Big Data, Cloud Computing | Tags: DevOps, AWS, Azure, Cloud, CDP, Databricks, GCP

Should you follow the trend and migrate your data, workflows and infrastructure to GCP, AWS and Azure? During the Strata Data Conference in New-York, a general focus was put on moving customer’s Big…

By Joris RUMMENS

Dec 9, 2019

Hadoop Ozone part 3: advanced replication strategy with Copyset

Categories: Infrastructure | Tags: HDFS, Ozone, Cluster, Kubernetes, Node

Hadoop Ozone provide a way of setting a ReplicationType for every write you make on the cluster. Right now is supported HDFS and Ratis but more advanced replication strategies can be achieved. In this…

By Paul-Adrien CORDONNIER

Dec 3, 2019

Policy enforcing with Open Policy Agent

Categories: Cyber Security, Data Governance | Tags: Kafka, Ranger, Authorization, Cloud, Kubernetes, REST, SSL/TLS

Open Policy Agent is an open-source multi-purpose policy engine. Its main goal is to unify policy enforcement across the cloud native stack. The project was created by Styra and it is currently…

By Leo SCHOUKROUN

Jan 22, 2020

Install and debug Kubernetes inside LXD

Categories: Containers Orchestration | Tags: Debug, Linux, LXD, Docker, Kubernetes, Node

We recently deployed a Kubernetes cluster with the need to maintain clusters isolation on our bare metal nodes across our infrastructure. We knew that Virtual Machines would provide the required…

By Leo SCHOUKROUN

Feb 4, 2020

Optimization of Spark applications in Hadoop YARN

Categories: Data Engineering, Learning | Tags: Tuning, Hadoop, Spark, Python

Apache Spark is an in-memory data processing tool widely used in companies to deal with Big Data issues. Running a Spark application in production requires user-defined resources. This article…

By Ferdinand DE BAECQUE

Mar 30, 2020

Expose a Rook-based Ceph cluster outside of Kubernetes

Categories: Containers Orchestration | Tags: Debug, Rook, Ceph, Docker, Kubernetes

We recently deployed a LXD based Hadoop cluster and we wanted to be able to apply size quotas on some filesystems (ie: service logs, user homes). Quota is a built in feature of the Linux kernel used…

By Leo SCHOUKROUN

Apr 16, 2020

Comparison of different file formats in Big Data

Categories: Big Data, Data Engineering | Tags: Business intelligence, Data structures, Avro, HDFS, ORC, Parquet, Batch processing, Big Data, CSV, JavaScript Object Notation (JSON), Kubernetes, Protocol Buffers

In data processing, there are different types of files formats to store your data sets. Each format has its own pros and cons depending upon the use cases and exists to serve one or several purposes…

By Aida NGOM

Jul 23, 2020

OAuth2 and OpenID Connect, a gentle and working introduction (Part 1)

Categories: Containers Orchestration, Cyber Security | Tags: Go Lang, JAMstack, LDAP, CNCF, Kubernetes, OAuth2, OpenID Connect

Understanding OAuth2, OpenID and OpenID Connect (OIDC), how they relate, how the communications are established, and how to architecture your application with the given access, refresh and id tokens…

By David WORMS

Nov 17, 2020

OAuth2 and OpenID Connect for microservices and public applications (Part 2)

Categories: Containers Orchestration, Cyber Security | Tags: LDAP, Micro Services, CNCF, JavaScript Object Notation (JSON), OAuth2, OpenID Connect

Using OAuth2 and OpenID Connect, it is important to understand how the authorization flow is taking place, who shall call the Authorization Server, how to store the tokens. Moreover, microservices and…

By David WORMS

Nov 20, 2020

Apache Liminal: when MLOps meets GitOps

Categories: Big Data, Containers Orchestration, Data Engineering, Data Science, Tech Radar | Tags: Data Engineering, CI/CD, Data Science, Deep Learning, Deployment, Docker, GitOps, Kubernetes, Machine Learning, MLOps, Open source, Python, TensorFlow

Apache Liminal is an open-source software which proposes a solution to deploy end-to-end Machine Learning pipelines. Indeed it permits to centralize all the steps needed to construct Machine Learning…

By Aargan COINTEPAS

Mar 31, 2021

Adaltas Summit 2021, 2nd edition in corsica

Categories: Adaltas Summit 2021, Learning | Tags: Ansible, Hadoop, Spark, Azure, Blockchain, Deep Learning, Docker, Terraform, Kubernetes, Node.js

For its second edition, the whole Adaltas crew is gathering in Corsica for a whole week with 2 days dedicated to technology the 23rd and the 24th of september 2021. After a year and a half of sanitary…

By David WORMS

Sep 21, 2021

Internship in Web Technologies

Categories: Front End, Learning | Tags: DevOps, LDAP, React.js, CI/CD, Docker, GraphQL, IaC, Internship, Kubernetes, Node.js, OAuth2

Job Description As part of its Big Data activities, Adaltas Academy is an information-sharing platform bringing together articles, training content, and a knowledge base. The users of the platform are…

By David WORMS

Oct 14, 2021

Internship in Data Engineering

Categories: Front End, Learning | Tags: Metrics, Monitoring, Hive, Kafka, Delta Lake, Elasticsearch, IaC, Internship, Kubernetes, Streaming

Job Description Data is a valuable business asset. Some call it the new oil. The data engineer collects, transform and refine raw data into information that can be used by business analysts and data…

By David WORMS

Oct 25, 2021

Spring 2022 internship - building a Data Lab

Categories: Data Science, Learning | Tags: MongoDB, Spark, Argo CD, Elasticsearch, Internship, Keycloak, Kubernetes, OpenID Connect, PostgreSQL

Job Description Over the last few years, we developed the ability to use computers to process large amounts of data. The ecosystem evolved over a large offering of tools and libraries and the creation…

By David WORMS

Nov 24, 2021

GitOps in practice, deploy Kubernetes applications with ArgoCD

Categories: Containers Orchestration, DevOps & SRE, Adaltas Summit 2021 | Tags: Argo CD, CI/CD, Git, GitOps, IaC, Kubernetes

GitOps is a set of practices to deploy applications using Git. Application definitions, configurations, and connectivity are to be stored in a version control software such as Git. Git then serves as…

By Paul-Adrien CORDONNIER

Dec 16, 2021

MinIO object storage within a Kubernetes cluster

Categories: Big Data, Data Governance, Learning | Tags: Amazon S3, Big Data, Cluster, Data Lake, Kubernetes, Storage

MinIO is a popular object storage solution. Often recommended for its simple setup and ease of use, it is not only a great way to get started with object storage: it also provides excellent…

By Luka BIGOT

Jul 9, 2022

Ceph object storage within a Kubernetes cluster with Rook

Categories: Big Data, Data Governance, Learning | Tags: Amazon S3, Big Data, Ceph, Cluster, Data Lake, Kubernetes, Storage

Ceph is a distributed all-in-one storage system. Reliable and mature, its first stable version was released in 2012 and has since then been the reference for open source storage. Ceph’s main perk is…

By Luka BIGOT

Aug 4, 2022

Ingresses and Load Balancers in Kubernetes with MetalLB and nginx-ingress

Categories: Containers Orchestration, Infrastructure, Tech Radar | Tags: Ingress, Kubeadm, Cluster, Deployment, Kubernetes

When it comes to exposing services from a Kubernetes cluster and making it accessible from outside the cluster, the recommended option is to use a load-balancer type service to redirect incoming…

By Kellian COTTART

Sep 8, 2022

Big data infrastructure internship

Categories: Big Data, Data Engineering, DevOps & SRE, Infrastructure | Tags: Infrastructure, Hadoop, Big Data, Cluster, Internship, Kubernetes, TDP

Job description Big Data and distributed computing are at the core of Adaltas. We accompagny our partners in the deployment, maintenance, and optimization of some of the largest clusters in France…

By Stephan BAUM

Dec 2, 2022

Adaltas Summit 2022 Morzine

Categories: Big Data, Adaltas Summit 2022 | Tags: Data Engineering, Infrastructure, Iceberg, Container, Data lakehouse, Docker, Kubernetes

For its third edition, the whole Adaltas crew is gathering in Morzine for a whole week with 2 days dedicated to technology the 15th and the 16Th of september 2022. The speakers choose one of the…

By David WORMS

Jan 13, 2023

Kubernetes: debugging with ephemeral containers

Categories: Containers Orchestration, Tech Radar | Tags: Debug, Kubernetes

Anyone who has ever had to manipulate Kubernetes has found himself confronted with the resolution of pod errors. The methods provided for this purpose are efficient, and allow to overcome the most…

By Pierre BERLAND

Feb 7, 2023

Operating Kafka in Kubernetes with Strimzi

Categories: Big Data, Containers Orchestration, Infrastructure | Tags: Kafka, Big Data, Kubernetes, Open source, Streaming

Kubernetes is not the first platform that comes to mind to run Apache Kafka clusters. Indeed, Kafka’s strong dependency on storage might be a pain point regarding Kubernetes’ way of doing things when…

By Leo SCHOUKROUN

Mar 7, 2023

How to build your OCI images using Buildpacks

Categories: Containers Orchestration, DevOps & SRE | Tags: CI/CD, CNCF, Docker, Kubernetes, OCI

Docker has become the new standard for building your application. In a Docker image we place our source code, its dependencies, some configurations and our application is almost ready to be deployed…

By Paul-Adrien CORDONNIER

Jan 9, 2023

Kubernetes

Related articles