Kubernetes
Kubernetes is an open source system released in 2015 that allows the orchestration and management of containers on server clusters, whether physical or virtual, public or private.
It can support multiple Linux host machines or servers called nodes, each hosting one or more containers. Kubernetes allows you to create application services on several containers, plan their execution in a cluster, guarantee their integrity and ensure their monitoring.
A server called "Master" will be the conductor of the cluster and will run the containers of one or more nodes according to the availability of resources on each server. Kubernetes therefore provides a routing service between the client and the containers that have deployed the service heās looking for.
- Related tags
- Docker
Related articles
Operating Kafka in Kubernetes with Strimzi
Categories: Big Data, Containers Orchestration, Infrastructure | Tags: Kafka, Big Data, Kubernetes, Open source, Streaming
Kubernetes is not the first platform that comes to mind to run Apache Kafka clusters. Indeed, Kafkaās strong dependency on storage might be a pain point regarding Kubernetesā way of doing things whenā¦
Mar 7, 2023
Kubernetes: debugging with ephemeral containers
Categories: Containers Orchestration, Tech Radar | Tags: Debug, Kubernetes
Anyone who has ever had to manipulate Kubernetes has found himself confronted with the resolution of pod errors. The methods provided for this purpose are efficient, and allow to overcome the mostā¦
Feb 7, 2023
Adaltas Summit 2022 Morzine
Categories: Big Data, Adaltas Summit 2022 | Tags: Data Engineering, Infrastructure, Iceberg, Container, Data lakehouse, Docker, Kubernetes
For its third edition, the whole Adaltas crew is gathering in Morzine for a whole week with 2 days dedicated to technology the 15th and the 16Th of september 2022. The speakers choose one of theā¦
By David WORMS
Jan 13, 2023
How to build your OCI images using Buildpacks
Categories: Containers Orchestration, DevOps & SRE | Tags: CI/CD, CNCF, Docker, Kubernetes, OCI
Docker has become the new standard for building your application. In a Docker image we place our source code, its dependencies, some configurations and our application is almost ready to be deployedā¦
Jan 9, 2023
Big data infrastructure internship
Categories: Big Data, Data Engineering, DevOps & SRE, Infrastructure | Tags: Infrastructure, Hadoop, Big Data, Cluster, Internship, Kubernetes, TDP
Job description Big Data and distributed computing are at the core of Adaltas. We accompagny our partners in the deployment, maintenance, and optimization of some of the largest clusters in Franceā¦
By Stephan BAUM
Dec 2, 2022
Ingresses and Load Balancers in Kubernetes with MetalLB and nginx-ingress
Categories: Containers Orchestration, Infrastructure, Tech Radar | Tags: Ingress, Kubeadm, Cluster, Deployment, Kubernetes
When it comes to exposing services from a Kubernetes cluster and making it accessible from outside the cluster, the recommended option is to use a load-balancer type service to redirect incomingā¦
Sep 8, 2022
Ceph object storage within a Kubernetes cluster with Rook
Categories: Big Data, Data Governance, Learning | Tags: Amazon S3, Big Data, Ceph, Cluster, Data Lake, Kubernetes, Storage
Ceph is a distributed all-in-one storage system. Reliable and mature, its first stable version was released in 2012 and has since then been the reference for open source storage. Cephās main perk isā¦
By Luka BIGOT
Aug 4, 2022
MinIO object storage within a Kubernetes cluster
Categories: Big Data, Data Governance, Learning | Tags: Amazon S3, Big Data, Cluster, Data Lake, Kubernetes, Storage
MinIO is a popular object storage solution. Often recommended for its simple setup and ease of use, it is not only a great way to get started with object storage: it also provides excellentā¦
By Luka BIGOT
Jul 9, 2022
GitOps in practice, deploy Kubernetes applications with ArgoCD
Categories: Containers Orchestration, DevOps & SRE, Adaltas Summit 2021 | Tags: Argo CD, CI/CD, Git, GitOps, IaC, Kubernetes
GitOps is a set of practices to deploy applications using Git. Application definitions, configurations, and connectivity are to be stored in a version control software such as Git. Git then serves asā¦
Dec 16, 2021
Spring 2022 internship - building a Data Lab
Categories: Data Science, Learning | Tags: MongoDB, Spark, Argo CD, Elasticsearch, Internship, Keycloak, Kubernetes, OpenID Connect, PostgreSQL
Job Description Over the last few years, we developed the ability to use computers to process large amounts of data. The ecosystem evolved over a large offering of tools and libraries and the creationā¦
By David WORMS
Nov 24, 2021
Internship in Data Engineering
Categories: Front End, Learning | Tags: Metrics, Monitoring, Hive, Kafka, Delta Lake, Elasticsearch, IaC, Internship, Kubernetes, Streaming
Job Description Data is a valuable business asset. Some call it the new oil. The data engineer collects, transform and refine āāraw data into information that can be used by business analysts and dataā¦
By David WORMS
Oct 25, 2021
Internship in Web Technologies
Categories: Front End, Learning | Tags: DevOps, LDAP, React.js, CI/CD, Docker, GraphQL, IaC, Internship, Kubernetes, Node.js, OAuth2
Job Description As part of its Big Data activities, Adaltas Academy is an information-sharing platform bringing together articles, training content, and a knowledge base. The users of the platform areā¦
By David WORMS
Oct 14, 2021
Adaltas Summit 2021, 2nd edition in corsica
Categories: Adaltas Summit 2021, Learning | Tags: Ansible, Hadoop, Spark, Azure, Blockchain, Deep Learning, Docker, Terraform, Kubernetes, Node.js
For its second edition, the whole Adaltas crew is gathering in Corsica for a whole week with 2 days dedicated to technology the 23rd and the 24th of september 2021. After a year and a half of sanitaryā¦
By David WORMS
Sep 21, 2021
Apache Liminal: when MLOps meets GitOps
Categories: Big Data, Containers Orchestration, Data Engineering, Data Science, Tech Radar | Tags: Data Engineering, CI/CD, Data Science, Deep Learning, Deployment, Docker, GitOps, Kubernetes, Machine Learning, MLOps, Open source, Python, TensorFlow
Apache Liminal is an open-source software which proposes a solution to deploy end-to-end Machine Learning pipelines. Indeed it permits to centralize all the steps needed to construct Machine Learningā¦
Mar 31, 2021
OAuth2 and OpenID Connect for microservices and public applications (Part 2)
Categories: Containers Orchestration, Cyber Security | Tags: LDAP, Micro Services, CNCF, JavaScript Object Notation (JSON), OAuth2, OpenID Connect
Using OAuth2 and OpenID Connect, it is important to understand how the authorization flow is taking place, who shall call the Authorization Server, how to store the tokens. Moreover, microservices andā¦
By David WORMS
Nov 20, 2020
OAuth2 and OpenID Connect, a gentle and working introduction (Part 1)
Categories: Containers Orchestration, Cyber Security | Tags: Go Lang, JAMstack, LDAP, CNCF, Kubernetes, OAuth2, OpenID Connect
Understanding OAuth2, OpenID and OpenID Connect (OIDC), how they relate, how the communications are established, and how to architecture your application with the given access, refresh and id tokensā¦
By David WORMS
Nov 17, 2020
Comparison of different file formats in Big Data
Categories: Big Data, Data Engineering | Tags: Business intelligence, Data structures, Avro, HDFS, ORC, Parquet, Batch processing, Big Data, CSV, JavaScript Object Notation (JSON), Kubernetes, Protocol Buffers
In data processing, there are different types of files formats to store your data sets. Each format has its own pros and cons depending upon the use cases and exists to serve one or several purposesā¦
By Aida NGOM
Jul 23, 2020
Expose a Rook-based Ceph cluster outside of Kubernetes
Categories: Containers Orchestration | Tags: Debug, Rook, Ceph, Docker, Kubernetes
We recently deployed a LXD based Hadoop cluster and we wanted to be able to apply size quotas on some filesystems (ie: service logs, user homes). Quota is a built in feature of the Linux kernel usedā¦
Apr 16, 2020
Optimization of Spark applications in Hadoop YARN
Categories: Data Engineering, Learning | Tags: Tuning, Hadoop, Spark, Python
Apache Spark is an in-memory data processing tool widely used in companies to deal with Big Data issues. Running a Spark application in production requires user-defined resources. This articleā¦
Mar 30, 2020
Install and debug Kubernetes inside LXD
Categories: Containers Orchestration | Tags: Debug, Linux, LXD, Docker, Kubernetes, Node
We recently deployed a Kubernetes cluster with the need to maintain clusters isolation on our bare metal nodes across our infrastructure. We knew that Virtual Machines would provide the requiredā¦
Feb 4, 2020
Policy enforcing with Open Policy Agent
Categories: Cyber Security, Data Governance | Tags: Kafka, Ranger, Authorization, Cloud, Kubernetes, REST, SSL/TLS
Open Policy Agent is an open-source multi-purpose policy engine. Its main goal is to unify policy enforcement across the cloud native stack. The project was created by Styra and it is currentlyā¦
Jan 22, 2020
Should you move your Big Data and Data Lake to the Cloud
Categories: Big Data, Cloud Computing | Tags: DevOps, AWS, Azure, Cloud, CDP, Databricks, GCP
Should you follow the trend and migrate your data, workflows and infrastructure to GCP, AWS and Azure? During the Strata Data Conference in New-York, a general focus was put on moving customerās Bigā¦
Dec 9, 2019
Hadoop Ozone part 3: advanced replication strategy with Copyset
Categories: Infrastructure | Tags: HDFS, Ozone, Cluster, Kubernetes, Node
Hadoop Ozone provide a way of setting a ReplicationType for every write you make on the cluster. Right now is supported HDFS and Ratis but more advanced replication strategies can be achieved. In thisā¦
Dec 3, 2019
Hadoop Ozone part 1: an introduction of the new filesystem
Categories: Infrastructure | Tags: HDFS, Ozone, Cluster, Kubernetes
Hadoop Ozone is an object store for Hadoop. It is designed to scale to billions of objects of varying sizes. It is currently in development. The roadmap is available on the project wiki. This articleā¦
Dec 3, 2019
InfraOps & DevOps Internship - build a Big Data & Kubernetes PaaS
Categories: Big Data, Containers Orchestration | Tags: DevOps, LXD, Hadoop, Kafka, Spark, Ceph, Internship, Kubernetes, NoSQL
Context The acquisition of a high-capacity cluster is in line with Adaltasā desire to build a PAAS-type offering to use and to provide Big Data and container orchestration platforms. The platforms areā¦
By David WORMS
Nov 26, 2019
Internship Data Science & Data Engineer - ML in production and streaming data ingestion
Categories: Data Engineering, Data Science | Tags: DevOps, Flink, Hadoop, HBase, Kafka, Spark, Internship, Kubernetes, Python
Context The exponential evolution of data has turned the industry upside down by redefining data storage, processing and data ingestion pipelines. Mastering these methods considerably facilitatesā¦
By David WORMS
Nov 26, 2019
Machine Learning model deployment
Categories: Big Data, Data Engineering, Data Science, DevOps & SRE | Tags: DevOps, Operation, AI, Cloud, Machine Learning, MLOps, On-premises, Schema
āEnterprise Machine Learning requires looking at the big picture [ā¦] from a data engineering and a data platform perspective,ā lectured Justin Norman during the talk on the deployment of Machineā¦
Sep 30, 2019
Rook with Ceph doesn't provision my Persistent Volume Claims!
Categories: DevOps & SRE | Tags: PVC, Linux, Rook, Ubuntu, Ceph, Cluster, Internship, Kubernetes
Ceph installation inside Kubernetes can be provisioned using Rook. Currently doing an internship at Adaltas, I was in charge of participating in the setup of a Kubernetes (k8s) cluster. To avoidā¦
Sep 9, 2019
Users and RBAC authorizations in Kubernetes
Categories: Containers Orchestration, Data Governance | Tags: Cyber Security, RBAC, Authentication, Authorization, Kubernetes, SSL/TLS
Having your Kubernetes cluster up and running is just the start of your journey and you now need to operate. To secure its access, user identities must be declared along with authentication andā¦
Aug 7, 2019
Auto-scaling Druid with Kubernetes
Categories: Big Data, Business Intelligence, Containers Orchestration | Tags: Helm, Metrics, OLAP, Operation, Container Orchestration, EC2, Druid, Cloud, CNCF, Data Analytics, Kubernetes, Prometheus, Python
Apache Druid is an open-source analytics data store which could leverage the auto-scaling abilities of Kubernetes due to its distributed nature and its reliance on memory. I was inspired by the talkā¦
Jul 16, 2019
Google Cloud Summit Paris Notes
Categories: Events | Tags: AWS, Azure, Cloud, GCP, Kubernetes, On-premises
Google organized its yearly Summit edition 2019 in Paris on the 18th of June. This yearās event was the biggest yet in Paris, which reflect Googleās commitment to position itself in the French marketā¦
Jun 26, 2019
Introduction to Cloudera Data Science Workbench
Categories: Data Science | Tags: Azure, Cloudera, Docker, Git, Kubernetes, Machine Learning, MLOps, Notebook
Cloudera Data Science Workbench is a platform that allows Data Scientists to create, manage, run and schedule data science workflows from their browser. Thus it enables them to focus on their mainā¦
Feb 28, 2019
Installing Kubernetes on CentOS 7
Categories: Containers Orchestration | Tags: CentOS, cgroups, DevOps, Infrastructure, Namespaces, Red Hat, VM, Ceph, CNCF, Docker, Kubernetes
This article explains how to install a Kubernetes cluster. I will dive into what each step does so you can build a thorough understanding of what is going on. This article is based on my talk from theā¦
Jan 29, 2019
LXD: The Missing Piece
Categories: Containers Orchestration | Tags: CPU, Linux, LXD, VM, Docker, Kubernetes
LXD stands for Linux Container Daemon. Yet another container technology. But LXD is very different. It stands apart from the pack. It is not necessarily better nor much faster nor more secure! But itā¦
Dec 28, 2018
Monitoring a production Hadoop cluster with Kubernetes
Categories: DevOps & SRE | Tags: Thrift, Grafana, Shinken, Hadoop, Knox, Cluster, Docker, Elasticsearch, Kubernetes, Node, Node.js, Prometheus, Python
Monitoring a production grade Hadoop cluster is a real challenge and needs to be constantly evolving. The software we use today is based on Nagios. Very efficient when it comes to the simplestā¦
Dec 21, 2018
Microsoft introduces Cloud Native Application Bundles
Categories: Containers Orchestration | Tags: CLI, Helm, Packaging, Docker, Kubernetes
At DockerCon EU 2018 in Barcelona, Matt Butcher, Principal Engineer at Microsoft and inventor of Helm, introduced CNAB, Cloud Native Application Bundles, a packaging format for distributedā¦
Dec 4, 2018
Apache Flink: past, present and future
Categories: Data Engineering | Tags: Pipeline, Flink, Kubernetes, Machine Learning, SQL, Streaming
Apache Flink is a little gem which deserves a lot more attention. Letās dive into Flinkās past, its current state and the future it is heading to by following the keynotes and presentations at Flinkā¦
Nov 5, 2018
One week to discuss technology in a Moroccan riad
Categories: Adaltas Summit 2018, Learning | Tags: CDSW, Gatsby, React.js, Flink, Hadoop, Knox, Data Science, Deep Learning, Kubernetes, Node.js
Adaltas organise the year its first conference between the 22 and 26 of October. On the agenda of these 5 days of conference: discuss technology in one of the most beautiful riad of Marrakech. Mix theā¦
By David WORMS
Oct 11, 2018
Lando: Deep Learning used to summarize conversations
Categories: Data Science, Learning | Tags: Micro Services, Open API, Deep Learning, Internship, Kubernetes, Neural Network, Node.js
Lando is an application to summarize conversations using Speech To Text to translate the written record of a meeting into text and Deep Learning technics to summarize contents. It allows users toā¦
By Yliess HATI
Sep 18, 2018
TensorFlow on Spark 2.3: The Best of Both Worlds
Categories: Data Science, DataWorks Summit 2018 | Tags: Mesos, C++, CPU, GPU, Tuning, Spark, YARN, JavaScript, Keras, Kubernetes, Machine Learning, Python, TensorFlow
The integration of TensorFlow With Spark has a lot of potential and creates new opportunities. This article is based on a conference seen at the DataWorks Summit 2018 in Berlin. It was about the newā¦
By Yliess HATI
May 29, 2018
What's new in Apache Spark 2.3?
Categories: Data Engineering, DataWorks Summit 2018 | Tags: Arrow, PySpark, Tuning, ORC, Spark, Spark MLlib, Data Science, Docker, Kubernetes, pandas, Streaming
Letās dive into the new features offered by the 2.3 distribution of Apache Spark. This article is a composition of the following talks seen at the DataWorks Summit 2018 and additional research: Apacheā¦
May 23, 2018
Notes after Katacoda Training on Kubernetes Container Orchestration
Categories: Containers Orchestration, Learning | Tags: Helm, Ingress, Kubeadm, CNI, Micro Services, Minikube, Kubernetes
A few weeks ago, I dedicated two days to follow the turorials available on Katacoda, the interactive learning platform for Kubernetes or any other container orchestration platform. Iām sharing myā¦
By David WORMS
Dec 14, 2017
Micro Services
Categories: Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags: Mesos, DNS, Encryption, gRPC, Istio, Linkerd, Micro Services, MITM, Service Mesh, CNCF, Kubernetes, Proxy, SPOF, SSL/TLS
Back in the days, applications were monolithic and we could use an IP address to access a service. With virtual machines (VM), multiple hosts started to appear on the same machine with multiple appsā¦
By David WORMS
Nov 14, 2017
Kubernetes Storage Primitives for Stateful Workloads
Categories: Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags: Container Storage Interface (CSI), PVC, Azure, Docker, GCE, Kubernetes, Storage
This article is based on the presentation āIntroduction to Kubernetes Storage Primitives for Stateful Workloadsā from the OSS Convention Prague 2017 by the {Code} team. So, letās start, what isā¦
Oct 28, 2017
Kubernetes 1.8
Categories: Containers Orchestration, Open Source Summit Europe 2017 | Tags: containerd, CRD, RBAC, Kubernetes, Network, OCI, Release and features
The 1.8 release of Kubernetes brings a lot of new things. With 2500+ pull request, 2000+ commits, 400+ commiters, Kubernetes added 39 new features in this version. This is the richest release in termsā¦
Oct 24, 2017