Évènements
En tant que fervents partisans et contributeurs actifs de la communauté Open Source, nous participons à plusieurs réunions et conférences. Chaque consultant participe à un minimum de deux conférences internationales chaque année. Nous organisons même notre propre événement que nous ouvrons à tous ceux qui souhaitent se joindre à nous.
Dès que nous en avons le temps, nous rédigeons des retours d'expériences sur les événements et des articles détaillés sur les technologies présentées. Cela inclut les nouveaux produits en cours d’introduction et de nouvelles fonctionnalités introduites dans les versions futures.
Latest events coverage
Adaltas Summit 2022 Morzine
Catégories : Big Data, Adaltas Summit 2022 | Tags : Data Engineering, Infrastructure, Iceberg, Container, Data lakehouse, Docker, Kubernetes
For its third edition, the whole Adaltas crew is gathering in Morzine for a whole week with 2 days dedicated to technology the 15th and the 16Th of september 2022. The speakers choose one of the…
Par David WORMS
13 janv. 2023
WasmEdge: WebAssembly runtimes are coming for the edge
Catégories : Containers Orchestration, Adaltas Summit 2021, Infrastructure, Tech Radar | Tags : JAMstack, Linux, Docker, Rust Lang, WebAssembly
With many security challenges solved by design in its core conception, lots of projects benefit from using WebAssembly. WasmEdge runtime is an efficient Virtual Machine optimized for edge computing…
Par Guillaume BOUTRY
29 sept. 2022
Spark on Hadoop integration with Jupyter
Catégories : Adaltas Summit 2021, Infrastructure, Tech Radar | Tags : Infrastructure, Jupyter, Spark, YARN, CDP, HDP, Notebook, TDP
For several years, Jupyter notebook has established itself as the notebook solution in the Python universe. Historically, Jupyter is the tool of choice for data scientists who mainly develop in Python…
Par Aargan COINTEPAS
1 sept. 2022
TDP workshop: Become a TDP power user from your terminal
Catégories : Events, Learning | Tags : DevOps, Ansible, Hadoop, Open source, TDP
The TDP CLI is used to deploy and operate your TDP services. It relies on tdp-lib to provide control and flexibility at your fingertips. Some time ago, we announced the public release of TDP - Trunk…
Par Paul FARAULT
17 juin 2022
Databricks logs collection with Azure Monitor at a Workspace Scale
Catégories : Cloud Computing, Data Engineering, Adaltas Summit 2021 | Tags : Metrics, Monitoring, Spark, Azure, Databricks, Log4j
Databricks is an optimized data analytics platform based on Apache Spark. Monitoring Databricks plateform is crucial to ensure data quality, job performance, and security issues by limiting access to…
Par Claire PLAYE
10 mai 2022
Blockchain 102: Cryptocurrencies, Wallets and DApps
Catégories : Adaltas Summit 2021, Infrastructure | Tags : Cryptography, Infrastructure, Blockchain, Consensus
A lot of people own cryptocurrencies today. But holding some tokens on an exchange does not mean interacting with the blockchain. The assets you trade are only numbers stored inside the exchange’s…
Par Gauthier LEONARD
12 avr. 2022
Apache HBase: RegionServers co-location
Catégories : Big Data, Adaltas Summit 2021, Infrastructure | Tags : Ambari, Database, Infrastructure, Tuning, Hadoop, HBase, Big Data, HDP, Storage
RegionServers are the processes that manage the storage and retrieval of data in Apache HBase, the non-relational column-oriented database in Apache Hadoop. It is through their daemons that any CRUD…
Par Pierre BERLAND
22 févr. 2022
Blockchain 101: Blockchains and Consensus Mechanisms
Catégories : Adaltas Summit 2021, Infrastructure, Learning | Tags : Cryptography, Infrastructure, Blockchain, Consensus
Cryptocurrencies are booming in 2021, with a market cap moving from 750 to more than 3,000 billion dollars. Let’s face it, this is mainly due to speculation. A lot of people involved do not have a…
Par Gauthier LEONARD
18 janv. 2022
GitOps in practice, deploy Kubernetes applications with ArgoCD
Catégories : Containers Orchestration, DevOps & SRE, Adaltas Summit 2021 | Tags : Argo CD, CI/CD, Git, GitOps, IaC, Kubernetes
GitOps is a set of practices to deploy applications using Git. Application definitions, configurations, and connectivity are to be stored in a version control software such as Git. Git then serves as…
16 déc. 2021
Adaltas Summit 2021, 2nd edition in corsica
Catégories : Adaltas Summit 2021, Learning | Tags : Ansible, Hadoop, Spark, Azure, Blockchain, Deep Learning, Docker, Terraform, Kubernetes, Node.js
For its second edition, the whole Adaltas crew is gathering in Corsica for a whole week with 2 days dedicated to technology the 23rd and the 24th of september 2021. After a year and a half of sanitary…
Par David WORMS
21 sept. 2021
Data versioning and reproducible ML with DVC and MLflow
Catégories : Data Science, DevOps & SRE, Events | Tags : Data Engineering, Databricks, Delta Lake, Git, Machine Learning, MLflow, Storage
Our talk on data versioning and reproducible Machine Learning proposed to the Data + AI Summit (formerly known as Spark+AI) is accepted. The summit will take place online the 17-19th November…
30 sept. 2020
Running Apache Hive 3, new features and tips and tricks
Catégories : Big Data, Business Intelligence, DataWorks Summit 2019 | Tags : JDBC, LLAP, Druid, Hadoop, Hive, Kafka, Release and features
Apache Hive 3 brings a bunch of new and nice features to the data warehouse. Unfortunately, like many major FOSS releases, it comes with a few bugs and not much documentation. It is available since…
Par Gauthier LEONARD
25 juil. 2019
Google Cloud Summit Paris Notes
Catégories : Events | Tags : AWS, Azure, Cloud, GCP, Kubernetes, On-premises
Google organized its yearly Summit edition 2019 in Paris on the 18th of June. This year’s event was the biggest yet in Paris, which reflect Google’s commitment to position itself in the French market…
Par Tariq SAHNOUNI
26 juin 2019
Gatsby.js, React and GraphQL for documentation websites
Catégories : Adaltas Summit 2018, Front End | Tags : Gatsby, HTTP, JAMstack, React.js, SEO, API, GitOps, GraphQL, JavaScript, Markdown, Node.js
In the last few months, I have started to redesign some of our Open Source project websites. This includes the websites of the Node.js CSV project, the Node.js HBase client and the Nikita project, our…
Par David WORMS
1 avr. 2019
Apache Knox made easy!
Catégories : Big Data, Cyber Security, Adaltas Summit 2018 | Tags : LDAP, Active Directory, Knox, Ranger, Kerberos, REST
Apache Knox is the secure entry point of a Hadoop cluster, but can it also be the entry point for my REST applications? Apache Knox overview Apache Knox is an application gateway for interacting in a…
Par Michael HATOUM
4 févr. 2019
CodaLab – Data Science competitions
Catégories : Data Science, Adaltas Summit 2018, Learning | Tags : Database, Infrastructure, Machine Learning, MySQL, Node.js, Python
CodaLab Competition is a platform for code execution in the field of Data Science. It is a web interface on which a user can submit code or results and compare themselves to others. Let’s see how it…
17 déc. 2018
Native modules for Node.js with N-API
Catégories : Adaltas Summit 2018, Front End | Tags : C++, NPM, JavaScript, Kerberos, Node.js
How to create native modules for Node.js? How to use N-API, the future of native addons development? Writing C/C++ addon is a useful and powerful feature of the Node.js runtime. Let’s explore them…
Par Xavier HERMAND
12 déc. 2018
Hadoop cluster takeover with Apache Ambari
Catégories : Big Data, DevOps & SRE, Adaltas Summit 2018 | Tags : Ambari, Automation, iptables, Nikita, Systemd, Cluster, HDP, Kerberos, Node, Node.js, REST
We recently migrated a large production Hadoop cluster from a “manual” automated install to Apache Ambari, we called this the Ambari Takeover. This is a risky process and we will detail why this…
Par Leo SCHOUKROUN
15 nov. 2018
One week to discuss technology in a Moroccan riad
Catégories : Adaltas Summit 2018, Learning | Tags : CDSW, Gatsby, React.js, Flink, Hadoop, Knox, Data Science, Deep Learning, Kubernetes, Node.js
Adaltas organise the year its first conference between the 22 and 26 of October. On the agenda of these 5 days of conference: discuss technology in one of the most beautiful riad of Marrakech. Mix the…
Par David WORMS
11 oct. 2018
Apache Hadoop YARN 3.0 – State of the union
Catégories : Big Data, DataWorks Summit 2018 | Tags : GPU, Hortonworks, Hadoop, HDFS, MapReduce, YARN, Cloudera, Data Science, Docker, Release and features
This article covers the ”Apache Hadoop YARN: state of the union” talk held by Wangda Tan from Hortonworks during the Dataworks Summit 2018. What is Apache YARN? As a reminder, YARN is one of the two…
Par Lucas BAKALIAN
31 mai 2018
Accelerating query processing with materialized views in Apache Hive
Catégories : Business Intelligence, DataWorks Summit 2018 | Tags : Calcite, OLAP, Druid, Hive, Release and features, SQL
The new materialized view feature is coming in Apache Hive 3.0. Jesus Camacho Rodriguez from Hortonworks held a talk ”Accelerating query processing with materialized views in Apache Hive” about it…
31 mai 2018
YARN and GPU Distribution for Machine Learning
Catégories : Data Science, DataWorks Summit 2018 | Tags : GPU, YARN, Machine Learning, Neural Network, Storage
This article goes over the fundamental principles of Machine Learning and what tools are currently used to run machine learning algorithms. We will then see how a resource manager such as YARN can be…
Par Grégor JOUET
30 mai 2018
TensorFlow on Spark 2.3: The Best of Both Worlds
Catégories : Data Science, DataWorks Summit 2018 | Tags : Mesos, C++, CPU, GPU, Tuning, Spark, YARN, JavaScript, Keras, Kubernetes, Machine Learning, Python, TensorFlow
The integration of TensorFlow With Spark has a lot of potential and creates new opportunities. This article is based on a conference seen at the DataWorks Summit 2018 in Berlin. It was about the new…
Par Yliess HATI
29 mai 2018
Apache Metron in the Real World
Catégories : Cyber Security, DataWorks Summit 2018 | Tags : Algorithm, NiFi, Solr, Storm, pcap, RDBMS, HDFS, Kafka, Metron, Spark, Data Science, Elasticsearch, SQL
Apache Metron is a storage and analytic platform specialized in cyber security. This talk was about demonstrating the usages and capabilities of Apache Metron in the real world. The presentation was…
Par Michael HATOUM
29 mai 2018
Running Enterprise Workloads in the Cloud with Cloudbreak
Catégories : Big Data, Cloud Computing, DataWorks Summit 2018 | Tags : Cloudbreak, Operation, Hadoop, AWS, Azure, GCP, HDP, OpenStack
This article is based on Peter Darvasi and Richard Doktorics’ talk Running Enterprise Workloads in the Cloud at the DataWorks Summit 2018 in Berlin. It presents Hortonworks’ automated deployment tool…
Par Joris RUMMENS
28 mai 2018
Omid: Scalable and highly available transaction processing for Apache Phoenix
Catégories : Big Data, DataWorks Summit 2018 | Tags : Omid, Phoenix, Transaction, ACID, HBase, SQL
Apache Omid provides a transactional layer on top of key/value NoSQL databases. In practice, it is usually used on top of Apache HBase. Credits to Ohad Shacham for his talk and his work for Apache…
Par Xavier HERMAND
24 mai 2018
Apache Beam: a unified programming model for data processing pipelines
Catégories : Data Engineering, DataWorks Summit 2018 | Tags : Apex, Beam, Pipeline, Flink, Spark
In this article, we will review the concepts, the history and the future of Apache Beam, that may well become the new standard for data processing pipelines definition. At Dataworks Summit 2018 in…
Par Gauthier LEONARD
24 mai 2018
Present and future of Hadoop workflow scheduling: Oozie 5.x
Catégories : Big Data, DataWorks Summit 2018 | Tags : Hadoop, Hive, Oozie, Sqoop, CDH, HDP, REST
During the DataWorks Summit Europe 2018 in Berlin, I had the opportunity to attend a breakout session on Apache Oozie. It covers the new features released in Oozie 5.0, including future features of…
Par Leo SCHOUKROUN
23 mai 2018
What's new in Apache Spark 2.3?
Catégories : Data Engineering, DataWorks Summit 2018 | Tags : Arrow, PySpark, Tuning, ORC, Spark, Spark MLlib, Data Science, Docker, Kubernetes, pandas, Streaming
Let’s dive into the new features offered by the 2.3 distribution of Apache Spark. This article is a composition of the following talks seen at the DataWorks Summit 2018 and additional research: Apache…
Par César BEREZOWSKI
23 mai 2018
Scaling massive, real-time data pipelines with Go
Catégories : Open Source Summit Europe 2017, Learning | Tags : Algorithm, Data structures, Go Lang, Pipeline, Protocols, Network
Last week at the Open Source Summit in Prague, Jean de Klerk held a talk called Scaling massive, real-time data pipelines with Go. This article goes over the main points of the talk, detailing the…
Par Arthur BUSSER
21 nov. 2017
Mesos Introduction
Catégories : Containers Orchestration, Open Source Summit Europe 2017 | Tags : Mesos, GPU, Container Orchestration, CUDA, Data Science, Docker
Apache Mesos is an open source cluster management project designed to implement and optimize distributed systems. Mesos enables the management and sharing of resources in a fine and dynamic way…
Par Louis BIANCHERIN
15 nov. 2017
Micro Services
Catégories : Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags : Mesos, DNS, Encryption, gRPC, Istio, Linkerd, Micro Services, MITM, Service Mesh, CNCF, Kubernetes, Proxy, SPOF, SSL/TLS
Back in the days, applications were monolithic and we could use an IP address to access a service. With virtual machines (VM), multiple hosts started to appear on the same machine with multiple apps…
Par David WORMS
14 nov. 2017
Lightweight containerization with Tupperware
Catégories : Containers Orchestration, Open Source Summit Europe 2017, Infrastructure | Tags : Btrfs, LXD, Red Hat, Systemd, Zookeeper, Cloud, Consensus
In this article, I will present lightweight containerization set up by Facebook called Tupperware. What is Tupperware Tupperware is a homemade framework written and used internally at Facebook…
Par Lucas BAKALIAN
3 nov. 2017
Multi-Repo, Multi-Node Gating at Massive Scale
Catégories : Cloud Computing, DevOps & SRE, Open Source Summit Europe 2017 | Tags : Infrastructure, Jenkins, Red Hat, Zuul, Ansible, CI/CD, OpenStack
This is a recap and personal review of Monty Taylor’s presentation of OpenStack’s Continuous Integration tool Zuul at the OpenSource Summit 2017 in Prague (not to mix with Netflix’ Zuul project…
Par Joris RUMMENS
28 oct. 2017
Apache Thrift vs REST
Catégories : DevOps & SRE, Open Source Summit Europe 2017 | Tags : Thrift, gRPC, HTTP, JavaScript Object Notation (JSON), REST
Adaltas recently attended the Open Source Summit Europe 2017 in Prague. I had the opportunity to follow a presentation made by Randy Abernethy and Jens Geyer of RM-X, a cloud native consulting company…
Par Leo SCHOUKROUN
28 oct. 2017
Kubernetes Storage Primitives for Stateful Workloads
Catégories : Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags : Container Storage Interface (CSI), PVC, Azure, Docker, GCE, Kubernetes, Storage
This article is based on the presentation “Introduction to Kubernetes Storage Primitives for Stateful Workloads” from the OSS Convention Prague 2017 by the {Code} team. So, let’s start, what is…
Par Pierre SAUVAGE
28 oct. 2017
Nobody* puts Java in a Container
Catégories : Containers Orchestration, Open Source Summit Europe 2017, Infrastructure | Tags : cgroups, Java, JRE, JVM, Namespaces, Docker
This talk was about the issues of putting Java in a container and how, in its latest version, the JDK is now more aware of the container it is running in. The presentation is led by Joerg Schad…
28 oct. 2017
From Dockerfile to Ansible Containers
Catégories : Containers Orchestration, DevOps & SRE, Open Source Summit Europe 2017 | Tags : pip, Shell, Ansible, Docker, Docker Compose, YAML
This talk was an introduction to the Dockerfile format and to Ansible container’s tool and then a comparison of both. It was hold by Tomas Tomecek from Red Hat’s containerization team. The Dockerfile…
Par César BEREZOWSKI
25 oct. 2017
Kubernetes 1.8
Catégories : Containers Orchestration, Open Source Summit Europe 2017 | Tags : containerd, CRD, RBAC, Kubernetes, Network, OCI, Release and features
The 1.8 release of Kubernetes brings a lot of new things. With 2500+ pull request, 2000+ commits, 400+ commiters, Kubernetes added 39 new features in this version. This is the richest release in terms…
Par Younes YASSINE
24 oct. 2017
Cloudera Sessions Paris 2017
Catégories : Big Data, Events | Tags : Altus, CDSW, SDX, EC2, Azure, Cloudera, CDH, Data Science, PaaS
Adaltas was at the Cloudera Sessions on October 5, where Cloudera showcased their new products and offerings. Below you’ll find a summary of what we witnessed. Note: the information were aggregated in…
Par César BEREZOWSKI
16 oct. 2017
Apache Apex with Apache SAMOA
Catégories : Data Science, Events, Tech Radar | Tags : Apex, Samoa, Storm, Tools, Flink, Hadoop, Machine Learning
Traditional Machine Learning Batch Oriented Supervised - most common Training and Scoring One time model building Data set Training: Model building Holdout: Paremeter tuning Test: Accuracy Online…
Par Pierre SAUVAGE
17 juil. 2016
Apache Apex: next gen Big Data analytics
Catégories : Data Science, Events, Tech Radar | Tags : Apex, Storm, Tools, Flink, Hadoop, Kafka, Data Science, Machine Learning
Below is a compilation of my notes taken during the presentation of Apache Apex by Thomas Weise from DataTorrent, the company behind Apex. Introduction Apache Apex is an in-memory distributed parallel…
Par César BEREZOWSKI
17 juil. 2016