
Related articles

Announcing Mecano, a set of functions for system deployment

Announcing Mecano, a set of functions for system deployment

Categories: DevOps & SRE, Node.js | Tags: Automation, Infrastructure, CoffeeScript, JavaScript, Open source

Update July 2016, Mecano is now renamed Nikita. We are releasing Node Mecano on GitHub which gather common functions used while deploying systems. The idea was to group those functions into aā€¦


By David WORMS

Feb 12, 2012

Remote connection with SSH

Remote connection with SSH

Categories: Cyber Security | Tags: Automation, HTTP, SSH

While teaching Big Data and Hadoop, a student asks me about SSH and how to use. Iā€™ll discuss about the protocol and the tools to benefit from it. Lately, I automate the deployment of Hadoop clustersā€¦


By David WORMS

Oct 2, 2013

Ambari - How to blueprint

Ambari - How to blueprint

Categories: Big Data, DevOps & SRE | Tags: Ambari, Automation, DevOps, Operation, Ranger, REST

As infrastructure engineers at Adaltas, we deploy Hadoop clusters. A lot of them. Letā€™s see how to automate this process with REST requests. While really handy for deploying one or two clusters, theā€¦



Jan 17, 2018

Hadoop cluster takeover with Apache Ambari

Hadoop cluster takeover with Apache Ambari

Categories: Big Data, DevOps & SRE, Adaltas Summit 2018 | Tags: Ambari, Automation, iptables, Nikita, Systemd, Cluster, HDP, Kerberos, Node, Node.js, REST

We recently migrated a large production Hadoop cluster from a ā€œmanualā€ automated install to Apache Ambari, we called this the Ambari Takeover. This is a risky process and we will detail why thisā€¦



Nov 15, 2018

Jumbo, the Hadoop cluster bootstrapper

Jumbo, the Hadoop cluster bootstrapper

Categories: Infrastructure | Tags: Ambari, Automation, Ansible, Cluster, Vagrant, HDP, REST

Introducing Jumbo, a Hadoop cluster bootstrapper for developers. Jumbo helps you deploy development environments for Big Data technologies. It takes a few minutes to get a custom virtualized Hadoopā€¦

Gauthier LEONARD

By Gauthier LEONARD

Nov 29, 2018

Internship Data Science & Data Engineer - ML in production and streaming data ingestion

Internship Data Science & Data Engineer - ML in production and streaming data ingestion

Categories: Data Engineering, Data Science | Tags: DevOps, Flink, Hadoop, HBase, Kafka, Spark, Internship, Kubernetes, Python

Context The exponential evolution of data has turned the industry upside down by redefining data storage, processing and data ingestion pipelines. Mastering these methods considerably facilitatesā€¦


By David WORMS

Nov 26, 2019

InfraOps & DevOps Internship - build a Big Data & Kubernetes PaaS

InfraOps & DevOps Internship - build a Big Data & Kubernetes PaaS

Categories: Big Data, Containers Orchestration | Tags: DevOps, LXD, Hadoop, Kafka, Spark, Ceph, Internship, Kubernetes, NoSQL

Context The acquisition of a high-capacity cluster is in line with Adaltasā€™ desire to build a PAAS-type offering to use and to provide Big Data and container orchestration platforms. The platforms areā€¦


By David WORMS

Nov 26, 2019

JS monorepos in prod 4: unit testing with Mocha and Should.js

JS monorepos in prod 4: unit testing with Mocha and Should.js

Categories: DevOps & SRE, Front End | Tags: Automation, CI/CD, Git, GitOps, Monorepo, Node.js, Unit tests

Unit testing is essential for every long-term project and allows you to pull down functionalities of your code into isolated testable units. Indeed the main goal of a unit test is to verify if anā€¦


By David WORMS

Feb 25, 2021

Faster model development with H2O AutoML and Flow

Faster model development with H2O AutoML and Flow

Categories: Data Science, Learning | Tags: Automation, Cloud, H2O, Machine Learning, MLOps, On-premises, Open source, Python

Building Machine Learning (ML) models is a time-consuming process. It requires expertise in statistics, ML algorithms, and programming. On top of that, it also requires the ability to translate aā€¦

Storage size and generation time in popular file formats

Storage size and generation time in popular file formats

Categories: Data Engineering, Data Science | Tags: Avro, HDFS, Hive, ORC, Parquet, Big Data, Data Lake, File Format, JavaScript Object Notation (JSON)

Choosing an appropriate file format is essential, whether your data transits on the wire or is stored at rest. Each file format comes with its own advantages and disadvantages. We covered them in aā€¦

Barthelemy NGOM

By Barthelemy NGOM

Mar 22, 2021

H2O in practice: a Data Scientist feedback

H2O in practice: a Data Scientist feedback

Categories: Data Science, Learning | Tags: Automation, Cloud, H2O, Machine Learning, MLOps, On-premises, Open source, Python

Automated machine learning (AutoML) platforms are gaining popularity and becoming a new important tool in the data scientistsā€™ toolbox. A few months ago, I introduced H2O, an open-source platform forā€¦

H2O in practice: a protocol combining AutoML with traditional modeling approaches

H2O in practice: a protocol combining AutoML with traditional modeling approaches

Categories: Data Science, Learning | Tags: Automation, Cloud, H2O, Machine Learning, MLOps, On-premises, Open source, Python, XGBoost

H20 comes with a lot of functionalities. The second part of the series H2O in practice proposes a protocol to combine AutoML modeling with traditional modeling and optimization approach. The objectiveā€¦

Local development environments with Terraform + LXD

Local development environments with Terraform + LXD

Categories: Containers Orchestration, DevOps & SRE | Tags: Automation, DevOps, KVM, LXD, Virtualization, VM, Terraform, Vagrant

As a Big Data Solutions Architect and InfraOps, I need development environments to install and test software. They have to be configurable, flexible, and performant. Working with distributed systemsā€¦

Gauthier LEONARD

By Gauthier LEONARD

Jun 1, 2023

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Scienceā€¦

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain