Automation
Related articles

Local development environments with Terraform + LXD
Categories: Containers Orchestration, DevOps & SRE | Tags: Automation, DevOps, KVM, LXD, Virtualization, VM, Terraform, Vagrant
As a Big Data Solutions Architect and InfraOps, I need development environments to install and test software. They have to be configurable, flexible, and performant. Working with distributed systemsā¦
Jun 1, 2023

H2O in practice: a protocol combining AutoML with traditional modeling approaches
Categories: Data Science, Learning | Tags: Automation, Cloud, H2O, Machine Learning, MLOps, On-premises, Open source, Python, XGBoost
H20 comes with a lot of functionalities. The second part of the series H2O in practice proposes a protocol to combine AutoML modeling with traditional modeling and optimization approach. The objectiveā¦
Nov 12, 2021

H2O in practice: a Data Scientist feedback
Categories: Data Science, Learning | Tags: Automation, Cloud, H2O, Machine Learning, MLOps, On-premises, Open source, Python
Automated machine learning (AutoML) platforms are gaining popularity and becoming a new important tool in the data scientistsā toolbox. A few months ago, I introduced H2O, an open-source platform forā¦
Sep 29, 2021

Storage size and generation time in popular file formats
Categories: Data Engineering, Data Science | Tags: Avro, HDFS, Hive, ORC, Parquet, Big Data, Data Lake, File Format, JavaScript Object Notation (JSON)
Choosing an appropriate file format is essential, whether your data transits on the wire or is stored at rest. Each file format comes with its own advantages and disadvantages. We covered them in aā¦
Mar 22, 2021

JS monorepos in prod 4: unit testing with Mocha and Should.js
Categories: DevOps & SRE, Front End | Tags: Automation, CI/CD, Git, GitOps, Monorepo, Node.js, Unit tests
Unit testing is essential for every long-term project and allows you to pull down functionalities of your code into isolated testable units. Indeed the main goal of a unit test is to verify if anā¦
By David WORMS
Feb 25, 2021

Faster model development with H2O AutoML and Flow
Categories: Data Science, Learning | Tags: Automation, Cloud, H2O, Machine Learning, MLOps, On-premises, Open source, Python
Building Machine Learning (ML) models is a time-consuming process. It requires expertise in statistics, ML algorithms, and programming. On top of that, it also requires the ability to translate aā¦
Dec 10, 2020

InfraOps & DevOps Internship - build a Big Data & Kubernetes PaaS
Categories: Big Data, Containers Orchestration | Tags: DevOps, LXD, Hadoop, Kafka, Spark, Ceph, Internship, Kubernetes, NoSQL
Context The acquisition of a high-capacity cluster is in line with Adaltasā desire to build a PAAS-type offering to use and to provide Big Data and container orchestration platforms. The platforms areā¦
By David WORMS
Nov 26, 2019

Internship Data Science & Data Engineer - ML in production and streaming data ingestion
Categories: Data Engineering, Data Science | Tags: DevOps, Flink, Hadoop, HBase, Kafka, Spark, Internship, Kubernetes, Python
Context The exponential evolution of data has turned the industry upside down by redefining data storage, processing and data ingestion pipelines. Mastering these methods considerably facilitatesā¦
By David WORMS
Nov 26, 2019

Jumbo, the Hadoop cluster bootstrapper
Categories: Infrastructure | Tags: Ambari, Automation, Ansible, Cluster, Vagrant, HDP, REST
Introducing Jumbo, a Hadoop cluster bootstrapper for developers. Jumbo helps you deploy development environments for Big Data technologies. It takes a few minutes to get a custom virtualized Hadoopā¦
Nov 29, 2018

Hadoop cluster takeover with Apache Ambari
Categories: Big Data, DevOps & SRE, Adaltas Summit 2018 | Tags: Ambari, Automation, iptables, Nikita, Systemd, Cluster, HDP, Kerberos, Node, Node.js, REST
We recently migrated a large production Hadoop cluster from a āmanualā automated install to Apache Ambari, we called this the Ambari Takeover. This is a risky process and we will detail why thisā¦
Nov 15, 2018

Ambari - How to blueprint
Categories: Big Data, DevOps & SRE | Tags: Ambari, Automation, DevOps, Operation, Ranger, REST
As infrastructure engineers at Adaltas, we deploy Hadoop clusters. A lot of them. Letās see how to automate this process with REST requests. While really handy for deploying one or two clusters, theā¦
Jan 17, 2018

Remote connection with SSH
Categories: Cyber Security | Tags: Automation, HTTP, SSH
While teaching Big Data and Hadoop, a student asks me about SSH and how to use. Iāll discuss about the protocol and the tools to benefit from it. Lately, I automate the deployment of Hadoop clustersā¦
By David WORMS
Oct 2, 2013

Announcing Mecano, a set of functions for system deployment
Categories: DevOps & SRE, Node.js | Tags: Automation, Infrastructure, CoffeeScript, JavaScript, Open source
Update July 2016, Mecano is now renamed Nikita. We are releasing Node Mecano on GitHub which gather common functions used while deploying systems. The idea was to group those functions into aā¦
By David WORMS
Feb 12, 2012