DevOps and Site Reliability Engineering (SRE)

DevOps is understood as part of the corporate culture with certain principles that a company aspires to and follows for the long term. Supporters of this culture value collaboration, the joy of experimenting and the willingness to learn. All parties involved in a DevOps culture focus on one goal throughout the entire software delivery lifecycle (not just development and operations): the rapid implementation of stable, high-quality software, from concept to customer or user.

The automation of software development, testing and deployment through Continuous Delivery (CD) is a recognized key factor for DevOps. Automation enables faster software implementation and ensures the solutions have the quality, security and stability they need.

DevOps lifecycle
SRE objectives

Objectives

Defining and contributing to:

  • Service Level Indicator (SLI)
  • Service Level Objective (SLO)
  • Service Level Agreements (SLA)
  • Service risk, level of availability and error budget
SRE collaboration

Collaboration

Works toguether with the application developers:

  • Change management
  • Set commons goals
  • Ensure production delivery
  • Improve system reliabity
SRE responsibilities

Responsibilities

Involved and responsible for:

  • Monitoring and alterting
  • Capacity planning and availability
  • latency, performance and efficiency
  • Emergency response and automation

Articles related to DevOps

Machine Learning model deployment

Machine Learning model deployment

Categories: Big Data, Data Engineering, Data Science, DevOps & SRE | Tags: AI, Cloud, DevOps, Machine Learning, On-premise, Operation, Schema

“Enterprise Machine Learning requires looking at the big picture … from a data engineering and a data platform perspective,” lectured Justin Norman during the talk on the deployment of Machine…

By Oskar RYNKIEWICZ

Sep 30, 2019

Rook with Ceph doesn't provision my Persistent Volume Claims!

Rook with Ceph doesn't provision my Persistent Volume Claims!

Categories: DevOps & SRE | Tags: Kubernetes, PVC, Linux, Rook, Ubuntu, Ceph

Ceph installation inside Kubernetes can be provisionned using Rook. Currently doing an internship at Adaltas, I was in charge of participating in the setup of a Kubernetes (k8s) cluster. To avoid…

By Eyal CHOJNOWSKI

Sep 9, 2019

Spark Streaming part 3: DevOps, tools and tests for Spark applications

Spark Streaming part 3: DevOps, tools and tests for Spark applications

Categories: Big Data, Data Engineering, DevOps & SRE | Tags: Spark, Apache Spark Streaming, DevOps, Learning and tutorial

Whenever services are unavailable, businesses experience large financial losses. Spark Streaming applications can break, like any other software application. A streaming application operates on data…

By Oskar RYNKIEWICZ

Jun 19, 2019

Monitoring a production Hadoop cluster with Kubernetes

Monitoring a production Hadoop cluster with Kubernetes

Categories: DevOps & SRE | Tags: Knox, Thrift, Docker, Elasticsearch, Graphana, Kubernetes, Node.js, Prometheus, Python, Shinken, Hadoop

Monitoring a production grade Hadoop cluster is a real challenge and needs to be constantly evolving. The software we use today is based on Nagios. Very efficient when it comes to the simplest…

By Paul-Adrien CORDONNIER

Dec 21, 2018

Hadoop cluster takeover with Apache Ambari

Hadoop cluster takeover with Apache Ambari

Categories: Big Data, DevOps & SRE, Adaltas Summit 2018 | Tags: Ambari, Automation, HDP, iptables, Kerberos, Nikita, Node.js, REST, Systemd

We recently migrated a large production Hadoop cluster from a “manual” automated install to Apache Ambari, we called this the Ambari Takeover. This is a risky process and we will detail why this…

By Schoukroun LEO

Nov 15, 2018

KVM machines for Vagrant on Archlinux

KVM machines for Vagrant on Archlinux

Categories: DevOps & SRE | Tags: Arch Linux, KVM, Linux, Vagrant, Virtualization, VM

Vagrant supports different providers to manage virtualization. In a Linux environment, you can dramatically improve VM performance by using the libvirt provider and the KVM hypervisor. This tutorial…

By Gauthier LEONARD

Sep 19, 2018

Publishing guidelines

Publishing guidelines

Categories: DevOps & SRE | Tags: Arch Linux, KVM, Markdown, Vagrant, VM

This is as much a set of guidelines targeting everyone publishing content on the web as rules for reviewers to ensure no validation is forgotten before submitting for publication. It mostly targets…

By David WORMS

Feb 26, 2018

Ambari - How to blueprint

Ambari - How to blueprint

Categories: Big Data, DevOps & SRE | Tags: Ambari, Ranger, Automation, CDH, DevOps, HDP, Operation, REST

As infrastructure engineers at Adaltas, we deploy Hadoop clusters. A lot of them. Let’s see how to automate this process with REST requests. While really handy for deploying one or two clusters, the…

By Joris RUMMENS

Jan 17, 2018

Apache Thrift vs REST

Apache Thrift vs REST

Categories: DevOps & SRE, Open Source Summit Europe 2017 | Tags: Thrift, GRPC, HTTP, JSON, REST

Adaltas recently attended the Open Source Summit Europe 2017 in Prague. I had the opportunity to follow a presentation made by Randy Abernethy and Jens Geyer of RM-X, a cloud native consulting company…

By Schoukroun LEO

Oct 28, 2017

From Dockerfile to Ansible Containers

From Dockerfile to Ansible Containers

Categories: Containers Orchestration, DevOps & SRE, Open Source Summit Europe 2017 | Tags: Ansible, Docker, Docker Compose, pip, Shell, YAML

This talk was an introduction to the Dockerfile format and to Ansible container’s tool and then a comparison of both. It was hold by Tomas Tomecek from Red Hat’s containerization team. The Dockerfile…

By César BEREZOWSKI

Oct 25, 2017

Multi-Repo, Multi-Node Gating at Massive Scale

Multi-Repo, Multi-Node Gating at Massive Scale

Categories: Cloud Computing, DevOps & SRE, Open Source Summit Europe 2017 | Tags: Ansible, CI/CD, Infrastructure, Jenkins, OpenStack, Red Hat, Zuul

This is a recap and personal review of Monty Taylor’s presentation of OpenStack’s Continuous Integration tool Zuul at the OpenSource Summit 2017 in Prague (not to mix with Netflix’ Zuul project…

By Joris RUMMENS

Oct 24, 2017

MiNiFi: Data at Scales & the Values of Starting Small

MiNiFi: Data at Scales & the Values of Starting Small

Categories: Big Data, DevOps & SRE, Infrastructure | Tags: MiNiFi, NiFi, Cloudera, C++, HDP, HDF, IOT

This conference presented rapidly Apache NiFi and explained where MiNiFi came from: basically it’s a NiFi minimal agent to deploy on small devices to bring data to a cluster’s NiFi pipeline (ex: IoT…

By César BEREZOWSKI

Jul 8, 2017

HDP cluster monitoring

HDP cluster monitoring

Categories: Big Data, DevOps & SRE, Infrastructure | Tags: Alert, Ambari, HDP, Metrics, Monitoring, REST

With the current growth of BigData technologies, more and more companies are building their own clusters in hope to get some value of their data. One main concern while building these infrastructures…

By Joris RUMMENS

Jul 5, 2017

Hive Metastore HA with DBTokenStore: Failed to initialize master key

Hive Metastore HA with DBTokenStore: Failed to initialize master key

Categories: Big Data, DevOps & SRE | Tags: Hive, Bug, Infrastructure

This article describes my little adventure around a startup error with the Hive Metastore. It shall be reproducable with any secure installation, meaning with Kerberos, with high availability enabled…

By David WORMS

Jul 21, 2016

A fresh look at testing Node.js projects: Mocha, Should and Travis

A fresh look at testing Node.js projects: Mocha, Should and Travis

Categories: DevOps & SRE, Node.js | Tags: CI/CD, DevOps, JavaScript, Mocha, Node.js, Unit tests

Today, I finally decided to spend some time around Travis. It’s been a few weeks since that little green image on top of many GitHub homepages has been buzzing me. Well, to be totally honest, this isn…

By David WORMS

Feb 19, 2012

Announcing Mecano, a set of functions for system deployment

Announcing Mecano, a set of functions for system deployment

Categories: DevOps & SRE, Node.js | Tags: Automation, CoffeeScript, DevOps, Infrastructure, JavaScript, Node.js, Open source

Update July 2016, Mecano is now renamed Nikita. We are releasing Node Mecano on GitHub which gather common functions used while deploying systems. The idea was to group those functions into a…

By David WORMS

Feb 12, 2012

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.