Cloud computing
Obtenir de l'agilité, de l'efficacité, un contrôle des coûts et une meilleure analyse en déployant une infrastructure de données volumineuses dans le cloud tout en tenant compte des impératifs de sécurité et de l'héritage, n'est pas une mince tâche. La gestion d'un pool élastique de ressources dans un environnement multi-tenant tout en respectant les SLAs, l'intégrité des données et le budget sous contrôle ne l'est pas non plus.
Nous concevons, déployons et exploitons quotidiennement des solutions de cloud hybrides publiques et privées basées sur de multiples offres. Nous avons été impliqués dans différentes approches de la migration vers le cloud, de «Lift & Shift» à la refonte complète de la plateforme. Ces expériences apportent à nos consultants toute la profondeur et l’éventail des compétences nécessaires pour vous aider à naviguer, personnaliser et exploiter la nouvelle norme.
Nos consultants interviennent sur l'ensemble du cycle de vie d'un projet, de l'étude de faisabilité jusqu'à sa mise en production
Cloud migration
- Rassembler et documenter les exigences (fonctionnelles et non fonctionnelles)
- Architecture de la solution en fonction des exigences
- Définition de la roadmap et planification de projet
- Test, optimisation et procédures de cut-off
- Comparaison des services et offres de cloud public
Exploitation et optimisation
- Audit d'infrastructure, des processus and des coûts
- Automatisation du déploiement de l'infrastructure
- Définition et respect des objectifs (SLOs, SLAs)
- Infrastructure, réseau et exploitation des services
- Analyse, calcul et optimisation des coûts (Total Cost of Ownership, TCO)
Intégration et développement dans le Cloud
- Qualification et validation de technologies et de services
- Ingestion et préparation des pipelines de données
- Chargement des données et connection des systèmes
- Algorithmes d'apprentissage automatique (Machine Learning, ML)
- Traitements sur architecture Stream and Batch
Articles associés au Cloud
CDP part 5: user permissions management on CDP Public Cloud
Catégories : Big Data, Cloud Computing, Data Governance | Tags : Ranger, Cloudera, CDP, Data Warehouse
When you create a user or a group in CDP, it requires permissions to access resources and use the Data Services. This article is the fifth in a series of six: CDP part 1: introduction to end-to-end…
Par Tobias CHAVARRIA
18 juil. 2023
CDP part 4: user management on CDP Public Cloud with Keycloak
Catégories : Big Data, Cloud Computing, Data Governance | Tags : EC2, Big Data, CDP, Docker Compose, Keycloak, SSO
Previous articles of the serie cover the deployment of a CDP Public Cloud environment. All the components are ready for use and it is time to make the environment available to other users to explore…
Par Tobias CHAVARRIA
4 juil. 2023
CDP part 3: Data Services activation on CDP Public Cloud environment
Catégories : Big Data, Cloud Computing, Infrastructure | Tags : Infrastructure, AWS, Big Data, Cloudera, CDP
One of the big selling points of Cloudera Data Platform (CDP) is their mature managed service offering. These are easy to deploy on-premises, in the public cloud or as part of a hybrid solution. The…
Par Albert KONRAD
27 juin 2023
CDP part 2: CDP Public Cloud deployment on AWS
Catégories : Big Data, Cloud Computing, Infrastructure | Tags : Infrastructure, AWS, Big Data, Cloud, Cloudera, CDP, Cloudera Manager
The Cloudera Data Platform (CDP) Public Cloud provides the foundation upon which full featured data lakes are created. In a previous article, we introduced the CDP platform. This article is the second…
Par Albert KONRAD
19 juin 2023
CDP part 1: introduction to end-to-end data lakehouse architecture with CDP
Catégories : Cloud Computing, Data Engineering, Infrastructure | Tags : Data Engineering, Hortonworks, Iceberg, AWS, Azure, Big Data, Cloud, Cloudera, CDP, Cloudera Manager, Data Warehouse
Cloudera Data Platform (CDP) is a hybrid data platform for big data transformation, machine learning and data analytics. In this series we describe how to build and use an end-to-end big data…
Par Stephan BAUM
8 juin 2023
Keycloak deployment in EC2
Catégories : Cloud Computing, Data Engineering, Infrastructure | Tags : Security, EC2, Authentication, AWS, Docker, Keycloak, SSL/TLS, SSO
Why use Keycloak Keycloak is an open-source identity provider (IdP) using single sign-on (SSO). An IdP is a tool to create, maintain, and manage identity information for principals and to provide…
Par Stephan BAUM
14 mars 2023
Databricks logs collection with Azure Monitor at a Workspace Scale
Catégories : Cloud Computing, Data Engineering, Adaltas Summit 2021 | Tags : Metrics, Monitoring, Spark, Azure, Databricks, Log4j
Databricks is an optimized data analytics platform based on Apache Spark. Monitoring Databricks plateform is crucial to ensure data quality, job performance, and security issues by limiting access to…
Par Claire PLAYE
10 mai 2022
Using Cloudera Deploy to install Cloudera Data Platform (CDP) Private Cloud
Catégories : Big Data, Cloud Computing | Tags : Ansible, Cloudera, CDP, Cluster, Data Warehouse, Vagrant, IaC
Following our recent Cloudera Data Platform (CDP) overview, we cover how to deploy CDP private Cloud on you local infrastructure. It is entirely automated with the Ansible cookbooks published by…
23 juil. 2021
An overview of Cloudera Data Platform (CDP)
Catégories : Big Data, Cloud Computing, Data Engineering | Tags : SDX, Big Data, Cloud, Cloudera, CDP, CDH, Data Analytics, Data Hub, Data Lake, Data lakehouse, Data Warehouse
Cloudera Data Platform (CDP) is a cloud computing platform for businesses. It provides integrated and multifunctional self-service tools in order to analyze and centralize data. It brings security and…
19 juil. 2021
Find your way into data related Microsoft Azure certifications
Catégories : Cloud Computing, Data Engineering | Tags : Data Governance, Azure, Data Science
Microsoft Azure has certification paths for many technical job roles such as developer, Data Engineer, Data Scientist and solution architect among others. Each of these certifications consists of…
Par Barthelemy NGOM
14 avr. 2021
Connecting to ADLS Gen2 from Hadoop (HDP) and Nifi (HDF)
Catégories : Big Data, Cloud Computing, Data Engineering | Tags : NiFi, Hadoop, HDFS, Authentication, Authorization, Azure, Azure Data Lake Storage (ADLS), OAuth2
As data projects built in the Cloud are becoming more and more frequent, a common use case is to interact with Cloud storage from an existing on premise Big Data platform. Microsoft Azure recently…
Par Gauthier LEONARD
5 nov. 2020
Automate a Spark routine workflow from GitLab to GCP
Catégories : Big Data, Cloud Computing, Containers Orchestration | Tags : Learning and tutorial, Airflow, Spark, CI/CD, GitLab, GitOps, GCP, Terraform
A workflow consists in automating a succession of tasks to be carried out without human intervention. It is an important and widespread concept which particularly apply to operational environments…
16 juin 2020
Introducing Apache Airflow on AWS
Catégories : Big Data, Cloud Computing, Containers Orchestration | Tags : PySpark, Learning and tutorial, Airflow, Oozie, Spark, AWS, Docker, Python
Apache Airflow offers a potential solution to the growing challenge of managing an increasingly complex landscape of data management tools, scripts and analytics processes. It is an open-source…
Par Aargan COINTEPAS
5 mai 2020
Snowflake, the Data Warehouse for the Cloud, introduction and tutorial
Catégories : Business Intelligence, Cloud Computing | Tags : Cloud, Data Lake, Data Science, Data Warehouse, Snowflake
Snowflake is a SaaS-based data-warehousing platform that centralizes, in the cloud, the storage and processing of structured and semi-structured data. The increasing generation of data produced over…
7 avr. 2020
Cloudera CDP and Cloud migration of your Data Warehouse
Catégories : Big Data, Cloud Computing | Tags : Azure, Cloudera, Data Hub, Data Lake, Data Warehouse
While one of our customer is anticipating a move to the Cloud and with the recent announcement of Cloudera CDP availability mi-september during the Strata conference, it seems like the appropriate…
Par David WORMS
16 déc. 2019
Should you move your Big Data and Data Lake to the Cloud
Catégories : Big Data, Cloud Computing | Tags : DevOps, AWS, Azure, Cloud, CDP, Databricks, GCP
Should you follow the trend and migrate your data, workflows and infrastructure to GCP, AWS and Azure? During the Strata Data Conference in New-York, a general focus was put on moving customer’s Big…
Par Joris RUMMENS
9 déc. 2019
Insert rows in BigQuery tables with complex columns
Catégories : Cloud Computing, Data Engineering | Tags : GCP, BigQuery, Schema, SQL
Google’s BigQuery is a cloud data warehousing system designed to process enormous volumes of data with several features available. Out of all those features, let’s talk about the support of Struct…
Par César BEREZOWSKI
22 nov. 2019
Running Enterprise Workloads in the Cloud with Cloudbreak
Catégories : Big Data, Cloud Computing, DataWorks Summit 2018 | Tags : Cloudbreak, Operation, Hadoop, AWS, Azure, GCP, HDP, OpenStack
This article is based on Peter Darvasi and Richard Doktorics’ talk Running Enterprise Workloads in the Cloud at the DataWorks Summit 2018 in Berlin. It presents Hortonworks’ automated deployment tool…
Par Joris RUMMENS
28 mai 2018
Micro Services
Catégories : Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags : Mesos, DNS, Encryption, gRPC, Istio, Linkerd, Micro Services, MITM, Service Mesh, CNCF, Kubernetes, Proxy, SPOF, SSL/TLS
Back in the days, applications were monolithic and we could use an IP address to access a service. With virtual machines (VM), multiple hosts started to appear on the same machine with multiple apps…
Par David WORMS
14 nov. 2017
Multi-Repo, Multi-Node Gating at Massive Scale
Catégories : Cloud Computing, DevOps & SRE, Open Source Summit Europe 2017 | Tags : Infrastructure, Jenkins, Red Hat, Zuul, Ansible, CI/CD, OpenStack
This is a recap and personal review of Monty Taylor’s presentation of OpenStack’s Continuous Integration tool Zuul at the OpenSource Summit 2017 in Prague (not to mix with Netflix’ Zuul project…
Par Joris RUMMENS
28 oct. 2017
Kubernetes Storage Primitives for Stateful Workloads
Catégories : Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags : Container Storage Interface (CSI), PVC, Azure, Docker, GCE, Kubernetes, Storage
This article is based on the presentation “Introduction to Kubernetes Storage Primitives for Stateful Workloads” from the OSS Convention Prague 2017 by the {Code} team. So, let’s start, what is…
Par Pierre SAUVAGE
28 oct. 2017
Node.js is now integrated to the Microsoft Azure platform
Catégories : Cloud Computing, Tech Radar | Tags : Linux, Azure, Cloud, Node.js
Node is now a first class citizen in the Microsoft Azure cloud environment alongside .Net, Java and PHP. This integration is the logical consequence of Microsoft’s involvement in the development of…
Par David WORMS
11 déc. 2011