Hortonworks Data Platform (HDP)

Created in 2011 by Hortonworks, HDP (Hortonworks Data Platform) is a framework based on Apache Hadoop. It facilitates the management and manipulation of massive amounts of data by bringing together several components within a single solution. These include HDFS, Hbase, Hive, Spark, YARN, Zookeeper, and many others. HDP can store, process, query, and schedule data streams.

HDP is based on Apache Hadoop, so it benefits from a distributed architecture for manipulating data.

Since the merger between Hortonworks and Cloudera in 2019, HDP has been included in CDP (Cloudera Data Platform).

Related articles

Spark on Hadoop integration with Jupyter

Spark on Hadoop integration with Jupyter

Categories: Adaltas Summit 2021, Infrastructure, Tech Radar | Tags: Infrastructure, Jupyter, Spark, YARN, CDP, HDP, Notebook, TDP

For several years, Jupyter notebook has established itself as the notebook solution in the Python universe. Historically, Jupyter is the tool of choice for data scientists who mainly develop in Pythonā€¦

Aargan COINTEPAS

By Aargan COINTEPAS

Sep 1, 2022

Apache HBase: RegionServers co-location

Apache HBase: RegionServers co-location

Categories: Big Data, Adaltas Summit 2021, Infrastructure | Tags: Ambari, Database, Infrastructure, Tuning, Hadoop, HBase, Big Data, HDP, Storage

RegionServers are the processes that manage the storage and retrieval of data in Apache HBase, the non-relational column-oriented database in Apache Hadoop. It is through their daemons that any CRUDā€¦

Pierre BERLAND

By Pierre BERLAND

Feb 22, 2022

Build your open source Big Data distribution with Hadoop, HBase, Spark, Hive & Zeppelin

Build your open source Big Data distribution with Hadoop, HBase, Spark, Hive & Zeppelin

Categories: Big Data, Infrastructure | Tags: Maven, Hadoop, HBase, Hive, Spark, Git, Release and features, TDP, Unit tests

The Hadoop ecosystem gave birth to many popular projects including HBase, Spark and Hive. While technologies like Kubernetes and S3 compatible object storages are growing in popularity, HDFS and YARNā€¦

Leo SCHOUKROUN

By Leo SCHOUKROUN

Dec 18, 2020

Connecting to ADLS Gen2 from Hadoop (HDP) and Nifi (HDF)

Connecting to ADLS Gen2 from Hadoop (HDP) and Nifi (HDF)

Categories: Big Data, Cloud Computing, Data Engineering | Tags: NiFi, Hadoop, HDFS, Authentication, Authorization, Azure, Azure Data Lake Storage (ADLS), OAuth2

As data projects built in the Cloud are becoming more and more frequent, a common use case is to interact with Cloud storage from an existing on premise Big Data platform. Microsoft Azure recentlyā€¦

Gauthier LEONARD

By Gauthier LEONARD

Nov 5, 2020

Rebuilding HDP Hive: patch, test and build

Rebuilding HDP Hive: patch, test and build

Categories: Big Data, Infrastructure | Tags: Maven, Java, Hive, Git, GitHub, Release and features, TDP, Unit tests

The Hortonworks HDP distribution will soon be deprecated in favor of Clouderaā€™s CDP. One of our clients wanted a new Apache Hive feature backported into HDP 2.6.0. We thought it was a good opportunityā€¦

Leo SCHOUKROUN

By Leo SCHOUKROUN

Oct 6, 2020

Installing Hadoop from source: build, patch and run

Installing Hadoop from source: build, patch and run

Categories: Big Data, Infrastructure | Tags: Maven, Java, LXD, Hadoop, HDFS, Docker, TDP, Unit tests

Commercial Apache Hadoop distributions have come and gone. The two leaders, Cloudera and Hortonworks, have merged: HDP is no more and CDH is now CDP. MapR has been acquired by HP and IBM BigInsightsā€¦

Leo SCHOUKROUN

By Leo SCHOUKROUN

Aug 4, 2020

Notes on the Cloudera Open Source licensing model

Notes on the Cloudera Open Source licensing model

Categories: Big Data | Tags: CDSW, License, Cloudera Manager, Open source

Following the publication of its Open Source licensing strategy on July 10, 2019 in an article called ā€œour Commitment to Open Source Softwareā€, Cloudera broadcasted a webinar yesterday October 2ā€¦

David WORMS

By David WORMS

Oct 25, 2019

Jumbo, the Hadoop cluster bootstrapper

Jumbo, the Hadoop cluster bootstrapper

Categories: Infrastructure | Tags: Ambari, Automation, Ansible, Cluster, Vagrant, HDP, REST

Introducing Jumbo, a Hadoop cluster bootstrapper for developers. Jumbo helps you deploy development environments for Big Data technologies. It takes a few minutes to get a custom virtualized Hadoopā€¦

Gauthier LEONARD

By Gauthier LEONARD

Nov 29, 2018

Hadoop cluster takeover with Apache Ambari

Hadoop cluster takeover with Apache Ambari

Categories: Big Data, DevOps & SRE, Adaltas Summit 2018 | Tags: Ambari, Automation, iptables, Kerberos, Nikita, Systemd, Cluster, HDP, Node, Node.js, REST

We recently migrated a large production Hadoop cluster from a ā€œmanualā€ automated install to Apache Ambari, we called this the Ambari Takeover. This is a risky process and we will detail why thisā€¦

Leo SCHOUKROUN

By Leo SCHOUKROUN

Nov 15, 2018

Curing the Kafka blindness with the UI manager

Curing the Kafka blindness with the UI manager

Categories: Big Data | Tags: Ambari, Hortonworks, HDF, JMX, UI, Kafka, Ranger, HDP

Today itā€™s really difficult for developers, operators and managers to visualize and monitor what happens in a Kafka cluster. This articles covers a new graphical interface to oversee Kafka. It wasā€¦

Lucas BAKALIAN

By Lucas BAKALIAN

Jun 20, 2018

Running Enterprise Workloads in the Cloud with Cloudbreak

Running Enterprise Workloads in the Cloud with Cloudbreak

Categories: Big Data, Cloud Computing, DataWorks Summit 2018 | Tags: Cloudbreak, Operation, Hadoop, AWS, Azure, GCP, HDP, OpenStack

This article is based on Peter Darvasi and Richard Doktoricsā€™ talk Running Enterprise Workloads in the Cloud at the DataWorks Summit 2018 in Berlin. It presents Hortonworksā€™ automated deployment toolā€¦

Joris RUMMENS

By Joris RUMMENS

May 28, 2018

Present and future of Hadoop workflow scheduling: Oozie 5.x

Present and future of Hadoop workflow scheduling: Oozie 5.x

Categories: Big Data, DataWorks Summit 2018 | Tags: Hadoop, Hive, Oozie, Sqoop, CDH, HDP, REST

During the DataWorks Summit Europe 2018 in Berlin, I had the opportunity to attend a breakout session on Apache Oozie. It covers the new features released in Oozie 5.0, including future features ofā€¦

Leo SCHOUKROUN

By Leo SCHOUKROUN

May 23, 2018

Ambari - How to blueprint

Ambari - How to blueprint

Categories: Big Data, DevOps & SRE | Tags: Ambari, Automation, DevOps, Operation, Ranger, REST

As infrastructure engineers at Adaltas, we deploy Hadoop clusters. A lot of them. Letā€™s see how to automate this process with REST requests. While really handy for deploying one or two clusters, theā€¦

Joris RUMMENS

By Joris RUMMENS

Jan 17, 2018

MiNiFi: Data at Scales & the Values of Starting Small

MiNiFi: Data at Scales & the Values of Starting Small

Categories: Big Data, DevOps & SRE, Infrastructure | Tags: MiNiFi, NiFi, C++, HDF, Cloudera, HDP, IOT

This conference presented rapidly Apache NiFi and explained where MiNiFi came from: basically itā€™s a NiFi minimal agent to deploy on small devices to bring data to a clusterā€™s NiFi pipeline (ex: IoTā€¦

CĆ©sar BEREZOWSKI

By CĆ©sar BEREZOWSKI

Jul 8, 2017

HDP cluster monitoring

HDP cluster monitoring

Categories: Big Data, DevOps & SRE, Infrastructure | Tags: Alert, Ambari, Metrics, Monitoring, HDP, REST

With the current growth of BigData technologies, more and more companies are building their own clusters in hope to get some value of their data. One main concern while building these infrastructuresā€¦

Joris RUMMENS

By Joris RUMMENS

Jul 5, 2017

Composants for CDH and HDP

Composants for CDH and HDP

Categories: Big Data | Tags: Flume, Hortonworks, Hadoop, Hive, Oozie, Sqoop, Zookeeper, Cloudera, CDH, HDP

I was interested to compare the different components distributed by Cloudera and HortonWorks. This also gives us an idea of the versions packaged by the two distributions. At the time of this writtingā€¦

David WORMS

By David WORMS

Sep 22, 2013

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Scienceā€¦

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain