Trunk Data Platform (TDP)
Trunk Data Platform (TDP) is a fully open source big data distribution based on the Apache ecosystem. The initiative is incubated by The Open Source I Trust (TOSIT), a French association whose mission is to promote open source between large accounts and institutions.
The TDP distribution is based on the open source versions of big data components of the Apache ecosystem. As part of the TDP project, these components are compiled, tested and deployed automatically.
The TDP distribution defines and qualifies a set of versioned components that interact with each other. In addition, it provides the community with tools for deploying platforms. The resulting stack is versioned and evolves along the following axes:
- The evolution of the components that compose it by integrating new versions and applying/backporting fixes;
- Adding new features to the source code of the TDP project.
Any new development has a ripple effect in the compilation of all the components, the validation of tests and the provision of a new version of the distribution in accordance with the recommendations of Semantic Versioning (SemVer).
For ensure the continuation of services, the first versions made available are aligned with those of the HDP 2.6.5 and HDP 3.1.5 distributions. The list of supported components includes: Hadoop (HDFS, YARN, MapReduce), Hive & Tez, Spark, Ranger, HBase, Phoenix, Knox, Oozie, NiFi, Kafka, and ZooKeeper.
Related articles
Installation Guide to TDP, the 100% open source big data platform
Categories: Big Data, Infrastructure | Tags: Infrastructure, VirtualBox, Hadoop, Vagrant, TDP
The Trunk Data Platform (TDP) is a 100% open source big data distribution, based on Apache Hadoop and compatible with HDP 3.1. Initiated in 2021 by EDF, the DGFiP and Adaltas, the project is governedā¦
By Paul FARAULT
Oct 18, 2023
New TDP website launched
Categories: Big Data | Tags: Programming, Ansible, Hadoop, Python, TDP
The new TDP (Trunk Data Platform) website is online. We invite you to browse its pages to discover the platform, stay informed, and cultivate contact with the TDP community. TDP is a completely openā¦
By David WORMS
Oct 3, 2023
Dive into tdp-lib, the SDK in charge of TDP cluster management
Categories: Big Data, Infrastructure | Tags: Programming, Ansible, Hadoop, Python, TDP
All the deployments are automated and Ansible plays a central role. With the growing complexity of the code base, a new system was needed to overcome the Ansible limitations which will enable us toā¦
Jan 24, 2023
Big data infrastructure internship
Categories: Big Data, Data Engineering, DevOps & SRE, Infrastructure | Tags: Infrastructure, Hadoop, Big Data, Cluster, Internship, Kubernetes, TDP
Job description Big Data and distributed computing are at the core of Adaltas. We accompagny our partners in the deployment, maintenance, and optimization of some of the largest clusters in Franceā¦
By Stephan BAUM
Dec 2, 2022
Spark on Hadoop integration with Jupyter
Categories: Adaltas Summit 2021, Infrastructure, Tech Radar | Tags: Infrastructure, Jupyter, Spark, YARN, CDP, HDP, Notebook, TDP
For several years, Jupyter notebook has established itself as the notebook solution in the Python universe. Historically, Jupyter is the tool of choice for data scientists who mainly develop in Pythonā¦
Sep 1, 2022
TDP workshop: Become a TDP power user from your terminal
Categories: Events, Learning | Tags: DevOps, Ansible, Hadoop, Open source, TDP
The TDP CLI is used to deploy and operate your TDP services. It relies on tdp-lib to provide control and flexibility at your fingertips. Some time ago, we announced the public release of TDP - Trunkā¦
By Paul FARAULT
Jun 17, 2022
Introducing Trunk Data Platform: the Open-Source Big Data Distribution Curated by TOSIT
Categories: Big Data, DevOps & SRE, Infrastructure | Tags: DevOps, Hortonworks, Ansible, Hadoop, HBase, Knox, Ranger, Spark, Cloudera, CDP, CDH, Open source, TDP
Ever since Cloudera and Hortonworks merged, the choice of commercial Hadoop distributions for on-prem workloads essentially boils down to CDP Private Cloud. CDP can be seen as the ābest of both worldsā¦
Apr 14, 2022
Reliable and reproducible Linux installation with NixOS
Categories: Infrastructure, Learning | Tags: Linux, Packaging, VM, NixOS, TDP
When using an operating system, upgrading packages or installing new ones are common tasks that introduce the risk of affecting the stability of the system. NixOS is a Linux distribution that ensuresā¦
Feb 8, 2022
Nix introduction, main concepts and commands
Categories: Infrastructure, Learning | Tags: Arch Linux, CentOS, Linux, OS X, Packaging, Ubuntu, NixOS, TDP
Nix is a functional package manager for Linux and other Unix systems, making the management of packages more reliable and easy to reproduce. With a traditional package manager, when updating a packageā¦
Feb 1, 2022
Internship in Big Data infrastructure with TDP
Categories: Infrastructure, Learning | Tags: Cyber Security, DevOps, Java, Hadoop, IaC, Internship, TDP
Job Description Big Data and distributed computing is at Adaltasā core. We support our partners in the deployment, maintenance and optimization of some of Franceās largest clusters. Adaltas is also anā¦
By Daniel HARTY
Oct 25, 2021
Build your open source Big Data distribution with Hadoop, HBase, Spark, Hive & Zeppelin
Categories: Big Data, Infrastructure | Tags: Maven, Hadoop, HBase, Hive, Spark, Git, Release and features, TDP, Unit tests
The Hadoop ecosystem gave birth to many popular projects including HBase, Spark and Hive. While technologies like Kubernetes and S3 compatible object storages are growing in popularity, HDFS and YARNā¦
Dec 18, 2020
Rebuilding HDP Hive: patch, test and build
Categories: Big Data, Infrastructure | Tags: Maven, Java, Hive, Git, GitHub, Release and features, TDP, Unit tests
The Hortonworks HDP distribution will soon be deprecated in favor of Clouderaās CDP. One of our clients wanted a new Apache Hive feature backported into HDP 2.6.0. We thought it was a good opportunityā¦
Oct 6, 2020
Installing Hadoop from source: build, patch and run
Categories: Big Data, Infrastructure | Tags: Maven, Java, LXD, Hadoop, HDFS, Docker, TDP, Unit tests
Commercial Apache Hadoop distributions have come and gone. The two leaders, Cloudera and Hortonworks, have merged: HDP is no more and CDH is now CDP. MapR has been acquired by HP and IBM BigInsightsā¦
Aug 4, 2020