Clusters and workloads migration from Hadoop 2 to Hadoop 3
Jul 25, 2018
- Categories
- Big Data
- Infrastructure
- Tags
- Slider
- Erasure Coding
- Rolling Upgrade
- HDFS
- Spark
- YARN
- Docker [more][less]
Never miss our publications about Open Source, big data and distributed systems, low frequency of one email every two months.
Hadoop 2 to Hadoop 3 migration is a hot subject. How to upgrade your clusters, which features present in the new release may solve current problems and bring new opportunities, how are your current processes impacted, which migration strategy is the most appropriate to your organization?
This article covers a talk given at Dataworks Summit’s 2018 edition in San Jose California, given by Suma Shivaprasad (Hortonworks Staff Engineer) and Rohith Sharkma (Hortonworks Senior Software Engineer). Hadoop 3 is a major version change. It comes with a lot of important new features and some incompatibilities. The latest major version, Hadoop 2, came in 2012. In many organizations, the migration took several years. Nowadays, clusters running Hadoop 2 are more numerous than those running Hadoop 1 and achieve higher SLAs.
The upgrade is also a challenge for the Hortonworks support and engineer teams. To ensure the smoothest possible upgrades, instructions and release notes should be the most accurate possible. Challenge accepted (on their behalf).
Upgrade motivations
Let’s remember why we should consider upgrading to 3.x.
HDFS
HDFS comes with many new features which are:
- Federation GA (General Availability)
- Erasure Coding
- Significant cost savings in storage
- Reduction of overhead from 200% to 50%
- Intra-DataNode Disk-Balancer
Erasure Coding significantly decreases the disk usage taken by HDFS at the cost of more stress in computing (reading a file). Prior to Hadoop 3, the Intra-DataNode Disk-Balancer was only effective at file creation. Now the behavior of the balancer will be similar to the Inter-DataNode balancer (and will avoid using external tools to do it).
YARN
New features include:
- Scheduler improvements
- New Resource types: GPUs, FPGAs
- Fast and Global scheduling
- Containerisation - Docker
- Long-running Services rehash
- New UI2
- Timeline Server v2
In my opinion, with the exception of the HDFS federation, YARN is the main reason motivating the upgrade to Hadoop 3. Docker scheduling, containerization, GPUs services, … those are features which were severely missing from current Hadoop clusters operated in production. It will make Hadoop a viable and more full-featured scheduler to answer the various existing and emerging use cases.
Things to consider before upgrade
Upgrade Mechanism
There is mainly two types of upgrade:
- Express Upgrades
- “Stop the world” upgrade (service downtime)
- Cluster Downtime
- Upgrade all process in one shot
- Rolling upgrades
- Preserve cluster operation
- Minimize impact and downtimes
- Can take a long time to complete
- Upgrade each process separately (master & workers by batch)
For the Hadoop 2 to Hadoop 3 migration. A rolling upgrade will be impossible.
Why?
-_-'
This is mainly due to the fact that components have incompatible changes:
- HDFS-13596
Changes in edit log format. If a JournalNode is upgraded the other one will fail to exchange information with the first one. - HDFS-15502
Changes in MetricsPlugin API. Impossible to read metrics from upgraded services without code update (for applications or HDFS services) - HDFS-6440
Changes in image transfer protocol. This means that if one NameNode is upgraded the other one (Standby NN for HA or Secondary NN) will fail to get the fsimage.
Incompatibility means that an HDFS DataNode running with version 2.x will fail to communicate with an HDFS NameNode running with version 3.x. By design, Hadoop does not provide guaranties to preserve protocol compatibilities between major versions. Thus, HDFS rolling upgrade is impossible and by consequence all the services depending on it.
Cluster Environment
- JAVA
- JAVA >= 8
- Shell
- Bash >= 3
- Docker
- Docker >= 1.12.5
So if you plan to use YARN 3 and Docker containerization (I know you will :) ) upgrade to 1.12.5.
- Docker >= 1.12.5
Hadoop Env Files
For every Hadoop user (developers or administrators) environment files are a big issue. I don’t know if you noticed it, but it was a mess as there were hadoop-env.sh
files (working for HDFS, MapReduce, YARN) and another one like yarn-env.sh
(YARN only) mapreduce-env.sh
(MR only). And the variables names were also messy.
Hadoop 3.x is the occasion to bring order. Now we will have:
hadoop-env.sh
- common placeholder
- Precedence rule
(yarn|hdfs)-env.sh
>hadoop-env.sh
> hard-coded default
hdfs-env.sh
This is a new one, it will allow isolating HDFS environment like for YARN.HDFS_*
replacesHADOOP_*
- Precedence rule
hdfs-env.sh
>hadoop-env.sh
> hard-coded default
yarn-env.sh
YARN_*
replacesHADOOP_*
- Precedence rule
yarn-env.sh
>hadoop-env.sh
> hard-coded default
HADOOP_HEAPSIZE
- JIRA HADOOP-10950
- is deprecated
- will be replaced by
HADOOP_HEAPSIZE_MAX
andHADOOP_HEAPSIZE_MIN
which makes more sense for setting Xmx and Xms JAVA opts - units support will be MB by default
HADOOP_HEAPSIZE_MAX=4096
HADOOP_HEAPSIZE_MAX=4g
- will be autotuned based on the host resource
Configuration Changes
Some default configurations have also changed (content of hdfs-site and yarn-site).
YARN
- RM max completed Applications in State/Memory
- property:
yarn.resourcemanager.max-completed-applications
value:10000
->1000
- property:
yarn.resourcemanager.state-store.max-completed-application
value:10000
->1000
- property:
HDFS
- Change in default daemon ports, all main port have changed for example:
- Namenode UI
50470
->9871
(secured)
50070
->9870
(unsecured) - Datanode UI
50475
->9865
(secured)
50075
->9864
(unsecured)
- Namenode UI
You can check the whole list the JIRA’s page HDFS-9427.
Don’t worry during the upgrade: because the configuration is already specified, the ports will not change. So the UI will still be available on the same link, BUT if you create a new cluster with Apache Ambari for example, the ports will be the new default ones.
Classpath Changes
This one is a tough one. We have classpath isolation now!
Users should rebuild their applications with shaded Hadoop-clients jars. Because server and client Java classes have been separated now, the jars do not contain the same dependencies.
- Hadoop dependencies leaked to application’s classpath - Guava, Protobuf, Jackson, jetty…
Every Hadoop developer has already encountered those class names. - Shaded jars available - isolates downstream clients from any third party dependencies.
HADOOP-11804. This is about Hadoop-client 2.x which does pull Hadoop’s transitive dependencies and inject them to the classpath. The problem is that transitive dependencies’ version can be in conflict with the application’s dependencies. This is solved with:hadoop-client-api
for compile-time dependencieshadoop-client-runtime
for runtime third-party dependencieshadoop-minicluster
for test scope dependencies
- Issue HDFS-6200 resolved.
hadoop-hdfs-jar
contained both HDFS server and client classes. Now Clients should instead depend onhadoop-hdfs-client
if they want to do HDFS operation in the application. Application will be isolated from HDFS’ server classes (so fewer dependencies, lighter apps). - No YARN/MR shaded jars.
Upgrade Steps
So as a reminder: express upgrade is recommended.
So once you have negotiated service downtime with your customers, cluster operators should proceed with:
- Install new packages
Downloads all new packages (no HDP/HDF select done) - Stop Services
Service downtime start - Configuration updates
Write new fileshdfs-site.xml
,core-site.xml
…hdfs-env.sh
! - Link to new version
Use thehdp-select
/hdf-select
commands to link new version, very useful. - Start services
This is the right moment to take a coffee, cross fingers and hope that everything will be all right.
For operators using Ambari, all those steps will be taken care of by Ambari, and you will see the current running steps and the stage of the upgrade.
Enable New features
That’s the cool thing. Passing to Hadoop 3 is complicated, but to not do it in Big Bang mode, all the new features can be enabled afterward. Hence, cluster administrators and operators can think about how/what/when to enable new features. Let’s remember what are the new features:
What about other Components
Hive on Tez
- Hive 3.0 supports Hadoop 3
- Hive does not support rolling upgrades
- The disk layout of
acid
tables has changed and is not backward compatible with 2.x
- The disk layout of
- Tez version support for Hadoop 3
Spark
- SPARK-23534 Umbrella JIRA to build/test with Hadoop 3, ongoing efforts in the community to resolve it.
Apache HBase
- HBase 2.0 supports Hadoop 3 (in HDP 2.6 we only have 1.2.1)
- Does NOT support the rolling upgrade either
Apache Slider
Slider is being removed from Apache incubator and will be integrated to YARN 3.x directly. Long-running applications using Slider will need to be ported from Slider to YARN.
Pig/Oozie
- PIG-5253 Enable HADOOP 3 support (planned for pig 0.18.0)
- OOZIE-2973 Make sure Oozie works with Hadoop 3 (planned for Oozie 5.1.0)
Conclusion
Migrating from Hadoop 2 (HDP 2.x) to Hadoop 3 (HDP 3.x) will not be an easy task, whether you are a cluster administrator or software developer. Teams and directions will have to synchronize tightly to make sure everything works fines. Cluster administrators will have to test several times the upgrade steps and procedures, and once ready, software developers will have to adapt their applications and do full integration tests. Apache Ambari will ease the pain for administrators as it takes care of writing new configuration files and manages services.
Before enabling new features you should first test that everything is alright and separate with few weeks upgrade and new features enablement. No rolling back is possible as every format on disk changes. But that’s all theoretical. Finally, the best way is to test it yourself. As the last advice I would say that administrators should test upgrading with HDP 3.0 as soon as it is released, but for production environment, 3.1 should be waited.