Apache Flink

Apache Flink is a open-source data processing framework and distributed processing engine for batch and stream dataflow. Originally named Stratosphere, the project evolved into Apache Flink in 2015 under the Apache Foundation.

Flink allows you to manage batch and stream processing through it's two main APIs:

the Dataset API
the DataStream API.

With his built-in fault-tolerant mechanism, Flink ensures high availability, delivering high throughput and low latency. Flink integrates with a wide range of storage systems and has built-in connectors for common data sources and sinks like Kafka as a source and a sink or [Elasticsearch] (https://www.adaltas.com/en/tag/elk-elasticsearch/) as a sink.

Also, Flink provides flexibility, operating either as a standalone cluster or integrating with common resource managers such as Hadoop YARN and Kubernetes

Learn more: Official website
Related tags: Apache Beam; Apache Hadoop; Apache Hadoop YARN; Apache Kafka; Apache Spark; Elasticsearch

Apache Apex with Apache SAMOA

Categories: Data Science, Events, Tech Radar | Tags: Apex, Samoa, Storm, Tools, Flink, Hadoop, Machine Learning

Traditional Machine Learning Batch Oriented Supervised - most common Training and Scoring One time model building Data set Training: Model building Holdout: Paremeter tuning Test: Accuracy Online…

By Pierre SAUVAGE

Jul 17, 2016

Apache Apex: next gen Big Data analytics

Categories: Data Science, Events, Tech Radar | Tags: Apex, Storm, Tools, Flink, Hadoop, Kafka, Data Science, Machine Learning

Below is a compilation of my notes taken during the presentation of Apache Apex by Thomas Weise from DataTorrent, the company behind Apex. Introduction Apache Apex is an in-memory distributed parallel…

By César BEREZOWSKI

Jul 17, 2016

Apache Beam: a unified programming model for data processing pipelines

Categories: Data Engineering, DataWorks Summit 2018 | Tags: Apex, Beam, Pipeline, Flink, Spark

In this article, we will review the concepts, the history and the future of Apache Beam, that may well become the new standard for data processing pipelines definition. At Dataworks Summit 2018 in…

By Gauthier LEONARD

May 24, 2018

Deploying a secured Flink cluster on Kubernetes

Categories: Big Data | Tags: Encryption, Flink, HDFS, Kafka, Elasticsearch, Kerberos, SSL/TLS

When deploying secured Flink applications inside Kubernetes, you are faced with two choices. Assuming your Kubernetes is secure, you may rely on the underlying platform or rely on Flink native…

By David WORMS

Oct 8, 2018

One week to discuss technology in a Moroccan riad

Categories: Adaltas Summit 2018, Learning | Tags: CDSW, Gatsby, React.js, Flink, Hadoop, Knox, Data Science, Deep Learning, Kubernetes, Node.js

Adaltas organise the year its first conference between the 22 and 26 of October. On the agenda of these 5 days of conference: discuss technology in one of the most beautiful riad of Marrakech. Mix the…

By David WORMS

Oct 11, 2018

Apache Flink: past, present and future

Categories: Data Engineering | Tags: Pipeline, Flink, Kubernetes, Machine Learning, SQL, Streaming

Apache Flink is a little gem which deserves a lot more attention. Let’s dive into Flink’s past, its current state and the future it is heading to by following the keynotes and presentations at Flink…

By César BEREZOWSKI

Nov 5, 2018

Internship Data Science & Data Engineer - ML in production and streaming data ingestion

Categories: Data Engineering, Data Science | Tags: DevOps, Flink, Hadoop, HBase, Kafka, Spark, Internship, Kubernetes, Python

Context The exponential evolution of data has turned the industry upside down by redefining data storage, processing and data ingestion pipelines. Mastering these methods considerably facilitates…

By David WORMS

Nov 26, 2019

Apache Flink

Related articles

Apache Apex with Apache SAMOA

Apache Apex: next gen Big Data analytics

Apache Beam: a unified programming model for data processing pipelines

Deploying a secured Flink cluster on Kubernetes

One week to discuss technology in a Moroccan riad

Apache Flink: past, present and future

Internship Data Science & Data Engineer - ML in production and streaming data ingestion