Apache Flink
Apache Flink is a open-source data processing framework and distributed processing engine for batch and stream dataflow. Originally named Stratosphere, the project evolved into Apache Flink in 2015 under the Apache Foundation.
Flink allows you to manage batch and stream processing through it's two main APIs:
- the Dataset API
- the DataStream API.
With his built-in fault-tolerant mechanism, Flink ensures high availability, delivering high throughput and low latency. Flink integrates with a wide range of storage systems and has built-in connectors for common data sources and sinks like Kafka as a source and a sink or [Elasticsearch] (https://www.adaltas.com/en/tag/elk-elasticsearch/) as a sink.
Also, Flink provides flexibility, operating either as a standalone cluster or integrating with common resource managers such as Hadoop YARN and Kubernetes
- Learn more
- Official website
Related articles
Internship Data Science & Data Engineer - ML in production and streaming data ingestion
Categories: Data Engineering, Data Science | Tags: DevOps, Flink, Hadoop, HBase, Kafka, Spark, Internship, Kubernetes, Python
Context The exponential evolution of data has turned the industry upside down by redefining data storage, processing and data ingestion pipelines. Mastering these methods considerably facilitatesā¦
By David WORMS
Nov 26, 2019
Apache Flink: past, present and future
Categories: Data Engineering | Tags: Pipeline, Flink, Kubernetes, Machine Learning, SQL, Streaming
Apache Flink is a little gem which deserves a lot more attention. Letās dive into Flinkās past, its current state and the future it is heading to by following the keynotes and presentations at Flinkā¦
Nov 5, 2018
One week to discuss technology in a Moroccan riad
Categories: Adaltas Summit 2018, Learning | Tags: CDSW, Gatsby, React.js, Flink, Hadoop, Knox, Data Science, Deep Learning, Kubernetes, Node.js
Adaltas organise the year its first conference between the 22 and 26 of October. On the agenda of these 5 days of conference: discuss technology in one of the most beautiful riad of Marrakech. Mix theā¦
By David WORMS
Oct 11, 2018
Deploying a secured Flink cluster on Kubernetes
Categories: Big Data | Tags: Encryption, Flink, HDFS, Kafka, Elasticsearch, Kerberos, SSL/TLS
When deploying secured Flink applications inside Kubernetes, you are faced with two choices. Assuming your Kubernetes is secure, you may rely on the underlying platform or rely on Flink nativeā¦
By David WORMS
Oct 8, 2018
Apache Beam: a unified programming model for data processing pipelines
Categories: Data Engineering, DataWorks Summit 2018 | Tags: Apex, Beam, Pipeline, Flink, Spark
In this article, we will review the concepts, the history and the future of Apache Beam, that may well become the new standard for data processing pipelines definition. At Dataworks Summit 2018 inā¦
May 24, 2018
Apache Apex with Apache SAMOA
Categories: Data Science, Events, Tech Radar | Tags: Apex, Samoa, Storm, Tools, Flink, Hadoop, Machine Learning
Traditional Machine Learning Batch Oriented Supervised - most common Training and Scoring One time model building Data set Training: Model building Holdout: Paremeter tuning Test: Accuracy Onlineā¦
Jul 17, 2016
Apache Apex: next gen Big Data analytics
Categories: Data Science, Events, Tech Radar | Tags: Apex, Storm, Tools, Flink, Hadoop, Kafka, Data Science, Machine Learning
Below is a compilation of my notes taken during the presentation of Apache Apex by Thomas Weise from DataTorrent, the company behind Apex. Introduction Apache Apex is an in-memory distributed parallelā¦
Jul 17, 2016