
Related tags

Related articles

Node CSV version 0.2.7

Node CSV version 0.2.7

Categories: Hack | Tags: Pipeline, CoffeeScript, CSV, Node.js

While Iā€™m release version 0.2.7 of the CSV parser for Node.js, I stop here to drop a few lines of what has made into this release. We are now using the latest CoffeeScript, which is version 1.4.ā€¦


By David WORMS

Jul 9, 2013

Scaling massive, real-time data pipelines with Go

Scaling massive, real-time data pipelines with Go

Categories: Open Source Summit Europe 2017, Learning | Tags: Algorithm, Data structures, Go Lang, Pipeline, Protocols, Network

Last week at the Open Source Summit in Prague, Jean de Klerk held a talk called Scaling massive, real-time data pipelines with Go. This article goes over the main points of the talk, detailing theā€¦


By Arthur BUSSER

Nov 21, 2017

Apache Beam: a unified programming model for data processing pipelines

Apache Beam: a unified programming model for data processing pipelines

Categories: Data Engineering, DataWorks Summit 2018 | Tags: Apex, Beam, Pipeline, Flink, Spark

In this article, we will review the concepts, the history and the future of Apache Beam, that may well become the new standard for data processing pipelines definition. At Dataworks Summit 2018 inā€¦

Gauthier LEONARD

By Gauthier LEONARD

May 24, 2018

Apache Flink: past, present and future

Apache Flink: past, present and future

Categories: Data Engineering | Tags: Pipeline, Flink, Kubernetes, Machine Learning, SQL, Streaming

Apache Flink is a little gem which deserves a lot more attention. Letā€™s dive into Flinkā€™s past, its current state and the future it is heading to by following the keynotes and presentations at Flinkā€¦



Nov 5, 2018

Avoid Bottlenecks in distributed Deep Learning pipelines with Horovod

Avoid Bottlenecks in distributed Deep Learning pipelines with Horovod

Categories: Data Science | Tags: GPU, Deep Learning, Horovod, Keras, TensorFlow

The Deep Learning training process can be greatly speed up using a cluster of GPUs. When dealing with huge amounts of data, distributed computing quickly becomes a challenge. A common obstacle whichā€¦



Nov 15, 2019

Logstash pipelines remote configuration and self-indexing

Logstash pipelines remote configuration and self-indexing

Categories: Data Engineering, Infrastructure | Tags: Docker, Elasticsearch, Kibana, Logstash, Log4j

Logstash is a powerful data collection engine that integrates in the Elastic Stack (Elasticsearch - Logstash - Kibana). The goal of this article is to show you how to deploy a fully managed Logstashā€¦



Dec 13, 2019

TensorFlow Extended (TFX): the components and their functionalities

TensorFlow Extended (TFX): the components and their functionalities

Categories: Big Data, Data Engineering, Data Science, Learning | Tags: Beam, Data Engineering, Pipeline, CI/CD, Data Science, Deep Learning, Deployment, Machine Learning, MLOps, Open source, Python, TensorFlow

Putting Machine Learning (ML) and Deep Learning (DL) models in production certainly is a difficult task. It has been recognized as more failure-prone and time consuming than the modeling itself, yetā€¦

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Scienceā€¦

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain