Data structures
Related articles
Storage size and generation time in popular file formats
Categories: Data Engineering, Data Science | Tags: Avro, HDFS, Hive, ORC, Parquet, Big Data, Data Lake, File Format, JavaScript Object Notation (JSON)
Choosing an appropriate file format is essential, whether your data transits on the wire or is stored at rest. Each file format comes with its own advantages and disadvantages. We covered them in aā¦
Mar 22, 2021
Download datasets into HDFS and Hive
Categories: Big Data, Data Engineering | Tags: Business intelligence, Data Engineering, Data structures, Database, Hadoop, HDFS, Hive, Big Data, Data Analytics, Data Lake, Data lakehouse, Data Warehouse
Introduction Nowadays, the analysis of large amounts of data is becoming more and more possible thanks to Big data technology (Hadoop, Spark,ā¦). This explains the explosion of the data volume and theā¦
By Aida NGOM
Jul 31, 2020
Comparison of different file formats in Big Data
Categories: Big Data, Data Engineering | Tags: Business intelligence, Data structures, Avro, HDFS, ORC, Parquet, Batch processing, Big Data, CSV, JavaScript Object Notation (JSON), Kubernetes, Protocol Buffers
In data processing, there are different types of files formats to store your data sets. Each format has its own pros and cons depending upon the use cases and exists to serve one or several purposesā¦
By Aida NGOM
Jul 23, 2020
Scaling massive, real-time data pipelines with Go
Categories: Open Source Summit Europe 2017, Learning | Tags: Algorithm, Data structures, Go Lang, Pipeline, Protocols, Network
Last week at the Open Source Summit in Prague, Jean de Klerk held a talk called Scaling massive, real-time data pipelines with Go. This article goes over the main points of the talk, detailing theā¦
Nov 21, 2017