Structured Query Language (SQL)
SQL is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS)
- Learn more
- Wikipedia
Related articles
Spark Streaming part 1: build data pipelines with Spark Structured Streaming
Categories: Data Engineering, Learning | Tags: Kafka, Spark, Apache Spark Streaming, Big Data, Streaming
Spark Structured Streaming is a new engine introduced with Apache Spark 2 used for processing streaming data. It is built on top of the existing Spark SQL engine and the Spark DataFrame. Theā¦
Apr 18, 2019
Insert rows in BigQuery tables with complex columns
Categories: Cloud Computing, Data Engineering | Tags: GCP, BigQuery, Schema, SQL
Googleās BigQuery is a cloud data warehousing system designed to process enormous volumes of data with several features available. Out of all those features, letās talk about the support of Structā¦
Nov 22, 2019
Druid and Hive integration
Categories: Big Data, Business Intelligence, Tech Radar | Tags: LLAP, OLAP, Druid, Hive, Data Analytics, SQL
This article covers the integration between Hive Interactive (LDAP) and Druid. One can see it as a complement of the Ultra-fast OLAP Analytics with Apache Hive and Druid article. Tools descriptionā¦
Jun 17, 2019
Publish Spark SQL DataFrame and RDD with Spark Thrift Server
Categories: Data Engineering | Tags: Thrift, JDBC, Hadoop, Hive, Spark, SQL
The distributed and in-memory nature of the Spark engine makes it an excellent candidate to expose data to clients which expect low latencies. Dashboards, notebooks, BI studios, KPIs-based reportsā¦
Mar 25, 2019
Apache Flink: past, present and future
Categories: Data Engineering | Tags: Pipeline, Flink, Kubernetes, Machine Learning, SQL, Streaming
Apache Flink is a little gem which deserves a lot more attention. Letās dive into Flinkās past, its current state and the future it is heading to by following the keynotes and presentations at Flinkā¦
Nov 5, 2018
Accelerating query processing with materialized views in Apache Hive
Categories: Business Intelligence, DataWorks Summit 2018 | Tags: Calcite, OLAP, Druid, Hive, Release and features, SQL
The new materialized view feature is coming in Apache Hive 3.0. Jesus Camacho Rodriguez from Hortonworks held a talk āAccelerating query processing with materialized views in Apache Hiveā about itā¦
May 31, 2018
Apache Metron in the Real World
Categories: Cyber Security, DataWorks Summit 2018 | Tags: Algorithm, NiFi, Solr, Storm, pcap, RDBMS, HDFS, Kafka, Metron, Spark, Data Science, Elasticsearch, SQL
Apache Metron is a storage and analytic platform specialized in cyber security. This talk was about demonstrating the usages and capabilities of Apache Metron in the real world. The presentation wasā¦
May 29, 2018
Omid: Scalable and highly available transaction processing for Apache Phoenix
Categories: Big Data, DataWorks Summit 2018 | Tags: Omid, Phoenix, Transaction, ACID, HBase, SQL
Apache Omid provides a transactional layer on top of key/value NoSQL databases. In practice, it is usually used on top of Apache HBase. Credits to Ohad Shacham for his talk and his work for Apacheā¦
May 24, 2018
Splitting HDFS files into multiple hive tables
Categories: Data Engineering | Tags: Flume, Pig, HDFS, Hive, Oozie, SQL
I am going to show how to split a CSV file stored inside HDFS as multiple Hive tables based on the content of each record. The context is simple. We are using Flume to collect logs from all over ourā¦
By David WORMS
Sep 15, 2013
Testing the Oracle SQL Connector for Hadoop HDFS
Categories: Data Engineering | Tags: Database, File system, Oracle, HDFS, CDH, SQL
Using Oracle SQL Connector for HDFS, you can use Oracle Database to access and analyze data residing in HDFS files or a Hive table. You can also query and join data in HDFS or a Hive table with otherā¦
By David WORMS
Jul 15, 2013
Options to connect and integrate Hadoop with Oracle
Categories: Data Engineering | Tags: Database, Java, Oracle, R, RDBMS, Avro, HDFS, Hive, MapReduce, Sqoop, NoSQL, SQL
I will list the different tools and libraries available to us developers in order to integrate Oracle and Hadoop. The Oracle SQL Connector for HDFS described below is covered in a follow up articleā¦
By David WORMS
May 15, 2013
Apache Hive Essentials How-to by Darren Lee
Categories: Business Intelligence, Learning | Tags: UDF, Hadoop, Hive, File Format, SQL
Recently, Iāve been ask to review a new book on Apache Hive called āApache Hive Essentials How-toā (edit: the second edition is now available) written by Darren Lee and published by Packt Publishingā¦
By David WORMS
Apr 23, 2013
Installing and using MADlib with PostgreSQL on OSX
Categories: Data Science | Tags: Database, Greenplum, Statistics, PostgreSQL, SQL
We cover basic installation and usage of PostgreSQL and MADlib on OSX and Ubuntu. Instructions for other environments should be similar. PostgreSQL is an Open Source database with enterpriseā¦
By David WORMS
Jul 7, 2012