Structured Query Language (SQL)

SQL is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS)

Learn more
Wikipedia

Related articles

Spark Streaming part 1: build data pipelines with Spark Structured Streaming

Spark Streaming part 1: build data pipelines with Spark Structured Streaming

Categories: Data Engineering, Learning | Tags: Kafka, Spark, Apache Spark Streaming, Big Data, Streaming

Spark Structured Streaming is a new engine introduced with Apache Spark 2 used for processing streaming data. It is built on top of the existing Spark SQL engine and the Spark DataFrame. Theā€¦

Oskar RYNKIEWICZ

By Oskar RYNKIEWICZ

Apr 18, 2019

Insert rows in BigQuery tables with complex columns

Insert rows in BigQuery tables with complex columns

Categories: Cloud Computing, Data Engineering | Tags: GCP, BigQuery, Schema, SQL

Googleā€™s BigQuery is a cloud data warehousing system designed to process enormous volumes of data with several features available. Out of all those features, letā€™s talk about the support of Structā€¦

CĆ©sar BEREZOWSKI

By CĆ©sar BEREZOWSKI

Nov 22, 2019

Druid and Hive integration

Druid and Hive integration

Categories: Big Data, Business Intelligence, Tech Radar | Tags: LLAP, OLAP, Druid, Hive, Data Analytics, SQL

This article covers the integration between Hive Interactive (LDAP) and Druid. One can see it as a complement of the Ultra-fast OLAP Analytics with Apache Hive and Druid article. Tools descriptionā€¦

Pierre SAUVAGE

By Pierre SAUVAGE

Jun 17, 2019

Publish Spark SQL DataFrame and RDD with Spark Thrift Server

Publish Spark SQL DataFrame and RDD with Spark Thrift Server

Categories: Data Engineering | Tags: Thrift, JDBC, Hadoop, Hive, Spark, SQL

The distributed and in-memory nature of the Spark engine makes it an excellent candidate to expose data to clients which expect low latencies. Dashboards, notebooks, BI studios, KPIs-based reportsā€¦

Oskar RYNKIEWICZ

By Oskar RYNKIEWICZ

Mar 25, 2019

Apache Flink: past, present and future

Apache Flink: past, present and future

Categories: Data Engineering | Tags: Flink, Pipeline, Kubernetes, Machine Learning, SQL, Streaming

Apache Flink is a little gem which deserves a lot more attention. Letā€™s dive into Flinkā€™s past, its current state and the future it is heading to by following the keynotes and presentations at Flinkā€¦

CĆ©sar BEREZOWSKI

By CĆ©sar BEREZOWSKI

Nov 5, 2018

Accelerating query processing with materialized views in Apache Hive

Accelerating query processing with materialized views in Apache Hive

Categories: Business Intelligence, DataWorks Summit 2018 | Tags: Calcite, OLAP, Druid, Hive, Release and features, SQL

The new materialized view feature is coming in Apache Hive 3.0. Jesus Camacho Rodriguez from Hortonworks held a talk ā€Accelerating query processing with materialized views in Apache Hiveā€ about itā€¦

Paul-Adrien CORDONNIER

By Paul-Adrien CORDONNIER

May 31, 2018

Apache Metron in the Real World

Apache Metron in the Real World

Categories: Cyber Security, DataWorks Summit 2018 | Tags: Algorithm, NiFi, Solr, Storm, pcap, RDBMS, HDFS, Kafka, Metron, Spark, Data Science, Elasticsearch, SQL

Apache Metron is a storage and analytic platform specialized in cyber security. This talk was about demonstrating the usages and capabilities of Apache Metron in the real world. The presentation wasā€¦

Michael HATOUM

By Michael HATOUM

May 29, 2018

Omid: Scalable and highly available transaction processing for Apache Phoenix

Omid: Scalable and highly available transaction processing for Apache Phoenix

Categories: Big Data, DataWorks Summit 2018 | Tags: Omid, Phoenix, Transaction, ACID, HBase, SQL

Apache Omid provides a transactional layer on top of key/value NoSQL databases. In practice, it is usually used on top of Apache HBase. Credits to Ohad Shacham for his talk and his work for Apacheā€¦

Xavier HERMAND

By Xavier HERMAND

May 24, 2018

Splitting HDFS files into multiple hive tables

Splitting HDFS files into multiple hive tables

Categories: Data Engineering | Tags: Flume, Pig, HDFS, Hive, Oozie, SQL

I am going to show how to split a CSV file stored inside HDFS as multiple Hive tables based on the content of each record. The context is simple. We are using Flume to collect logs from all over ourā€¦

David WORMS

By David WORMS

Sep 15, 2013

Options to connect and integrate Hadoop with Oracle

Options to connect and integrate Hadoop with Oracle

Categories: Data Engineering | Tags: Database, Java, Oracle, R, RDBMS, Avro, HDFS, Hive, MapReduce, Sqoop, NoSQL, SQL

I will list the different tools and libraries available to us developers in order to integrate Oracle and Hadoop. The Oracle SQL Connector for HDFS described below is covered in a follow up articleā€¦

David WORMS

By David WORMS

May 15, 2013

Testing the Oracle SQL Connector for Hadoop HDFS

Testing the Oracle SQL Connector for Hadoop HDFS

Categories: Data Engineering | Tags: Database, File system, Oracle, HDFS, CDH, SQL

Using Oracle SQL Connector for HDFS, you can use Oracle Database to access and analyze data residing in HDFS files or a Hive table. You can also query and join data in HDFS or a Hive table with otherā€¦

David WORMS

By David WORMS

Jul 15, 2013

Apache Hive Essentials How-to by Darren Lee

Apache Hive Essentials How-to by Darren Lee

Categories: Business Intelligence, Learning | Tags: UDF, Hadoop, Hive, File Format, SQL

Recently, Iā€™ve been ask to review a new book on Apache Hive called ā€œApache Hive Essentials How-toā€ (edit: the second edition is now available) written by Darren Lee and published by Packt Publishingā€¦

David WORMS

By David WORMS

Apr 23, 2013

Installing and using MADlib with PostgreSQL on OSX

Installing and using MADlib with PostgreSQL on OSX

Categories: Data Science | Tags: Database, Greenplum, Statistics, PostgreSQL, SQL

We cover basic installation and usage of PostgreSQL and MADlib on OSX and Ubuntu. Instructions for other environments should be similar. PostgreSQL is an Open Source database with enterpriseā€¦

David WORMS

By David WORMS

Jul 7, 2012

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Scienceā€¦

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain