Elasticsearch
Elasticsearch is an open source analytics, storage and search engine developed by Elasticsearch B.V. and first released in 2010. It's a distributed software written in Java and built on top of Apache Lucene.The latter is used for indexing and searching data via a REST API.
It is often used with Kibana, a data visualization platform, and Logstash, a data processing pipeline, which are tools developed and maintained by the same company. Together they form what's referred to as the ELK Stack.
Grafana, while not part of the ELK stack, is another open source tool often used with with Elasticsearch for visualizing metrics such as memory, CPU usage and system I/O.
Elasticsearch provides complex search functionality, such as auto-completion, handling synonyms or even correcting typos. But, it is also used as an analytics platform by querying structured data for instance:
- Analyzing application logs and system metrics
- Send events to Elasticsearch
- Forecast future values with machine learning and anomality detection.
Since Elasticsearch is distributed by nature, it scales very well in terms of increasing data volumes and query throughput.
- Learn more
- Official website
Related articles
Spring 2022 internship - building a Data Lab
Categories: Data Science, Learning | Tags: MongoDB, Spark, Argo CD, Elasticsearch, Internship, Keycloak, Kubernetes, OpenID Connect, PostgreSQL
Job Description Over the last few years, we developed the ability to use computers to process large amounts of data. The ecosystem evolved over a large offering of tools and libraries and the creationā¦
By David WORMS
Nov 24, 2021
Internship in Data Engineering
Categories: Front End, Learning | Tags: Metrics, Monitoring, Hive, Kafka, Delta Lake, Elasticsearch, IaC, Internship, Kubernetes, Streaming
Job Description Data is a valuable business asset. Some call it the new oil. The data engineer collects, transform and refine āāraw data into information that can be used by business analysts and dataā¦
By David WORMS
Oct 25, 2021
Logstash pipelines remote configuration and self-indexing
Categories: Data Engineering, Infrastructure | Tags: Docker, Elasticsearch, Kibana, Logstash, Log4j
Logstash is a powerful data collection engine that integrates in the Elastic Stack (Elasticsearch - Logstash - Kibana). The goal of this article is to show you how to deploy a fully managed Logstashā¦
Dec 13, 2019
Internship Data Science & Data Engineer - ML in production and streaming data ingestion
Categories: Data Engineering, Data Science | Tags: DevOps, Flink, Hadoop, HBase, Kafka, Spark, Internship, Kubernetes, Python
Context The exponential evolution of data has turned the industry upside down by redefining data storage, processing and data ingestion pipelines. Mastering these methods considerably facilitatesā¦
By David WORMS
Nov 26, 2019
Monitoring a production Hadoop cluster with Kubernetes
Categories: DevOps & SRE | Tags: Thrift, Grafana, Shinken, Hadoop, Knox, Cluster, Docker, Elasticsearch, Kubernetes, Node, Node.js, Prometheus, Python
Monitoring a production grade Hadoop cluster is a real challenge and needs to be constantly evolving. The software we use today is based on Nagios. Very efficient when it comes to the simplestā¦
Dec 21, 2018
Deploying a secured Flink cluster on Kubernetes
Categories: Big Data | Tags: Encryption, Flink, HDFS, Kafka, Elasticsearch, Kerberos, SSL/TLS
When deploying secured Flink applications inside Kubernetes, you are faced with two choices. Assuming your Kubernetes is secure, you may rely on the underlying platform or rely on Flink nativeā¦
By David WORMS
Oct 8, 2018
Apache Metron in the Real World
Categories: Cyber Security, DataWorks Summit 2018 | Tags: Algorithm, NiFi, Solr, Storm, pcap, RDBMS, HDFS, Kafka, Metron, Spark, Data Science, Elasticsearch, SQL
Apache Metron is a storage and analytic platform specialized in cyber security. This talk was about demonstrating the usages and capabilities of Apache Metron in the real world. The presentation wasā¦
May 29, 2018
Essential questions about Time Series
Categories: Big Data | Tags: Grafana, Druid, HBase, Hive, ORC, Data Science, Elasticsearch, IOT
Today, the bulk of Big Data is temporal. We see it in the media and among our customers: smart meters, banking transactions, smart factories, connected vehicles ā¦ IoT and Big Data go hand in hand. Weā¦
By David WORMS
Mar 18, 2018
Execute Python in an Oozie workflow
Categories: Data Engineering | Tags: Oozie, Elasticsearch, Python, REST
Oozie workflows allow you to use multiple actions to execute code, however doing so with Python can be a bit tricky, letās see how to do that. Iāve recently designed a workflow that would interactā¦
Mar 6, 2018
Yahoo's Vespa Engine
Categories: Tech Radar | Tags: Database, Tools, Elasticsearch, Search Engine
Vespa is Yahooās fully autonomous and self-sufficient big data processing and serving engine. It aims at serving results of queries on huge amounts of data in real time. An example of this would beā¦
Oct 16, 2017