How Elastic search works

Geomario
2 min readFeb 8, 2021

When working with unstructured data, you should ask yourself, How Elasticsearch works?.

How Elasticsearch works?

In my second week as a Software Developer & Data Scientist, I met Elasticsearch. Elasticsearch is a search engine. It is open and provides you with an analytical engine. The queries for Elasticsearch are in real-time. Elasticsearch provides features as autocomplete, geo-localisation, based filters and multilevel aggregations.

Our company scraps the web 🐱‍💻 . Therefore, scraping unstructured data as any web page and transforming it into structured data seems to fit Elasticsearch and our duties perfectly.

Basic Concepts

The structure of Elasticsearch could be similar to a SQL structure. The next table gives you a view (equivalent terms) of it.

Figure.1

Document

Do you remember that the company scraps the web? Elasticsearch stores the data in JSON documents. JSON is very flexible and easily understood by humans. Elasticsearch supports our duty while we grab unstructured data from websites and store it in a JSON. The equivalent of a Row (SQL) in Elasticsearch is a Document (Figure 1).

Figure. 2. “Users” Type

In Figure 2, we created a “Users” type that shows us the Document (Row), Luke, Petra, 20, 21, male, etc. In figure 2, therefore, our elements are:

Type     = USERS.
Document = Luke, Petra, 20, 21, 1, 2, Male, Female, etc.
Field = ID, Name, Age, Gender, Email.

It is a simple new way of naming our data. When we see a Document as a JSON format would look like:

{
"id": 1,
"name": "Luke",
"gender": "M",
"email": "luke@gmail.com"
}

Index

The index in Elasticsearch is a database in SQL. Do not be confused, with a Database index. Hence, our data would be stored in Elasticsearch indexes similarly as you store data in databases. Elasticsearch index should be in lower case and with a unique name.

Type

A type is a database table. In figure 1, we have created a “USERS” type. Different Types separate a different kind of data. Therefore an index can be relational and contain more than one type. Let's see how do they look as JSON and how they correlate:

{
"articleid": 1,
"name": "Futbol-ball",
}

The document contains the article type. In the next block, we see the document from a comment Type.

{ "commentid": "RxftPwUwere-rTs"
"articleid": 1,
"comment": "Best price-quality ball",
}

We have entered the beautiful world of Elasticsearch. We are experts in Elasticsearch terminology. Now we know the terminology in Elasticsearch, we can now understand the basics for this powerful analytical search engine. That is the next post.

Please invite me a coffee and give me a clap 👏 and follow me!

--

--

Geomario

👨‍💻 Software & Data Developer | Software Research Engineer | MLE