Deploy Elasticsearch Cluster on Rocky Linux 8 Using Ansible

CloudSpinx
December 17, 2024
8:45 am
No Comments

Elasticsearch is an open source full text search and analytics engine tool used in storing, searching and analysing big volumes of data real time. It supports RESTful operations i.e you can use HTTP methods in combination with a HTTP URI to manipulate data. It is built on Apache Lucene licensed under Apache 2 License. Elasticsearch key features are, log analytics, search engine, full-text search, security intelligence e.t.c

In this article, I will demonstrate how to setup three node Elasticsearch cluster on Rocky / AlmaLinux 8 Using Ansible. This tutorial will help Linux users to install and configure a highly available multi-node Elasticsearch cluster.

With 3 nodes, we will have the following:

Master node – Used for cluster-wide operations such as management and allocation of data shards to storage to the data nodes
Data node – They hold the actual data shards of the indexed data. They perform more operations such as handling CRUID and aggregation operations thus consuming more memory and I/O.

In this tutorial, we will use 1 Master node and 2 Data nodes.

Step 1 – Prepare the servers

Before we begin ensure:

You have 3 servers and a client machine installed and updated.

Server Host_Name	Host_IP	Server Role
1. Client machine	192.168.1.15	Ansible
2. master	172.16.120.129	Master Node
3. data-01	172.16.120.130	Data Node 1
4. data-02	172.16.120.131	Data Node 2

A user with sudo privileges or root.

Step 2 – Install Ansible in local system

In this tutorial, we will be using Ansible to setup Elasticsearch cluster. Install Ansible on the client machine for easy administration.

sudo yum -y install epel-release
sudo yum install ansible -y

Confirm Ansible installation:

$ ansible --version
ansible [core 2.16.3]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.12/site-packages/ansible
  ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible
  python version = 3.12.3 (main, Jul  2 2024, 20:57:30) [GCC 8.5.0 20210514 (Red Hat 8.5.0-22)] (/usr/bin/python3.12)
  jinja version = 3.1.2
  libyaml = True

Step 3 – Import Elasticsearch Ansible Role

After a successful Ansible installation, we now import Elasticsearch Ansible role on the Client machine.

See available versions:

ansible-galaxy install elastic.elasticsearch

Install specific version, example:

$ ansible-galaxy install elastic.elasticsearch,v7.17.0
- downloading role 'elasticsearch', owned by elastic
- downloading role from https://github.com/elastic/ansible-elasticsearch/archive/v7.17.0.tar.gz
- extracting elastic.elasticsearch to /root/.ansible/roles/elastic.elasticsearch
- elastic.elasticsearch (v7.17.0) was installed successfully

Here 7.17.0 was the latest release version when I was doing this tutorial. You can check the latest available version of Elasticsearch form the release page.

Confirm the added role at ~/.ansible/roles directory.

$ ls ~/.ansible/roles
elastic.elasticsearch

Next we need to configure ssh with Elasticsearch cluster hosts as below:

vim ~/.ssh/config

Edit your configuration to resemble this:

# Elasticsearch master nodes
Host master
Hostname 172.16.120.129
  User rocky

# Elasticsearch worker nodes
Host data-01
  Hostname 172.16.120.130
  User rocky
Host data-02
  Hostname 172.16.120.131
  User rocky

Aslo add ssh keys to all the machines.

Generate ssh keys for the 3 nodes on your local machine replacing the username@remotehost with your own

###For the Master nodes ###
ssh-copy-id [email protected]

### Data nodes ###
ssh-copy-id [email protected]
ssh-copy-id [email protected]

Confirm if the keys were successfully added, ssh without password.

The master node:

$ ssh [email protected]
Last login: Sun Jul 18 08:11:53 2021 from 192.168.1.15
[rocky@master ~]$

incase, your private key has a passphrase,save it to avoid prompt.

$ eval `ssh-agent -s` && ssh-add
Agent pid 2862948
Identity added: /home/ethicalhacker/.ssh/id_rsa (/home/ethicalhacker/.ssh/id_rsa)

Step 4 – Run Elasticsearch Playbook

With all the above configurations set, we now want to create a playook and run it.

vim elk.yml

Edit your content as below:

- hosts: el-master-nodes
  roles: 
    - role: elastic.elasticsearch
  vars:
    es_enable_xpack: false
    es_data_dirs:
      - "/data/elasticsearch/data"
    es_log_dir: "/data/elasticsearch/logs"
    es_java_install: true
    es_heap_size: "1g"
    es_config:
      cluster.name: "el-cluster"
      cluster.initial_master_nodes: "172.16.120.129:9300"
      discovery.seed_hosts: "172.16.120.129:9300"
      http.port: 9200
      node.data: false
      ndataode.master: true
      bootstrap.memory_lock: false
      network.host: '0.0.0.0'
    es_plugins:
     - plugin: ingest-attachment

- hosts: el-data-nodes
  roles:
    - role: elastic.elasticsearch
  vars:
    es_enable_xpack: false
    es_data_dirs:
      - "/data/elasticsearch/data"
    es_log_dir: "/data/elasticsearch/logs"
    es_java_install: true
    es_config:
      cluster.name: "el-cluster"
      cluster.initial_master_nodes: "172.16.120.130:9300,172.16.120.131:9300"
      discovery.seed_hosts: "172.16.120.130:9300,172.16.120.131:9300"
      http.port: 9200
      node.data: true
      node.master: false
      bootstrap.memory_lock: false
      network.host: '0.0.0.0'
    es_plugins:
      - plugin: ingest-attachment

In the above configuration,

master node has node.master as true and node.data as false.
data nodes have node.master as false and node.data as true
For scalability, the data shard is stored at /data/elasticsearch/data
Logs are stored in /data/elasticsearch/logs

With the above configuration set, we proceed and create an Inventory file as below:

$ vim hosts
[el-master-nodes]
master

[el-data-nodes]
data-01
data-02

Now run playbook

ansible-playbook -i hosts elk.yml

Sample output:

TASK [elastic.elasticsearch : remove x-pack plugin directory when it isn't a plugin] ***
ok: [master]
ok: [data-02]
ok: [data-01]

TASK [elastic.elasticsearch : Check installed elasticsearch plugins] ***********
ok: [data-02]
ok: [master]
ok: [data-01]

TASK [elastic.elasticsearch : set fact plugins_to_remove to install_plugins.stdout_lines] ***
skipping: [data-01]
skipping: [data-02]
skipping: [master]
.............................

This takes quite sometime, on completion, you should see this:

TASK [elastic.elasticsearch : Copy templates to elasticsearch] *****************

TASK [elastic.elasticsearch : Install templates] *******************************

PLAY RECAP *********************************************************************
data-01                    : ok=75   changed=3    unreachable=0    failed=0    skipped=243  rescued=0    ignored=0   
data-02                    : ok=75   changed=3    unreachable=0    failed=0    skipped=242  rescued=0    ignored=0   
master                     : ok=77   changed=6    unreachable=0    failed=0    skipped=242  rescued=0    ignored=0

In case elasticsearch fails to restart run this on the nodes and increase the timeout time after restart.

sudo vi /usr/lib/systemd/system/elasticsearch.service

In the file, add this line TimeoutStartSec=500

# Disable timeout logic and wait until process is stopped
TimeoutStopSec=0
TimeoutStartSec=500

Reload the daemon

sudo systemctl daemon-reload

You will see the new set timeout value:

$ sudo systemctl show elasticsearch | grep ^Timeout
TimeoutStartUSec=8min 20s
TimeoutStopUSec=infinity

Sample illustration after a successfully deployment from ansible playbook.

Step 5 – Confirm Elasticsearch cluster installation

ssh master

Then check the health status:

$ curl http://localhost:9200/_cluster/health?pretty
{
  "cluster_name" : "el-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Check master nodes

$ curl -XGET 'http://localhost:9200/_cat/master'
3yCGVlAFSDGguEkO8WqSgw 172.16.120.129 172.16.120.129 master

Check the data nodes:

$ curl -XGET 'http://localhost:9200/_cat/nodes'
172.16.120.129 17 94 0 0.00 0.10 0.10 imr * master
172.16.120.130 10 47 0 0.00 0.03 0.03 di - data-02
172.16.120.131 6 30 0 0.00 0.02 0.02 di - data-01

Explore More with CloudSpinx

Looking to streamline your tech stack? At CloudSpinx, we deliver robust solutions tailored to your needs.

Check out our other articles:

Join our Linux and open source community. Subscribe to our newsletter for tips, tricks, and collaboration opportunities!

Unlock the Right Solutions with Confidence

At CloudSpinx, we don’t just offer services - we deliver clarity, direction, and results. Whether you're navigating cloud adoption, scaling infrastructure, or solving DevOps challenges, our seasoned experts help you make smart, strategic decisions with total confidence. Let us turn complexity into opportunity and bring your vision to life.

Elearning

Cloud Services

Infra Services

Iac & GitOps

IT Support

InfoSec

Development