ES UPgrade

3 min readDec 1, 2023

When you have a multi-node Elasticsearch cluster with different node roles (hot, cold, master), performing a backup and upgrade involves some additional considerations. Below is a step-by-step guide for backing up and upgrading an Elasticsearch cluster with hot, cold, and master nodes.

1. Backup:

a. Snapshot and Restore:

Take a Snapshot:

Use the Snapshot API to create a snapshot of your indices. This can be done using a repository on a shared file system or cloud storage.
bashCopy code
PUT /_snapshot/my_backup/snapshot_1 { "indices": "index1,index2", "ignore_unavailable": true, "include_global_state": false }

Verify Snapshot:

Confirm that the snapshot was successful and check the status.
bashCopy code
GET /_snapshot/my_backup/snapshot_1

Backup Configuration Files:

Ensure you have a copy of your Elasticsearch configuration files, especially elasticsearch.yml, as they may contain custom settings.

2. Upgrade:

a. Upgrade Process:

Prepare for Upgrade:

Review the Elasticsearch documentation for any specific upgrade considerations.
Ensure that all nodes, including hot, cold, and master nodes, are stopped.

Upgrade Each Node:

Perform a rolling restart for each node type separately. Start with the master nodes, then hot nodes, and finally cold nodes.
Upgrade Elasticsearch on each node and restart.

Check Cluster Health:

Monitor the cluster health after each node upgrade to ensure stability.

3. Post-Upgrade Steps:

Update Index Settings:

If there are any changes in index settings due to the upgrade, update them as needed.

Verify Plugins:

Confirm that all custom plugins are compatible with the new Elasticsearch version and update them if necessary.

Reconfigure Nodes:

Adjust node configurations, such as heap size, if recommended for the new version.

4. Restore:

a. Restore Process:

Stop Nodes:

Stop all Elasticsearch nodes to prevent any data changes during the restore.

Restore Snapshot:

Restore the snapshot created earlier.
bashCopy code
POST /_snapshot/my_backup/snapshot_1/_restore

Verify Restore:

Confirm that the indices are successfully restored and check the cluster health.
bashCopy code
GET /_cat/indices GET /_cat/health

Start Nodes:

Start all Elasticsearch nodes.

5. Verify and Monitor:

Cluster Health:

Monitor the cluster health to ensure it returns to a ‘green’ state.
bashCopy code
GET /_cat/health

Application Testing:

Test your applications against the restored data to ensure everything is functioning correctly.

6. Additional Considerations:

Cold Node Migration:
If you’re using cold nodes, consider any necessary steps for migrating data between hot and cold nodes based on your cluster’s architecture.
Upgrade Kibana and Logstash:
Upgrade Kibana and Logstash after completing the Elasticsearch upgrade, following a similar rolling upgrade process.
Documentation:
Update your documentation to reflect the new Elasticsearch version and any changes made during the upgrade.

Always refer to the official Elasticsearch documentation for your specific version for the most accurate and up-to-date information on upgrading and backup procedures.

Rolling Upgrade vs. Full Cluster Restart

Both rolling upgrades and full cluster restarts are methods for updating a cluster with new software. However, they have different advantages and disadvantages:

Rolling Upgrade:

Pros:

Zero downtime: Applications and services continue to run during the upgrade, minimizing disruption to users.
Lower risk: If an issue occurs during the upgrade, it can be rolled back to the previous version without affecting the entire cluster.
Gradual resource utilization: The upgrade process is spread out over time, which can help to avoid resource spikes and performance issues.
Scalability: Rolling upgrades can be easily scaled to large clusters.

Cons:

Complexity: Rolling upgrades can be more complex to set up and manage than full cluster restarts.
Potential for errors: If any errors occur during the upgrade, they can affect applications and services.
Longer duration: Rolling upgrades can take longer than full cluster restarts.

Full Cluster Restart: