ES UPgrade

Arun Battepati
3 min readDec 1, 2023

--

When you have a multi-node Elasticsearch cluster with different node roles (hot, cold, master), performing a backup and upgrade involves some additional considerations. Below is a step-by-step guide for backing up and upgrading an Elasticsearch cluster with hot, cold, and master nodes.

1. Backup:

a. Snapshot and Restore:

  1. Take a Snapshot:
  • Use the Snapshot API to create a snapshot of your indices. This can be done using a repository on a shared file system or cloud storage.
  • bashCopy code
  • PUT /_snapshot/my_backup/snapshot_1 { "indices": "index1,index2", "ignore_unavailable": true, "include_global_state": false }
  1. Verify Snapshot:
  • Confirm that the snapshot was successful and check the status.
  • bashCopy code
  • GET /_snapshot/my_backup/snapshot_1
  1. Backup Configuration Files:
  • Ensure you have a copy of your Elasticsearch configuration files, especially elasticsearch.yml, as they may contain custom settings.

2. Upgrade:

a. Upgrade Process:

  1. Prepare for Upgrade:
  • Review the Elasticsearch documentation for any specific upgrade considerations.
  • Ensure that all nodes, including hot, cold, and master nodes, are stopped.
  1. Upgrade Each Node:
  • Perform a rolling restart for each node type separately. Start with the master nodes, then hot nodes, and finally cold nodes.
  • Upgrade Elasticsearch on each node and restart.
  1. Check Cluster Health:
  • Monitor the cluster health after each node upgrade to ensure stability.

3. Post-Upgrade Steps:

  1. Update Index Settings:
  • If there are any changes in index settings due to the upgrade, update them as needed.
  1. Verify Plugins:
  • Confirm that all custom plugins are compatible with the new Elasticsearch version and update them if necessary.
  1. Reconfigure Nodes:
  • Adjust node configurations, such as heap size, if recommended for the new version.

4. Restore:

a. Restore Process:

  1. Stop Nodes:
  • Stop all Elasticsearch nodes to prevent any data changes during the restore.
  1. Restore Snapshot:
  • Restore the snapshot created earlier.
  • bashCopy code
  • POST /_snapshot/my_backup/snapshot_1/_restore
  1. Verify Restore:
  • Confirm that the indices are successfully restored and check the cluster health.
  • bashCopy code
  • GET /_cat/indices GET /_cat/health
  1. Start Nodes:
  • Start all Elasticsearch nodes.

5. Verify and Monitor:

  1. Cluster Health:
  • Monitor the cluster health to ensure it returns to a ‘green’ state.
  • bashCopy code
  • GET /_cat/health
  1. Application Testing:
  • Test your applications against the restored data to ensure everything is functioning correctly.

6. Additional Considerations:

  • Cold Node Migration:
  • If you’re using cold nodes, consider any necessary steps for migrating data between hot and cold nodes based on your cluster’s architecture.
  • Upgrade Kibana and Logstash:
  • Upgrade Kibana and Logstash after completing the Elasticsearch upgrade, following a similar rolling upgrade process.
  • Documentation:
  • Update your documentation to reflect the new Elasticsearch version and any changes made during the upgrade.

Always refer to the official Elasticsearch documentation for your specific version for the most accurate and up-to-date information on upgrading and backup procedures.

Rolling Upgrade vs. Full Cluster Restart

Both rolling upgrades and full cluster restarts are methods for updating a cluster with new software. However, they have different advantages and disadvantages:

Rolling Upgrade:

Pros:

  • Zero downtime: Applications and services continue to run during the upgrade, minimizing disruption to users.
  • Lower risk: If an issue occurs during the upgrade, it can be rolled back to the previous version without affecting the entire cluster.
  • Gradual resource utilization: The upgrade process is spread out over time, which can help to avoid resource spikes and performance issues.
  • Scalability: Rolling upgrades can be easily scaled to large clusters.

Cons:

  • Complexity: Rolling upgrades can be more complex to set up and manage than full cluster restarts.
  • Potential for errors: If any errors occur during the upgrade, they can affect applications and services.
  • Longer duration: Rolling upgrades can take longer than full cluster restarts.

Full Cluster Restart:

Pros:

  • Simplicity: Full cluster restarts are simple to set up and manage.
  • Reduced risk of errors: There is less risk of errors occurring during the upgrade because the entire cluster is restarted.
  • Faster completion: Full cluster restarts can be completed faster than rolling upgrades.

Cons:

  • Downtime: Applications and services are unavailable during the restart, which can disrupt users.
  • Higher risk of failure: If the new software fails to start, the entire cluster will be unavailable.
  • Resource spikes: Restarting the entire cluster can cause resource spikes, which can lead to performance issues.
  • Limited scalability: Full cluster restarts are not as scalable as rolling upgrades.

Which method to choose depends on several factors, including:

  • The size and complexity of the cluster.
  • The availability requirements of the applications and services.
  • The risk tolerance of the organization.
  • The experience of the IT staff.

Here are some additional tips for choosing the right method:

  • Use a rolling upgrade for critical applications and services that require high availability.
  • Use a full cluster restart for smaller clusters or applications that can tolerate some downtime.
  • Consider using a hybrid approach that combines elements of both methods.
  • Test the upgrade procedure thoroughly before deploying it to production.

--

--