Simple but realistic Elasticsearch load test before deploying

20th December 2019 – 1422 words

Load testing an Elasticsearch cluster before a migration, upgrade or similar is always a good strategy to reduce bad surprises afterwards. With pludoni GmbH, we use Elasticsearch since 2014 for our Job search backend for all of our community websites (Empfehlungsbund), as well as blog article search and for keyword optimizations.

Recently, I’ve prepared such a migration and needed a way to verify that the cluster holds after switching, of better yet, improving the performance in regards to the costs. After checking Github et. al. for other specific code, I was not satisfied enough and build some small script around the awesome siege tool.

Preparation - Gather good queries for test

To get a realistic performance test, I suggest to grab original payloads of queries that your production ES runs. Even more, I only took a couple of queries that are the slowest to boost my confidence in the end.

To to that, first enable Slow Log in ES settings in your UI (cerebro/kopf whatever frontend) or via curl:

curl 'http://localhost:9200/index_settings/update' \
  -H 'Accept: application/json' \
  -H 'Content-Type: application/json;charset=utf-8' --data \

Then, wait a while or produce slow logs via querying. Afterwards the file /var/log/elasticsearch/*_index_search_slowlog.log will fill up.

Extract the query payloads by copying all json between brackets:

[2019-12-11 04:30:29,725][INFO ][] [es01.localhost] [ebsearch_production][2] took[1.7s],
  took_millis[1774], types[], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[<<COPY ALL BETWEEN BRACKETS>>], extra_source[],
  • Put each payload each in individual file in a common folder, e.g. payloads/1, payloads/2 etc.

Build new test cluster

One hint if not yet used: Use Repository + Snapshots (S3) to quickly seed a new cluster with production grade data.

Test the (old/new) cluster

The test is run by the battle-tested tool Siege, which should be easy to install from all your OS repo (Apt, Brew, etc.). Siege supports a input parameter with a file with urls to test. Later, we will utilize Siege like that:

# Concurrency: 3, for 1 minute
siege -b --log=./siege.log -H 'Content-Type: application/json' --internet --delay=15 -c 3 -t 1M --file=urls.txt

The urls.txt has the format:

http://server/index/_search POST {" payload"}

To generate urls.txt easily with all of the payloads, I’ve created a Rakefile, because Ruby is awesome. Also, we are living in 2019(+), so the Ruby that shipped with your distro should be just fine, no rvm/rbenv needed.

SERVER = ''.freeze
DURATION = '1M'.freeze  # 1 minute each test
CONCURRENCY_TESTS = (1..10) # or [1, 5, 10, 20, 100] etc.
INDEX_NAME = 'ebsearch_production'

desc 'create urls.text file with all payloads in payloads/*'
task :urls do
  out = Dir['payloads/*'].map do |pl|
    "http://#{SERVER}/#{INDEX_NAME}/_search POST #{}"
  puts "recreating urls.txt for #{SERVER} with #{out.count} requests"
  File.write('urls.txt', out.join("\n"))

desc 'run series!'
task :run do
  (1..MAX_CONCURRENCY).each do |c|
    puts "==== #{c} Concurrent ==== "

    sh %[siege -b -m "#{SERVER}-C#{c}" --log=./siege.log -H 'Content-Type: application/json' --internet --delay=15 -c #{c} -t #{DURATION} --file=urls.txt]

desc 'show csv as tsv for copy paste into google spreadsheets'
task :csv do
  lines ='siege.log').lines
  csv = lines.reject { |i| i.include?("****") }.map { |line| line.gsub(',', '').gsub('.', ',') }.join
  puts csv

task default: [:urls, :run]
  • Modify the params in the header of the file
  • Run it! rake
  • After it is finished (CONCURRENCY_TESTS * DURATION), you can output the data: rake csv and copy the output in e.g. Google Spreadsheets to easily generate charts

Bonuspoints: quick chart with ascii-charts

Install bundler, if not done yet gem install bundler

Append to Rakefile:

require 'bundler/inline'
gemfile do
  source ''
  gem 'ascii_charts'
desc 'chart'
task :chart do
  require 'ascii_charts'
  lines ='siege.log').lines
  csv = lines.reject { |i| i.include?("****") || i.include?('Elap Time') }
  data = { |i| i.split(',').map(&:strip) }.map { |a| %w[date transactions duration transfer response_time requests_s mbs conc success failed].zip(a).to_h }
  require 'pry'
  puts "======= Response Time / Concurrency ========"
  items = { |d| [d['conc'].to_f.round, d['response_time'].to_f] }

  puts "======= Requests/s / Concurrency ========"
  items = { |d| [d['conc'].to_f.round, d['requests_s'].to_f] }

Bonus: results for our smallish cluster

Our search used custom search plugins that are quite CPU intensive, especially with long queries. Overall our concurrent users are not that many, so a 3-4 node cluster is generally enough.

Deployment target is the very cost efficient Hetzner Cloud (HCloud). Here a quick overview over the different cloud instance types Hetzner offers at this point (2019):

  • CX11 (1 Core, 2GB, 3 EUR)
  • CX21 (2 Core, 4GB, 6 EUR)
  • CX31 (2 Core, 8GB, 11 EUR) (not included, because no CPU improvement and RAM is not utilized)
  • CX41 (4 Core, 16GB, 19 EUR)
  • CX51 (8 Core, 32GB, 36 EUR)

I’ve tried the following combinations in the Hetzner cloud. Please note, that if the rq/s looks ridiculous low, but please keep in mind that those 10 concurrent users are only searching with the worst queries that I could found.

nodes EUR/month rq/s @ 10ccu response time @ 10 ccu requests/s/EUR
1x Coordinator (CX11) + 2x Data CX21 15 EUR 5.90 1.67 0.39
1x Coordinator (CX11) + 3x Data CX21 21 EUR 4.84 2.03 0.23
1x Coordinator (CX11) + 2x Data CX41 41 EUR 11.24 0.80 0.27
1x Coordinator (CX11) + 3x Data CX41 60 EUR 17.19 0.58 0.28
1x Coordinator (CX11) + 4x Data CX41 79 EUR 16.56 0.54 0.20

My findings:

  • Even the smallest instance seemed to be fine for the Coordinator node, the CPU usage never reached any kind of utilization number
  • Going from 2 CX21 to 3 CX21 did not improve the core metrics (reqs/s, response time), but worsening it. My conclusion is, that the CX21 has too low CPU power to
  • Same, going to 4 CX41 seems to be worse than 3 CX41
  • 2 or 3 CX41 are best performance for the price
  • CX51 untested
  • PLEASE NOTE: That findings could be totally related to our type of querying which includes custom search algorithm written in Groovy, Also: I am not a Elasticsearch expert, so there might be tuning params, sharding settings that could be adjusted.