Saturday, March 14, 2015

Investigate into elasticsearch indices memory marking shard active or inactive

If you have enable logging for indices memory controller indices.memory: DEBUG in logging.yml in elasticsearch 0.90 to see how is the shard behaving, you will notice in the log file, messages such as below happen quite often.
[2014-12-11 15:22:27,562][DEBUG][indices.memory           ] [es01] recalculating shard indexing buffer (reason=active/inactive[true] created/deleted[false]), total is [4.2gb] with [11] active shards, each shard set to indexing=[391.8mb], translog=[64kb]
[2014-12-11 15:23:57,562][DEBUG][indices.memory ] [es01] marking shard [index_A][7] as inactive (inactive_time[30m]) indexing wise, setting size to [500kb]
[2014-12-11 15:23:57,562][DEBUG][indices.memory ] [es01] marking shard [index_B][0] as inactive (inactive_time[30m]) indexing wise, setting size to [500kb]

Tracing into the code base, class IndexingMemoryController , we noticed that a periodic with default interval of 30seconds, runnable instance ShardsIndicesStatusChecker is created. Reading into this class, we see that the algorithm is coded in such a way to loop through the indices service and for a index service, check the index shard status. Based on the log output above, we see that the control goes into this path. The shard indexing inactive is true.
    if (!status.inactiveIndexing) {
// mark it as inactive only if enough time has passed and there are no ongoing merges going on...
if ((time - status.time) > inactiveTime.millis() && indexShard.mergeStats().getCurrent() == 0) {
// inactive for this amount of time, mark it
activeToInactiveIndexingShards.add(indexShard);
status.inactiveIndexing = true;
activeInactiveStatusChanges = true;
logger.debug("marking shard [{}][{}] as inactive (inactive_time[{}]) indexing wise, setting size to [{}]", indexShard.shardId().index().name(), indexShard.shardId().id(), inactiveTime, Engine.INACTIVE_SHARD_INDEXING_BUFFER);
}
}

Going further down the code, because there is a change in the variable activeInactiveStatusChanges, the code if statement evaluation become true,
if (shardsCreatedOrDeleted || activeInactiveStatusChanges) {
calcAndSetShardBuffers("active/inactive[" + activeInactiveStatusChanges + "] created/deleted[" + shardsCreatedOrDeleted + "]");
}

Tracing into the code on method calcAndSetShardBuffers(), we see that the shard buffer size is calculated accordingly within a range. The range for default miniumum shard buffer is 4MB and default maximum shard buffer 512MB. Similar algorithm happen to translog buffer size as well. Default minimum translog buffer size is 2KB and default maximum translog buffer size is 64KB. Then we see the log message such as the one above is written using the code below.
logger.debug("recalculating shard indexing buffer (reason={}), total is [{}] with [{}] active shards, each shard set to indexing=[{}], translog=[{}]", reason, indexingBuffer, shardsCount, shardIndexingBufferSize, shardTranslogBufferSize);

From this analysis, it looks to me elasticsearch is working fine checking the index shard status, and recalculate the shard and translog buffer accordingly. If you think this analysis is incorrect or would like to contribute more information, please leave your comment below.

No comments:

Post a Comment