Showing posts with label shard. Show all posts
Showing posts with label shard. Show all posts

Sunday, March 15, 2015

Investigate into elasticsearch already flushing exception

Today, we will take a look at another elasticsearch issue. The example of message output in the log file happened in one of my node.
[2015-03-02 04:51:30,450][DEBUG][action.admin.indices.flush] [node02] [index_A][1], node[DEF_-ABC], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.admin.indices.flush.FlushRequest@23465ba5]
org.elasticsearch.index.engine.FlushNotAllowedEngineException: [index_A][1] already flushing...
at org.elasticsearch.index.engine.robin.RobinEngine.flush(RobinEngine.java:825)
at org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:564)
at org.elasticsearch.action.admin.indices.flush.TransportFlushAction.shardOperation(TransportFlushAction.java:114)
at org.elasticsearch.action.admin.indices.flush.TransportFlushAction.shardOperation(TransportFlushAction.java:49)
at org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$AsyncBroadcastAction$2.run(TransportBroadcastOperationAction.java:225)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

We will investigate into the code base lead to the exception above and to decide if this is a problem. Below is the snippet of the "entry" elasticsearch code.
public void run() {
try {
onOperation(shard, shardIndex, shardOperation(shardRequest));
} catch (Throwable e) {
onOperation(shard, shardIt, shardIndex, e);
}
}

so we read above, this is an asychronous broadcast action execution happened during performing an action. Code execution path tell that this shard is a local shard and encounter exception during flushing.
    @Override
protected ShardFlushResponse shardOperation(ShardFlushRequest request) throws ElasticSearchException {
IndexShard indexShard = indicesService.indexServiceSafe(request.index()).shardSafe(request.shardId());
indexShard.flush(new Engine.Flush().type(request.full() ? Engine.Flush.Type.NEW_WRITER : Engine.Flush.Type.COMMIT_TRANSLOG).force(request.force()));
return new ShardFlushResponse(request.index(), request.shardId());
}

Tracing up in the stack trace, we noticed it is an internal index shard class during flushing of the shard.
    @Override
public void flush(Engine.Flush flush) throws ElasticSearchException {
// we allows flush while recovering, since we allow for operations to happen
// while recovering, and we want to keep the translog at bay (up to deletes, which
// we don't gc).
verifyStartedOrRecovering();
if (logger.isTraceEnabled()) {
logger.trace("flush with {}", flush);
}
long time = System.nanoTime();
engine.flush(flush);
flushMetric.inc(System.nanoTime() - time);
}

As can be read above, we are at the end of the investigation journey here, it turn out that this shard flushing count is greater than one and is false, the flushing count get decrease and exception is thrown and tell that this shard is currently being flush.

As to why this is happening it is puzzling, but I never see this exception anymore, and from the investigation above, it does not look like this is an harm since the shard in concern is flushing currently.

If you think this analysis is not correct or you have better insight, please leave your comment below. Thank you.

Saturday, March 14, 2015

Investigate into elasticsearch indices memory marking shard active or inactive

If you have enable logging for indices memory controller indices.memory: DEBUG in logging.yml in elasticsearch 0.90 to see how is the shard behaving, you will notice in the log file, messages such as below happen quite often.
[2014-12-11 15:22:27,562][DEBUG][indices.memory           ] [es01] recalculating shard indexing buffer (reason=active/inactive[true] created/deleted[false]), total is [4.2gb] with [11] active shards, each shard set to indexing=[391.8mb], translog=[64kb]
[2014-12-11 15:23:57,562][DEBUG][indices.memory ] [es01] marking shard [index_A][7] as inactive (inactive_time[30m]) indexing wise, setting size to [500kb]
[2014-12-11 15:23:57,562][DEBUG][indices.memory ] [es01] marking shard [index_B][0] as inactive (inactive_time[30m]) indexing wise, setting size to [500kb]

Tracing into the code base, class IndexingMemoryController , we noticed that a periodic with default interval of 30seconds, runnable instance ShardsIndicesStatusChecker is created. Reading into this class, we see that the algorithm is coded in such a way to loop through the indices service and for a index service, check the index shard status. Based on the log output above, we see that the control goes into this path. The shard indexing inactive is true.
    if (!status.inactiveIndexing) {
// mark it as inactive only if enough time has passed and there are no ongoing merges going on...
if ((time - status.time) > inactiveTime.millis() && indexShard.mergeStats().getCurrent() == 0) {
// inactive for this amount of time, mark it
activeToInactiveIndexingShards.add(indexShard);
status.inactiveIndexing = true;
activeInactiveStatusChanges = true;
logger.debug("marking shard [{}][{}] as inactive (inactive_time[{}]) indexing wise, setting size to [{}]", indexShard.shardId().index().name(), indexShard.shardId().id(), inactiveTime, Engine.INACTIVE_SHARD_INDEXING_BUFFER);
}
}

Going further down the code, because there is a change in the variable activeInactiveStatusChanges, the code if statement evaluation become true,
if (shardsCreatedOrDeleted || activeInactiveStatusChanges) {
calcAndSetShardBuffers("active/inactive[" + activeInactiveStatusChanges + "] created/deleted[" + shardsCreatedOrDeleted + "]");
}

Tracing into the code on method calcAndSetShardBuffers(), we see that the shard buffer size is calculated accordingly within a range. The range for default miniumum shard buffer is 4MB and default maximum shard buffer 512MB. Similar algorithm happen to translog buffer size as well. Default minimum translog buffer size is 2KB and default maximum translog buffer size is 64KB. Then we see the log message such as the one above is written using the code below.
logger.debug("recalculating shard indexing buffer (reason={}), total is [{}] with [{}] active shards, each shard set to indexing=[{}], translog=[{}]", reason, indexingBuffer, shardsCount, shardIndexingBufferSize, shardTranslogBufferSize);

From this analysis, it looks to me elasticsearch is working fine checking the index shard status, and recalculate the shard and translog buffer accordingly. If you think this analysis is incorrect or would like to contribute more information, please leave your comment below.