Information Technology Blogs: memtable_flush

FlushWriter                       0         0            941         0                53

If you noticed in a cassandra cluster, I often noticed that the pool FlushWriter all time block always increased while other pool remain 0. So is this that we should concern of?

Snippet from class ColumnFamilyStore:

/*
 * maybeSwitchMemtable puts Memtable.getSortedContents on the writer executor.  When the write is complete,
 * we turn the writer into an SSTableReader and add it to ssTables_ where it is available for reads.
 *
 * There are two other things that maybeSwitchMemtable does.
 * First, it puts the Memtable into memtablesPendingFlush, where it stays until the flush is complete
 * and it's been added as an SSTableReader to ssTables_.  Second, it adds an entry to commitLogUpdater
 * that waits for the flush to complete, then calls onMemtableFlush.  This allows multiple flushes
 * to happen simultaneously on multicore systems, while still calling onMF in the correct order,
 * which is necessary for replay in case of a restart since CommitLog assumes that when onMF is
 * called, all data up to the given context has been persisted to SSTables.
 */
 private static final ExecutorService flushWriter
        = new JMXEnabledThreadPoolExecutor(DatabaseDescriptor.getFlushWriters(),
                                           StageManager.KEEPALIVE,
                                           TimeUnit.SECONDS,
                                           new LinkedBlockingQueue<Runnable>(DatabaseDescriptor.getFlushQueueSize()),
                                           new NamedThreadFactory("FlushWriter"),
                                           "internal");

Just like other Stage.replicate_on_write, FlushWriter is also an instance of JMXEnabledThreadPoolExecutor, governed by two configuration which you can altered in cassandra.yaml.

memtable_flush_writers default based on number of data_file_directories specified.

memtable_flush_queue_size default 4

Whenever maybeSwitchMemtable is called, memtable.flushAndSignal() is called within.

Notice that in Memtable.flushAndSignal(), ExecutorService which is extends a few until the construction object JMXEnabledThreadPoolExecutor for pool FlushWriter aforementioned. So whenever, the task is rejected due to queue full, method rejectedExecution() is triggered which eventually increase the count by one.

So that's it, hope you get an idea what and why is the all time block for pool FlushWriter is increased, so it should give indication you should altered the parameter for the two configuration in cassandra.yaml file.

Last, if you learned something and would like to contribute back, please visit our donation page. Thank you.

Information Technology Blogs

Pages

Saturday, May 3, 2014

what and why always all time blocked for cassandra pool FlushWriter