Saturday, January 17, 2015

Investigate into Apache Cassandra Memtable updateLiveRatio logging output

Today, we are going to study apache cassandra 1.0.8 on memtable logging its statistics in the cassandra system.log. Example below
WARN [MemoryMeter:1] 2014-10-17 07:38:15,346 Memtable.java (line 176) setting live ratio to minimum of 1.0 instead of 0.16714977001091136
INFO [MemoryMeter:1] 2014-10-17 07:38:15,346 Memtable.java (line 186) CFS(Keyspace='OpsCenter', ColumnFamily='pdps') liveRatio is 1.6474114033116056 (just-counted was 1.0). calculation took 3ms for 595 columns

What does the above logging output means? Is cassandra instance operating normally since logging level is warn? Before we dive into the code to understand its meaning, let first read and understand what is memtable in cassandra context.

Excerpts from datastax documentation

Cassandra is optimized for write throughput. Cassandra writes are first written to a commit log (for durability), and then to an in-memory table structure called a memtable. A write is successful once it is written to the commit log and memory, so there is very minimal disk I/O at the time of write. Writes are batched in memory and periodically written to disk to a persistent table structure called an SSTable (sorted string table). Memtables and SSTables are maintained per column family. Memtables are organized in sorted order by row key and flushed to SSTables sequentially (no random seeking as in relational databases).

Whenever method updateRatio() from class Memtable is called, the following codes applies.
public void updateLiveRatio()
{
if (!MemoryMeter.isInitialized())
{
// hack for openjdk. we log a warning about this in the startup script too.
logger.warn("MemoryMeter uninitialized (jamm not specified as java agent); assuming liveRatio of 10.0. Usually this means cassandra-env.sh disabled jamm because you are using a buggy JRE; upgrade to the Sun JRE instead");
cfs.liveRatio = 10.0;
return;
}

Runnable runnable = new Runnable()
{
public void run()
{
activelyMeasuring = Memtable.this;

long start = System.currentTimeMillis();
// ConcurrentSkipListMap has cycles, so measureDeep will have to track a reference to EACH object it visits.
// So to reduce the memory overhead of doing a measurement, we break it up to row-at-a-time.
long deepSize = meter.measure(columnFamilies);
int objects = 0;
for (Map.Entry<DecoratedKey, ColumnFamily> entry : columnFamilies.entrySet())
{
deepSize += meter.measureDeep(entry.getKey()) + meter.measureDeep(entry.getValue());
objects += entry.getValue().getColumnCount();
}
double newRatio = (double) deepSize / currentThroughput.get();

if (newRatio < MIN_SANE_LIVE_RATIO)
{
logger.warn("setting live ratio to minimum of 1.0 instead of {}", newRatio);
newRatio = MIN_SANE_LIVE_RATIO;
}
if (newRatio > MAX_SANE_LIVE_RATIO)
{
logger.warn("setting live ratio to maximum of 64 instead of {}", newRatio);
newRatio = MAX_SANE_LIVE_RATIO;
}
cfs.liveRatio = Math.max(cfs.liveRatio, newRatio);

logger.info("{} liveRatio is {} (just-counted was {}). calculation took {}ms for {} columns",
new Object[]{ cfs, cfs.liveRatio, newRatio, System.currentTimeMillis() - start, objects });
activelyMeasuring = null;
}
};

try
{
meterExecutor.submit(runnable);
}
catch (RejectedExecutionException e)
{
logger.debug("Meter thread is busy; skipping liveRatio update for {}", cfs);
}
}

There is a new thread which will be execute by the executor meterExecutor. This executor if it is not busy, will start to measure this column family associated with this memtable. There are two essential metrics that is involve in the arithmetic, key size plus value size and the column count. deepSize is the summation of the column family , its key and its value. A new variable newRatio is calculated with the deepSize divided by the currentThroughput.

Valid newRatio range is between 1.0 to 64.0 inclusive. When calculated newRatio is less than 1.0, first line of log such as above will started to appear in cassandra system log and newRatio will be reset to 1.0. The same check when newRatio exceed 64.0, it will be logged and value for newRatio reset to maximum valid value of 64.0. Then column family live ratio is updated with whichever which is higher, the current or the new calculated newRatio.

So technically, this is nothing to really concern about. It's a measurement of the number of operations has increased as compare to the previous. Also, given the next throughput, as estimated of size of the memtable can be calculated based on the newRatio. As a side note, this method updateRatio() is called after a mutation has been applied to the memtable but before a flush is requested.

That's it for today, I hope you learned something.

No comments:

Post a Comment