Showing posts with label Jmxterm. Show all posts
Showing posts with label Jmxterm. Show all posts

Friday, January 16, 2015

operate casandra using jmx in terminal including changing pool size, compacting sstables and key cache

If you operate apache cassandra cluster and if load per node goes huge (like nodetool info show 800GB), compactions become a problem. It's a big problem for apache cassandra 1.0.8 if you have load per node average hover around 600GB to 1TB. The read performance suffers and at times system uptime load goes high. In some instance, I noticed when repair is running, system load goes more than 20. It's not a concern if this is operating well, but the more often you see this, something has gone wrong. Today, I will share my experience on how to operate cassandra when node load is huge and cassandra instance is still running. Often times, there are nice method that is exposed via jmx but to operate remotely, jmx gui client such as jmxconsole is not ideal. Instead, we will using a jmxterm for these operation in apache cassandra 1.0.8. So let's get started.

Changing pool size

So, it is pretty simple, launch it and set to the bean, and then set the CorePoolSize. The steps will be illustrate below.
$ java -jar jmxterm-1.0-alpha-4-uber.jar
$>open localhost:7199
#Connection to localhost:7199 is opened
$>bean org.apache.cassandra.request:type=ReplicateOnWriteStage
#bean is set to org.apache.cassandra.request:type=ReplicateOnWriteStage
$>get CorePoolSize
#mbean = org.apache.cassandra.request:type=ReplicateOnWriteStage:
CorePoolSize = 32;
$>info
#mbean = org.apache.cassandra.request:type=ReplicateOnWriteStage
#class name = org.apache.cassandra.concurrent.JMXConfigurableThreadPoolExecutor
# attributes
%0 - ActiveCount (int, r)
%1 - CompletedTasks (long, r)
%2 - CorePoolSize (int, rw)
%3 - CurrentlyBlockedTasks (int, r)
%4 - PendingTasks (long, r)
%5 - TotalBlockedTasks (int, r)
#there's no operations
#there's no notifications
$>set CorePoolSize 64
$>get CorePoolSize
#mbean = org.apache.cassandra.request:type=ReplicateOnWriteStage:
CorePoolSize = 64;

Alter key cache

Often times, when there is heap pressure in the jvm, the safety valve kicks in.  You can restart the cassandra instance or you can reset the key cache back to the initial value. Assuming your column family name FooBar and keyspace just4fun, then the following are steps to illustrate how is this done.
$>bean org.apache.cassandra.db:cache=FooBarKeyCache,keyspace=just4fun,type=Caches
#bean is set to org.apache.cassandra.db:cache=FooBarKeyCache,keyspace=just4fun,type=Caches
$>info
#mbean = org.apache.cassandra.db:cache=FooBarKeyCache,keyspace=just4fun,type=Caches
#class name = org.apache.cassandra.cache.AutoSavingKeyCache
# attributes
%0 - Capacity (int, rw)
%1 - Hits (long, r)
%2 - RecentHitRate (double, r)
%3 - Requests (long, r)
%4 - Size (int, r)
#there's no operations
#there's no notifications
$>
$>get Size
#mbean = org.apache.cassandra.db:cache=FooBarKeyCache,keyspace=just4fun,type=Caches:
Size = 122307;

$>set Capacity 250000
#Value of attribute Capacity is set to 250000
$>get Capacity;
#mbean = org.apache.cassandra.db:cache=FooBarKeyCache,keyspace=just4fun,type=Caches:
$>get Capacity
#mbean = org.apache.cassandra.db:cache=FooBarKeyCache,keyspace=just4fun,type=Caches:
Capacity = 250000;

Compact sstable

Lastly, to compact sstables. It's amazing we have a sstable that as huge as 84GB! So trigger major compaction is not an option here, often time when load per node goes beyond 600GB, compaction took forever, as GC kick in and cpu keep on recollecting heap, making system load goes high. So here, we will select one sstable that is huge and compact that only. You can also select a few sstable and compact them and separate using comma.
$>bean org.apache.cassandra.db:type=CompactionManager
#bean is set to org.apache.cassandra.db:type=CompactionManager
$>run forceUserDefinedCompaction just4fun FooBar-hc-5-Index.db
#calling operation forceUserDefinedCompaction of mbean org.apache.cassandra.db:type=CompactionManager
#RuntimeMBeanException: java.lang.IllegalArgumentException: FooBar-hc-5-Index.db does not appear to be a data file
$>run forceUserDefinedCompaction just4fun FooBar-hc-401-Data.db
#calling operation forceUserDefinedCompaction of mbean org.apache.cassandra.db:type=CompactionManager
#operation returns:
null

The compaction should be started, you can check in cassandra system log or the nodetool compaction. So that's it, I hope you learned something.

Tuesday, December 31, 2013

Cassandra cluster jmx metrics inspection to decide if cluster expansion is justifiable

One of the important decision during managing the clusters is to determine when cluster capacity should expand. To maintain a cluster in optimal performance will give the applications working nicely and most importantly, it give confidence to the people.

So, to answer question like, how do I determine if my cluster is at bottleneck? To answer this type of question, you will need to have the measuring tools ready and measure over time. Meaning that you need to display statistics in graphical form and with the history, it should give an indication of the cluster performance.

Because the topic will grow huge, hence, we will focus on a specific metric. This article gonna inspect the metric exposed by the jmx beans. In order to inspect the jmx metrics, you will need a jmx client. There is a gui jmx client that comes with the jdk, that is jconsole. Because nature of this article, I would suggest you go for cli jmx client, for example jmxterm. You can read introduction of jmxterm here.

There are many metrics exposed by cassandra jmx beans. But we will focus on bean org.apache.cassandra.db:type=CompactionManager.

If you are using jmxterm, you can read the output below:
$ cat test.script 
open localhost:7199
bean org.apache.cassandra.db:type=CompactionManager
get PendingTasks
$ java -jar jmxterm-1.0-alpha-4-uber.jar -i test.script
Welcome to JMX terminal. Type "help" for available commands.
#Connection to localhost:7199 is opened
#bean is set to org.apache.cassandra.db:type=CompactionManager
#mbean = org.apache.cassandra.db:type=CompactionManager:
PendingTasks = 0;

So if you plot PendingTasks in a graph over time with a periodic interval, it should give insight to your cluster performance. You can also plot the statistics output from nodetool tpstats. I would suggest also, you plot Message type dropped as those metrics indicate over time that the performance is impacted. If you have a stock cassandra settings, you will probably want to fine tune to your node at this point after this investigation and analysis on the graph.

There is no best strategies, as mentioned earlier, you need experience and there are many other metrics measuring tools, example sar , iostats, top, your application measurement. So it take some times to even master all of these but it is crucial if you managed a production cluster.

Sunday, December 22, 2013

Learning Jmxterm

If you have been using jconsole to inspect an application perform under jvm, you might want to look for alternative in command line form. In this article, we are going to spend sometime to learn on Jmxterm . So what is a Jmxterm? Jmxterm is a command line based interactive JMX client. It's designed to allow user to access a Java MBean server from command line without graphical environment. In another word, it's a command line based jconsole.

To get started, you will of cause, needed JDK installed and an java application that you want to inspect. To start using it , go to http://wiki.cyclopsgroup.org/jmxterm/download and start to download. You should have a jmxterm-[version].jar file.

So, I'm gonna demonstrate on how to use Jmxterm by showing with examples of a terminal output.
$ java -jar jmxterm-1.0-alpha-4-uber.jar
Welcome to JMX terminal. Type "help" for available commands.
$>help;
#IllegalArgumentException: Command help; isn't valid, run help to see available commands
$>help
#following commands are available to use:
about - Display about page
bean - Display or set current selected MBean.
beans - List available beans under a domain or all domains
bye - Terminate console and exit
close - Close current JMX connection
domain - Display or set current selected domain.
domains - List all available domain names
exit - Terminate console and exit
get - Get value of MBean attribute(s)
help - Display available commands or usage of a command
info - Display detail information about an MBean
jvms - List all running local JVM processes
open - Open JMX session or display current connection
option - Set options for command session
quit - Terminate console and exit
run - Invoke an MBean operation
set - Set value of an MBean attribute
$> bean
null
$>beans
#IllegalStateException: Connection isn't open yet. Run open command to open a connection
$>domains
#following domains are available
#IllegalStateException: Connection isn't open yet. Run open command to open a connection
$>jvms
5552 ( ) - jmxterm-1.0-alpha-4-uber.jar
$>help open
usage: open [-h] [-p <val>] [-u <val>]
Open JMX session or display current connection
-h,--help Display usage
-p,--password <val> Password for user/password authentication
-u,--user <val> User name for user/password authentication
Without argument this command display current connection. URL can be a <PID>,
<hostname>:<port> or full qualified JMX service URL. For example
open localhost:9991,
open jmx:service:...
$>open 192.168.0.2:7199
#RuntimeIOException: Runtime IO exception: Connection refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused
$>open localhost:7199
#Connection to localhost:7199 is opened
$>bean
null
$>bean org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies
#bean is set to org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies
$>info
#mbean = org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies
#class name = org.apache.cassandra.db.ColumnFamilyStore
# attributes
%0 - AutoCompactionDisabled (boolean, r)
%1 - BloomFilterDiskSpaceUsed (long, r)
%2 - BloomFilterFalsePositives (long, r)
%3 - BloomFilterFalseRatio (double, r)
%4 - BuiltIndexes (java.util.List, r)
%5 - ColumnFamilyName (java.lang.String, r)
%6 - CompactionStrategyClass (java.lang.String, rw)
%7 - CompressionParameters (java.util.Map, rw)
%8 - CompressionRatio (double, r)
%9 - CrcCheckChance (double, w)
%10 - DroppableTombstoneRatio (double, r)
%11 - EstimatedColumnCountHistogram ([J, r)
%12 - EstimatedRowSizeHistogram ([J, r)
%13 - LifetimeReadLatencyHistogramMicros ([J, r)
%14 - LifetimeWriteLatencyHistogramMicros ([J, r)
%15 - LiveCellsPerSlice (double, r)
%16 - LiveDiskSpaceUsed (long, r)
%17 - LiveSSTableCount (int, r)
%18 - MaxRowSize (long, r)
%19 - MaximumCompactionThreshold (int, rw)
%20 - MeanRowSize (long, r)
%21 - MemtableColumnsCount (long, r)
%22 - MemtableDataSize (long, r)
%23 - MemtableSwitchCount (int, r)
%24 - MinRowSize (long, r)
%25 - MinimumCompactionThreshold (int, rw)
%26 - PendingTasks (int, r)
%27 - ReadCount (long, r)
%28 - RecentBloomFilterFalsePositives (long, r)
%29 - RecentBloomFilterFalseRatio (double, r)
%30 - RecentReadLatencyHistogramMicros ([J, r)
%31 - RecentReadLatencyMicros (double, r)
%32 - RecentSSTablesPerReadHistogram ([J, r)
%33 - RecentWriteLatencyHistogramMicros ([J, r)
%34 - RecentWriteLatencyMicros (double, r)
%35 - SSTableCountPerLevel ([I, r)
%36 - SSTablesPerReadHistogram ([J, r)
%37 - TombstonesPerSlice (double, r)
%38 - TotalDiskSpaceUsed (long, r)
%39 - TotalReadLatencyMicros (long, r)
%40 - TotalWriteLatencyMicros (long, r)
%41 - UnleveledSSTables (int, r)
%42 - WriteCount (long, r)
# operations
%0 - long estimateKeys()
%1 - void forceMajorCompaction()
%2 - java.util.List getSSTablesForKey(java.lang.String p1)
%3 - void loadNewSSTables()
%4 - void setCompactionThresholds(int p1,int p2)
#there's no notifications
$>get WriteCount
#mbean = org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies:
WriteCount = 0;
$>get TotalDiskSpaceUsed
#mbean = org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies:
TotalDiskSpaceUsed = 9437;

So, a brief explanation on what I have just done. To start, you will need to run the Jmxterm from a terminal. To understand what commands it has and what can you use for, simply issued command help. In order to inspect, you will need to open a connection to the jvm. Once a connection is established, you get do all sort of operations and in this example, I'm connected to cassandra and inspect on its bean org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies and get the WriteCount and TotalDiskSpaceUsed statistics.

That's all folks! Hope you get an idea on what it does and where it is applicable to you.