Thursday, December 26, 2013

cassandra 2.0 catch 101 – part5

Our cluster status.
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 192.168.33.31 147.34 KB 512 34.3% f13c9390-4c52-4fc2-afa8-f7f74e7fd710 rack1
UN 192.168.33.32 123.31 KB 512 33.2% bc7fcfcc-9a30-4929-bf24-35ec770856a3 rack1
UN 192.168.33.33 160.74 KB 256 16.5% 999d58bf-2b31-49ff-a452-6f0d01598429 rack1
UN 192.168.33.34 137.39 KB 256 16.0% 222796e9-d330-469a-8dcd-3f3581c9d795 rack1

So it is pretty interesting that a node can own different amount of cluster load based on the tokens specified. Because in our cluster environment, we have different types of hardware and for instance, mine is pretty old. With default settings, a default of 512 tokens is assigned.

The setting for tokens can be found in cassandra.yaml
# This defines the number of tokens randomly assigned to this node on the ring
# The more tokens, relative to other nodes, the larger the proportion of data
# that this node will store. You probably want all nodes to have the same number
# of tokens assuming they have equal hardware capability.
#
# If you leave this unspecified, Cassandra will use the default of 1 token for legacy compatibility,
# and will use the initial_token as described below.
#
# Specifying initial_token will override this setting.
#
# If you already have a cluster with 1 token per node, and wish to migrate to
# multiple tokens per node, see http://wiki.apache.org/cassandra/Operations
num_tokens: 512

Just note that this setting is one time only when a join the cluster. Meaning that the first time it join, 512 tokens will be assign for this node and this tokens will be store in the keyspace system and in column family local. Even if you removed the data directory and start all over, the data will be stream from other nodes, hence, the information is still persists. If you really want to change at later day, it is possible, you may want to treat this node as dead through decommission, stop cassandra instance. Change the num_tokens configuration in yaml file and then start cassandra instance back. You may want to think about this because decommission stream data to other servers and it may create load in system and also network traffic.

Wednesday, December 25, 2013

Lightweight Java Game Library

Since childhood, gaming has been one of my favorite activities. If you are from 80s, Supermario should sound familiar to you. =) 30 years had passed, gaming development improve tremendously over the period.

In this article, we are going to explore gaming development. Most of the gaming is written in low level languages, example C and thus, it is very complicated. This certainly introduced steep learning curve if you are a beginner. Hence, we will choose a simple startup to learn about gaming development. A example of library that can be use is Lightweight Java Game Library or its acronym LWJGL.

What is Lightweight Java Game Library?

The Lightweight Java Game Library (LWJGL) is a solution aimed directly at professional and amateur Java programmers alike to enable commercial quality games to be written in Java. LWJGL provides developers access to high performance crossplatform libraries such as OpenGL (Open Graphics Library), OpenCL (Open Computing Language) and OpenAL (Open Audio Library) allowing for state of the art 3D games and 3D sound. Additionally LWJGL provides access to controllers such as Gamepads, Steering wheel and Joysticks. All in a simple and straight forward API.

Because nature of this library deal with graphic display, hence the hardware display driver must be setup correctly. For me, my workstation is using ati radeon, and using xserver-xorg-video-radeon and enable 3D acceleration with package libgl1-mesa-dri. We won't delve deep into graphic driver installation and configuration since our focus here is the gaming development. You can check if your drive is setup properly by running glxgears via a terminal. If a windows popup with three gears spinning, your driver install and setup should be fine to continue for this coding tutorial.

In the official wiki, it is well written and documented to get you started. With this, I have setup my eclipse environment in debian sid. The library needed to should be setup in the project build path so when you run your application, the library is detected. Because I'm running linux, the native library location is pointed to lwjgl-2.9.1/native/linux. These two library must be configured before any development begin. If you noticed, I've setup the source as well, it will be convienient to read the code if you need to be sure later down the road during coding phase.



There are many tutorials to pick from, as a start, I just pick the basics - LWJGL Basics 1 (The Display).  The source code should be in the link, and it is incredibly easy to create the display with few lines of codes and I got that window display with just initial try. Very impressive and promising.



It is pretty impressive what this library can do. There are many examples that come in the library and one of it is an example game. Just execute
java -cp .:res:jar/lwjgl.jar:jar/lwjgl_test.jar:jar/lwjgl_util.jar:jar/jinput.jar: -Djava.library.path=native/linux org.lwjgl.examples.spaceinvaders.Game

if you are running linux. Run fine in my environment and played the bundle game; amazing. Maybe in my next article, I'm gonna try to even complete this .

Monday, December 23, 2013

Elasticsearch index slow log for search and indexing

Today, we are going to learn on the logging for elasticsearch for its search and index. In elasticsearch config file, elasticsearch.yml, it should have a configuration such as below:
################################## Slow Log ##################################

# Shard level query and fetch threshold logging.

#index.search.slowlog.threshold.query.warn: 10s
#index.search.slowlog.threshold.query.info: 5s
#index.search.slowlog.threshold.query.debug: 2s
index.search.slowlog.threshold.query.trace: 500ms

#index.search.slowlog.threshold.fetch.warn: 1s
#index.search.slowlog.threshold.fetch.info: 800ms
#index.search.slowlog.threshold.fetch.debug: 500ms
index.search.slowlog.threshold.fetch.trace: 200ms

#index.indexing.slowlog.threshold.index.warn: 10s
#index.indexing.slowlog.threshold.index.info: 5s
#index.indexing.slowlog.threshold.index.debug: 2s
index.indexing.slowlog.threshold.index.trace: 500ms

So with this example, I have enable tracing for search query and search fetch with 500ms and 200ms respectively. A search in elasticsearch consists of query time and fetch time. Hence the two configuration for search. Meanwhile, logging for elasticsearch index is also enable with a threshold of 500ms.

With these configuration sets, and if your indexing or search exceed that threshold,
an entry will be log into a file. The logging file should be located in path.log
that is set in elasticsearch.yml.

So what does the number really means? Excerpts from elasticsearch official documentation

The logging is done on the shard level scope, meaning the executionof a search request within a specific shard. It does not encompass the whole search request, which can be broadcast to several shards in order to execute. Some of the benefits of shard level logging is the association of the actual execution on the specific machine, compared with request level.


 

All settings are index level settings (and each index can have different values for it), and can be changed in runtime using the indexupdate settings API.


 

... and, I have tried updating the index setting via a simple tool I've made earlier on. But the idea is same, you just need to http get by putting the variable into the index setting. You can find more information here The key for the configuration is available at ShardSlowLogSearchService.java class.
[jason@node1 bin]$ ./indices-setting.sh set search.slowlog.threshold.query.trace 500
{
"ok" : true,
"acknowledged" : true
}

[2013-12-23 12:31:12,758][TRACE][index.search.slowlog.query] [node1] [index_test][146] took[1s], took_millis[1026], types[foo,bar], stats[], search_type[QUERY_THEN_FETCH], total_shards[90], source[{"size":80,"timeout":10000,"query":{"filtered":{"query":{"query_string":{"query":"maxis*","default_operator":"and"}},"filter":{"and":{"filters":[{"query":{"match":{"site":{"query":"www.google.com","type":"boolean"}}}},{"range":{"unixtimestamp":{"from":null,"to":1387825199000,"include_lower":true,"include_upper":true}}}]}}}},"filter":{"query":{"match":{"site":{"query":"www.google.com","type":"boolean"}}}},"sort":[{"unixtimestamp":{"order":"desc"}}]}], extra_source[],

With this example, it has exceed the threshold set at 500ms which it ran for 1 second.

As for indexing, the fundamental concept is the same, so we won't elaborate in this article and that should leave you as a tutorial. :-)

Sunday, December 22, 2013

Learning Jmxterm

If you have been using jconsole to inspect an application perform under jvm, you might want to look for alternative in command line form. In this article, we are going to spend sometime to learn on Jmxterm . So what is a Jmxterm? Jmxterm is a command line based interactive JMX client. It's designed to allow user to access a Java MBean server from command line without graphical environment. In another word, it's a command line based jconsole.

To get started, you will of cause, needed JDK installed and an java application that you want to inspect. To start using it , go to http://wiki.cyclopsgroup.org/jmxterm/download and start to download. You should have a jmxterm-[version].jar file.

So, I'm gonna demonstrate on how to use Jmxterm by showing with examples of a terminal output.
$ java -jar jmxterm-1.0-alpha-4-uber.jar
Welcome to JMX terminal. Type "help" for available commands.
$>help;
#IllegalArgumentException: Command help; isn't valid, run help to see available commands
$>help
#following commands are available to use:
about - Display about page
bean - Display or set current selected MBean.
beans - List available beans under a domain or all domains
bye - Terminate console and exit
close - Close current JMX connection
domain - Display or set current selected domain.
domains - List all available domain names
exit - Terminate console and exit
get - Get value of MBean attribute(s)
help - Display available commands or usage of a command
info - Display detail information about an MBean
jvms - List all running local JVM processes
open - Open JMX session or display current connection
option - Set options for command session
quit - Terminate console and exit
run - Invoke an MBean operation
set - Set value of an MBean attribute
$> bean
null
$>beans
#IllegalStateException: Connection isn't open yet. Run open command to open a connection
$>domains
#following domains are available
#IllegalStateException: Connection isn't open yet. Run open command to open a connection
$>jvms
5552 ( ) - jmxterm-1.0-alpha-4-uber.jar
$>help open
usage: open [-h] [-p <val>] [-u <val>]
Open JMX session or display current connection
-h,--help Display usage
-p,--password <val> Password for user/password authentication
-u,--user <val> User name for user/password authentication
Without argument this command display current connection. URL can be a <PID>,
<hostname>:<port> or full qualified JMX service URL. For example
open localhost:9991,
open jmx:service:...
$>open 192.168.0.2:7199
#RuntimeIOException: Runtime IO exception: Connection refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused
$>open localhost:7199
#Connection to localhost:7199 is opened
$>bean
null
$>bean org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies
#bean is set to org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies
$>info
#mbean = org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies
#class name = org.apache.cassandra.db.ColumnFamilyStore
# attributes
%0 - AutoCompactionDisabled (boolean, r)
%1 - BloomFilterDiskSpaceUsed (long, r)
%2 - BloomFilterFalsePositives (long, r)
%3 - BloomFilterFalseRatio (double, r)
%4 - BuiltIndexes (java.util.List, r)
%5 - ColumnFamilyName (java.lang.String, r)
%6 - CompactionStrategyClass (java.lang.String, rw)
%7 - CompressionParameters (java.util.Map, rw)
%8 - CompressionRatio (double, r)
%9 - CrcCheckChance (double, w)
%10 - DroppableTombstoneRatio (double, r)
%11 - EstimatedColumnCountHistogram ([J, r)
%12 - EstimatedRowSizeHistogram ([J, r)
%13 - LifetimeReadLatencyHistogramMicros ([J, r)
%14 - LifetimeWriteLatencyHistogramMicros ([J, r)
%15 - LiveCellsPerSlice (double, r)
%16 - LiveDiskSpaceUsed (long, r)
%17 - LiveSSTableCount (int, r)
%18 - MaxRowSize (long, r)
%19 - MaximumCompactionThreshold (int, rw)
%20 - MeanRowSize (long, r)
%21 - MemtableColumnsCount (long, r)
%22 - MemtableDataSize (long, r)
%23 - MemtableSwitchCount (int, r)
%24 - MinRowSize (long, r)
%25 - MinimumCompactionThreshold (int, rw)
%26 - PendingTasks (int, r)
%27 - ReadCount (long, r)
%28 - RecentBloomFilterFalsePositives (long, r)
%29 - RecentBloomFilterFalseRatio (double, r)
%30 - RecentReadLatencyHistogramMicros ([J, r)
%31 - RecentReadLatencyMicros (double, r)
%32 - RecentSSTablesPerReadHistogram ([J, r)
%33 - RecentWriteLatencyHistogramMicros ([J, r)
%34 - RecentWriteLatencyMicros (double, r)
%35 - SSTableCountPerLevel ([I, r)
%36 - SSTablesPerReadHistogram ([J, r)
%37 - TombstonesPerSlice (double, r)
%38 - TotalDiskSpaceUsed (long, r)
%39 - TotalReadLatencyMicros (long, r)
%40 - TotalWriteLatencyMicros (long, r)
%41 - UnleveledSSTables (int, r)
%42 - WriteCount (long, r)
# operations
%0 - long estimateKeys()
%1 - void forceMajorCompaction()
%2 - java.util.List getSSTablesForKey(java.lang.String p1)
%3 - void loadNewSSTables()
%4 - void setCompactionThresholds(int p1,int p2)
#there's no notifications
$>get WriteCount
#mbean = org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies:
WriteCount = 0;
$>get TotalDiskSpaceUsed
#mbean = org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies:
TotalDiskSpaceUsed = 9437;

So, a brief explanation on what I have just done. To start, you will need to run the Jmxterm from a terminal. To understand what commands it has and what can you use for, simply issued command help. In order to inspect, you will need to open a connection to the jvm. Once a connection is established, you get do all sort of operations and in this example, I'm connected to cassandra and inspect on its bean org.apache.cassandra.db:columnfamily=IndexInfo,keyspace=system,type=ColumnFamilies and get the WriteCount and TotalDiskSpaceUsed statistics.

That's all folks! Hope you get an idea on what it does and where it is applicable to you.

Saturday, December 21, 2013

vifm a true gem

filemanager goes vim


  * A ncurse file manager, with vim like UI for a vim user you will feel right
at home, command like the dd delete line just like in vim, and move to other
window and type p past the line in the "clipboard" to it. And the normal move
command like the hjkl works as expected jk down/up item in list and hl
up/down directory. like in vim most settings are made in it's rc file, the
vifmrc is in ~/.vifm to get an idea about all the option you have in vifm go
to http://vifm.sourceforge.net/docs.html#OPTIONS it's amazing what you can do
with vifm, if you like me has been using vim for sometime this filemanager is
a true gem. And much like vim, the options are "endless" browse on the project
homepage or at github.com. You will find the sourcecode/setup and help to make your own
setup. I often look at the config/setup on github to get idea's and mabye
improve my setup.

* On http://vifm.sourceforge.net/docs.html is the documentation for vifm.

* I must say after using vifm for some time, and done some github'ing made my
own vifmrc some nice filetype setting and hard-bookmarks, it's like vim the
more you use it, and add to your rc file the better and faster it get.

  * So thanks to ksteen & xaizek for this power-tool.
Don't look as much, but it is ;)

Friday, December 20, 2013

A maven introduction.

So recently, I have been working on an opensource project and stumble upon maven. So I'm all ant guy (with ant background), and guess that to use maven should not be that difficult to start using it.

When you see a file in the java project, pom.xml, this should tell you that it is a maven configuration file. So for instance, in a as simple java project, it would look like
+-------------+
|project home |
+--+----------+
|
| +-------+
+------+ src |
| +---+---+
| | +------+
| +-----+ main |
| | +--+---+
| | | +----------+
| | +----+ java |
| | | +----------+
| | | +----------+
| | +----+ resources|
| | +----------+
| | +------+
| +-----+ test |
| +--+---+
| | +----------+
| +----+ java |
| | +----------+
| | +----------+
| +----+ resources|
| +--------+ +----------+
+-----+ target |
| +--------+
| +--------+
+-----+ pom.xml|
+--------+

The most basic command that you ever gonna use and use it very often would probably

mvn package

With above command, mvn will compile your class, run any tests and package the deliverable code and resources into target/my-app-1.0.jar . If mvn produced this jar, this should be enough and that the developer should be able to concentrate the java project.

But if you are adventurous and want to know more about maven, continue to read on. There are a few maven phases which you can issue the command. The following is the standard maven lifecycle with an ordered phases.

  • process-resources

  • compile

  • process-test-resources

  • test-compile

  • test

  • package

  • install

  • deploy


So in order to satisfy the library dependencies of your project, you should specify coordinate of the lib that it depends into pom.xml. You can use  this site to search for the libraries it depends.

I hope this answer a simple start up to use maven to assist in your java project. If you reach here and have further question, this link  and this link .

Changing ElasticSearch logging level by updating cluster setting.

In this article, we are going to learn how to update logging for all the in the elasticsearch cluster. Because logging is crucial in understanding the system behaviour, so from time to time, change the logging level in elasticsearch via elasticsearch.yml and restart elasticsearch instance so that the logging level will be pick up. Unfortunately restart on the live production will take sometime (because of the shards recovery) and this could not be efficient.

Luckily, there is a setting in the cluster which allow the logging level to be change on the fly.

So with that, if you want to understand the what's happening in the cluster node, you can change the logging

e.g.
curl -XPUT localhost:9200/_cluster/settings -d '{
"transient" : {
"logger.cluster.service" : "DEBUG"
}
}'

and tail the elasticsearch log, you should see some log started appearing. Because logging is managed by the class NodeSettingsService, so you should read into the elasticsearch package that initialized with this class. Example elasticsearch package, cluster.service, cluster.routing.allocation.allocator, indices.ttl.IndicesTTLService, etc. Note that the package prefix, org.elastic is not needed when the setting is updated.

If you want more information, this link would provide better help.