Friday, October 18, 2013

How to generate murmur3 in cassandra2.0

Reading into this doc, got really curious on how the murmur3 hash value is generated.

So I dig at cassandra github, found this this class , it seem that, cassandra 2.0 generate the token for the primary key using this method hash3_x64_128. Below are the method to get it work.. just put this into any java class and see the token generated.

    

public static LongToken genToken(String rowKey) {
ByteBuffer key = ByteBufferUtil.bytes(rowKey);
long hash = MurmurHash.hash3_x64_128(key, key.position(), key.remaining(), 0)[0];
LongToken lk = new LongToken(normalize(hash));
return lk;
}

public static void main(String[] args) {
System.out.println(genToken("jim"));
}

private static long normalize(long v)
{
// We exclude the MINIMUM value; see getToken()
return v == Long.MIN_VALUE ? Long.MAX_VALUE : v;
}

 


with jim, it generated as 2680261686609811218. So that should be correct. Something extra, if you use nodetool to show the token ranges, e.g. nodetool -h localhost describering jw_schema1, you should get an idea with the token generated, the range on which nodes are responsible that hold the row data.

Allocate jvm heap more than 8GB for cassandra

What happen if you allocate jvm heap more than 8GB for cassandra instance? With my past experience, we allocate more than 16GB for the cassandra instance and it is still running fine. But occasionally we encounter performance issue when we increase more than 16GB to the heap. Google a little and found this doc

Excessive heap space size

DataStax recommends using the default heap space size for most use
cases. Exceeding this size can impair the Java virtual machine's
(JVM) ability to perform fluid garbage collections (GC). The
following table shows a comparison of heap space performances
reported by a Cassandra user:

Heap CPU utilization Queries per second Latency
40 GB 50% 750 1 second
8 GB 5% 8500 (not maxed out) 10 ms

For information on heap sizing, see Tuning Java resources.


As the benchmark indicate, the more heap you allocate, the higher the cpu usage is. Though the performance decrease is not linear but rather exponentially. So it is wise to keep the heap at 8GB or not more than 50% of that value. It is not deadly but it certainly decrease the performance of the cluster dramatically which would render it useless. If you encountered memory error in the log, in this situation, apart from other factors, it is better if you consider scale your cluster horizontally, that is adding more nodes to increase the capacity. But a quick workaround should you encounter memory error, the

So what happen really happen in the gc if high heap is allocated? well, excerpt from the guru,

..the concurrent mark/sweep phase runs concurrently with your
application. CMS will cause a stop-the-world full pause it it fails to
complete a CMS sweep in time and you hit the maximum heap size, but
unless that happens, CMS will run concurrently (though there are
stop-the-world pauses involved, that are typically very short, the
mark/sweep phase is concurrent).


Hence, if you really hit stop the world situation, this would render the node useless, because the node is too busy doing gc that, cassandra would not be able to perform.

http://www.mail-archive.com/user@cassandra.apache.org/msg17481.html
http://www.mail-archive.com/user@cassandra.apache.org/msg32312.html

Tuesday, October 15, 2013

arch install/remove/search packages

Much like aptitude on debian, archlinux also use the same tools to add and remove packages.

pacman is the packages handler on arch, but as on debian there are more then one tool for the job, pacman is the stock like apt-get is on debian. The commands I list here is the minimum you have to know, to handel packages on archlinux. pacman is a command-line tool.

pacman <options> packag1 pack2

to update: pacman -Syu

to install: pacman -S package1

to remove: pacman -R package1

to search: pacman -Ss pack-x

Always good practice to run pacman -Sy before any install, to be sure you get the latest package

Like on debian all the dependencies are solved by pacman.

This is just the most common used, to get the full information on it, go to the archwiki or man pacman.

cassandra 2.0 catch 101 - part1 - correct cassandra Unsupported major.minor version 51.0

This is another series of my journey to cassandra 2.0, if you have not read the previous post, you should read here

As cassandra 2.0 required jdk 7.0 or later, if your system has jdk 6 running and configure, it is still possible to run cassandra 2.0 with jdk7, that is to make them co-exists. Download jdk and extract to the directory, e.g. /usr/lib/jvm. add JAVA_HOME=/usr/lib/jvm/jdk1.7.0_04/ to /etc/default/cassandra. This should export the variable JAVA_HOME to your environment so that jdk 7 is used to start the cassandra successfully.

Because the above setting is set to work for cassandra instance only, for the admin tools that come with it, we will set another environment for it. This make sure we don't break our existing work. If you want the environment to work only for yourself, then you should create a file in your home directory. $HOME/.cassandra.in.sh . Below are the content.
JAVA_HOME=/usr/lib/jvm/jdk1.7.0_04
. /usr/share/cassandra/cassandra.in.sh

This first line is the same as previously but the second line, we source the additional environment setting for the admin tools so that it can find the java classes. With that done, you should not get the error Unsupported major.minor version 51.0 below anymore!
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/cassandra/tools/NodeCmd : Unsupported major.minor version 51.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.cassandra.tools.NodeCmd. Program will exit.

Sunday, October 13, 2013

Changing default application that start a file

Sometime the default installation of an operating system configuration is not the way you wanted it to be, so with this article, it will show you how to change the application that open a file. Note that this article will work in gnome version 3.4.2 or later.

gui way
This is an easy way if you don't want to dig deep into how is the gnome mechanism file opening work. Just right click on the file, click on Properties. An pop up window associated with the file you selected appeared. On the tab Open With , locate the application you want the file to open with and then select it.

cli way
If you want to dig a little deeper, there are many configuration files located in /usr/share/applications. These files contain information about an application like application name and application exec command to trigger the start of an application. Once you located the application you want a file to be associated to start with, e.g. I would like csv file to be started with libreoffice-calc.desktop , then go to $HOME/.local/share/applications directory, locate another file, mimeapps.list. Open this file with an text editor, under the group [Default Applications] , add an entry like text/csv=libreoffice-calc.desktop. If you want to associated with more files, locate the general mime file at /etc/mime.types, and start to populate your environment with default application.

https://developer.gnome.org/integration-guide/stable/desktop-files.html.en

Saturday, October 12, 2013

analysis graph - gangnam style view



So after a year run, everyone on earth should not be alien about gangnam style; the korean kpop hit songs. It's impresive we see this video view in youtube break the most view youtube holder and set itself far ahead than the rest top 10 most view youtube videos.

Numbers speak and represent the popularity of the content, in this case, the song and the singer. With this article, I would like to share my thought and history experience about this song view chart. This is just my analysis and opinion based on this graph.

It was released sometime ago but it has not gain momentum, probably due to just released and new to the people. But from the black circle, this is where the slope is at its highest. At this time, videos played all over the tv, news play over and over in the radio, and even in social networking websites, everyone sharing this songs to everyone. The rate of explosion is tremendous!

Then we continue to see the curve going upward...as indicated in the red circle. There was a time where there is a video circulate in the internet said how the view is generated using a paid service company.. LoL.. lame.. that is, if you want to get your views goes high up, you paid a certain amount to a company, and they click on your video and view and does that continously for you. :-) But I don't think it happen for this song, because this gangnam trend is seem throughout the earth.. though it is korean song, the impact is everywhere, we can see mass public doing mob dance everywhere on earth.. also, the hundred of parodies....

Though the graph is still rising, as seen in blue circle, the rate of view is definitely slow down. somewhere during the period of 2013 april, that's about 6 months run! I think it has done good , considering a good pop song would not last like 2-3months... Nonetheless, true fans of Psy and newbie just watch the video still contributing the view rate. At this stage, Psy and his gangnam style has reach the fame and every household would know him.. maybe your pet, dog and cat too. :-)

The record breaking at almost 1.8billions view after july 2013 is really astonishing and greatest achievement. We see the graph not grow but in fact, if you zoom into the last of light yellow circle, you will see the fall of the view rate. I guesss it is only normal everyone is bored with this pop song. Would it grow any higher and stay steady? Trend and time will tell because we see history, there are pop songs goes back into chart, with gangnam style, there is high possibility too. Thereby push the graph higher.

That's it for my analysis and thank you for reading.

cassandra 2.0 catch 101

This is my experience with cassandra 2.0 and cqlsh, reading through the datastax documentation [1] , sometime you need to play around with the new tool to understand with the nature of the usage. It could be a bug, it could be just some changes, nonetheless, it is my experience reading and experiencing in cassandra 2.0. Some bugs may be fix or change in the later cassandra release and thus render this article invalid.. thus this article is only meant for cassandra 2.0 and cqlsh 4.0.0 . I will add more in my journey to cassandra 2.0


when updating config durable_writes to the keyspace, got a few times error... like below.


cqlsh:jw_schema1> alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : false};
Bad Request: Failed parsing statement: [alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : false};] reason: NullPointerException null
cqlsh:jw_schema1> alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : 'false'};
Bad Request: Failed parsing statement: [alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : 'false'};] reason: NullPointerException null
cqlsh:jw_schema1> alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : 0};
Bad Request: Failed parsing statement: [alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : 0};] reason: NullPointerException null
cqlsh:jw_schema1> alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : 1};
Bad Request: Failed parsing statement: [alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : 1};] reason: NullPointerException null
cqlsh:jw_schema1> alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : '1'};
Bad Request: Failed parsing statement: [alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : '1'};] reason: NullPointerException null
cqlsh:jw_schema1> alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : '0'};
Bad Request: Failed parsing statement: [alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : '0'};] reason: NullPointerException null
cqlsh:jw_schema1> alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : 'False'};
Bad Request: Failed parsing statement: [alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : 'False'};] reason: NullPointerException null
cqlsh:jw_schema1> alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : 'True'};
Bad Request: Failed parsing statement: [alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : 'True'};] reason: NullPointerException null
cqlsh:jw_schema1> alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : True};
Bad Request: Failed parsing statement: [alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : True};] reason: NullPointerException null
cqlsh:jw_schema1> alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : False};
Bad Request: Failed parsing statement: [alter keyspace jw_schema1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3, durable_writes : False};] reason: NullPointerException null


just specified as one config in the alter keyspace command. It strange alter keyspace is not in the cqlsh help file...


cqlsh:jw_schema1> alter keyspace jw_schema1 with durable_writes = false;
cqlsh:jw_schema1> DESCRIBE KEYSPACE jw_schema1;

CREATE KEYSPACE jw_schema1 WITH replication = {
'class': 'SimpleStrategy',
'replication_factor': '3'
} AND durable_writes = 'false';


voila!

[1] http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/configuration/configStorage_r.html#reference_ds_itw_wkz_1k
[2] http://cassandra.apache.org/doc/cql3/CQL.html#alterKeyspaceStmt