Showing posts with label analysis. Show all posts
Showing posts with label analysis. Show all posts

Sunday, March 1, 2015

Study logic elasticsearch discovery zen fd ping_timeout and ping_retries

Today, we are going to study two parameter from elasticsearch version 0.90.7, specifically

  • discovery.zen.fd.ping_timeout

  • discovery.zen.fd.ping_retries


Let's find the definition from official documentation, here and here.
There are two fault detection processes running. The first is by the master, to ping all the other nodes in the cluster and verify that they are alive. And on the other end, each node pings to master to verify if its still alive or an election process needs to be initiated.

 

ping_timeout How long to wait for a ping response, defaults to 30s.
ping_retries How many ping failures / timeouts cause a node to be considered failed. Defaults to 3.

So this setting is use by master node and data node for detection if the node is okay or not and once ping, the duration time to wait is 30seconds (default) and if 3 times, a node is considered down/failed if it exceed 3 times. Okay, let's go into the code.

NodesFaultDetection.java

Within the class NodesFaultDetection, there is a inner class SendPingRequest. As it implement interface Runnable, the run method will be execute by an executor. Instead, object are read and write so emulate a ping behaviour, you can read the class PingRequest for more information. As you noticed, ping_timeout is pass to the super class of PingRequest.

The essence of logic is pretty much written in the statement transportService.sendRequest(final DiscoveryNode node, final String action, final TransportRequest request, final TransportRequestOptions options, TransportResponseHandler<T> handler). You would think it would be an ICMP ping, but it is not, there isn't isReachable() is called.

In the method, handleResponse(PingResponse response), we see that, the retry count is reset to 0 and then this SendPingRequest object is schedule again with the ping_interval you set earlier. In the method, handleException(TransportException exp). we see that the variable retryCount is increase by one, and if the current retry count exceed the default 3 times, then the node is considered dead and were removed. If the current retry count is less than the default 3 times, then another ping is send with the same ping_timeout.

MasterFaultDetection.java

Master fault detection is a little different than nodes fault detection. When the public method of MasterFaultDetection start is called and then method innerStart(), object MasterPinger is created.
this.masterPinger = new MasterPinger();
// start the ping process
threadPool.schedule(pingInterval, ThreadPool.Names.SAME, masterPinger);

So there is a periodic ping of default 1 second. When instance of MasterPinger is run, we noticed that it goes through the same process of sending the request using transport service. transportService.sendRequest(final DiscoveryNode node, final String action, final TransportRequest request, final TransportRequestOptions options, TransportResponseHandler<T> handler)
The logic of this request sending is same with NodesFaultDetection. What interesting is the method override in class BaseTransportResponseHandler. In handleResponse, so we see the the retry count is reset back to 0. Then another ping is scheduled.

In the override method handleException(TransportException exp) , so there are three exception check on the master if it no longer a master, or ping to a non master ping a master but does not exists on it. Now at the stage, retry count is increase by one. If current retry count greater than or equal to the default 3 times, then ping to node by this master is falied, this node consider failed. if current retry count less than the default three, another ping is sent.

That's it, if you think this analysis need improvement, please leave your comment below. Thank you.

Saturday, February 28, 2015

Implement java remote method invocation on tomcat6

28Today, we will learn a bit on remote method invocation (rmi) via java. I know this concept rmi is old but for the sake of learning, nothing is old :) fun and knowledge is what matter. First, let's see what is java remote method invocation. From wikipedia.
The Java Remote Method Invocation (Java RMI) is a Java API that performs the object-oriented equivalent of remote procedure calls (RPC), with support for direct transfer of serialized Java classes and distributed garbage collection.

 

The original implementation depends on Java Virtual Machine (JVM) class representation mechanisms and it thus only supports making calls from one JVM to another. The protocol underlying this Java-only implementation is known as Java Remote Method Protocol (JRMP).
In order to support code running in a non-JVM context, a CORBA version was later developed.

 

Usage of the term RMI may denote solely the programming interface or may signify both the API and JRMP, whereas the term RMI-IIOP (read: RMI over IIOP) denotes the RMI interface delegating most of the functionality to the supporting CORBA implementation.



and if you do not understand, looking one step up, java rmi is actually a java implementation of remote procedure call (rpc). Excerpts from wikipedia.
In computer science, a remote procedure call (RPC) is an inter-process communication that allows a computer program to cause a subroutine or procedure to execute in another address space (commonly on another computer on a shared network) without the programmer explicitly coding the details for this remote interaction.[1] That is, the programmer writes essentially the same code whether the subroutine is local to the executing program, or remote. When the software in question uses object-oriented principles, RPC is called remote invocation or remote method invocation.

Okay, enough of the theory, let's start a simple java rmi using tomcat. This learning tutorial assume you have tomcat server running and know basic how to deploy the jar file into your tomcat running server.

Provide access permission to the jar. Probably easiest if you are starting up to learn this and too much concept to grasp for, you start with grant permission for all security and when you are good at it, start to fine tune. You should set this in <TOMCAT_HOME>/conf/catalina.policy
  grant {
permission java.security.AllPermission;
};

Then now we will code at the server side. First, let's create a java interface which the server will implement this and the client will invoke this method remotely.
import java.rmi.Remote;

public interface CalculatorInterface extends Remote {

public final String serviceName = "MyRemoteService";

public Double Add(Double num1, Double num2) throws Exception;

public Double Sub(Double num1, Double num2) throws Exception;

public Double Mul(Double num1, Double num2) throws Exception;

public Double Div(Double num1, Double num2) throws Exception;

public Integer Factorial(Integer num) throws Exception;

public Float Random() throws Exception;
}

So we have an interface of Calculator which extends Remote interface. There are a few public method are exposed in this interface which will be invoke by client later.
public class Calculator implements CalculatorInterface {

public Calculator() {
super();
}

@Override
public Double Add(Double num1, Double num2) throws Exception {
return num1 + num2;
}

@Override
public Double Sub(Double num1, Double num2) throws Exception {
return num1 - num2;
}

@Override
public Double Mul(Double num1, Double num2) throws Exception {
return num1 * num2;
}

@Override
public Double Div(Double num1, Double num2) throws Exception {
return num1 / num2;
}

@Override
public Integer Factorial(Integer num) throws Exception {
Integer t = 1;
for(int i = 1; i <= num;i++){
t = t * i;
}
return t;
}

@Override
public Float Random() throws Exception {
return (float) Math.random();
}

}

Here, we implement the calculator. As seen here, all basic mathematics formulae. Now, we will start this instance in tomcat. The easy way would probably be implement servletContextListener and start the stub on a port when tomcat is starting. With that said, let's read the code below.
import java.rmi.registry.LocateRegistry;
import java.rmi.registry.Registry;
import java.rmi.server.UnicastRemoteObject;

import javax.servlet.ServletContextEvent;
import javax.servlet.ServletContextListener;

public class InitCalculator implements ServletContextListener {
public static boolean isRegistered = false;
public static CalculatorInterface service;

public InitCalculator() {
if (!isRegistered) {
try {
service = new Calculator();
CalculatorInterface stub = (CalculatorInterface)UnicastRemoteObject.exportObject(service, 0);
Registry registry = LocateRegistry.createRegistry(9345);
registry.rebind(CalculatorInterface.serviceName, stub);
System.out.println("Remote service bound");
isRegistered = true;
} catch (Exception e) {
System.err.println("Remote service exception:");
e.printStackTrace();
}
}

}

@Override
public void contextDestroyed(ServletContextEvent arg0) {
// TODO Auto-generated method stub

}

@Override
public void contextInitialized(ServletContextEvent arg0) {
new InitCalculator();
System.out.println("started ...");
}

}

As seen above, when webapp context is initialized, a new object InitCalculator() is created. This object is bind to port 9345, so make sure your firewall allow this as later you will need to access this port remotely. So we create a registray and bind it to the registry on port 9345. So very easy code. Remember to register this listener class into tomcat web descriptor.
  <listener>
<listener-class>com.example.InitCalculator</listener-class>
</listener>

Moving on to the last piece of puzzle, the client code.
import java.rmi.registry.LocateRegistry;
import java.rmi.registry.Registry;

public class CalculatorClient {

public static void main(String[] args) {
try {
Registry registry = LocateRegistry.getRegistry("localhost", 9345);
String[] names = registry.list();
for (String name: names) {
System.out.println("~~~~~" + name + "~~~");
}
CalculatorInterface serv = (CalculatorInterface)registry.lookup(CalculatorInterface.serviceName);
System.out.println("add total " + serv.Add(1d, 1d));
} catch (Exception e) {
e.printStackTrace();
}
}

}

As can be read above, the client code connect to localhost on port 9345 and then list what's in the registry. Then the interface is created with registry lookup on the interface service name. Now, we can invoke the server method.. Pretty cool stuff. :) See below.
[user@localhost ~]$ java -cp /var/lib/tomcat/webapps/example/WEB-INF/lib/example.jar:. CalculatorClient
~~~~~MyRemoteService~~~
add total 2.0

That's it.

Friday, February 27, 2015

how to determine currently occupied queue size and cache usage in elasticsearch 0.90

Have you encounter situation like, when a elasticsearch client is indexing into elasticsearch cluster, the client get rejected exception from the cluster? What about if you have cache some filters in your query and you want to know how much memory is used at the moment? If yes, and you are using elasticsearch 0.90, then you come to the right place. I'm going to show you how to show these statistics through elasticsearch exposed metric API. This is important if you want to determine the health of your cluster.

Okay, let's start with the first one, how to get the occupied queue size in the node cluster.
[jason@node009 ~]$ curl -XGET 'http://localhost:9200/_nodes/node009/stats/thread_pool?pretty'
{
"cluster_name" : "MY_TEST_Cluster",
"nodes" : {
"1111111111111111111111" : {
"timestamp" : 1422372473667,
"name" : "node009",
"transport_address" : "inet[my.private.ip.com/1.2.3.4:9300]",
"hostname" : "node009.foobar.com",
"thread_pool" : {
"generic" : {
"threads" : 1,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 82,
"completed" : 6378594
},
"index" : {
"threads" : 8,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 8,
"completed" : 25735782
},
"get" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"snapshot" : {
"threads" : 4,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 4,
"completed" : 1003286
},
"merge" : {
"threads" : 4,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 4,
"completed" : 4863710
},
"suggest" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"bulk" : {
"threads" : 8,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 8,
"completed" : 42148
},
"optimize" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"warmer" : {
"threads" : 4,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 4,
"completed" : 2087615
},
"flush" : {
"threads" : 3,
"queue" : 0,
"active" : 1,
"rejected" : 0,
"largest" : 4,
"completed" : 10492
},
"search" : {
"threads" : 512,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 512,
"completed" : 245843
},
"percolate" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"management" : {
"threads" : 5,
"queue" : 0,
"active" : 1,
"rejected" : 0,
"largest" : 5,
"completed" : 2082438
},
"refresh" : {
"threads" : 4,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 4,
"completed" : 1521727
}
}
}
}
}

As can be read above, this is node statistics and only thread pools stats are exposed. So if the client are actively index into elasticsearch, the metric you should look at is index. So the above sample look pretty good, there is no queue and no rejection.

Next, we will take a look at cache usage. Using the same node stats api but change the thread_pool to indices.
[jason@node009 ~]$ curl -XGET 'http://localhost:9200/_nodes/node009/stats/indices?pretty'
{
"cluster_name" : "MY_TEST_Cluster",
"nodes" : {
"1111111111111111111111" : {
"timestamp" : 1422373322128,
"name" : "node009",
"transport_address" : "inet[my.private.ip.com/1.2.3.4:9300]",
"hostname" : "node009.foobar.com",
"indices" : {
"docs" : {
"count" : 134502646,
"deleted" : 104806463
},
"store" : {
"size" : "340.9gb",
"size_in_bytes" : 366092384499,
"throttle_time" : "2.1ms",
"throttle_time_in_millis" : 2
},
"indexing" : {
"index_total" : 25692998,
"index_time" : "6.8h",
"index_time_in_millis" : 24495073,
"index_current" : 22015,
"delete_total" : 13217673,
"delete_time" : "14.6m",
"delete_time_in_millis" : 877101,
"delete_current" : 0
},
"get" : {
"total" : 0,
"get_time" : "0s",
"time_in_millis" : 0,
"exists_total" : 0,
"exists_time" : "0s",
"exists_time_in_millis" : 0,
"missing_total" : 0,
"missing_time" : "0s",
"missing_time_in_millis" : 0,
"current" : 0
},
"search" : {
"open_contexts" : 1,
"query_total" : 204027,
"query_time" : "1.9h",
"query_time_in_millis" : 6856699,
"query_current" : 0,
"fetch_total" : 34409,
"fetch_time" : "2.4m",
"fetch_time_in_millis" : 148210,
"fetch_current" : 0
},
"merges" : {
"current" : 0,
"current_docs" : 0,
"current_size" : "0b",
"current_size_in_bytes" : 0,
"total" : 563950,
"total_time" : "7.5h",
"total_time_in_millis" : 27166425,
"total_docs" : 219150610,
"total_size" : "374.6gb",
"total_size_in_bytes" : 402324404903
},
"refresh" : {
"total" : 1522907,
"total_time" : "10.5h",
"total_time_in_millis" : 38110904
},
"flush" : {
"total" : 10499,
"total_time" : "2.4h",
"total_time_in_millis" : 8726951
},
"warmer" : {
"current" : 0,
"total" : 2089352,
"total_time" : "9.3m",
"total_time_in_millis" : 560985
},
"filter_cache" : {
"memory_size" : "2.8gb",
"memory_size_in_bytes" : 3011274608,
"evictions" : 35449
},
"id_cache" : {
"memory_size" : "0b",
"memory_size_in_bytes" : 0
},
"fielddata" : {
"memory_size" : "140.5mb",
"memory_size_in_bytes" : 147415629,
"evictions" : 86803
},
"completion" : {
"size" : "231b",
"size_in_bytes" : 231
},
"segments" : {
"count" : 700
}
}
}
}
}

As seem above, there are two cache metrics, filter cache and id cache. Right now it is pretty clear, how much this cache is used in this node and how much evictions happened. There is also a metric, fielddata which is occupied memory in the jvm, you might want to keep an eye during monitoring. If you want to know exactly what field using how much memory, you can use this api
curl localhost:9200/_nodes/stats/indices/fielddata/field1,field2?pretty

But this one is left for you to play with as a home work. Hints, to replace field1 and field2 to the value you index and read the output. That's it. :-)

Sunday, February 15, 2015

Fix steam error libGL error: failed to load driver: swrast in debian

If you have steam client installed on debian sid, once a while, operating system is upgraded and then the upgraded break steam client. An example output of such error encountered.
user@localhost:~$ steam
Running Steam on debian 8 64-bit
STEAM_RUNTIME is enabled automatically
Installing breakpad exception handler for appid(steam)/version(1421694684)
libGL error: unable to load driver: r600_dri.so
libGL error: driver pointer missing
libGL error: failed to load driver: r600
libGL error: unable to load driver: swrast_dri.so
libGL error: failed to load driver: swrast
^C

So steam client fail to launch and this look like 3d graphic driver unable to load or not install. Don't bother to even install the package libgl1-mesa-swx11 that provide the file swrast because at this point of time, installation of this package will not work as conflict is clearly indicated. Conflicts: libgl1, libgl1-mesa-swrast, mesag3, mesag3+ggi, mesag3-glide, mesag3-glide2, nvidia-glx. Installation of this package will render debian gui not usable, had that path :( So don't do that.

So I have google and found a good solution and below is what I have taken. I hope it works for you too.
user@localhost:~/.local/share/Steam/ubuntu12_32/steam-runtime/i386/usr/lib/i386-linux-gnu$ mv libstdc++.so.6.0.18 libstdc++.so.6.0.18.remove.by.user
user@localhost:~/.local/share/Steam/ubuntu12_32/steam-runtime/i386/usr/lib/i386-linux-gnu$ ls libstdc++.so.6*
lrwxrwxrwx 1 user user 19 Jul 19 2014 libstdc++.so.6 -> libstdc++.so.6.0.18
-rw-r--r-- 1 user user 901K Jul 19 2014 libstdc++.so.6.0.18.remove.by.user
user@localhost:~/.local/share/Steam/ubuntu12_32/steam-runtime/i386/usr/lib/i386-linux-gnu$ rm libstdc++.so.6
rm: remove symbolic link ‘libstdc++.so.6’? y
user@localhost:~/.local/share/Steam/ubuntu12_32/steam-runtime/i386/usr/lib/i386-linux-gnu$ pwd
/home/user/.local/share/Steam/ubuntu12_32/steam-runtime/i386/usr/lib/i386-linux-gnu


user@localhost:~/.local/share/Steam/ubuntu12_32/steam-runtime/amd64/usr/lib/x86_64-linux-gnu$ pwd
/home/user/.local/share/Steam/ubuntu12_32/steam-runtime/amd64/usr/lib/x86_64-linux-gnu
user@localhost:~/.local/share/Steam/ubuntu12_32/steam-runtime/amd64/usr/lib/x86_64-linux-gnu$ mv libstdc++.so.6.0.18 libstdc++.so.6.0.18.remove.by.user
user@localhost:~/.local/share/Steam/ubuntu12_32/steam-runtime/amd64/usr/lib/x86_64-linux-gnu$ rm libstdc++.so.6
rm: remove symbolic link ‘libstdc++.so.6’? y
user@localhost:~/.local/share/Steam/ubuntu12_32/steam-runtime/amd64/usr/lib/x86_64-linux-gnu$ pwd
/home/user/.local/share/Steam/ubuntu12_32/steam-runtime/amd64/usr/lib/x86_64-linux-gnu
user@localhost:~/.local/share/Steam/ubuntu12_32/steam-runtime/amd64/usr/lib/x86_64-linux-gnu$

As you can see above, the example shown two symbolic links libstdc++.so.6 in two different directory, i386 and amd64 were removed. Then again for the file that symlink pointed to libstdc++.so.6.0.18 is moved to another name and so it can be revert if something goes wrong after.

After these were removed, start again the steam client and steam will redownload the file and it should work again! :-)

Saturday, February 14, 2015

how to connect to msn server with pidgin 2.10.11

After numeral news (here , here  and here) that msn will? was? shut down, today we will take another  look if connection to msn server is still possible. hehe

Well, this issue of cannot make connection to msn server happen to me again. But I'm not so sure if microsoft really shutdown the messenger server? Anyway, let's fire up pidgin with --debug option.
(04:39:32) account: Connecting to account ursa@hotmail.com.
(04:39:32) connection: Connecting. gc = 0x7fb12226a4a0
(04:39:32) msn: new httpconn (0x7fb12244ce40)
(04:39:32) proxy: Gnome proxy settings are set to 'manual' but no suitable proxy server is specified. Using Pidgin's proxy settings instead.
(04:39:32) dnsquery: Performing DNS lookup for messenger.hotmail.com
(04:39:32) proxy: Gnome proxy settings are set to 'manual' but no suitable proxy server is specified. Using Pidgin's proxy settings instead.
(04:39:32) dns: Wait for DNS child 4807 failed: No child processes
(04:39:32) dns: Wait for DNS child 4816 failed: No child processes
(04:39:32) dns: Created new DNS child 5206, there are now 1 children.
(04:39:32) dns: Successfully sent DNS request to child 5206
(04:39:32) dns: Got response for 'messenger.hotmail.com'
(04:39:32) dnsquery: IP resolved for messenger.hotmail.com
(04:39:32) proxy: Attempting connection to 64.4.45.209
(04:39:32) proxy: Connecting to messenger.hotmail.com:1863 with no proxy
(04:39:32) proxy: Connection in progress
(04:39:32) proxy: Connecting to messenger.hotmail.com:1863.
(04:39:32) proxy: Error connecting to messenger.hotmail.com:1863 (Connection refused).
(04:39:32) proxy: Connection attempt failed: Connection refused
(04:39:32) msn: Connection error: Connection refused
(04:39:32) msn: Connection error from Notification server (messenger.hotmail.com): Connection refused
(04:39:32) connection: Connection error on 0x7fb12226a4a0 (reason: 0 description: Connection error from Notification server:
Connection refused)
(04:39:32) account: Disconnecting account ursa@hotmail.com (0x7fb1218f83c0)
(04:39:32) connection: Disconnecting connection 0x7fb12226a4a0
(04:39:32) msn: destroy the OIM 0x7fb12226b250
(04:39:32) msn: destroy httpconn (0x7fb12244ce40)
(04:39:32) connection: Destroying connection 0x7fb12226a4a0

Bummer! So connection to msn really a problem (again) ! So I'm trying to play around the settings and surprise surprise, pidgin can connect to the msn again. >:-) Here is how I did to make it work.

  1. In the pidgin menu, click on Accounts.

  2. Click on Manage Accounts.

  3. Select your msn account and click on Modify...

  4. In the Modify Account window, click on Advanced tab and check the checkbox Use HTTP Method.

  5. Then in the Proxy tab, for the proxy type, select Use Environmental Settings. Note this setting really depend on your network setup so check with your network admin.


pidgin_modify_account_proxy pidgin_modify_account_advance

save the settings and click on the checkbox in the Enabled column in Accounts. Finger cross it will work, at least this time for me (until it break again). :-)

That's it!

 

Friday, February 13, 2015

using google guava library to hold data for report

Often times when ones work with report (just a typical report), it is pretty common to meet the situation like to hold a list of rows into a data type which has a key and value and maybe a page number. So for java programmer, you will encounter something like this.
public class Report  {

List<LinkedHashMap<String, String>> rows = new ArrayList<LinkedHashMap<String, String>>();
private int page;

public static void printReport(Report report) {

List<LinkedHashMap<String, String>> oldReport = report.getOld();

for (LinkedHashMap<String, String> oldRows : oldReport) {
for (Entry<String, String> entry : oldRows.entrySet()) {
System.out.print(entry.getValue());
}
}
}

}

So you will have many rows to hold each row in a report and within each row, you have a key and a value. For instance, one the first page of report, you will have a person with first name john and last name doe and age 30. Then you have another row of person, first name dan, last name christensen, age 40, etc. Then to print the report, you basically iterate over the data collections and print out its value.

Is there any other ways, better yet efficient?

So I have google and people suggest using guava and I will take a look at the different feature offer by guava and how it help me in this situation above. So what is google guava?
The Google Guava is an open-source set of common libraries for Java, mainly developed by Google engineers.

This page give a general overview for the common libraries found in google guava. As you notice, there are many features included in this library but for the report above, I will use only two of it. Let's rewrite the above code.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;

import com.google.common.base.Joiner;
import com.google.common.collect.LinkedHashMultimap;
import com.google.common.collect.LinkedHashMultiset;
import com.google.common.collect.Multimap;
import com.google.common.collect.Multiset;

public class Report {

private int page;
private List<String> header;
private Multimap<String, String> rows;
private Multiset<String> rowsSet;

private List<LinkedHashMap<String, String>> old = new ArrayList<LinkedHashMap<String, String>>();

public Report() {
page = 0;
header = new ArrayList<String>();
rows = LinkedHashMultimap.create();
rowsSet = LinkedHashMultiset.create();

}


public void getReportFromDS() {
header.addAll(Arrays.asList("firstName", "LastName", "age"));
rows.put("2829f395317df0f88597ef288f132827794707af", "john");
rows.put("2829f395317df0f88597ef288f132827794707af", "doe");
rows.put("2829f395317df0f88597ef288f132827794707af", "30");
rows.put("d94c2ddf2a4817e5c9a56db45d41ed876e823fcf", "dan");
rows.put("d94c2ddf2a4817e5c9a56db45d41ed876e823fcf", "christensen");
rows.put("d94c2ddf2a4817e5c9a56db45d41ed876e823fcf", "40");
rows.put("1fd23a55e9780810d2e6f0ec9ba1ddb99827e4cf", "chai");
rows.put("1fd23a55e9780810d2e6f0ec9ba1ddb99827e4cf", "lenny");
rows.put("1fd23a55e9780810d2e6f0ec9ba1ddb99827e4cf", "20");
}

public int getPage() {
return page;
}


public List<String> getHeader() {
return header;
}


public Multimap<String, String> getRows() {
return rows;
}

public List<LinkedHashMap<String, String>> getOld() {
return old;
}


public static void printReport(Report report) {
Joiner joiner = Joiner.on(", ");
String headers = joiner.join(report.getHeader());

System.out.println(headers);
Map<String, Collection<String>> rows = report.getRows().asMap();

for (Entry<String, Collection<String>> row : rows.entrySet()) {

String line = joiner.join(row.getValue().iterator());
System.out.println(line);
}

List<LinkedHashMap<String, String>> oldReport = report.getOld();

for (LinkedHashMap<String, String> oldRows : oldReport) {
for (Entry<String, String> entry : oldRows.entrySet()) {
System.out.print(entry.getValue());
}
}
}


public static void main(String[] args) {
Report sampleReport = new Report();
sampleReport.getReportFromDS();
printReport(sampleReport);
}

}

As noted from the full code above, it contain the constructor to initialize the objects. Then a method getReportFromDS(), you could probably get from your data source like database. Then we have getter methods and a static method to print the report. If you run this app, you notice it print out the report header, and then rows.

There is a class which join the string together with just two lines. Even better you can make it a line ;-) using Joinner. To print each row of sample report, you can using a for loop but only a for loop. Then you can join the value of the row and print out the row. Less codes and more readability. If you measure the object sampleReport, I guess is much use less memory footprint.

That's it, just two goodies features from google guava, I suggest you read on different features offered and fully use this great library.13

Sunday, February 1, 2015

Initial study on apache lucene

Today, we are going to learn apache lucene. So first thing first, what is apache lucene?
Apache Lucene is a free open source information retrieval software library, originally written in Java by Doug Cutting. It is supported by the Apache Software Foundation and is released under the Apache Software License.

Let's go into apache lucene "hello world", so we get an basic idea what is it. Go to the offical site and download the latest release. Below is the tutorial I follow from the official documentation, and using apache lucene version 4.10.3 with oracle java 7 with slight modification to the tutorial.
jason@localhost:~/Desktop/lucene-4.10.3$ java -cp ./core/lucene-core-4.10.3.jar:./queryparser/lucene-queryparser-4.10.3.jar:./analysis/common/lucene-analyzers-common-4.10.3.jar:./demo/lucene-demo-4.10.3.jar org.apache.lucene.demo.IndexFiles
Usage: java org.apache.lucene.demo.IndexFiles [-index INDEX_PATH] [-docs DOCS_PATH] [-update]

This indexes the documents in DOCS_PATH, creating a Lucene indexin INDEX_PATH that can be searched with SearchFiles
jason@localhost:~/Desktop/lucene-4.10.3$ java -cp ./core/lucene-core-4.10.3.jar:./queryparser/lucene-queryparser-4.10.3.jar:./analysis/common/lucene-analyzers-common-4.10.3.jar:./demo/lucene-demo-4.10.3.jar org.apache.lucene.demo.IndexFiles -index data/ -docs docs/
Indexing to directory 'data/'...
adding docs/grouping/constant-values.html
adding docs/grouping/index.html
adding docs/grouping/allclasses-noframe.html
adding docs/grouping/overview-frame.html
adding docs/grouping/org/apache/lucene/search/grouping/AbstractGroupFacetCollector.html
...
...
...
adding docs/analyzers-phonetic/deprecated-list.html
adding docs/analyzers-phonetic/package-list
adding docs/analyzers-phonetic/allclasses-frame.html
95794 total milliseconds
jason@localhost:~/Desktop/lucene-4.10.3$ uptime
21:10:16 up 16:44, 23 users, load average: 5.45, 4.49, 3.59

As you can see, instead of indexing the source of java class file, I index the javadoc in html format and it works nicely. Although my system is loaded but the index still reasonably quick. Apache lucene finish index within 95seconds for a total of 5818 files. After index are done, if you do a list on the directory data, you will notice the lucene index files. If you want to go into details what are these files before, you should read this documentation.
jason@localhost:~/Desktop/lucene-4.10.3$ ls -l data/
total 13784
-rw-r--r-- 1 jason jason 284 Jan 13 21:07 _0.cfe
-rw-r--r-- 1 jason jason 12387776 Jan 13 21:07 _0.cfs
-rw-r--r-- 1 jason jason 242 Jan 13 21:07 _0.si
-rw-r--r-- 1 jason jason 284 Jan 13 21:07 _1.cfe
-rw-r--r-- 1 jason jason 1677329 Jan 13 21:07 _1.cfs
-rw-r--r-- 1 jason jason 242 Jan 13 21:07 _1.si
-rw-r--r-- 1 jason jason 151 Jan 13 21:07 segments_1
-rw-r--r-- 1 jason jason 36 Jan 13 21:07 segments.gen
-rw-r--r-- 1 jason jason 0 Jan 13 21:06 write.lock

Okay, now to the search.
jason@localhost:~/Desktop/lucene-4.10.3$ java -cp ./core/lucene-core-4.10.3.jar:./queryparser/lucene-queryparser-4.10.3.jar:./analysis/common/lucene-analyzers-common-4.10.3.jar:./demo/lucene-demo-4.10.3.jar  org.apache.lucene.demo.SearchFiles
Exception in thread "main" org.apache.lucene.store.NoSuchDirectoryException: directory '/home/jason/Desktop/lucene-4.10.3/index' does not exist
at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:218)
at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:242)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:801)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:53)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:67)
at org.apache.lucene.demo.SearchFiles.main(SearchFiles.java:91)
jason@localhost:~/Desktop/lucene-4.10.3$ java -cp ./core/lucene-core-4.10.3.jar:./queryparser/lucene-queryparser-4.10.3.jar:./analysis/common/lucene-analyzers-common-4.10.3.jar:./demo/lucene-demo-4.10.3.jar org.apache.lucene.demo.SearchFiles --help
Exception in thread "main" org.apache.lucene.store.NoSuchDirectoryException: directory '/home/jason/Desktop/lucene-4.10.3/index' does not exist
at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:218)
at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:242)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:801)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:53)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:67)
at org.apache.lucene.demo.SearchFiles.main(SearchFiles.java:91)
jason@localhost:~/Desktop/lucene-4.10.3$ java -cp ./core/lucene-core-4.10.3.jar:./queryparser/lucene-queryparser-4.10.3.jar:./analysis/common/lucene-analyzers-common-4.10.3.jar:./demo/lucene-demo-4.10.3.jar org.apache.lucene.demo.SearchFiles -h
Usage: java org.apache.lucene.demo.SearchFiles [-index dir] [-field f] [-repeat n] [-queries file] [-query string] [-raw] [-paging hitsPerPage]

See http://lucene.apache.org/core/4_1_0/demo/ for details.
jason@localhost:~/Desktop/lucene-4.10.3$ java -cp ./core/lucene-core-4.10.3.jar:./queryparser/lucene-queryparser-4.10.3.jar:./analysis/common/lucene-analyzers-common-4.10.3.jar:./demo/lucene-demo-4.10.3.jar org.apache.lucene.demo.SearchFiles -index data
Enter query:
string
Searching for: string
1674 total matching documents
1. docs/benchmark/org/apache/lucene/benchmark/byTask/utils/Format.html
2. docs/analyzers-common/org/apache/lucene/analysis/util/AbstractAnalysisFactory.html
3. docs/queryparser/deprecated-list.html
4. docs/queryparser/org/apache/lucene/queryparser/classic/class-use/ParseException.html
5. docs/queryparser/org/apache/lucene/queryparser/flexible/core/messages/QueryParserMessages.html
6. docs/core/org/apache/lucene/index/IndexFileNames.html
7. docs/analyzers-stempel/org/egothor/stemmer/Diff.html
8. docs/queryparser/org/apache/lucene/queryparser/ext/Extensions.html
9. docs/facet/org/apache/lucene/facet/FacetsConfig. html
10. docs/queryparser/org/apache/lucene/queryparser/flexible/messages/package-summary.html
Press (n)ext page, (q)uit or enter number to jump to a page.
n
11. docs/highlighter/org/apache/lucene/search/highlight/class-use/InvalidTokenOffsetsException.html
12. docs/queryparser/org/apache/lucene/queryparser/xml/DOMUtils.html
13. docs/queryparser/org/apache/lucene/queryparser/classic/MultiFieldQueryParser.html
14. docs/core/org/apache/lucene/index/SegmentInfo.html
15. docs/highlighter/org/apache/lucene/search/vectorhighlight/FragmentsBuilder.html
16. docs/highlighter/org/apache/lucene/search/vectorhighlight/class-use/FieldFragList.html
17. docs/highlighter/org/apache/lucene/search/vectorhighlight/BaseFragmentsBuilder.html
18. docs/queryparser/org/apache/lucene/queryparser/flexible/standard/QueryParserUtil.html
19. docs/highlighter/org/apache/lucene/search/highlight/GradientFormatter.html
20. docs/highlighter/org/apache/lucene/search/postingshighlight/PostingsHighlighter.html
Press (p)revious page, (n)ext page, (q)uit or enter number to jump to a page.
q
Enter query:
quit
Searching for: quit
2 total matching documents
1. docs/demo/src-html/org/apache/lucene/demo/SearchFiles.html
2. docs/changes/Changes.html
Press (q)uit or enter number to jump to a page.
q
Enter query:
^Cjason@localhost:~/Desktop/lucene-4.10.3$

The search is quick even though in the loaded system. That's it, a light learning experience on apache lucene.

Saturday, January 31, 2015

How to setup software raid mirroring for disks on xubuntu

Over a course period of time, disk stop working in a computer is to be expected and if it does, then all the data is lost. Oh no, that's not good! In this article, we will take a look on mirroring the data from a disk to another disk using software, and so the data are duplicated on at least two disks. This will reduced the data loss risk by 50%! There is also hardware raid but in this article, we will look into software raid. Specifically software raid one, that is mirroring. For detail explanation of software raid one, please read on this link  but for a shorter explanation, it is basically save the data into two disk at once and read from two disk.

This article assumed you have two disks with same storage capacity and only one partition per disk and this one partition occupied the whole disk size. So the operating system detected both disks as sda and sdb. Let's start to partition them first. Note, create partition will make your current data lost and make sure you backup your data somewhere else safely before continue.
root@localhost:~# fdisk /dev/sdb 

Welcome to fdisk (util-linux 2.25.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): p
Disk /dev/sdb: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00053dc0

Device Boot Start End Sectors Size Id Type
/dev/sdb1 2048 976773119 976771072 465.8G fd Linux raid autodetect

As you can see above, this is supposed to be the end result it should be. You can type m for help. To create a partition, this is your homework, but as a hints, you need add a new partition, with only 1 partition and used all all the cylinder. Then you need to change the disk partition type to Linux raid auto and remember to save the change you made so fdisk will write the partition and partition type to the disk.

Repeat this procedure for another disk, sdc. The partition information of sdc should be identical to sdb above. Note, you can use fdisk -l /dev/sdb and fdisk -l /dev/sdc to verify the disk is changed accordingly.
root@localhost:~# fdisk /dev/sdc

Welcome to fdisk (util-linux 2.25.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): p
Disk /dev/sdc: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x00019748

Device Boot Start End Sectors Size Id Type
/dev/sdc1 2048 976773119 976771072 465.8G fd Linux raid autodetect

If you do not have mdadm install, you should install it now. To install mdadm, it is as easy as apt-get install mdadm. mdadm is a Linux utility used to manage software RAID devices.

After mdadm is installed, then it is time to add that two partition into mdadm. To do that, issue the following command.
# mdadm --create /dev/md0 --level=mirror --raid-devices=2 /dev/sdb1 /dev/sdc1

The above commend should return immediately and now you can format the new block device using the command.
# mkfs.ext4 /dev/md0

By now, the disk will be formatted to ext4 filesystem and you can check the progress using command cat /proc/mdstat. You can also check the raid detail using this command mdadm --detail /dev/md0 .
root@localhost:~# mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Fri Dec 12 03:54:49 2014
Raid Level : raid1
Array Size : 488254464 (465.64 GiB 499.97 GB)
Used Dev Size : 488254464 (465.64 GiB 499.97 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent

Intent Bitmap : Internal

Update Time : Thu Jan 8 21:42:16 2015
State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

Name :
UUID : be4c04c4:349da5d9:cbcd7313:7ec7cf60
Events : 26492

Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1

When the disk is done formatted, you should be able to see output like the following.
root@localhost:~# cat /proc/mdstat 
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sdc1[1] sdb1[0]
488254464 blocks super 1.2 [2/2] [UU]
bitmap: 4/4 pages [16KB], 65536KB chunk

unused devices: <none>

Note the UU, if the raid is degraded, like a disk failure, you should be able to see [_U] or [U_] depending on which disk is failing.

The last step is to mount this new device to a mount point so that we can start to use. The example below create a mount point on /mnt/myBackup and mount md0 to /mnt/myBackup
root@localhost:~# mkdir /mnt/myBackup
root@localhost:~# mount /dev/md0 /mnt/myBackup

To make this change survive over a reboot, you should add an entry into /etc/fstab.
/dev/md0 /mnt/myBackup ext4 defaults 1 2

You should also save the raid configuration into mdadm configuration file. The following command does just that.
root@localhost:~# mdadm --detail --scan > /etc/mdadm/mdadm.conf

That's it, I hope your data are save from now on.

Friday, January 30, 2015

Initial study to docker

Docker making so much fuss lately and today we are going to look into Docker. Let's start something basic, what actually is a docker? According to the definition from official site,
Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications. Consisting of Docker Engine, a portable, lightweight runtime and packaging tool, and Docker Hub, a cloud service for sharing applications and automating workflows, Docker enables apps to be quickly assembled from components and eliminates the friction between development, QA, and production environments. As a result, IT can ship faster and run the same app, unchanged, on laptops, data center VMs, and any cloud.

and explanation from wikipedia
Docker is an open-source project that automates the deployment of applications inside software containers, by providing an additional layer of abstraction and automation of operating system–level virtualization on Linux.[2] Docker uses resource isolation features of the Linux kernel such as cgroups and kernel namespaces to allow independent "containers" to run within a single Linux instance, avoiding the overhead of starting virtual machines.[3]

Okay, that's the theory. If you want to quickly get an idea how docker work, you can try it here!

For people who has run virtual machine environment before, it may seem, hey isn't this very similar to the current virtual machine? But they are not the same really. See the software stack below virtual machines versus docker.

docker-vs-vm

Next, we will install docker locally and the below illustration is using debian sid. If you run other linux distribution, you should read this page. First we will install and then start bash in the ubuntu container. Note that when pulling ubuntu image down, may take sometime which depending on your internet speed.
root@localhost:~# apt-get install docker.io
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
aufs-tools cgroupfs-mount libnih-dbus1 libnih1 makedev mountall plymouth
Suggested packages:
btrfs-tools debootstrap lxc rinse plymouth-themes
The following NEW packages will be installed:
aufs-tools cgroupfs-mount docker.io libnih-dbus1 libnih1 makedev mountall plymouth
0 upgraded, 8 newly installed, 0 to remove and 557 not upgraded.
Need to get 4,360 kB of archives.
After this operation, 21.6 MB of additional disk space will be used.
Do you want to continue? [Y/n] Y
Get:1 http://cdn.debian.net/debian/ unstable/main makedev all 2.3.1-93 [42.6 kB]
Get:2 http://cdn.debian.net/debian/ unstable/main plymouth amd64 0.9.0-9 [189 kB]
Get:3 http://cdn.debian.net/debian/ unstable/main libnih1 amd64 1.0.3-4.3 [127 kB]
Get:4 http://cdn.debian.net/debian/ unstable/main libnih-dbus1 amd64 1.0.3-4.3 [97.1 kB]
Get:5 http://cdn.debian.net/debian/ unstable/main mountall amd64 2.54 [68.3 kB]
Get:6 http://cdn.debian.net/debian/ unstable/main aufs-tools amd64 1:3.2+20130722-1.1 [92.9 kB]
Get:7 http://cdn.debian.net/debian/ unstable/main cgroupfs-mount all 1.1 [4,572 B]
Get:8 http://cdn.debian.net/debian/ unstable/main docker.io amd64 1.3.3~dfsg1-2 [3,739 kB]
Fetched 4,360 kB in 44s (97.8 kB/s)
Selecting previously unselected package makedev.
(Reading database ... 324961 files and directories currently installed.)
Preparing to unpack .../makedev_2.3.1-93_all.deb ...
Unpacking makedev (2.3.1-93) ...
Selecting previously unselected package plymouth.
Preparing to unpack .../plymouth_0.9.0-9_amd64.deb ...
Unpacking plymouth (0.9.0-9) ...
Selecting previously unselected package libnih1.
Preparing to unpack .../libnih1_1.0.3-4.3_amd64.deb ...
Unpacking libnih1 (1.0.3-4.3) ...
Selecting previously unselected package libnih-dbus1.
Preparing to unpack .../libnih-dbus1_1.0.3-4.3_amd64.deb ...
Unpacking libnih-dbus1 (1.0.3-4.3) ...
Selecting previously unselected package mountall.
Preparing to unpack .../mountall_2.54_amd64.deb ...
Unpacking mountall (2.54) ...
Selecting previously unselected package aufs-tools.
Preparing to unpack .../aufs-tools_1%3a3.2+20130722-1.1_amd64.deb ...
Unpacking aufs-tools (1:3.2+20130722-1.1) ...
Selecting previously unselected package cgroupfs-mount.
Preparing to unpack .../cgroupfs-mount_1.1_all.deb ...
Unpacking cgroupfs-mount (1.1) ...
Selecting previously unselected package docker.io.
Preparing to unpack .../docker.io_1.3.3~dfsg1-2_amd64.deb ...
Unpacking docker.io (1.3.3~dfsg1-2) ...
Processing triggers for man-db (2.7.0.2-5) ...
Processing triggers for dbus (1.8.12-3) ...
Setting up makedev (2.3.1-93) ...
/run/udev or .udevdb or .udev presence implies active udev. Aborting MAKEDEV invocation.
/run/udev or .udevdb or .udev presence implies active udev. Aborting MAKEDEV invocation.
/run/udev or .udevdb or .udev presence implies active udev. Aborting MAKEDEV invocation.
Setting up plymouth (0.9.0-9) ...
update-initramfs: deferring update (trigger activated)
update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
Setting up libnih1 (1.0.3-4.3) ...
Setting up libnih-dbus1 (1.0.3-4.3) ...
Setting up mountall (2.54) ...
Setting up aufs-tools (1:3.2+20130722-1.1) ...
Setting up docker.io (1.3.3~dfsg1-2) ...
Adding group `docker' (GID 139) ...
Done.
Processing triggers for dbus (1.8.12-3) ...
Setting up cgroupfs-mount (1.1) ...
Processing triggers for initramfs-tools (0.117) ...
update-initramfs: Generating /boot/initrd.img-3.9-1-amd64
W: mdadm: /etc/mdadm/mdadm.conf defines no arrays.
W: mdadm: no arrays defined in configuration file.
Processing triggers for libc-bin (2.19-13) ...
root@localhost:~#

jason@localhost:~$ docker run -i -t ubuntu /bin/bash
2015/01/08 16:27:07 Post http:///var/run/docker.sock/v1.15/containers/create: dial unix /var/run/docker.sock: permission denied
jason@localhost:~$ sudo docker run -i -t ubuntu /bin/bash
Unable to find image 'ubuntu' locally
Pulling repository ubuntu
8eaa4ff06b53: Download complete
511136ea3c5a: Download complete
3b363fd9d7da: Download complete
607c5d1cca71: Download complete
f62feddc05dc: Download complete
Status: Downloaded newer image for ubuntu:latest
root@bedef9a17ac3:/# cat /etc/issue
Ubuntu 14.04.1 LTS \n \l

root@bedef9a17ac3:/# exit
jason@localhost:~$ sudo docker run ubuntu /bin/echo "hello world"
hello world

One would ask, why should I replace virtualbox to docker? There are four main points as outline in this article :

  • Faster delivery of your applications

  • Deploy and scale more easily

  • Get higher density and run more workloads

  • Faster deployment makes for easier management


If you think the above points are attracting, perhaps you should consider it and I leave these additional materials for your further exploration.

docker 101 video presentation.
remember to sign up
get the image from docker hub.
last but not least, documentation.

Sunday, January 18, 2015

how to improve apache cassandra 1.0.8 read speed

This article is for improve reading speed for apache cassandra 1.0.8. Because the reading improvement determined by many factors, we will investigate all possible areas so the gain will be improve collectively. So you may experience these factors and alter according to suit your node environment to achieve the best result. As the cassandra 1.0 released, the official cited that the read performance has increased up to 400%!

First and foremost, there are numerous articles which I use as a reference has cited copyright, I take no ownership nor credit of their hardwork as that is rightfully belong to them entirely. I only reference their work to improve my knowledge and to help people (like me) who need help and came to read what I share in my article.

Let's split these improvements into two parts, the hardware and the software.

hardware

ssd

ssd disk is way faster than hdd disk in term of reading in multiple magnitude, please read cassandra-benchmark for the benchmark. Although the cassandra using was version 0.8.10, but when cassandra 1.0 was released, read gained tremendous improvement. Then these two improvemetns will be linear gain too. Also, it is recommend to read the aforementioned article as it explain why is the random speed will hurt the read performance for hdd disk.

multiple disks

disks allocation to the commit log should be different than the data directory. Because during data write, data is repeating appending to the commit log. If the data directory is located on different drive, read performance gain should be visible.

 

software

memtable
If write behaviour has a lot of updates, it is good to look into memtable settings. There are two settings which you can start with

  • memtable_total_space_in_mb

  • commitlog_total_space_in_mb


more memory to this settings means the frequent write (update) will be absorded by the memory and thus, reading will be fast too as read start from memtable first before going into sstable. But because this impact system wide, you might want to gradually start to increase it and measure them. Read below for more information on what these two settings are and how to tune them
http://www.datastax.com/docs/1.0/operations/tuning#memtable-sizing
http://www.datastax.com/docs/1.0/configuration/node_configuration#memtable-total-space-in-mb
http://www.slideshare.net/driftx/cassandra-summit-2010-performance-tuning slide 14 and slide 26
http://wiki.apache.org/cassandra/MemtableThresholds

WP-DataStax-Cassandra page 16
Specifically, for read performance, Cassandra 1.0 optimizes queries by using a lighter-weight data structure for representing a row fragment from a read, than for a row fragment in a memtable into which updates accumulates. Also, with named reads, Cassandra 1.0 includes enhancements for deserializing the most recent versions of requested columns. Combined with the other optimizations, this makes reads in Cassandra as fast as writes for many workloads.

data compression

Previously I have done a study on the compression affect improvement read, read here, herehere and here. Please read the links as it provide comprehensive explanation than I could describe here.

compaction

compaction can improve (or impact) the read speed. Citation fromWP-DataStax-Cassandra,
The above process produces exceptionally fast write operations; however it also can lead to data fragmentation across the disk. Read requests may have to combine data from many SStables as well as Memtables to satisfy end user requests for data, and this can increase query response times.

To reduce data fragmentation and reclaim space taken by obsolete data, Cassandra performs "compactions" that merge the most recent data from many different SStables on disk into a new one.

So with my experience, if you trigger compaction (major) through nodetool, during compaction, the read latency will increase, thus impact but when the compaction is done, the read performance is improved.

In this documentation,  it explain different compaction strategy to use for read or write workload. So identify how's your environment write and read pattern and always measure it so you know what and when it could went wrong. Choose compaction strategy to suit your data model. For instance, if cassandra is not strong at a point, choose other big data technology. Read here for bad experience encountered.

sstables counts

Keep the sstables counts as low as possible for a column family. Excerpt from FULLTEXT02 page 39.
If a read operation is performed, initially the data are read from the memtable. If data are not in the memtable, then data get read from SSTable. Multiple SSTables may be looked up to find the data. Reading directly from SSTable decreases the performance because there are many SSTable that might need to be looked at hence requires an I/O operation means it requires touching the disk. Compared to SSTable, reading directly from memtable is fast because there is no I/O involved. The more the I/O operations are involved, the more performance will be degraded. Performance can also be increased by increasing the size of memtable [7].

Cassandra uses Bloom filter to judge quickly whether the key exists in the SSTable or not before touching the disk. Bloom filter is a efficient data structure that checks whether element is a member of a set by dividing the memory into buckets. Check each bucket to see if a key is present and if any bucket is empty then key was never inserted before. If there are many SSTables, then lots of I/O operations would be needed to read the data which can definitely decrease the performance. This is because of the fact that I/O operations are expensive and therefore compaction is used to improve read performance. Compaction merge two SSTables and sort to become one SSTable, which eventually decreases the number of SSTables and number of I/O operations, hence increasing the performance [7].

key / row cache

key cache should be enable to reduce the search from touching the disk, especially spinning disk. Excerpt from FULLTEXT02 page 47.
By default key cache is enabled and Cassandra caches 20,000 keys per Column Family (CF). The key cache decreases the Input/Output (I/O) operations because if key cache is not enabled then I/O operation is required in order to figure out the exact location of the row. Key cache holds the exact location of the data belonging to that key.

 

Row cache holds the entire content of the row in cache. By default, row cache is disabled. The overhead of enabling or increasing the row cache is that it may require more Java Virtual Machine (JVM) heap of Cassandra. By if jna lib is available, then storing row cache off heap is a good option. This article has diagram on how read is perform.

concurrent_read

Excerpt from FULLTEXT02 page 48.
Read performance can also be increased by tuning the concurrent reads. The rule is span 4 threads per Central Processing Units (CPU) core in the cluster. The higher the number of threads spanned for read, the higher performance can be achieved if the machines have got faster I/O.

A word of cautious, I tried increase concurrent_read from 32 to 64 and see some unpredictable behaviour, so it is better you do this in test environment.

decreasing read consistency level

If your business requirement can tolerate of eventual consistency, then decrease from quorum to one will improve read speed as only one node acknowledgement is sufficient to fulfill the read request compare to a certain amount of nodes in quorum.

turn off swap space

When the node start to swap due to shortage of memory, the response of node be it write or read will be visible. Hence it is best to turn off swap, and let the operating system kill or jvm kill itself to oom than the page swap start to happen.

java heap

Citation from OS-8.1.3-Cassandra Installation and Configuration Guide page 33
HEAP_NEWSIZE : Size of young generation. Larger value leads to longer GC pause times while smaller value will typically lead to more expensive GC. Set in conjunction with MAX_HEAP_SIZE.

So tune it carefully since this is pretty low level. Read this article as it mentioned a few garbage collector settings for cassandra and memory footprint.

upgrade

Each release of software improve or fix the previous defect, so is cassandra. If upgrade is viable, you should consider. For instance, to quote Aaron Morton
1.0 has key and row caches defined per CF, 1.1 has global ones which are better utilised and easier to manage. 1.2 moves bloom filters and compression meta off heap which reduces GC, which will help.  Things normally get faster.

This is also true.

monitoring

Because data load increase and/or decrease will impact the read response time, it is vital if there is monitoring services running. As cited from this paper, p1724_tilmannrabl_vldb2012 Page 1,
In modern enterprise systems it is not uncommon to have thousands of different metrics that are reported from a single host machine.

So monitor crucial metrics by cassandra, example, cpu, java heap and io should give some indicator if your speed has been reduced.

Whilst these are collected knowledge are from public and free will sharing. Any mistake and errors in this article is mine alone and does not reflect to them. Thank you and I hope you learned something.

Saturday, January 17, 2015

Investigate into Apache Cassandra Memtable updateLiveRatio logging output

Today, we are going to study apache cassandra 1.0.8 on memtable logging its statistics in the cassandra system.log. Example below
WARN [MemoryMeter:1] 2014-10-17 07:38:15,346 Memtable.java (line 176) setting live ratio to minimum of 1.0 instead of 0.16714977001091136
INFO [MemoryMeter:1] 2014-10-17 07:38:15,346 Memtable.java (line 186) CFS(Keyspace='OpsCenter', ColumnFamily='pdps') liveRatio is 1.6474114033116056 (just-counted was 1.0). calculation took 3ms for 595 columns

What does the above logging output means? Is cassandra instance operating normally since logging level is warn? Before we dive into the code to understand its meaning, let first read and understand what is memtable in cassandra context.

Excerpts from datastax documentation

Cassandra is optimized for write throughput. Cassandra writes are first written to a commit log (for durability), and then to an in-memory table structure called a memtable. A write is successful once it is written to the commit log and memory, so there is very minimal disk I/O at the time of write. Writes are batched in memory and periodically written to disk to a persistent table structure called an SSTable (sorted string table). Memtables and SSTables are maintained per column family. Memtables are organized in sorted order by row key and flushed to SSTables sequentially (no random seeking as in relational databases).

Whenever method updateRatio() from class Memtable is called, the following codes applies.
public void updateLiveRatio()
{
if (!MemoryMeter.isInitialized())
{
// hack for openjdk. we log a warning about this in the startup script too.
logger.warn("MemoryMeter uninitialized (jamm not specified as java agent); assuming liveRatio of 10.0. Usually this means cassandra-env.sh disabled jamm because you are using a buggy JRE; upgrade to the Sun JRE instead");
cfs.liveRatio = 10.0;
return;
}

Runnable runnable = new Runnable()
{
public void run()
{
activelyMeasuring = Memtable.this;

long start = System.currentTimeMillis();
// ConcurrentSkipListMap has cycles, so measureDeep will have to track a reference to EACH object it visits.
// So to reduce the memory overhead of doing a measurement, we break it up to row-at-a-time.
long deepSize = meter.measure(columnFamilies);
int objects = 0;
for (Map.Entry<DecoratedKey, ColumnFamily> entry : columnFamilies.entrySet())
{
deepSize += meter.measureDeep(entry.getKey()) + meter.measureDeep(entry.getValue());
objects += entry.getValue().getColumnCount();
}
double newRatio = (double) deepSize / currentThroughput.get();

if (newRatio < MIN_SANE_LIVE_RATIO)
{
logger.warn("setting live ratio to minimum of 1.0 instead of {}", newRatio);
newRatio = MIN_SANE_LIVE_RATIO;
}
if (newRatio > MAX_SANE_LIVE_RATIO)
{
logger.warn("setting live ratio to maximum of 64 instead of {}", newRatio);
newRatio = MAX_SANE_LIVE_RATIO;
}
cfs.liveRatio = Math.max(cfs.liveRatio, newRatio);

logger.info("{} liveRatio is {} (just-counted was {}). calculation took {}ms for {} columns",
new Object[]{ cfs, cfs.liveRatio, newRatio, System.currentTimeMillis() - start, objects });
activelyMeasuring = null;
}
};

try
{
meterExecutor.submit(runnable);
}
catch (RejectedExecutionException e)
{
logger.debug("Meter thread is busy; skipping liveRatio update for {}", cfs);
}
}

There is a new thread which will be execute by the executor meterExecutor. This executor if it is not busy, will start to measure this column family associated with this memtable. There are two essential metrics that is involve in the arithmetic, key size plus value size and the column count. deepSize is the summation of the column family , its key and its value. A new variable newRatio is calculated with the deepSize divided by the currentThroughput.

Valid newRatio range is between 1.0 to 64.0 inclusive. When calculated newRatio is less than 1.0, first line of log such as above will started to appear in cassandra system log and newRatio will be reset to 1.0. The same check when newRatio exceed 64.0, it will be logged and value for newRatio reset to maximum valid value of 64.0. Then column family live ratio is updated with whichever which is higher, the current or the new calculated newRatio.

So technically, this is nothing to really concern about. It's a measurement of the number of operations has increased as compare to the previous. Also, given the next throughput, as estimated of size of the memtable can be calculated based on the newRatio. As a side note, this method updateRatio() is called after a mutation has been applied to the memtable but before a flush is requested.

That's it for today, I hope you learned something.

Friday, January 16, 2015

operate casandra using jmx in terminal including changing pool size, compacting sstables and key cache

If you operate apache cassandra cluster and if load per node goes huge (like nodetool info show 800GB), compactions become a problem. It's a big problem for apache cassandra 1.0.8 if you have load per node average hover around 600GB to 1TB. The read performance suffers and at times system uptime load goes high. In some instance, I noticed when repair is running, system load goes more than 20. It's not a concern if this is operating well, but the more often you see this, something has gone wrong. Today, I will share my experience on how to operate cassandra when node load is huge and cassandra instance is still running. Often times, there are nice method that is exposed via jmx but to operate remotely, jmx gui client such as jmxconsole is not ideal. Instead, we will using a jmxterm for these operation in apache cassandra 1.0.8. So let's get started.

Changing pool size

So, it is pretty simple, launch it and set to the bean, and then set the CorePoolSize. The steps will be illustrate below.
$ java -jar jmxterm-1.0-alpha-4-uber.jar
$>open localhost:7199
#Connection to localhost:7199 is opened
$>bean org.apache.cassandra.request:type=ReplicateOnWriteStage
#bean is set to org.apache.cassandra.request:type=ReplicateOnWriteStage
$>get CorePoolSize
#mbean = org.apache.cassandra.request:type=ReplicateOnWriteStage:
CorePoolSize = 32;
$>info
#mbean = org.apache.cassandra.request:type=ReplicateOnWriteStage
#class name = org.apache.cassandra.concurrent.JMXConfigurableThreadPoolExecutor
# attributes
%0 - ActiveCount (int, r)
%1 - CompletedTasks (long, r)
%2 - CorePoolSize (int, rw)
%3 - CurrentlyBlockedTasks (int, r)
%4 - PendingTasks (long, r)
%5 - TotalBlockedTasks (int, r)
#there's no operations
#there's no notifications
$>set CorePoolSize 64
$>get CorePoolSize
#mbean = org.apache.cassandra.request:type=ReplicateOnWriteStage:
CorePoolSize = 64;

Alter key cache

Often times, when there is heap pressure in the jvm, the safety valve kicks in.  You can restart the cassandra instance or you can reset the key cache back to the initial value. Assuming your column family name FooBar and keyspace just4fun, then the following are steps to illustrate how is this done.
$>bean org.apache.cassandra.db:cache=FooBarKeyCache,keyspace=just4fun,type=Caches
#bean is set to org.apache.cassandra.db:cache=FooBarKeyCache,keyspace=just4fun,type=Caches
$>info
#mbean = org.apache.cassandra.db:cache=FooBarKeyCache,keyspace=just4fun,type=Caches
#class name = org.apache.cassandra.cache.AutoSavingKeyCache
# attributes
%0 - Capacity (int, rw)
%1 - Hits (long, r)
%2 - RecentHitRate (double, r)
%3 - Requests (long, r)
%4 - Size (int, r)
#there's no operations
#there's no notifications
$>
$>get Size
#mbean = org.apache.cassandra.db:cache=FooBarKeyCache,keyspace=just4fun,type=Caches:
Size = 122307;

$>set Capacity 250000
#Value of attribute Capacity is set to 250000
$>get Capacity;
#mbean = org.apache.cassandra.db:cache=FooBarKeyCache,keyspace=just4fun,type=Caches:
$>get Capacity
#mbean = org.apache.cassandra.db:cache=FooBarKeyCache,keyspace=just4fun,type=Caches:
Capacity = 250000;

Compact sstable

Lastly, to compact sstables. It's amazing we have a sstable that as huge as 84GB! So trigger major compaction is not an option here, often time when load per node goes beyond 600GB, compaction took forever, as GC kick in and cpu keep on recollecting heap, making system load goes high. So here, we will select one sstable that is huge and compact that only. You can also select a few sstable and compact them and separate using comma.
$>bean org.apache.cassandra.db:type=CompactionManager
#bean is set to org.apache.cassandra.db:type=CompactionManager
$>run forceUserDefinedCompaction just4fun FooBar-hc-5-Index.db
#calling operation forceUserDefinedCompaction of mbean org.apache.cassandra.db:type=CompactionManager
#RuntimeMBeanException: java.lang.IllegalArgumentException: FooBar-hc-5-Index.db does not appear to be a data file
$>run forceUserDefinedCompaction just4fun FooBar-hc-401-Data.db
#calling operation forceUserDefinedCompaction of mbean org.apache.cassandra.db:type=CompactionManager
#operation returns:
null

The compaction should be started, you can check in cassandra system log or the nodetool compaction. So that's it, I hope you learned something.

Saturday, January 3, 2015

apache cassandra 1.0.8 IncompatibleClassChangeError vtable stub and AssertionError Added column does not sort as the last column

Today we will spend sometime to look into two errors and see if it is really something to concern about. The erros are thrown when apache cassandra version 1.0.8 is running. Okay, let's to the first error.
ERROR [ReadStage:1559] 2012-10-16 20:38:25,336 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[ReadStage:1559,5,main]
java.lang.IncompatibleClassChangeError: vtable stub
at org.apache.cassandra.db.AbstractColumnContainer.getColumn(AbstractColumnContainer.java:134)
at org.apache.cassandra.db.Memtable$6.computeNext(Memtable.java:402)
at org.apache.cassandra.db.Memtable$6.computeNext(Memtable.java:384)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
at org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:93)
at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:62)
at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1298)
at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1184)
at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1151)
at org.apache.cassandra.db.Table.getRow(Table.java:375)
at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:58)
at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:765)
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1224)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

I guess this is really some fatal errors encountered. Let's check out what is IncompatibleClassChangeError means. From IncompatibleClassChangeError javadoc

Thrown when an incompatible class change has occurred to some class definition. The definition of some class, on which the currently executing method depends, has since changed.

Honestly, this is really odd, this error thrown out of no where and the jvm stopped. I guess nobody change the compiled apache cassandra code during the node instance is running. It certainly sounds odd but I guess when the data hold per node is huge, then strange thing start to happened. Now, let's get into stack trace and follow stack trace path.

Bottom three stack traces are pretty obvious, a new thread was started and execute by the thread pool executor. Then we have class StorageProxy. Within this class, there is a static class LocalReadRunnable which implement the abstract method LocalReadRunnable. It seem like it is trying to read a local node table row. The table eventually make calls to the column family to retrieve column. Tracing even deeper, at line 134 of class AbstractColumnContainer,  reveal that no exception is thrown from here. This is like a mystery! :) The stack trace analysis and observed jvm stopped shown something is wrong. Though I am not sure what went wrong but if you have any idea, please discuss it as a comment below.

Next we look to another error.
ERROR [CompactionExecutor:4] 2014-10-22 06:13:00,884 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[CompactionExecutor:4,1,main]
java.lang.AssertionError: Added column does not sort as the last column
at org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:126)
at org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:129)
at org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:124)
at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:148)
at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:232)
at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:110)
at org.apache.cassandra.db.compaction.PrecompactedRow.<init>(PrecompactedRow.java:97)
at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:137)
at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:102)
at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:87)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:118)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:101)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:173)
at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:135)
at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:115)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

Again, this error pop out of nowhere, during node operation, this exception just thrown. So now, we will look into what happened using the stack trace given. Similar to the above, bottom three stack traces are pretty obvious, a new thread was started and execute by the thread pool executor. It is a compaction thread and when compacting sstable, things break. Then row get reduced and columns get deserialized. When columns are re-formation, it throw an exception as the column does not sort. This is another strange phenomenon. It should have been sort before anyway but it is not. The javdoc for this method addColumn

AddColumn throws an exception if the column added does not sort after the last column in the map. The reasoning is that this implementation can get slower if too much insertions are done in unsorted order and right now we only use it when *all* insertion (with this method) are done in sorted order. The assertion throwing is thus a protection against performance regression without knowing about (we can revisit that decision later if we have use cases where most insert are in sorted order but a few are not).

It seem that this is probably reproducible during development environment as a few are not.. I guess this is not as fatal as previous error. That's it for this analysis, please give your inputs or comments if you have workaround.




 

UPDATE:

It seem that the second error is fix in https://issues.apache.org/jira/browse/CASSANDRA-5856

Friday, January 2, 2015

Android first application hello world

First Off, Happy New Year Everybody!

Perhaps there are ten or even hundreds of first application develop using android if you google. Well today, I will share mine too, hey, start of year, a hello world would be nice too. :) If you have programming skill in java and using eclipse IDE, then you come to this right place. In this article, I will use eclipse IDE and develop a hello world android application. I figured it was a painless few steps setup but it took me several days to get everything going due to my hectic life and problems encountered during this learning journey. I hope with these explain, you will have better and easier learning experience than I did.

This article assume you have read and had setup the Android Developer Toolkit plugin into eclipse IDE and Speed up android emulator startup in eclipse. If you have not, please refer to this link and this link respectively. Okay, let's get started as I will show project creation in steps with screenshots. I will document and describe my learning experience below but if you want to read further on certain topics, you can refer on the official documentation, then please refer to this link.

1. create a new android application project. If it is not listed in the tab File, try Other... ;)

eclipse-android-new-project

eclipse-android-application-project

2. Click Next button, it is a window to configure the new android application project. Here fill in the information which as such for this introduction course.

eclipse-android-new-application

3. Click Next button, yet another window to configure the project. The default settings will work mostly fine for this introductory course.

eclipse-android-new-application-configure

4. Click next button, this is a setting for the application icons. I'm not a color person but this colors design works like a charm for me. :-) if you want to take the challenge, try change the default icon.

eclipse-android-new-application-configure-icon

5. Click next button, the default Black Activity just fine to display the text hello world later.

eclipse-android-new-application-create-activity

6. Click next button, again, the default configuration work just fine. This is the last configuration.

eclipse-android-new-application-create-activity-1

7. Click finish button and now check your workspace, it should have created all the necessary files.

I remembered the first time I have setup, the tutorial never mentioned about installing android support libraries, so once Finish button was click, the project created with errors. This is a big of learning curve for something who is new to android. Let alone to say if you are new to java or even IDE. See screenshot below.

eclipse-android-androidManifest-missing

For your information, this project require a support library which is available in Android SDK manager. Okay, now, let's install android support library. On eclipse tab, click on Window and then Android SDK Manager. Find out where is the android support library and install it. See the screenshots below, it should give you sufficient of information to get this done.

eclipse-android-sdk-manager

 

eclipse-android-sdk-manager-accept-licence eclipse-android-sdk-manager-installing-support-packageeclipse-android-sdk-manager-done-support-packageBy now, eclipse will request you to restart, hence, restart your eclipse. Once restarted, you should see a new project known as appcompat_v7 exists in the Package Explorer window. If you project is still contain error, check if your project is build automatically. You can click on eclipse menu tab, Project then Build Automatically... is checked. The project should contain no error and if there is still error, in the eclipse, click on Problems tab and identify what is the error. Because error is dependent on your platform and version of eclipse and/or android sdk, hence, if you have problem, google and fix them. This is left for your exercises. If everything is okay, your project should be as similar as mine such as below.

appcompat_v7_myfirstapp eclipse-hello-world

Okay, now check the project properties make sure everything is okay.

eclipse-android-java-build-path eclipse-android-configuration

As you can notice above, I have also install Android 4.4W.2 sdk for my learning journey, but I have enabled Android 5.0 for this project. The reference project appcompat_v7 is valid and exists. Both of android projects are free from any errors. So far so good.

Note, there is a directory known as gen and educated guess said it is automatic generated files. As such, you should not modify directory to the file within this gen package but leave it automatically generated. Please note, R.java must autogenerated for this project to compiled successfully for our next journey and if you do not see this file, then you should be worry and find out where and what is the problem why R.java never get generated. Hint, don't try to add into an invalid R.java from somewhere but fix the root cause here.

Just before we launch the application in android emulator, let's see some android files here.

eclipse-android-style-xml eclipse-android-androidManifest

So all looking good, it should be ready now to launch this hello world android application in android emulator. Just like launching a simple java app, the sequence is pretty much the same for android. Right click on the project, then click on Run As / Debug As. Since this is the first android application, you might have not configured Android Virtual Device (avd), you should by now configure one. If you have done the previous article, Speed up android emulator startup in eclipse. It should be a breeze here and just select this avd. If not, you can follow the procedures using the screenshots below.

eclipse-debug-as-android-applicationeclipse-android-android-avd-erroreclipse-android-android-device-choosereclipse-android-avd-configurationeclipse-android-avd-configured-vdeclipse-android-start-avdeclipse-android-starting-avdeclipse-android-launching-app

Because the startup emulator is slow, I would suggest to leave the emulator running as long as you are learning to develop the application. Because starting the emulator in my workstation took minutes. As seen in the output below, I have configured two different avd and the output of one screenshot shown My First App is shown in the android menu. Cool!

eclipse-android-started-avd eclipse-android-avd-screen

Try click on My First App through the emulator, it should hello world. :-) That's it for this learning experience and I hope you learn something and if you want to contribute back, you can do so by donating back.

Thank you.