Showing posts with label java. Show all posts
Showing posts with label java. Show all posts

Friday, December 4, 2015

Learn java util concurrent part2

This series is the next learning series of java.util.concurrent. You should read the first part here too. Today we will learn ten interfaces in package java.util.concurrent.

Okay, let's start on the queue interface, BlockingDeque. Some characteristics of this interface including

  • blocking
  • thread safe
  • does not permit null elements
  • may (or may not) be capacity-constrained.

and we can do adding/removing item to this queue. Example below.

1:  bd.add(1);  
2:  bd.add(2);  
3:  System.out.println("size: " + bd.size());  
4:          
5:  bd.add(3);  
6:  System.out.println("size: " + bd.size());  
7:  //bd.add(4); // exception  
8:          
9:  bd.forEach(s -> System.out.println(s));  

Try play around this class with different methods to get a basic understanding on it. Next, we have a similar queue called BlockingQueue. It's characters same as BlockingDeque, not sure where is the different. But official java has many classes implement this BlockingQueue.

1:  BlockingQueue<Integer> bq = new ArrayBlockingQueue<Integer>(10);  
2:  bq = new DelayQueue();  
3:  bq = new LinkedBlockingDeque<Integer>();  
4:  bq = new LinkedTransferQueue<Integer>();  
5:  bq = new PriorityBlockingQueue<Integer>();  
6:  bq = new SynchronousQueue<Integer>();  

Next we have Callable interface which is similar to Runnable with a clear distinction. Callable return a value.

1:  ExecutorService executorService = Executors.newFixedThreadPool(1);  
2:    
3:  Future<Integer> future = executorService.submit(new Summer(11,22));  
4:    
5:  try {  
6:     Integer total = future.get();  
7:     System.out.println("sum " + total);  
8:  } catch (Exception e) {  
9:     e.printStackTrace();  
10:  }  
11:    

As can be read above, Summer is a implementation of interface Callable and it is submitted to an executor service to be execute upon on. CompletableFuture.AsynchronousCompletionTask is a interesting interface and its official documentation said "A marker interface identifying asynchronous tasks produced by async methods. This may be useful for monitoring, debugging, and tracking asynchronous activities."

1:  CompletableFuture<Integer> cf = new CompletableFuture<Integer>();  
2:  ForkJoinPool.commonPool().submit(  
3:        (Runnable & CompletableFuture.AsynchronousCompletionTask)()->{  
4:      try {  
5:         cf.complete(1);  
6:      } catch (Exception e) {  
7:         cf.completeExceptionally(e);  
8:      }  
9:   });  

As can be read above, we submit a anonymous function to the ForkJoinPool where this anonymous function cast into the intersection of interface Runnable and CompletableFuture.AsynchronousCompletionTask. Moving on, we have Interface CompletionService. Now if you have a long running service, you might want to look into this interface. Example as can be read below.

1:  CompletionService<Integer> longRunningCompletionService = new ExecutorCompletionService<Integer>(executorService);  
2:    
3:  longRunningCompletionService.submit(() -> {System.out.println("done"); return 1;});  
4:    
5:  try {  
6:     Future<Integer> result = longRunningCompletionService.take();  
7:     System.out.println(result.get());  
8:  } catch (Exception e) {  
9:     // TODO Auto-generated catch block  
10:     e.printStackTrace();  
11:  }  

Persist the object longRunningCompletionService throughout your application and the result can be retrieve in the future. Pretty handy. Moving on, we have a new Interface CompletionStage which debut on jdk8. From CompletionStage javadoc, A stage of a possibly asynchronous computation, that performs an action or computes a value when another CompletionStage completes. A stage completes upon termination of its computation, but this may in turn trigger other dependent stages.

Example code using CompletionStage as of following.

1:  ListenableFuture<String> springListenableFuture = createSpringListenableFuture();  
2:    
3:  CompletableCompletionStage<Object> completionStage = factory.createCompletionStage();  
4:  springListenableFuture.addCallback(new ListenableFutureCallback<String>() {  
5:    @Override  
6:    public void onSuccess(String result) {  
7:       System.out.println("onSuccess called");  
8:      completionStage.complete(result);  
9:    }  
10:    @Override  
11:    public void onFailure(Throwable t) {  
12:       System.out.println("onFailure called");  
13:      completionStage.completeExceptionally(t);  
14:    }  
15:  });  
16:    
17:  completionStage.thenAccept(System.out::println);  

Until here, if you don't understand, you should start to take the code and start to work on it. In this concurrent package, we have two Maps to use, that is ConcurrentMap and ConcurrentNavigableMap.

1:  ConcurrentMap<String, String> cm = new ConcurrentHashMap();  
2:  cm = new ConcurrentSkipListMap<String, String>();  
3:    
4:  ConcurrentNavigableMap<String, String> cnm = new ConcurrentSkipListMap<String, String>();  

ConcurrentMap providing thread safety and atomicity guarantees whilst ConcurrentNavigableMap support additional supporting NavigableMap operations, and recursively so for its navigable sub-maps. Then we have interface Delayed.

1:  Random random = new Random();  
2:  int delay = random.nextInt(10000);  
3:  Delayed employer = new SalaryDelay("a lot of bs reasons", delay);  
4:  System.out.println("bullshit delay this time " + employer.getDelay(TimeUnit.SECONDS));  

Delayed is an interface where you should implemented two require methods, getDelay and compareTo. As you can read fundamentally is you have a few object which feed to the executor to process upon but not immmediate, it may get delay for any reasons. As an example, pretty common in the working world, emplayer delay salary for any reasons.

Last two interfaces is related to each other where ExecutorService is a sub interface of Executor. For ExecutorService, we have seen an example above and below is for Executor,

1:  Executor executor = new ForkJoinPool();  
2:  executor = new ScheduledThreadPoolExecutor(1);  
3:    
4:  BlockingQueue<Runnable> blockingQueue = new ArrayBlockingQueue<Runnable>(4);  
5:  executor = new ThreadPoolExecutor(1, 1, 1000, TimeUnit.SECONDS, blockingQueue);  

We see there are many classes implemented Executor interface. Executor interface guaranteed


  • An object that executes submitted Runnable tasks. 
  • Executor interface does not strictly require that execution be asynchronous


That's it for this article, we continue the rest in the next article!

Oh before that, you can download the full source at the follow links

https://github.com/jasonwee/videoOnCloud/commit/3d291610df89c610215b8ecdbcc59cbb028ba14b
https://github.com/jasonwee/videoOnCloud/blob/master/src/java/play/learn/java/concurrent/LearnConcurrentInterfaceP1.java
https://github.com/jasonwee/videoOnCloud/blob/master/src/java/play/learn/java/concurrent/SalaryDelay.java
https://github.com/jasonwee/videoOnCloud/blob/master/src/java/play/learn/java/concurrent/Summer.java
https://github.com/jasonwee/videoOnCloud/blob/master/src/java/play/learn/java/CompletableFuture/LearnCompletionStage.java

Sunday, November 22, 2015

Learn java util concurrent part1

Today we are going to learn classes in java package java.util.concurrent. Because there are many classes within java.util.concurrent package, there will be several articles covering classes in this package. Let's start with a simple class first, TimeUnit.

1:  package play.learn.java.concurrent;  
2:    
3:  import java.util.concurrent.BrokenBarrierException;  
4:  import java.util.concurrent.TimeUnit;  
5:    
6:  public class LearnTimeUnit {  
7:       
8:     public LearnTimeUnit() throws InterruptedException {  
9:          
10:        // assuming we have a long running apps which ran for 2 days 7hours 35minutes 6 seconds   
11:        long longRunningApplicationDuration = 200102l;  
12:          
13:        System.out.println("duration in nanos " + TimeUnit.SECONDS.toNanos(longRunningApplicationDuration));  
14:        System.out.println("duration in days " + TimeUnit.SECONDS.toDays(longRunningApplicationDuration));  
15:        System.out.println("duration in hours " + TimeUnit.SECONDS.toHours(longRunningApplicationDuration));  
16:        System.out.println("duration in micros " + TimeUnit.SECONDS.toMicros(longRunningApplicationDuration));  
17:        System.out.println("duration in millis " + TimeUnit.SECONDS.toMillis(longRunningApplicationDuration));  
18:        System.out.println("duration in minutes " + TimeUnit.SECONDS.toMinutes(longRunningApplicationDuration));  
19:        System.out.println("duration in seconds " + TimeUnit.SECONDS.toSeconds(longRunningApplicationDuration));  
20:          
21:          
22:        TimeUnit[] var = TimeUnit.values();  
23:        System.out.println("size " + var.length);  
24:          
25:        for (TimeUnit elem : var) {  
26:           System.out.println(elem.name());  
27:        }  
28:          
29:        TimeUnit.SECONDS.sleep(10);  
30:     }  
31:    
32:     public static void main(String[] args) throws InterruptedException {  
33:        new LearnTimeUnit();  
34:     }  
35:    
36:  }  

TimeUnit provides several helpful methods to convert the time to different unit. You can also download the above source code here.

Next, we will take a look at concurrent exceptions. This exception will become meaningful when we try catch it in the class. For now, we will go through the definition to get a basic understanding of them. Below is a summarization.

BrokenBarrierException
Exception thrown when a thread tries to wait upon a barrier that is in a broken state, or which enters the broken state while the thread is waiting.

CancellationException
Exception indicating that the result of a value-producing task, such as a FutureTask, cannot be retrieved because the task was cancelled.

CompletionException
Exception thrown when an error or other exception is encountered in the course of completing a result or task.

ExecutionException
Exception thrown when attempting to retrieve the result of a task that aborted by throwing an exception. This exception can be inspected using the Throwable.getCause() method.

RejectedExecutionException
Exception thrown by an Executor when a task cannot be accepted for execution.

TimeoutException
Exception thrown when a blocking operation times out. Blocking operations for which a timeout is specified need a means to indicate that the timeout has occurred. For many such operations it is possible to return a value that indicates timeout; when that is not possible or desirable then TimeoutException should be declared and thrown.


BrokenBarrierException example, for full source code, you can download it here.

1:  package play.learn.java.concurrent;  
2:    
3:  import java.util.concurrent.BrokenBarrierException;  
4:  import java.util.concurrent.CyclicBarrier;  
5:    
6:  public class LearnBrokenBarrierException {  
7:       
8:     private CyclicBarrier cibai;  
9:     public static int count = 0;  
10:       
11:     private void manageThread() {  
12:        cibai = new CyclicBarrier(3);  
13:          
14:        for (int i = 0; i < 3; i++) {  
15:           new Thread(new Worker(cibai)).start();  
16:        }  
17:     }  
18:       
19:     public static void barrierComplete(CyclicBarrier cb) {  
20:        System.out.println("collating task");  
21:          
22:        if (count == 3) {  
23:           System.out.println("Exit from system");  
24:           // comment for finite  
25:           System.exit(0);  
26:        }  
27:        count++;  
28:          
29:        for (int i = 0; i < 3; i++) {  
30:        new Thread(new Worker(cb)).start();  
31:        }  
32:     }  
33:       
34:     public static void main(String[] args) {  
35:        new LearnBrokenBarrierException().manageThread();   
36:     }  
37:       
38:     static class Worker implements Runnable {  
39:          
40:        CyclicBarrier cibai;  
41:          
42:        public Worker(CyclicBarrier cb) {  
43:           this.cibai = cb;  
44:        }  
45:          
46:        @Override  
47:        public void run() {  
48:           doSomeWork();  
49:           try {  
50:              if (cibai.await() == 0)  
51:                 LearnBrokenBarrierException.barrierComplete(cibai);  
52:           } catch (InterruptedException e) {  
53:              e.printStackTrace();  
54:           } catch (BrokenBarrierException e) {  
55:              e.printStackTrace();  
56:           }  
57:        }  
58:    
59:        private void doSomeWork() {  
60:           System.out.println("Doing some work");  
61:        }  
62:          
63:     }  
64:    
65:  }  
66:    

CancellationException, ExecutionException, RejectedExecutionException and TimeoutException example, see below. Full source code can be download here.

1:  package play.learn.java.concurrent;  
2:    
3:  import java.util.concurrent.Callable;  
4:  import java.util.concurrent.CancellationException;  
5:  import java.util.concurrent.ExecutionException;  
6:  import java.util.concurrent.ExecutorService;  
7:  import java.util.concurrent.Executors;  
8:  import java.util.concurrent.TimeUnit;  
9:  import java.util.concurrent.TimeoutException;  
10:  import java.util.concurrent.FutureTask;  
11:    
12:  public class LearnCancellationException {  
13:    
14:     public static void main(String[] args) {  
15:        MyCallable callable1 = new MyCallable(1000);  
16:        MyCallable callable2 = new MyCallable(2000);  
17:    
18:        FutureTask<String> futureTask1 = new FutureTask<String>(callable1);  
19:        FutureTask<String> futureTask2 = new FutureTask<String>(callable2);  
20:    
21:        ExecutorService executor = Executors.newFixedThreadPool(2);  
22:        executor.execute(futureTask1);  
23:        executor.execute(futureTask2);  
24:    
25:        while (true) {  
26:           try {  
27:              if(futureTask1.isDone() && futureTask2.isDone()){  
28:                 System.out.println("Done");  
29:                 //shut down executor service  
30:                 executor.shutdown();  
31:                 return;  
32:              }  
33:                
34:              // uncomment for cancel  
35:              //futureTask2.cancel(true);  
36:    
37:              if(!futureTask1.isDone()){  
38:              //wait indefinitely for future task to complete  
39:              System.out.println("FutureTask1 output="+futureTask1.get());  
40:              }  
41:    
42:              System.out.println("Waiting for FutureTask2 to complete");  
43:              // set a samll range to get timedout exception.  
44:              String s = futureTask2.get(2000L, TimeUnit.MILLISECONDS);  
45:              if(s !=null){  
46:                 System.out.println("FutureTask2 output="+s);  
47:              }  
48:           } catch (CancellationException e) {  
49:              e.printStackTrace();  
50:           } catch (InterruptedException | ExecutionException e) {  
51:              e.printStackTrace();  
52:           } catch(TimeoutException e){  
53:              e.printStackTrace();  
54:           }  
55:        }  
56:    
57:     }  
58:       
59:     static class MyCallable implements Callable<String> {  
60:          
61:        private long waitTime;  
62:          
63:        public MyCallable(int timeInMillis) {  
64:           this.waitTime = timeInMillis;  
65:        }  
66:    
67:        @Override  
68:        public String call() throws Exception {  
69:           Thread.sleep(waitTime);  
70:           return Thread.currentThread().getName();  
71:        }  
72:          
73:     }  
74:    
75:  }  
76:    

CompletionException example, for full source code, you can download it here.

1:  public class LearnCompletableFuture {  
2:    
3:     public void learnCompletionException() {  
4:        try {  
5:           List<String> list = Arrays.asList("A", "B", "C", "D");  
6:           list.stream().map(s->CompletableFuture.supplyAsync(() -> s+s))  
7:           .map(f->f.getNow("Not Done")).forEach(s->System.out.println(s));  
8:    
9:        } catch (CompletionException e) {  
10:           e.printStackTrace();  
11:        }  
12:    
13:     }  
14:    
15:     public static void main(String[] args) {  
16:        LearnCompletableFuture c = new LearnCompletableFuture();  
17:        c.learnCompletionException();  
18:     }  
19:  }  

That's it for this article, for the incoming interface and classed until java.util.concurrent which will be publish in the next few articles, until then.

Saturday, November 21, 2015

Java Garbage Collector

If you are a java developer, java garbage collection (gc) sometime pop up from time to time in javadoc, online article or online discussion. It is such a hot and tough topic because that is entirely different paradigm than what programmer usually do, that is coding. Java gc free heap for the object you created in class in the background. In the past, I also cover a few article which related to java gc and today I am thinking to go through several blogs/articles which I found online, learn the basic and share what I've learned  and hopefully for java programmer, java gc will become clearer.

When you start a java application, with the parameters that are assigned to the java, the operating system will reserved some memory for java application known as heap. The heap further divided into several regions collectively known as eden, survivor spaces, old gen and perm gens. In oracle java8 hotspot, perm gen has been removed, be sure to always check official documention on garbage collector for changes. Below are a few links for hotspot implementation for java gc.
Survivor spaces are divided into two, survivor 0 and survivor 1. Both eden and survivor spaces collectively known as Young generation or new generation whilst old gen also known as tenured generation. Garbage collections will happened on young generation and old generations. Below are two diagrams show the heap regions are divided.



While the concept of Garbage Collection is the same, the implementation is not and neither are the default settings or how to tune it. The well known jvm includes the oracle sun hotspot, oracle jrockit and ibm j9. You can find the other jvm lists here. Essentially garbage collection will perform on young generation and old generation to remove object on heap that has no valid reference.

common java parameters settings. For full list, issue the command java -X

-Xms initial java heap size
-Xmx maximum java heap size
-Xmn the size of the heap for the young generation

There are a few type of GC
- serial gc
- parallel gc
- parallel old gc
- cms gc 

You can specify what gc implementation to run on the java heap region.

If you run a server application, the metric exposed by gc is definitely to watch out for. In order to get the metric, you can use

That's it for this brief introduction.

Sunday, October 25, 2015

Learning Java Eden Space


If you have been a java developer and you should came across java garbage collection that free the object created by your application from occupied all the java heap. In today article, we will look into java heap and particular into java eden space. First, let's look at the general java heap.

From this StackOverflow

Heap memory

The heap memory is the runtime data area from which the Java VM allocates memory for all class instances and arrays. The heap may be of a fixed or variable size. The garbage collector is an automatic memory management system that reclaims heap memory for objects.

Eden Space: The pool from which memory is initially allocated for most objects.

Survivor Space: The pool containing objects that have survived the garbage collection of the Eden space.

Tenured Generation: The pool containing objects that have existed for some time in the survivor space.

When you created a new object, jvm allocate a part of the heap for your object. Visually, it is something as of following.

                   +-----+  
                   |     |  
   <-minor gc->    v     v   <------------- major gc---------------------->  
   +------------+-----+-----+----------------------------------------------+-------------+  
   |            |     |     |                                              |             |
   | Eden       | S0  | S1  |  Tenure Generation                           | Perm gen    |
   |            |     |     |                                              |             |
   +------------+-----+-----+----------------------------------------------+-------------+  
    <---------------------jvm heap (-Xms -Xmx)----------------------------> -XX:PermSize  
    <-- young gen(-Xmn)---->                                                -XX:MaxPermSize  

When eden space is fill with object and minor gc is performed, survive objects will copy to either survivor spaces; s0 or s1. At a time, one of the survivor space is empty. Because the eden space are relatively small in comparison to the tenure generation, hence, the gc that happened in eden space is quick.  Eden and both survivors spaces are also known as young or new generation.

To understand into how young generation heap get free, this article provided detail explanation.

The Sun/Oracle HotSpot JVM further divides the young generation into three sub-areas: one large area named "Eden" and two smaller "survivor spaces" named "From" and "To". As a rule, new objects are allocated in "Eden" (with the exception that if a new object is too large to fit into "Eden" space, it will be directly allocated in the old generation). During a GC, the live objects in "Eden" first move into the survivor spaces and stay there until they have reached a certain age (in terms of numbers of GCs passed since their creation), and only then they are transferred to the old generation. Thus, the role of the survivor spaces is to keep young objects in the young generation for a little longer than just their first GC, in order to be able to still collect them quickly should they die soon afterwards.
Based on the assumption that most of the young objects may be deleted during a GC, a copying strategy ("copy collection") is being used for young generation GC. At the beginning of a GC, the survivor space "To" is empty and objects can only exist in "Eden" or "From". Then, during the GC, all objects in "Eden" that are still being referenced are moved into "To". Regarding "From", the still referenced objects in this space are handled depending on their age. If they have not reached a certain age ("tenuring threshold"), they are also moved into "To". Otherwise they are moved into the old generation. At the end of this copying procedure, "Eden" and "From" can be considered empty (because they only contain dead objects), and all live objects in the young generation are located in "To". Should "to" fill up at some point during the GC, all remaining objects are moved into the old generation instead (and will never return). As a final step, "From" and "To" swap their roles (or, more precisely, their names) so that "To" is empty again for the next GC and "From" contains all remaining young objects.

As you can observed based on the visual diagram above, you can set the amount of heap for the eden and survivor space using -Xmn in the java parameter. There is also -XX:SurvivorRatio=ratio and you can find further information here for java8. Note that in the diagram above, Perm gen has been removed in java8, hence always refer find out what java run your application and refer to the right version of java documentation.

If you want to monitor the statistics of eden , you can use jstats. Previously I have written an article about jstat and you can read here what is jstat and how to use it. You can also enable gc log statistics and so jvm will write the gc statistics into a file, you can further read more here.

Till then we meet again in the next article. Please consider donate, thank you!

Sunday, August 2, 2015

Learning basic of cobertura

A while back, I was reading an article talk about code coverage and I googled, there is this opensource code coverage tool called cobertura. So naturally I thought to give it a try and the result was not disappoint. Read on to find out why. You might wondering why the name of cobertura, from the official site explanation.

"Cobertura" is the Spanish and Portuguese word for "coverage." We were trying to avoid acronyms and coffee references. It's not too hard to associate the word "cobertura" with the word "coverage," and it even has a bit of a zesty kick to it!

Okay, again, why would I want this as I have already junit running?

Cobertura is a free Java tool that calculates the percentage of code accessed by tests. It can be used to identify which parts of your Java program are lacking test coverage. It is based on jcoverage.

So cobertura is a auxiliary to the exiting test by showing how much of your test currently cover in your main codebase. So a requirement is such that, you need to have tests written before you use cobertura.

Okay, enough for the theory, let's dip toe into water. First, download the library, you can download from this link. Next, unzip this file and change into this library directory. There is a nice ready example for you to play with.

 $ ls  
 cobertura-2.1.1.jar        cobertura-2.1.1-sources.jar cobertura-check.sh cobertura-instrument.bat     cobertura-merge.bat cobertura-report.bat examples LICENSE.txt  
 cobertura-2.1.1-javadoc.jar cobertura-check.bat      coberturaFlush.war cobertura-instrument.sh     cobertura-merge.sh  cobertura-report.sh  lib        README.markdown  

change into this directory and run the command such as below. Yes, you will need ant installed and java.

 $ ant -p  
 Buildfile: /home/user/Desktop/cobertura-2.1.1/examples/basic/build.xml  
   Cobertura - http://cobertura.sourceforge.net/  
   Copyright (C) 2003 jcoverage ltd.  
   Copyright (C) 2005 Mark Doliner <thekingant@users.sourceforge.net>  
   Copyright (C) 2006 Dan Godfrey  
   Cobertura is licensed under the GNU General Public License  
   Cobertura comes with ABSOLUTELY NO WARRANTY  
 Main targets:  
  clean   Remove all files created by the build/test process.  
  coverage Compile, instrument ourself, run the tests and generate JUnit and coverage reports.  
 Default target: coverage  

So that's pretty clear, we have two targets, the clean and coverage. The coverage will generate all necessary files for you. See below.

 $ ant coverage  
 Buildfile: /home/user/Desktop/cobertura-2.1.1/examples/basic/build.xml  
 init:  
   [mkdir] Created dir: /home/user/Desktop/cobertura-2.1.1/examples/basic/classes  
   [mkdir] Created dir: /home/user/Desktop/cobertura-2.1.1/examples/basic/instrumented  
   [mkdir] Created dir: /home/user/Desktop/cobertura-2.1.1/examples/basic/reports/junit-xml  
   [mkdir] Created dir: /home/user/Desktop/cobertura-2.1.1/examples/basic/reports/junit-html  
   [mkdir] Created dir: /home/user/Desktop/cobertura-2.1.1/examples/basic/reports/cobertura-xml  
   [mkdir] Created dir: /home/user/Desktop/cobertura-2.1.1/examples/basic/reports/cobertura-summary-xml  
   [mkdir] Created dir: /home/user/Desktop/cobertura-2.1.1/examples/basic/reports/cobertura-html  
 compile:  
   [javac] /home/user/Desktop/cobertura-2.1.1/examples/basic/build.xml:36: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds  
   [javac] Compiling 2 source files to /home/user/Desktop/cobertura-2.1.1/examples/basic/classes  
   [javac] Note: /home/user/Desktop/cobertura-2.1.1/examples/basic/src/com/example/simple/SimpleTest.java uses unchecked or unsafe operations.  
   [javac] Note: Recompile with -Xlint:unchecked for details.  
 instrument:  
   [delete] Deleting directory /home/user/Desktop/cobertura-2.1.1/examples/basic/instrumented  
 [cobertura-instrument] 21:55:08,566 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.groovy]  
 [cobertura-instrument] 21:55:08,566 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]  
 [cobertura-instrument] 21:55:08,566 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml]  
 [cobertura-instrument] 21:55:08,567 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs multiple times on the classpath.  
 [cobertura-instrument] 21:55:08,567 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml]  
 [cobertura-instrument] 21:55:08,567 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1.jar!/logback.xml]  
 [cobertura-instrument] 21:55:08,601 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@4fce7ceb - URL [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml] is not of type file  
 [cobertura-instrument] 21:55:08,699 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not set  
 [cobertura-instrument] 21:55:08,704 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]  
 [cobertura-instrument] 21:55:08,716 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT]  
 [cobertura-instrument] 21:55:08,813 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property  
 [cobertura-instrument] 21:55:08,897 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting level of logger [net.sourceforge.cobertura] to INFO  
 [cobertura-instrument] 21:55:08,897 |-INFO in ch.qos.logback.classic.joran.action.RootLoggerAction - Setting level of ROOT logger to DEBUG  
 [cobertura-instrument] 21:55:08,897 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[ROOT]  
 [cobertura-instrument] 21:55:08,898 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration.  
 [cobertura-instrument] 21:55:08,899 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@7d6b513b - Registering current configuration as safe fallback point  
 [cobertura-instrument]   
 [cobertura-instrument] 21:55:09,216 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.groovy]  
 [cobertura-instrument] 21:55:09,217 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]  
 [cobertura-instrument] 21:55:09,217 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml]  
 [cobertura-instrument] 21:55:09,218 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs multiple times on the classpath.  
 [cobertura-instrument] 21:55:09,218 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml]  
 [cobertura-instrument] 21:55:09,218 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1.jar!/logback.xml]  
 [cobertura-instrument] 21:55:09,243 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@45a5049a - URL [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml] is not of type file  
 [cobertura-instrument] 21:55:09,310 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not set  
 [cobertura-instrument] 21:55:09,315 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]  
 [cobertura-instrument] 21:55:09,325 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT]  
 [cobertura-instrument] 21:55:09,354 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property  
 [cobertura-instrument] 21:55:09,402 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting level of logger [net.sourceforge.cobertura] to INFO  
 [cobertura-instrument] 21:55:09,402 |-INFO in ch.qos.logback.classic.joran.action.RootLoggerAction - Setting level of ROOT logger to DEBUG  
 [cobertura-instrument] 21:55:09,402 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[ROOT]  
 [cobertura-instrument] 21:55:09,403 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration.  
 [cobertura-instrument] 21:55:09,405 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@54d13e2e - Registering current configuration as safe fallback point  
 [cobertura-instrument]   
 [cobertura-instrument] Cobertura 2.1.1 - GNU GPL License (NO WARRANTY) - See COPYRIGHT file  
 [cobertura-instrument] [INFO] Cobertura: Saved information on 1 classes.  
 [cobertura-instrument] [INFO] Cobertura: Saved information on 1 classes.  
 test:  
   [junit] [INFO] Cobertura: Loaded information on 1 classes.  
   [junit] [INFO] Cobertura: Saved information on 1 classes.  
 [junitreport] Processing /home/user/Desktop/cobertura-2.1.1/examples/basic/reports/junit-xml/TESTS-TestSuites.xml to /tmp/null1467716178  
 [junitreport] Loading stylesheet jar:file:/usr/share/ant/lib/ant-junit.jar!/org/apache/tools/ant/taskdefs/optional/junit/xsl/junit-frames.xsl  
 [junitreport] Transform time: 1272ms  
 [junitreport] Deleting: /tmp/null1467716178  
 coverage-report:  
 [cobertura-report] 21:55:13,533 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.groovy]  
 [cobertura-report] 21:55:13,533 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]  
 [cobertura-report] 21:55:13,533 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml]  
 [cobertura-report] 21:55:13,535 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs multiple times on the classpath.  
 [cobertura-report] 21:55:13,535 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml]  
 [cobertura-report] 21:55:13,535 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1.jar!/logback.xml]  
 [cobertura-report] 21:55:13,561 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@6e038230 - URL [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml] is not of type file  
 [cobertura-report] 21:55:13,636 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not set  
 [cobertura-report] 21:55:13,643 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]  
 [cobertura-report] 21:55:13,653 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT]  
 [cobertura-report] 21:55:13,684 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property  
 [cobertura-report] 21:55:13,748 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting level of logger [net.sourceforge.cobertura] to INFO  
 [cobertura-report] 21:55:13,748 |-INFO in ch.qos.logback.classic.joran.action.RootLoggerAction - Setting level of ROOT logger to DEBUG  
 [cobertura-report] 21:55:13,748 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[ROOT]  
 [cobertura-report] 21:55:13,749 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration.  
 [cobertura-report] 21:55:13,751 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@285855bd - Registering current configuration as safe fallback point  
 [cobertura-report]   
 [cobertura-report] Cobertura 2.1.1 - GNU GPL License (NO WARRANTY) - See COPYRIGHT file  
 [cobertura-report] [INFO] Cobertura: Loaded information on 1 classes.  
 [cobertura-report] Report time: 159ms  
 summary-coverage-report:  
 [cobertura-report] 21:55:14,128 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.groovy]  
 [cobertura-report] 21:55:14,129 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]  
 [cobertura-report] 21:55:14,129 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml]  
 [cobertura-report] 21:55:14,131 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs multiple times on the classpath.  
 [cobertura-report] 21:55:14,131 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml]  
 [cobertura-report] 21:55:14,131 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1.jar!/logback.xml]  
 [cobertura-report] 21:55:14,161 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@52633079 - URL [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml] is not of type file  
 [cobertura-report] 21:55:14,234 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not set  
 [cobertura-report] 21:55:14,239 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]  
 [cobertura-report] 21:55:14,250 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT]  
 [cobertura-report] 21:55:14,281 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property  
 [cobertura-report] 21:55:14,334 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting level of logger [net.sourceforge.cobertura] to INFO  
 [cobertura-report] 21:55:14,335 |-INFO in ch.qos.logback.classic.joran.action.RootLoggerAction - Setting level of ROOT logger to DEBUG  
 [cobertura-report] 21:55:14,335 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[ROOT]  
 [cobertura-report] 21:55:14,336 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration.  
 [cobertura-report] 21:55:14,338 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@6e038230 - Registering current configuration as safe fallback point  
 [cobertura-report]   
 [cobertura-report] Cobertura 2.1.1 - GNU GPL License (NO WARRANTY) - See COPYRIGHT file  
 [cobertura-report] [INFO] Cobertura: Loaded information on 1 classes.  
 [cobertura-report] Report time: 124ms  
 alternate-coverage-report:  
 [cobertura-report] 21:55:14,694 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.groovy]  
 [cobertura-report] 21:55:14,694 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]  
 [cobertura-report] 21:55:14,694 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml]  
 [cobertura-report] 21:55:14,695 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs multiple times on the classpath.  
 [cobertura-report] 21:55:14,695 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml]  
 [cobertura-report] 21:55:14,695 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs at [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1.jar!/logback.xml]  
 [cobertura-report] 21:55:14,727 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@5abce07 - URL [jar:file:/home/user/Desktop/cobertura-2.1.1/cobertura-2.1.1-sources.jar!/logback.xml] is not of type file  
 [cobertura-report] 21:55:14,814 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not set  
 [cobertura-report] 21:55:14,821 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]  
 [cobertura-report] 21:55:14,832 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [STDOUT]  
 [cobertura-report] 21:55:14,874 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property  
 [cobertura-report] 21:55:14,934 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting level of logger [net.sourceforge.cobertura] to INFO  
 [cobertura-report] 21:55:14,934 |-INFO in ch.qos.logback.classic.joran.action.RootLoggerAction - Setting level of ROOT logger to DEBUG  
 [cobertura-report] 21:55:14,935 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [STDOUT] to Logger[ROOT]  
 [cobertura-report] 21:55:14,935 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration.  
 [cobertura-report] 21:55:14,937 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@52633079 - Registering current configuration as safe fallback point  
 [cobertura-report]   
 [cobertura-report] Cobertura 2.1.1 - GNU GPL License (NO WARRANTY) - See COPYRIGHT file  
 [cobertura-report] [INFO] Cobertura: Loaded information on 1 classes.  
 [cobertura-report] Report time: 171ms  
 coverage:  
 BUILD SUCCESSFUL  
 Total time: 9 seconds  

From the output, we see a target called instrument and then test was called. Then all the reports are generated. If you list the directory now, you should see a few additional files are added. cobertura.ser, reports, classes and instrumented. All the generated reports are in the reports directory. The one is relevant for this article cobertura and I will show you cobertura-html report.



That's it, if you want to go further integrating into this code coverage tool into your project, start to look into the example build.xml. Have fun.

Friday, June 19, 2015

Learn lucene term range query

Today, we are going to learn lucene term range query. But first, what actually is lucene term range query? From the official javadoc definition

A Query that matches documents within an range of terms.

This query matches the documents looking for terms that fall into the supplied range according to Byte.compareTo(Byte). It is not intended for numerical ranges; use NumericRangeQuery instead.

This query uses the MultiTermQuery.CONSTANT_SCORE_AUTO_REWRITE_DEFAULT rewrite method.

So byte to byte comparison of between two ranges, because it is byte to byte comparison, the comparison is lexicographic. If you intend to find range between two numbers, this is not the class you should use. Okay, if this is not clear, let's go into the code, shall we?

As you know, lucene is about two parts, the first indexing (write) part and then search (query) part. So in this article, we are going to index and query using term range query. To give you an overall of this article, we have four class.

  • LuceneConstants - just a setting class for this application.
  • Indexer - the class that does the indexing. 
  • Searcher - a class that do the search.
  • LearnTermRangeQuery - our main entry class to bind the above three classes into one. 
We have create an object tester for this learning journey. We then create index by calling method createIndex and then the index using term range query.


1:  LearnTermRangeQuery tester;  
2:    
3:  try {  
4:     tester = new LearnTermRangeQuery();  
5:     tester.createIndex();  
6:     tester.searchUsingTermRangeQuery("record2.txt", "record6.txt");  
7:  } catch (Exception e) {  
8:       
9:  }  

In the method createIndex(), I have some lambda usage, which you can notice with the arrow symbol, so you need to have java8 installed. There are two variables, indexDir and dataDir. The variable, indexDir is there directory where the created index will reside whilst dataDir is the sample data to be index upon. In the class Indexer, method getDocument(), is essentially index all sample documents. Nothing fancy, just ordinary creating lucene document and three fields, filename, filepath and file content.

Back to the class LearnTermRangeQuery, method searchUsingTermRangeQuery(). Notice we search the range with two files as the border. We initialized a lucene directory object and pass to the object index searcher. Everything else for lucene index searcher is just standard. We construct the TermRangeQuery and passed to the searcher object. The results are then shown and eventually close.

Below are the sample output in eclipse output.

 record 21.txt  
 src/resources/samples.termrange/record 21.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record 21.txt  
 record 33 .txt  
 src/resources/samples.termrange/record 33 .txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record 33 .txt  
 record10.txt  
 src/resources/samples.termrange/record10.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record10.txt  
 record7.txt  
 src/resources/samples.termrange/record7.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record7.txt  
 record6.txt  
 src/resources/samples.termrange/record6.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record6.txt  
 record9.txt  
 src/resources/samples.termrange/record9.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record9.txt  
 record33.txt  
 src/resources/samples.termrange/record33.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record33.txt  
 record2.txt  
 src/resources/samples.termrange/record2.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record2.txt  
 record5.txt  
 src/resources/samples.termrange/record5.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record5.txt  
 record 33.txt  
 src/resources/samples.termrange/record 33.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record 33.txt  
 record3.txt  
 src/resources/samples.termrange/record3.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record3.txt  
 record8.txt  
 src/resources/samples.termrange/record8.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record8.txt  
 record2.1.txt  
 src/resources/samples.termrange/record2.1.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record2.1.txt  
 record1.txt  
 src/resources/samples.termrange/record1.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record1.txt  
 record4.txt  
 src/resources/samples.termrange/record4.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record4.txt  
 record22.txt  
 src/resources/samples.termrange/record22.txt  
 Indexing /home/user/eclipse/test/src/resources/samples.termrange/record22.txt  
 16 File indexed, time taken: 800 ms  
 6 documents found. Time :74ms  
 File : /home/user/eclipse/test/src/resources/samples.termrange/record33.txt  
 File : /home/user/eclipse/test/src/resources/samples.termrange/record2.txt  
 File : /home/user/eclipse/test/src/resources/samples.termrange/record5.txt  
 File : /home/user/eclipse/test/src/resources/samples.termrange/record3.txt  
 File : /home/user/eclipse/test/src/resources/samples.termrange/record4.txt  
 File : /home/user/eclipse/test/src/resources/samples.termrange/record22.txt  
   

As you can see above, the result are not correct if you consider numeric file name from record2.txt to record6.txt. So, always try experiment for few values before you implement. hehe, have fun! You can get the source for this codes at my github.

Sunday, May 10, 2015

My journey and experience on upgrading apache cassandra from version1.0.12 to 1.1.12

If you have read my previous post on apache cassandra upgrade, this is another journey to major upgrade apache cassandra from version 1.0 to 1.1. In this article, I will share on my experience on upgrading cassandra from version 1.0.12 to 1.1.12.

The sstable version used by cassandra 1.0.12 is hd  and you should ensure that all nodes sstables become hd before proceed upgrade to a newer version of cassandra.

First, let read some highlight of cassandra 1.1

  • api version 19.33.0

  • new file cassandra-rackdc.properties, commitlog_archiving.properties

  • new directory structure for sstable and filename change for sstable.

  • more features/improvement to nodetool such as compactionstats has remaining timestamp, calculate exact size required for cleanup operations, you can now stop compaction, rangekeysample, getsstables, repair print progress, etc.

  • global key and row cache.

  • cql 3.0 beta

  • schema change for cassandra in caching.

  • libthrift version 0.7.0.

  • sstable hf version.

  • default compressor become snappy compressor.

  • a lot of improvement to level compression strategy.

  • sliced_buffer_size_in_kb option has been removed from the cassandra.yaml configuration file (this option was a no-op since 1.0).

  • thread stack size increased to 160k

  • added flag UseTLAB for jvm to improve read speed.

As this is a newer version of cassandra compare the previous, it is always good to setup a test node and so you can play around and get familiar with it before actually doing the upgrade. With this new node, you can also quickly test with your application client which write and/or read to the test cassandra node. It is also recommended to do some load test to see the result is what you have expected.

If you want to be extremely careful on the upgrade, then reading the code changes between the version you chose to upgrade is always recommended. This is the link for this upgrade  and I know and understand as there are huge differences in betweeen them, so you should split as small as possible to read through it. You can learn a lot from the experience coder if you spend a lot of time reading their code and you can learn new technology too. It is a daunting huge tasks but if you willing to spend sometime to read them, the benefits return is just too much to even describe here.

If you upgrade from 1.0.12 to 1.1.12, cassandra 1.1 is smart enough to move the sstable into new directory structure. So, it ease your job that you do not need to move the sstable into the new directory structure. When the new cassandra 1.1.12 starting up, it will move for you.

So you might want to consider prepare the configuration file for your cluster environment before hand. For example, cassandra.yaml, cassandra-env.sh and cassandra.in.sh. By doing this, you can decrease the upgrade process time duration and less error when you are not actually doing it but a upgrade script will symlink this for you. So spend sometime to write upgrade and downgrade scripts for the production cluster and tests it.

Because upgrade process will take time (a long one, depend on how many nodes you have in cluster) and it will tired you in the process (remember, there will be post upgrade issues which you need to deal with), so I suggest you create a upgrade script to handle the upgrade process. The cassandra configuration which you prepare before will be automatically symlink within this script. When you do this, you reduce risk such as factor human error and for a production cluster, you will NOT want to risk anything or cut the risk to as minimum as possible.

There is official upgrade documentation here at datastax but because your cluster environment might be different, so you might want to write the upgrade step taking into consideration from the official documentation and let peer review so you cover as much as possible. Best if your peer will tests and raise in some questions which you might not think of.

If you have using monitoring system such as opscenter, spm, jconsole, or your own monitoring system, you wanna check it out if these monitoring can support the newer version of cassandra.

key cache and row cache per column family based has been replace with global key cache and global row cache respectively. These global cache settings can be found in casandra.yaml file. If you leave it to default, 1 millon key cache by default. Below are some new parameter for cassandra 1.1,

  • populate_io_cache_on_flush

  • key_cache_size_in_mb

  • key_cache_save_period

  • row_cache_size_in_mb

  • row_cache_save_period

  • row_cache_provider

  • commitlog_segment_size_in_mb

  • trickle_fsync

  • trickle_fsync_interval_in_kb

  • internode_authenticator

and below are configuration get removed

  • sliced_buffer_size_in_kb

  • thrift_max_message_length_in_mb

For the upgrade steps in production, these steps are taken appropriately:

pre-upgrade apply to all node in cluster.
* stop any repair , cleanup in all cassandra node and no streaming happened. Streaming are the nodes bootstrap or you rebuild a node.

upgrade steps.
1. download cassandra 1.1.12 and verify binary is not corrupted.
2. extract the compressed tarball.
3. nodetool snapshot.
4. nodetool drain.
5. stop cassandra if it not stopped.
6. symlink new configuration files.
7. start cassandra 1.1.12
8. monitor cassandra system.log
9. check monitoring system.

If everything looks okay for first node, best if you do two nodes, and then continue till the rest of the node in rolling upgrade fashion. After you migrate, you might also noticed there are 3 more additional column families in cassandra 1.1

cassandra 1.0 system keyspace has a total of 7 column families

  • HintsColumnFamily

  • IndexInfo

  • LocationInfo

  • Migrations

  • NodeIdInfo

  • Schema

  • Versions

cassandra 1.1 system keyspace has a total 10 column families.

  • HintsColumnFamily

  • IndexInfo

  • LocationInfo

  • Migrations

  • NodeIdInfo

  • Schema

  • schema_columnfamilies

  • schema_columns

  • schema_keyspaces

  • Versions

If you are using level compaction strategy, these sstable need to be scrub accordingly. There are nodetool scrub and offline sstablescrub for this job. If you have defined column family using counter type, you should upgrade the sstable using nodetool upgradesstables.

That's it and if you need professional service for this, please contact me and I will be gladly to provide professional advice and/or service.

Saturday, May 9, 2015

Light walkthrough on Java Execution Time Measurement Library (JETM)

Today, let's learn a java library, Java Execution Time Measurement Library or JETM. What is JETM?

From the official site
A small and free library, that helps locating performance problems in existing Java applications.

 

JETM enables developers to track down performance issues on demand, either programmatic or declarative with minimal impact on application performance, even in production.

jetm is pretty cool and has a lot of features.

You can follow the tutorial trail here. The following codes are taken from one of the tutorial with minor modification.
public class BusinessService {

private static final EtmMonitor etmMonitor = EtmManager.getEtmMonitor();

public void someMethod() {
EtmPoint point = etmMonitor.createPoint("BusinessService:someMethod");

try {
Thread.sleep((long)(10d * Math.random()));
nestedMethod();
} catch (InterruptedException e ) {

} finally {
point.collect();
}
}

public void nestedMethod() {
EtmPoint point = etmMonitor.createPoint("BusinessService:nestedMethod");

try {
Thread.sleep((long)(15d * Math.random()));
} catch (InterruptedException e) {

} finally {
point.collect();
}

}

public static void main(String[] args) {
BasicEtmConfigurator.configure(true);
//etmMonitor = EtmManager.getEtmMonitor();
etmMonitor.start();
BusinessService bizz = new BusinessService();
bizz.someMethod();
bizz.someMethod();
bizz.someMethod();
bizz.someMethod();
bizz.nestedMethod();
etmMonitor.render(new SimpleTextRenderer());

etmMonitor.stop();
}

}

Hit the run button in eclipse.
EtmMonitor info [INFO] JETM 1.2.3 started.
|--------------------------------|---|---------|-------|--------|--------|
| Measurement Point | # | Average | Min | Max | Total |
|--------------------------------|---|---------|-------|--------|--------|
| BusinessService:nestedMethod | 1 | 4.121 | 4.121 | 4.121 | 4.121 |
|--------------------------------|---|---------|-------|--------|--------|
| BusinessService:someMethod | 4 | 12.611 | 6.196 | 16.347 | 50.442 |
| BusinessService:nestedMethod | 4 | 5.381 | 0.017 | 10.194 | 21.523 |
|--------------------------------|---|---------|-------|--------|--------|
EtmMonitor info [INFO] Shutting down JETM.

So we saw that nestedMethod execute once and four time for someMethod. The result showing a minimum and maximum for the execution with an avarage. Last column shown the total. Pretty neat for a small java library.

 

Friday, May 8, 2015

Elasticsearch no node exception happened in tomcat web container

If you ever get the stack trace in web container log file such as below and wondering how to solve these. Then read on but first, a little background. A elasticsearch cluster 0.90 and client running on tomcat web container using elasticsearch java transport client. Both server and client running same elasticsearch version and same java version.
16.Feb 6:21:30,830 ERROR WebAppTransportClient [put]: error
org.elasticsearch.client.transport.NoNodeAvailableException: No node available
at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:212)
at org.elasticsearch.client.transport.support.InternalTransportClient.execute(InternalTransportClient.java:106)
at org.elasticsearch.client.support.AbstractClient.index(AbstractClient.java:84)
at org.elasticsearch.client.transport.TransportClient.index(TransportClient.java:316)
at org.elasticsearch.action.index.IndexRequestBuilder.doExecute(IndexRequestBuilder.java:324)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:85)
at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:59)
at com.example.elasticsearch.WebAppTransportClient.put(WebAppTransportClient.java:258)
at com.example.elasticsearch.WebAppTransportClient.put(WebAppTransportClient.java:307)
at com.example.threadpool.TaskThread.run(TaskThread.java:38)
at java.lang.Thread.run(Thread.java:662)

This exception will disappear once web container is restarted but restarting webapp that often is not a good solution in production. I did a few research on line and gather a few information, they are as following:

* The default number of channels in each of these class are configured with the configuration prefix of transport.connections_per_node.
https://www.found.no/foundation/elasticsearch-networking/

* If you see NoNodeAvailableException you may have hit a connect timeout of the client. Connect timeout is 30 secs IIRC.
https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/elasticsearch/VyNpCs17aTA/CcXkYvVMYWAJ

* You can set org.elasticsearch.client.transport to TRACE level in your logging configuration (on the client side) to see the failures it has (to connect for example). For more information, you can turn on logging on org.elasticsearch.client.transport.
https://groups.google.com/forum/#!topic/elasticsearch/Mt2x4d5BCGI

* This means that you started to get disconnections between the client (transport) and the server. It will try and reconnect automatically, and possibly manages to do it. For more information, you can turn on logging on org.elasticsearch.client.transport.
* Can you try and increase the timeout and see how it goes? Set client.transport.ping_timeout in the settings you pass to the TransportClient to 10s for example.
* We had the same problem. reason: The application server uses a older version of log4j than ES needed.
http://elasticsearch-users.115913.n3.nabble.com/No-node-available-Exception-td3920119.html

* The correct method is to add the known host addresses with addTransportAddresses() and afterwards check the connectedNodes() method. If it returns empty list, no nodes could be found.
https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/elasticsearch/ceH3UIy14jM/XJSFKd8kAXEJ

* the most common case for NoNodeAvailable is the regular pinging that the transport client does fails to do it, so no nodes end up as the list of nodes that the transport client uses. If you will set client.transport (or org.elasticsearch.client.transport if running embedded) to TRACE, you will see the pinging effort and if it failed or not (and the reason for the failures). This might get us further into trying to understand why it happens.
* .put("client.transport.ping_timeout", pingTimeout)
* .put("client.transport.nodes_sampler_interval", pingSamplerInterval).build();
https://groups.google.com/forum/#!msg/elasticsearch/9aSkB0AVrHU/_4kDkjAFKuYJ

* this has nothing to do with migration errors. Your JVM performs a very long GC of 9 seconds which exceeds the default ping timeout of 5 seconds, so ES dropped the connection ,assuming your JVM is just too busy. Try again if you can reproduce it. If yes, increase the timeout to something like 10 seconds, or consider to update your Java version.
http://elasticsearch-users.115913.n3.nabble.com/Migration-errors-0-20-1-to-0-90-td4035165.html

* During long GC the JVM is somehow suspended. So your client can not see it anymore.
http://grokbase.com/t/gg/elasticsearch/136fw0hppp/transport-client-ping-timeout-no-node-available-exception

* You wrote that you have a 0.90.9 cluster but you added 0.90.0 jars to the client. Is that correct?
* Please check:
*
* if your cluster nodes and client node is using exactly the same JVM
* if your cluster and client use exactly the same ES version
* if your cluster and client use the same cluster name
* reasons outside ES: IP blocking, network reachability, network interfaces, IPv4/IPv6 etc.
* Then you should be able to connect with TransportClient.

https://groups.google.com/forum/#!msg/elasticsearch/fYmKjGywe8o/z9Ci5L5WjUAJ

So I have tried all that option mentioned and the problem solve by added sniff to the transport client setting. 08988For more information, read here.

I hope this will solve your problem too.

Saturday, April 25, 2015

My way of solving tomcat memory leaking issue

Recently, I did a mistake by accidentally commit a stupid static codes into a static method into production causing heap usage grow tremendously. Since the static method stay persisted with the object, tomcat has to restart often to free up the heap that get hold. So today, I will share my experience on how I solve it and I hope it will give you a way on how to solve this difficult problem.
First is the to end, I will summarize the sequence you need to investigate and find out the fix.

* CHECK YOUR CODE.
* learn on how to find the memory leak using google.
* one step at a time to trace until you successfully pin down the problem and fix it.

As you can read, only three general steps but for each step, I will talk more about it.
CHECK YOUR CODE.

Always check your code by reading and tests! Best if you have someone experience and you can probably send your code for inspection. Remember, 4 eyes ball and 2 brains are better than 2 eyes ball and a brain. If you are using opensource project, most probably, the library are well tested and you should just spend time to investigate your codes. It's difficult especially for new programmer, but that should not stopped you to find out the problem. If you still cannot find out the problem, then you should start to search on search engine on how people solve it.
learn on how to find the memory leak using google.
Nobody is perfect and know everything, but if you are unsure, always google away. Google keyword such as java memory leak, tomcat memory leak or even best java coding practice. Pay attention on the first 10 links return by google and then read on blogging or even stackoverflow, it will give you knowledge that you never know of. Example of tools needed include jstat, jmap, jhat, and visualvm that can give you an idea what or even where might be the problem from. Remember, reading this material is a way of growing and it take times, so please be patience at this step and make sure u spend adequate amount of time and jot down important points mentioned and so you can use it on final step.

one step at a time to trace until you successfully pin down the problem and fix it.
Final step would probably repeating step 1 and step 2 slowly to determine the root cause. If you are using versoning system, you should really find out when was the last best working codes and start to check file by file where the problem was introduced. This is a TEDIOUS and DAUNTING process but this is effective to solving the root cause.
These steps were used by myself during determine the tomcat web application memory problem. Thank you and I hope you can benefit too.

Friday, April 24, 2015

Learning java jstat

Today, we will going to learn a java tool, which is incredibly useful if you are frequent coding for java application. This java tool is a monitoring tool known as jstat and it came with jdk. So you would ask why would I need to use jstat, my app run just fine. So for a simple java application, yes, you do not need to this monitoring tool. However if you have a long running application or big java codebase application, and sometime when your java application run midway hang (pause/freeze), then you should start to look into this tool really. In this article, I'm going to show you how I use it.

But first, let understand on what is jstat.
The jstat tool displays performance statistics for an instrumented HotSpot Java virtual machine (JVM).

As you aware, object that you wrote in the code will eventually get free from heap when it is not reference. If you has a lot of objects and heap usage grow, then you can use this monitoring tool to check out wassup of the heap allocation. Okay now, let's read into the command input.
jstat [ generalOption | outputOptions vmid [interval[s|ms] [count]] ]

so pretty simple, the commands jstat followed by a few parameters. The parameters can be explain below. You can find official documentation here.

generalOption
A single general command-line option (-help or -options)

outputOptions
One or more output options, consisting of a single statOption, plus any of the -t, -h, and -J options.

vmid
Virtual machine identifier, a string indicating the target Java virtual machine (JVM). The general syntax is
[protocol:][//]lvmid[@hostname[:port]/servername]
The syntax of the vmid string largely corresponds to the syntax of a URI. The vmid can vary from a simple integer representing a local JVM to a more complex construction
specifying a communications protocol, port number, and other implementation-specific values. See Virtual Machine Identifier for details.

interval[s|ms]
Sampling interval in the specified units, seconds (s) or milliseconds (ms). Default units are milliseconds. Must be a positive integer. If specified, jstat will produce its
output at each interval.

count
Number of samples to display. Default value is infinity; that is, jstat displays statistics until the target JVM terminates or the jstat command is terminated. Must be a
positive integer.

It should be very clear to you if you are season java coder and if you don't, take a look at an example below.
[iser@localhost ~]$ jstat -gcutil 12345 1s
S0 S1 E O P YGC YGCT FGC FGCT GCT
10.08 0.00 70.70 69.22 59.49 122328 4380.327 355 43.146 4423.474
10.08 0.00 84.99 69.22 59.49 122328 4380.327 355 43.146 4423.474
0.00 15.62 0.00 69.24 59.49 122329 4380.351 355 43.146 4423.497

so jstat is instrument a local jvm with process id 12345 with an interval of 1 second and loop infinitely. There are different type of statistics can be shown and with the above example given, it show summary of garbage collection statistics. If you want to shown different types of gc statistics, you can use the command jstat -options and below is the table of summaries what these options display means.
Option 	                Displays...
class Statistics on the behavior of the class loader.
compiler Statistics of the behavior of the HotSpot Just-in-Time compiler.
gc Statistics of the behavior of the garbage collected heap.
gccapacity Statistics of the capacities of the generations and their corresponding spaces.
gccause Summary of garbage collection statistics (same as -gcutil), with the cause of the last and current (if applicable) garbage collection events.
gcnew Statistics of the behavior of the new generation.
gcnewcapacity Statistics of the sizes of the new generations and its corresponding spaces.
gcold Statistics of the behavior of the old and permanent generations.
gcoldcapacity Statistics of the sizes of the old generation.
gcpermcapacity Statistics of the sizes of the permanent generation.
gcutil Summary of garbage collection statistics.
printcompilation HotSpot compilation method statistics.

Out of all these options, probably the most frequently you will use is gcutil, gc and gccapacity. We will look at them with example. Please note that in order to protect the privacy of the user, there are some information is removed but what need to be presented in this article shall remained as is.

option gcutil

jstat-gcutil

As can be read above, the command jstat with option gcutil on a java process id 23483. The statistics are generated with an interval at 1 second. It has 10 columns and these column can be explain below.
Column 	Description
S0 Survivor space 0 utilization as a percentage of the space's current capacity.
S1 Survivor space 1 utilization as a percentage of the space's current capacity.
E Eden space utilization as a percentage of the space's current capacity.
O Old space utilization as a percentage of the space's current capacity.
P Permanent space utilization as a percentage of the space's current capacity.
YGC Number of young generation GC events.
YGCT Young generation garbage collection time.
FGC Number of full GC events.
FGCT Full garbage collection time.
GCT Total garbage collection time.

First five columns depict space utilization in term of percentage. The next five depict amount of young generation collection and its time, full garbage collection and its time and last, total garbage collection time. With this screen capture, we see that the eden space is filling up quickly and promoted to either survivor space 0 or survivor space 1. At one instance, some object survived and eventually promoted to old space and increased the usage by 0.01% to 5.24%. Note that also YGC is increased by one as a result to 256. This young generation collection time took 13 milliseconds. Similar pattern happen again later and we see that, YGC is increased by oen to 257 with another 13 milliseconds of collection time. In this output, there is no change to full collection, which is good. It is only one full collection happened but with a pause of 94millseconds! You might want to keep an eye on the E column so it dont fill up quickly and adjust hte young gen in your java app accordingly. But for a long term solution, you might want to spend some time to find out which code take a lot of resources and improve it.

option gc

jstat-gcAs can be read above, the command jstat with option gc on a java process id 28276. The statistics are generated with an interval at 1 second. It has 15 columns and these column can be explain below.
Column 	Description
S0C Current survivor space 0 capacity (KB).
S1C Current survivor space 1 capacity (KB).
S0U Survivor space 0 utilization (KB).
S1U Survivor space 1 utilization (KB).
EC Current eden space capacity (KB).
EU Eden space utilization (KB).
OC Current old space capacity (KB).
OU Old space utilization (KB).
PC Current permanent space capacity (KB).
PU Permanent space utilization (KB).
YGC Number of young generation GC Events.
YGCT Young generation garbage collection time.
FGC Number of full GC events.
FGCT Full garbage collection time.
GCT Total garbage collection time.

The statistics shown the capacity in term of kilobytes. First ten columns are pretty easy, the space capacity and its current utilization. The last five columns are the same as gcutil last five columns. Notice that when the column EU value near to the column EC value, young generation collection happened. Object promoted to survivor spaces. Notice that column OU grow gradually. This statistics almost the same with gcutil except that the statistics shown here display in term of bytes whereas gcutil statistics display in term of percentage.

option gccapacity

jstat-gccapacity

As can be read above, the command jstat with option gccapacity on a java process id 13080. The statistics are generated with an interval at 1 second. It has 16 columns and these column can be explain below.
Column 	Description
NGCMN Minimum new generation capacity (KB).
NGCMX Maximum new generation capacity (KB).
NGC Current new generation capacity (KB).
S0C Current survivor space 0 capacity (KB).
S1C Current survivor space 1 capacity (KB).
EC Current eden space capacity (KB).
OGCMN Minimum old generation capacity (KB).
OGCMX Maximum old generation capacity (KB).
OGC Current old generation capacity (KB).
OC Current old space capacity (KB).
PGCMN Minimum permanent generation capacity (KB).
PGCMX Maximum Permanent generation capacity (KB).
PGC Current Permanent generation capacity (KB).
PC Current Permanent space capacity (KB).
YGC Number of Young generation GC Events.
FGC Number of Full GC Events.

These output is similar to the output of option gc but with minimum and maximum for the individual java heap.

That's it for this article and I will leave three links for your references.

http://www.cubrid.org/blog/dev-platform/how-to-monitor-java-garbage-collection/
http://docs.oracle.com/javase/7/docs/technotes/tools/share/jstat.html
http://oracle-base.com/articles/misc/monitoring-java-garbage-collection-using-jstat.php