Friday, July 1, 2016

Yet another sstable corruption - EOFException

During starting up a apache cassandra 1.2 instance, I noticed in the log of the following error.

 INFO 10:38:23,334 Opening /var/lib/cassandra/data/MYKEYSPACE/MYCOLUMNFAMILY/MYKEYSPACE-COLUMNFAMILY-hf-2508 (2275767 bytes)  
 ERROR 10:38:23,467 Exception in thread Thread[SSTableBatchOpen:2,5,RMI Runtime]  
 org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException  
    at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:108)  
    at org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:63)  
    at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:42)  
    at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:418)  
    at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:209)  
    at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:157)  
    at org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:273)  
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)  
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)  
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)  
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)  
    at java.lang.Thread.run(Thread.java:745)  
 Caused by: java.io.EOFException  
    at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340)  
    at java.io.DataInputStream.readUTF(DataInputStream.java:589)  
    at java.io.DataInputStream.readUTF(DataInputStream.java:564)  
    at org.apache.cassandra.io.compress.CompressionMetadata<init>(CompressionMetadata.java:83)  
     ... 11 more  

Yes, if you noticed that the cassandra sstable version is hf which belong to cassandra 1.1 as this node is just right after cassandra 1.2 upgrade and first cassandra 1.2 boot up.

Tracing the stacktrace above with cassandra 1.2 source code, it turn out to be the compression metadata cannot be open due to file corruption. I tried using nodetools upgradesstables, scrub and restart cassandra instance, this error still persist. I guess in this case, nothing can really help so I end up stopping the cassandra instance. remove this data sstables together with its metadata sstables and then start it up again. The error is gone and I ran a repair.

I hope you find this useful in your situation too.

Sunday, June 19, 2016

Investigating into apache cassandra 1.2 jmx metrics connection type warn logging

Recently I got the opportunity to upgrade a production cassandra cluster from 1.1.12 to 1.2.19 and during the midst of upgrading, I noticed the following in the log file during boot up of a cassandra 1.2 instance.

1:  WARN 10:30:14,987 Error processing org.apache.cassandra.metrics:type=Connection,scope=1.2.3.4,name=Timeouts  
2:  javax.management.InstanceNotFoundException: org.apache.cassandra.metrics:type=Connection,scope=1.2.3.4,name=Timeouts  
3:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)  
4:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)  
5:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)  
6:     at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)  
7:     at com.yammer.metrics.reporting.JmxReporter.registerBean(JmxReporter.java:462)  
8:     at com.yammer.metrics.reporting.JmxReporter.processMeter(JmxReporter.java:412)  
9:     at com.yammer.metrics.reporting.JmxReporter.processMeter(JmxReporter.java:16)  
10:     at com.yammer.metrics.core.Meter.processWith(Meter.java:131)  
11:     at com.yammer.metrics.reporting.JmxReporter.onMetricAdded(JmxReporter.java:395)  
12:     at com.yammer.metrics.core.MetricsRegistry.notifyMetricAdded(MetricsRegistry.java:516)  
13:     at com.yammer.metrics.core.MetricsRegistry.getOrAdd(MetricsRegistry.java:491)  
14:     at com.yammer.metrics.core.MetricsRegistry.newMeter(MetricsRegistry.java:240)  
15:     at com.yammer.metrics.Metrics.newMeter(Metrics.java:245)  
16:     at org.apache.cassandra.metrics.ConnectionMetrics.<init>(ConnectionMetrics.java:106)  
17:     at org.apache.cassandra.net.OutboundTcpConnectionPool.<init>(OutboundTcpConnectionPool.java:53)  
18:     at org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:493)  
19:     at org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:507)  
20:     at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:640)  
21:     at org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:614)  
22:     at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:59)  
23:     at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)  
24:     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)  
25:     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)  
26:     at java.lang.Thread.run(Thread.java:745)  

As the logging level is WARN, I did not worry that much. Going into the codes, it turn out that in cassandra 1.2 , a metric known as ConnectionMetrics is added. This metric is under domain org.apache.cassandra.metrics and of type connection and name is Timeouts. This is not available in cassandra 1.1.

The same situation is applicable to CommandPendingTasks, ResponseCompletedTasks, ResponsePendingTasks, CommandCompletedTasks.


1:  WARN 10:38:58,079 Error processing org.apache.cassandra.metrics:type=Connection,scope=1.2.3.4,name=CommandPendingTasks  
2:  javax.management.InstanceNotFoundException: org.apache.cassandra.metrics:type=Connection,scope=1.2.3.4,name=CommandPendingTasks  
3:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)  
4:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)  
5:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)  
6:     at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)  
7:     at com.yammer.metrics.reporting.JmxReporter.registerBean(JmxReporter.java:462)  
8:     at com.yammer.metrics.reporting.JmxReporter.processGauge(JmxReporter.java:438)  
9:     at com.yammer.metrics.reporting.JmxReporter.processGauge(JmxReporter.java:16)  
10:     at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28)  
11:     at com.yammer.metrics.reporting.JmxReporter.onMetricAdded(JmxReporter.java:395)  
12:     at com.yammer.metrics.core.MetricsRegistry.notifyMetricAdded(MetricsRegistry.java:516)  
13:     at com.yammer.metrics.core.MetricsRegistry.getOrAdd(MetricsRegistry.java:491)  
14:     at com.yammer.metrics.core.MetricsRegistry.newGauge(MetricsRegistry.java:79)  
15:     at com.yammer.metrics.Metrics.newGauge(Metrics.java:70)  
16:     at org.apache.cassandra.metrics.ConnectionMetrics.<init>(ConnectionMetrics.java:71)  
17:     at org.apache.cassandra.net.OutboundTcpConnectionPool.<init>(OutboundTcpConnectionPool.java:53)  
18:     at org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:493)  
19:     at org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:507)  
20:     at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:640)  
21:     at org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:614)  
22:     at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:59)  
23:     at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)  
24:     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)  
25:     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)  
26:     at java.lang.Thread.run(Thread.java:745)  
27:       
28:   WARN 07:52:19,882 Error processing org.apache.cassandra.metrics:type=Connection,scope=1.2.3.4,name=ResponseCompletedTasks  
29:  javax.management.InstanceNotFoundException: org.apache.cassandra.metrics:type=Connection,scope=1.2.3.4,name=ResponseCompletedTasks  
30:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)  
31:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)  
32:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)  
33:     at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)  
34:     at com.yammer.metrics.reporting.JmxReporter.registerBean(JmxReporter.java:462)  
35:     at com.yammer.metrics.reporting.JmxReporter.processGauge(JmxReporter.java:438)  
36:     at com.yammer.metrics.reporting.JmxReporter.processGauge(JmxReporter.java:16)  
37:     at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28)  
38:     at com.yammer.metrics.reporting.JmxReporter.onMetricAdded(JmxReporter.java:395)  
39:     at com.yammer.metrics.core.MetricsRegistry.notifyMetricAdded(MetricsRegistry.java:516)  
40:     at com.yammer.metrics.core.MetricsRegistry.getOrAdd(MetricsRegistry.java:491)  
41:     at com.yammer.metrics.core.MetricsRegistry.newGauge(MetricsRegistry.java:79)  
42:     at com.yammer.metrics.Metrics.newGauge(Metrics.java:70)  
43:     at org.apache.cassandra.metrics.ConnectionMetrics.<init>(ConnectionMetrics.java:99)  
44:     at org.apache.cassandra.net.OutboundTcpConnectionPool.<init>(OutboundTcpConnectionPool.java:53)  
45:     at org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:493)  
46:     at org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:507)  
47:     at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:640)  
48:     at org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:614)  
49:     at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:59)  
50:     at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)  
51:     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)  
52:     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)  
53:     at java.lang.Thread.run(Thread.java:745)  
54:       
55:   WARN 09:06:07,059 Error processing org.apache.cassandra.metrics:type=Connection,scope=1.2.3.4,name=ResponsePendingTasks  
56:  javax.management.InstanceNotFoundException: org.apache.cassandra.metrics:type=Connection,scope=1.2.3.4,name=ResponsePendingTasks  
57:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)  
58:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)  
59:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)  
60:     at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)  
61:     at com.yammer.metrics.reporting.JmxReporter.registerBean(JmxReporter.java:462)  
62:     at com.yammer.metrics.reporting.JmxReporter.processGauge(JmxReporter.java:438)  
63:     at com.yammer.metrics.reporting.JmxReporter.processGauge(JmxReporter.java:16)  
64:     at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28)  
65:     at com.yammer.metrics.reporting.JmxReporter.onMetricAdded(JmxReporter.java:395)  
66:     at com.yammer.metrics.core.MetricsRegistry.notifyMetricAdded(MetricsRegistry.java:516)  
67:     at com.yammer.metrics.core.MetricsRegistry.getOrAdd(MetricsRegistry.java:491)  
68:     at com.yammer.metrics.core.MetricsRegistry.newGauge(MetricsRegistry.java:79)  
69:     at com.yammer.metrics.Metrics.newGauge(Metrics.java:70)  
70:     at org.apache.cassandra.metrics.ConnectionMetrics.<init>(ConnectionMetrics.java:92)  
71:     at org.apache.cassandra.net.OutboundTcpConnectionPool.<init>(OutboundTcpConnectionPool.java:53)  
72:     at org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:493)  
73:     at org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:507)  
74:     at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:640)  
75:     at org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:614)  
76:     at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:59)  
77:     at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)  
78:     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)  
79:     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)  
80:     at java.lang.Thread.run(Thread.java:745)     
81:    
82:    
83:   WARN 02:13:09,861 Error processing org.apache.cassandra.metrics:type=Connection,scope=1.2.3.4,name=CommandCompletedTasks  
84:  javax.management.InstanceNotFoundException: org.apache.cassandra.metrics:type=Connection,scope=1.2.3.4,name=CommandCompletedTasks  
85:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)  
86:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)  
87:     at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)  
88:     at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546)  
89:     at com.yammer.metrics.reporting.JmxReporter.registerBean(JmxReporter.java:462)  
90:     at com.yammer.metrics.reporting.JmxReporter.processGauge(JmxReporter.java:438)  
91:     at com.yammer.metrics.reporting.JmxReporter.processGauge(JmxReporter.java:16)  
92:     at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28)  
93:     at com.yammer.metrics.reporting.JmxReporter.onMetricAdded(JmxReporter.java:395)  
94:     at com.yammer.metrics.core.MetricsRegistry.notifyMetricAdded(MetricsRegistry.java:516)  
95:     at com.yammer.metrics.core.MetricsRegistry.getOrAdd(MetricsRegistry.java:491)  
96:     at com.yammer.metrics.core.MetricsRegistry.newGauge(MetricsRegistry.java:79)  
97:     at com.yammer.metrics.Metrics.newGauge(Metrics.java:70)  
98:     at org.apache.cassandra.metrics.ConnectionMetrics.<init>(ConnectionMetrics.java:78)  
99:     at org.apache.cassandra.net.OutboundTcpConnectionPool.<init>(OutboundTcpConnectionPool.java:53)  
100:     at org.apache.cassandra.net.MessagingService.getConnectionPool(MessagingService.java:493)  
101:     at org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:507)  
102:     at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:640)  
103:     at org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:614)  
104:     at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:59)  
105:     at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)  
106:     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)  
107:     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)  
108:     at java.lang.Thread.run(Thread.java:745)  
This is indeed nothing to worry about. When you done upgrading all your nodes in the cluster, do another round of restart, this type of warning logging will disappear.


Saturday, June 18, 2016

yet another upgrade to cassandra virtual nodes fail

Recently I was assigned a project to upgrade cassandra from 1.1 to 1.2 (I know it is ancient cassandra but who cares? we just want it to work and cassandra deliver just that) and one of the main feature of cassandra 1.2 is the virtual nodes.

Although there is a red warning note in this instruction, but I took sometime to investigate it knowing that we not enable bleeding edge technology or home based customized the cassandra code. If you selecting cassandra in 1.2 for your upgrade and you want to try on virtual nodes upgrade as well, choose one less version than 1.2.19. why? read here https://github.com/apache/cassandra/blob/cassandra-1.2.19/NEWS.txt#L19-L23

I started three nodes cassandra 1.2.18 in sandbox environment where I can safely test the cassandra upgrade from 1.1 to 1.2 and after upgraded that, upgrade to virtual nodes.

1:  [user@localhost ~]$ sudo ./cassandra-shuffle -h 127.0.0.2 -p 7210 create  
2:  Token                   From      To         
3:  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~  
4:  73107539768170373009709388315418951678   127.0.0.4    127.0.0.3     
5:  169033493463981801837600797832317151914  127.0.0.2    127.0.0.3     
6:  136467407567251362951457524855448709801  127.0.0.2    127.0.0.3     
7:  133808951575681531205649910734888020649  127.0.0.2    127.0.0.3     
8:  75544457760442718776699701259266250066   127.0.0.4    127.0.0.3     
9:    
10:    
11:  [user@localhost ~]$ sudo ./cassandra-shuffle -h 127.0.0.2 -p 7210 enable  
12:  [user@localhost ~]$ sudo ./cassandra-shuffle -h 127.0.0.2 -p 7210 ls  
13:  Token                   Endpoint    Requested at  
14:  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~~~~~~~~~~  
15:  159285821494892418769639546056927958356  127.0.0.3    Tue Feb 02 16:12:41 MYT 2016  
16:  91938269708456681209179988336057166505   127.0.0.3    Tue Feb 02 16:12:41 MYT 2016  
17:  74436767763955288882613195375699296254   127.0.0.3    Tue Feb 02 16:12:41 MYT 2016  
18:  103901321670520924065314251878580267688  127.0.0.3    Tue Feb 02 16:12:41 MYT 2016  

1:  [user@localhost ~]$ apache-cassandra-1.2.18/bin/nodetool -h 127.0.0.2 -p 7210 status  
2:  Datacenter: datacenter1  
3:  ===================  
4:  Status=Up/Down  
5:  |/ State=Normal/Leaving/Joining/Moving  
6:  -- Address  Load    Tokens Owns (effective) Host ID                Rack  
7:  UN 127.0.0.2 20.01 GB  256   100.0%      ba12301a-2e3f-49d2-bb5b-e125e91fcd1b rack1  
8:  UN 127.0.0.3 20.01 GB  1    100.0%      e09705bf-01b1-423a-863c-c425a7796e51 rack1  
9:  UN 127.0.0.4 20.01 GB  1    100.0%      937f97ce-a55f-4785-8636-809456123a63 rack1  
10:  [user@localhost ~]$ apache-cassandra-1.2.18/bin/nodetool -h 127.0.0.2 -p 7210 ring  
11:    
12:  Datacenter: datacenter1  
13:  ==========  
14:  Replicas: 257  
15:    
16:  Address  Rack    Status State  Load      Owns        Token                      
17:                                       169919645461171745752870002539170714965     
18:  127.0.0.2 1e     Up   Normal 20.01 GB    100.00%       0                        
19:  127.0.0.3 1e     Up   Normal 20.01 GB    100.00%       56713727820156410577229101238628035242     
20:  127.0.0.4 1e     Up   Normal 20.01 GB    100.00%       113427455640312821154458202477256070485     
21:  127.0.0.2 1e     Up   Normal 20.01 GB    100.00%       113648993639610307133275503653969461247     
22:  127.0.0.2 1e     Up   Normal 20.01 GB    100.00%       113870531638907793112092804830682852010     
23:  127.0.0.2 1e     Up   Normal 20.01 GB    100.00%       114092069638205279090910106007396242772     
24:  127.0.0.2 1e     Up   Normal 20.01 GB    100.00%       114313607637502765069727407184109633535     
25:  127.0.0.2 1e     Up   Normal 20.01 GB    100.00%       114535145636800251048544708360823024297     
26:  127.0.0.2 1e     Up   Normal 20.01 GB    100.00%       114756683636097737027362009537536415060     
27:  127.0.0.2 1e     Up   Normal 20.01 GB    100.00%       114978221635395223006179310714249805823     
28:  ...  
29:  ...  
30:    
31:    
32:  shuffling ongoing  
33:  [user@localhost ~]$ sudo ./cassandra-shuffle -h 127.0.0.2 -p 7210 ls | wc -l  
34:  767  
35:  [user@localhost ~]$ sudo ./cassandra-shuffle -h 127.0.0.2 -p 7210 ls | wc -l  
36:  764  
37:    
38:  [user@localhost ~]$ apache-cassandra-1.2.18/bin/nodetool -h 127.0.0.2 -p 7210 netstats  
39:  Mode: RELOCATING  
40:  Not sending any streams.  
41:  Not receiving any streams.  
42:  Read Repair Statistics:  
43:  Attempted: 0  
44:  Mismatch (Blocking): 0  
45:  Mismatch (Background): 0  
46:  Pool Name          Active  Pending   Completed  
47:  Commands            n/a     0      154  
48:  Responses            n/a     0      3310  
49:  [user@localhost ~]$ apache-cassandra-1.2.18/bin/nodetool -h 127.0.0.2 -p 7210 compactionstats  
50:  pending tasks: 0  
51:  Active compaction remaining time :    n/a  
52:  [user@localhost ~]$   

As you can read above, I have created a shuffling process and enable it. The tokens started to change to 256 and the shuffling count suddenly coming down. I thought hey man, this can actually work! happily I announce to the team, looks like we able to migrate to cassandra vnodes.

However, on the next morning, when I check the upgrade process, oh gosh, the upgrade goes into a loop it seems.

1:    
2:    
3:   WARN [ScheduledRangeXfers:0] 2016-02-03 05:14:29,594 ScheduledRangeTransferExecutorService.java (line 120) Pausing until token count stabilizes (target=256, actual=282)  
4:   WARN [ScheduledRangeXfers:0] 2016-02-03 05:14:30,836 ScheduledRangeTransferExecutorService.java (line 120) Pausing until token count stabilizes (target=256, actual=282)  
5:   WARN [ScheduledRangeXfers:0] 2016-02-03 05:14:32,667 ScheduledRangeTransferExecutorService.java (line 120) Pausing until token count stabilizes (target=256, actual=282)  
6:   WARN [ScheduledRangeXfers:0] 2016-02-03 05:14:33,339 ScheduledRangeTransferExecutorService.java (line 120) Pausing until token count stabilizes (target=256, actual=282)  
7:   WARN [ScheduledRangeXfers:0] 2016-02-03 05:14:34,582 ScheduledRangeTransferExecutorService.java (line 120) Pausing until token count stabilizes (target=256, actual=282)  
8:     
9:     
10:     
11:       if (res.size() < 1)  
12:      {  
13:        LOG.debug("No queued ranges to transfer");  
14:        return;  
15:      }  
16:    
17:      if (!isReady())  
18:        return;  
19:    
20:      UntypedResultSet.Row row = res.iterator().next();  
21:    
22:      Date requestedAt = row.getTimestamp("requested_at");  
23:      ByteBuffer tokenBytes = row.getBytes("token_bytes");  
24:      Token token = StorageService.getPartitioner().getTokenFactory().fromByteArray(tokenBytes);  
25:    
26:      LOG.info("Initiating transfer of {} (scheduled at {})", token, requestedAt.toString());  
27:      try  
28:      {  
29:        StorageService.instance.relocateTokens(Collections.singleton(token));  
30:      }  
31:      catch (Exception e)  
32:      {  
33:        LOG.error("Error removing {}: {}", token, e);   
34:      }  
35:      finally  
36:      {  
37:        LOG.debug("Removing queued entry for transfer of {}", token);  
38:        processInternal(String.format("DELETE FROM system.%s WHERE token_bytes = '%s'",  
39:                       SystemTable.RANGE_XFERS_CF,  
40:                       ByteBufferUtil.bytesToHex(tokenBytes)));  
41:      }  
42:    }    
43:    
44:    private boolean isReady()  
45:    {    
46:      int targetTokens = DatabaseDescriptor.getNumTokens();  
47:      int highMark = (int)Math.ceil(targetTokens + (targetTokens * .10));  
48:      int actualTokens = StorageService.instance.getTokens().size();  
49:    
50:      if (actualTokens >= highMark)  
51:      {  
52:        LOG.warn("Pausing until token count stabilizes (target={}, actual={})", targetTokens, actualTokens);  
53:        return false;  
54:      }  
55:    
56:      return true;  
57:    }   

The shuffling counts stay at 744, it is unfortunately we have to stay with the non vnodes technology. If you have success virtual nodes upgrade, please leave your comment below like what version path you taken and what shuffling steps you taken to successfully upgrade c* cluster to vnodes.

I end this article with the steps I have taken. If you intend to upgrade to vnodes, I suggest don't waste time and might as well spin up a new cluster if more and more upgrade is not possible. One comes to mind now is the partitioner (random to murmur3) and vnodes technology.


  • stop automatic cassandra maintenace.
  • make sure data consistent.
  • make sure ALL SSTABLE VERSION ARE IC.


1. change in all server cassandra.yaml
num_tokens:256
initial_tokens to empty
rolling restart all server

2. cassandra-shuffle create

3. cassandra-shuffle enable

4. cassandra-shuffle ls

5. periodic checks.
check in log,
check in nodetool netstats
user@localhost ~$ sudo ./cassandra-shuffle -h 127.0.0.2 -p 7210 ls | wc -l
759

Friday, June 17, 2016

trying out aggregations on mongodb

Today, we will again look into mongodb but on the specific topic of aggregation.

Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result. MongoDB provides three ways to perform aggregation: the aggregation pipeline, the map-reduce function, and single purpose aggregation methods.

Let's start with a sample aggregation from the zip code.Importing 29353 objects within second. Blazing fast, maybe because I'm using ssd, heh.

 user@localhost:~$ mongoimport --collection zipcodes < ~/Desktop/zips.json   
 connected to: 127.0.0.1  
 Tue Mar 8 20:21:37.950 check 9 29353  
 Tue Mar 8 20:21:38.099 imported 29353 objects  
   
 > db.zipcodes.aggregate( [ { $group: { _id: "$state", totalPop: { $sum: "$pop" } } }, { $match: { totalPop: { $gte: 10*1000*1000 } } } ] )  
 {  
    "result" : [  
       {  
          "_id" : "IL",  
          "totalPop" : 11427576  
       },  
       {  
          "_id" : "OH",  
          "totalPop" : 10846517  
       },  
       {  
          "_id" : "FL",  
          "totalPop" : 12686644  
       },  
       {  
          "_id" : "NY",  
          "totalPop" : 17990402  
       },  
       {  
          "_id" : "PA",  
          "totalPop" : 11881643  
       },  
       {  
          "_id" : "TX",  
          "totalPop" : 16984601  
       },  
       {  
          "_id" : "CA",  
          "totalPop" : 29754890  
       }  
    ],  
    "ok" : 1  
 }  
 > db.zipcodes.aggregate( [ { $group: { _id: { state: "$state", city: "$city" }, pop: { $sum: "$pop" } } }, { $group: { _id: "$_id.state", avgCityPop: { $avg: "$pop" } } } ] )  
 {  
    "result" : [  
       {  
          "_id" : "NH",  
          "avgCityPop" : 5232.320754716981  
       },  
       {  
          "_id" : "MA",  
          "avgCityPop" : 14855.37037037037  
       },  
       {  
          "_id" : "ME",  
          "avgCityPop" : 3006.4901960784314  
       },  
       {  
          "_id" : "NY",  
          "avgCityPop" : 13131.680291970803  
       },  
       {  
          "_id" : "VT",  
          "avgCityPop" : 2315.8765432098767  
       },  
       {  
          "_id" : "PA",  
          "avgCityPop" : 8679.067202337472  
       },  
       {  
          "_id" : "DE",  
          "avgCityPop" : 14481.91304347826  
       },  
       {  
          "_id" : "DC",  
          "avgCityPop" : 303450  
       },  
       {  
          "_id" : "VA",  
          "avgCityPop" : 8526.177931034483  
       },  
       {  
          "_id" : "SC",  
          "avgCityPop" : 11139.626198083068  
       },  
       {  
          "_id" : "FL",  
          "avgCityPop" : 27400.958963282937  
       },  
       {  
          "_id" : "AL",  
          "avgCityPop" : 7907.2152641878665  
       },  
       {  
          "_id" : "NJ",  
          "avgCityPop" : 15775.89387755102  
       },  
       {  
          "_id" : "WV",  
          "avgCityPop" : 2771.4775888717154  
       },  
       {  
          "_id" : "TN",  
          "avgCityPop" : 9656.350495049504  
       },  
       {  
          "_id" : "OH",  
          "avgCityPop" : 12700.839578454332  
       },  
       {  
          "_id" : "MD",  
          "avgCityPop" : 12615.775725593667  
       },  
       {  
          "_id" : "MN",  
          "avgCityPop" : 5372.21375921376  
       },  
       {  
          "_id" : "ND",  
          "avgCityPop" : 1645.0309278350514  
       },  
       {  
          "_id" : "NC",  
          "avgCityPop" : 10622.815705128205  
       },  
       {  
          "_id" : "MT",  
          "avgCityPop" : 2593.987012987013  
       },  
       {  
          "_id" : "IL",  
          "avgCityPop" : 9954.334494773519  
       },  
       {  
          "_id" : "MO",  
          "avgCityPop" : 5672.195338512764  
       },  
       {  
          "_id" : "KS",  
          "avgCityPop" : 3819.884259259259  
       },  
       {  
          "_id" : "LA",  
          "avgCityPop" : 10465.496277915632  
       },  
       {  
          "_id" : "AR",  
          "avgCityPop" : 4175.355239786856  
       },  
       {  
          "_id" : "CO",  
          "avgCityPop" : 9981.075757575758  
       },  
       {  
          "_id" : "IN",  
          "avgCityPop" : 9271.130434782608  
       },  
       {  
          "_id" : "KY",  
          "avgCityPop" : 4767.164721141375  
       },  
       {  
          "_id" : "OK",  
          "avgCityPop" : 6155.743639921722  
       },  
       {  
          "_id" : "ID",  
          "avgCityPop" : 4320.811158798283  
       },  
       {  
          "_id" : "WY",  
          "avgCityPop" : 3384.5373134328356  
       },  
       {  
          "_id" : "UT",  
          "avgCityPop" : 9518.508287292818  
       },  
       {  
          "_id" : "NV",  
          "avgCityPop" : 18209.590909090908  
       },  
       {  
          "_id" : "NE",  
          "avgCityPop" : 3034.882692307692  
       },  
       {  
          "_id" : "RI",  
          "avgCityPop" : 19292.653846153848  
       },  
       {  
          "_id" : "NM",  
          "avgCityPop" : 5872.360465116279  
       },  
       {  
          "_id" : "CA",  
          "avgCityPop" : 27756.42723880597  
       },  
       {  
          "_id" : "AZ",  
          "avgCityPop" : 20591.16853932584  
       },  
       {  
          "_id" : "HI",  
          "avgCityPop" : 15831.842857142858  
       },  
       {  
          "_id" : "IA",  
          "avgCityPop" : 3123.0821147356583  
       },  
       {  
          "_id" : "MS",  
          "avgCityPop" : 7524.023391812865  
       },  
       {  
          "_id" : "WI",  
          "avgCityPop" : 7323.00748502994  
       },  
       {  
          "_id" : "TX",  
          "avgCityPop" : 13775.02108678021  
       },  
       {  
          "_id" : "SD",  
          "avgCityPop" : 1839.6746031746031  
       },  
       {  
          "_id" : "MI",  
          "avgCityPop" : 12087.512353706112  
       },  
       {  
          "_id" : "GA",  
          "avgCityPop" : 11547.62210338681  
       },  
       {  
          "_id" : "OR",  
          "avgCityPop" : 8262.561046511628  
       },  
       {  
          "_id" : "CT",  
          "avgCityPop" : 14674.625  
       },  
       {  
          "_id" : "WA",  
          "avgCityPop" : 12258.670025188916  
       },  
       {  
          "_id" : "AK",  
          "avgCityPop" : 2976.4918032786886  
       }  
    ],  
    "ok" : 1  
 }  
 > db.zipcodes.aggregate( [ { $group: { _id: { state: "$state", city: "$city" }, pop: { $sum: "$pop" } } }, { $sort: { pop: 1 } }, { $group: { _id : "$_id.state", biggestCity: { $last: "$_id.city" }, biggestPop:  { $last: "$pop" }, smallestCity: { $first: "$_id.city" }, smallestPop: { $first: "$pop" } } }, { $project: { _id: 0, state: "$_id", biggestCity: { name: "$biggestCity", pop: "$biggestPop" }, smallestCity: { name: "$smallestCity", pop: "$smallestPop" } } } ] )  
 {  
    "result" : [  
       {  
          "biggestCity" : {  
             "name" : "NEWARK",  
             "pop" : 111674  
          },  
          "smallestCity" : {  
             "name" : "BETHEL",  
             "pop" : 108  
          },  
          "state" : "DE"  
       },  
       {  
          "biggestCity" : {  
             "name" : "SAINT LOUIS",  
             "pop" : 397802  
          },  
          "smallestCity" : {  
             "name" : "BENDAVIS",  
             "pop" : 44  
          },  
          "state" : "MO"  
       },  
       {  
          "biggestCity" : {  
             "name" : "CHICAGO",  
             "pop" : 2452177  
          },  
          "smallestCity" : {  
             "name" : "ANCONA",  
             "pop" : 38  
          },  
          "state" : "IL"  
       },  
       {  
          "biggestCity" : {  
             "name" : "CLEVELAND",  
             "pop" : 536759  
          },  
          "smallestCity" : {  
             "name" : "ISLE SAINT GEORG",  
             "pop" : 38  
          },  
          "state" : "OH"  
       },  
       {  
          "biggestCity" : {  
             "name" : "MANCHESTER",  
             "pop" : 106452  
          },  
          "smallestCity" : {  
             "name" : "WEST NOTTINGHAM",  
             "pop" : 27  
          },  
          "state" : "NH"  
       },  
       {  
          "biggestCity" : {  
             "name" : "WASHINGTON",  
             "pop" : 606879  
          },  
          "smallestCity" : {  
             "name" : "PENTAGON",  
             "pop" : 21  
          },  
          "state" : "DC"  
       },  
       {  
          "biggestCity" : {  
             "name" : "GRAND FORKS",  
             "pop" : 59527  
          },  
          "smallestCity" : {  
             "name" : "TROTTERS",  
             "pop" : 12  
          },  
          "state" : "ND"  
       },  
       {  
          "biggestCity" : {  
             "name" : "BALTIMORE",  
             "pop" : 733081  
          },  
          "smallestCity" : {  
             "name" : "ANNAPOLIS JUNCTI",  
             "pop" : 32  
          },  
          "state" : "MD"  
       },  
       {  
          "biggestCity" : {  
             "name" : "MINNEAPOLIS",  
             "pop" : 344719  
          },  
          "smallestCity" : {  
             "name" : "JOHNSON",  
             "pop" : 12  
          },  
          "state" : "MN"  
       },  
       {  
          "biggestCity" : {  
             "name" : "SALT LAKE CITY",  
             "pop" : 186346  
          },  
          "smallestCity" : {  
             "name" : "MODENA",  
             "pop" : 9  
          },  
          "state" : "UT"  
       },  
       {  
          "biggestCity" : {  
             "name" : "CHEYENNE",  
             "pop" : 70185  
          },  
          "smallestCity" : {  
             "name" : "LOST SPRINGS",  
             "pop" : 6  
          },  
          "state" : "WY"  
       },  
       {  
          "biggestCity" : {  
             "name" : "PHOENIX",  
             "pop" : 890853  
          },  
          "smallestCity" : {  
             "name" : "HUALAPAI",  
             "pop" : 2  
          },  
          "state" : "AZ"  
       },  
       {  
          "biggestCity" : {  
             "name" : "BRIDGEPORT",  
             "pop" : 141638  
          },  
          "smallestCity" : {  
             "name" : "EAST KILLINGLY",  
             "pop" : 25  
          },  
          "state" : "CT"  
       },  
       {  
          "biggestCity" : {  
             "name" : "SEATTLE",  
             "pop" : 520096  
          },  
          "smallestCity" : {  
             "name" : "BENGE",  
             "pop" : 2  
          },  
          "state" : "WA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "BIRMINGHAM",  
             "pop" : 242606  
          },  
          "smallestCity" : {  
             "name" : "ALLEN",  
             "pop" : 0  
          },  
          "state" : "AL"  
       },  
       {  
          "biggestCity" : {  
             "name" : "LAS VEGAS",  
             "pop" : 597557  
          },  
          "smallestCity" : {  
             "name" : "TUSCARORA",  
             "pop" : 1  
          },  
          "state" : "NV"  
       },  
       {  
          "biggestCity" : {  
             "name" : "OMAHA",  
             "pop" : 358930  
          },  
          "smallestCity" : {  
             "name" : "LAKESIDE",  
             "pop" : 5  
          },  
          "state" : "NE"  
       },  
       {  
          "biggestCity" : {  
             "name" : "MIAMI",  
             "pop" : 825232  
          },  
          "smallestCity" : {  
             "name" : "CECIL FIELD NAS",  
             "pop" : 0  
          },  
          "state" : "FL"  
       },  
       {  
          "biggestCity" : {  
             "name" : "SIOUX FALLS",  
             "pop" : 102046  
          },  
          "smallestCity" : {  
             "name" : "ZEONA",  
             "pop" : 8  
          },  
          "state" : "SD"  
       },  
       {  
          "biggestCity" : {  
             "name" : "HOUSTON",  
             "pop" : 2095918  
          },  
          "smallestCity" : {  
             "name" : "FULTON",  
             "pop" : 0  
          },  
          "state" : "TX"  
       },  
       {  
          "biggestCity" : {  
             "name" : "MILWAUKEE",  
             "pop" : 597324  
          },  
          "smallestCity" : {  
             "name" : "CLAM LAKE",  
             "pop" : 2  
          },  
          "state" : "WI"  
       },  
       {  
          "biggestCity" : {  
             "name" : "JACKSON",  
             "pop" : 204788  
          },  
          "smallestCity" : {  
             "name" : "CHUNKY",  
             "pop" : 79  
          },  
          "state" : "MS"  
       },  
       {  
          "biggestCity" : {  
             "name" : "DES MOINES",  
             "pop" : 148155  
          },  
          "smallestCity" : {  
             "name" : "DOUDS",  
             "pop" : 15  
          },  
          "state" : "IA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "HONOLULU",  
             "pop" : 396643  
          },  
          "smallestCity" : {  
             "name" : "NINOLE",  
             "pop" : 0  
          },  
          "state" : "HI"  
       },  
       {  
          "biggestCity" : {  
             "name" : "CHARLOTTE",  
             "pop" : 465833  
          },  
          "smallestCity" : {  
             "name" : "GLOUCESTER",  
             "pop" : 0  
          },  
          "state" : "NC"  
       },  
       {  
          "biggestCity" : {  
             "name" : "BILLINGS",  
             "pop" : 78805  
          },  
          "smallestCity" : {  
             "name" : "MOSBY",  
             "pop" : 7  
          },  
          "state" : "MT"  
       },  
       {  
          "biggestCity" : {  
             "name" : "TULSA",  
             "pop" : 389072  
          },  
          "smallestCity" : {  
             "name" : "SOUTHARD",  
             "pop" : 8  
          },  
          "state" : "OK"  
       },  
       {  
          "biggestCity" : {  
             "name" : "BOISE",  
             "pop" : 165522  
          },  
          "smallestCity" : {  
             "name" : "KEUTERVILLE",  
             "pop" : 0  
          },  
          "state" : "ID"  
       },  
       {  
          "biggestCity" : {  
             "name" : "INDIANAPOLIS",  
             "pop" : 348868  
          },  
          "smallestCity" : {  
             "name" : "WESTPOINT",  
             "pop" : 145  
          },  
          "state" : "IN"  
       },  
       {  
          "biggestCity" : {  
             "name" : "LOUISVILLE",  
             "pop" : 288058  
          },  
          "smallestCity" : {  
             "name" : "BROWDER",  
             "pop" : 0  
          },  
          "state" : "KY"  
       },  
       {  
          "biggestCity" : {  
             "name" : "BURLINGTON",  
             "pop" : 39127  
          },  
          "smallestCity" : {  
             "name" : "UNIV OF VERMONT",  
             "pop" : 0  
          },  
          "state" : "VT"  
       },  
       {  
          "biggestCity" : {  
             "name" : "PHILADELPHIA",  
             "pop" : 1610956  
          },  
          "smallestCity" : {  
             "name" : "HAMILTON",  
             "pop" : 0  
          },  
          "state" : "PA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "NEW ORLEANS",  
             "pop" : 496937  
          },  
          "smallestCity" : {  
             "name" : "FORDOCHE",  
             "pop" : 0  
          },  
          "state" : "LA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "COLUMBIA",  
             "pop" : 269521  
          },  
          "smallestCity" : {  
             "name" : "QUINBY",  
             "pop" : 0  
          },  
          "state" : "SC"  
       },  
       {  
          "biggestCity" : {  
             "name" : "WICHITA",  
             "pop" : 295115  
          },  
          "smallestCity" : {  
             "name" : "ARNOLD",  
             "pop" : 0  
          },  
          "state" : "KS"  
       },  
       {  
          "biggestCity" : {  
             "name" : "NEWARK",  
             "pop" : 275572  
          },  
          "smallestCity" : {  
             "name" : "IMLAYSTOWN",  
             "pop" : 17  
          },  
          "state" : "NJ"  
       },  
       {  
          "biggestCity" : {  
             "name" : "HUNTINGTON",  
             "pop" : 75343  
          },  
          "smallestCity" : {  
             "name" : "MOUNT CARBON",  
             "pop" : 0  
          },  
          "state" : "WV"  
       },  
       {  
          "biggestCity" : {  
             "name" : "MEMPHIS",  
             "pop" : 632837  
          },  
          "smallestCity" : {  
             "name" : "ALLRED",  
             "pop" : 2  
          },  
          "state" : "TN"  
       },  
       {  
          "biggestCity" : {  
             "name" : "VIRGINIA BEACH",  
             "pop" : 385080  
          },  
          "smallestCity" : {  
             "name" : "WALLOPS ISLAND",  
             "pop" : 0  
          },  
          "state" : "VA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "ANCHORAGE",  
             "pop" : 183987  
          },  
          "smallestCity" : {  
             "name" : "CHEVAK",  
             "pop" : 0  
          },  
          "state" : "AK"  
       },  
       {  
          "biggestCity" : {  
             "name" : "BROOKLYN",  
             "pop" : 2300504  
          },  
          "smallestCity" : {  
             "name" : "RAQUETTE LAKE",  
             "pop" : 0  
          },  
          "state" : "NY"  
       },  
       {  
          "biggestCity" : {  
             "name" : "DENVER",  
             "pop" : 451182  
          },  
          "smallestCity" : {  
             "name" : "CHEYENNE MTN AFB",  
             "pop" : 0  
          },  
          "state" : "CO"  
       },  
       {  
          "biggestCity" : {  
             "name" : "DETROIT",  
             "pop" : 963243  
          },  
          "smallestCity" : {  
             "name" : "LELAND",  
             "pop" : 0  
          },  
          "state" : "MI"  
       },  
       {  
          "biggestCity" : {  
             "name" : "PORTLAND",  
             "pop" : 518543  
          },  
          "smallestCity" : {  
             "name" : "KENT",  
             "pop" : 0  
          },  
          "state" : "OR"  
       },  
       {  
          "biggestCity" : {  
             "name" : "ATLANTA",  
             "pop" : 609591  
          },  
          "smallestCity" : {  
             "name" : "FORT STEWART",  
             "pop" : 0  
          },  
          "state" : "GA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "CRANSTON",  
             "pop" : 176404  
          },  
          "smallestCity" : {  
             "name" : "CLAYVILLE",  
             "pop" : 45  
          },  
          "state" : "RI"  
       },  
       {  
          "biggestCity" : {  
             "name" : "ALBUQUERQUE",  
             "pop" : 449584  
          },  
          "smallestCity" : {  
             "name" : "ALGODONES",  
             "pop" : 0  
          },  
          "state" : "NM"  
       },  
       {  
          "biggestCity" : {  
             "name" : "LOS ANGELES",  
             "pop" : 2102295  
          },  
          "smallestCity" : {  
             "name" : "TWIN BRIDGES",  
             "pop" : 0  
          },  
          "state" : "CA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "LITTLE ROCK",  
             "pop" : 192895  
          },  
          "smallestCity" : {  
             "name" : "TOMATO",  
             "pop" : 0  
          },  
          "state" : "AR"  
       },  
       {  
          "biggestCity" : {  
             "name" : "WORCESTER",  
             "pop" : 169856  
          },  
          "smallestCity" : {  
             "name" : "BUCKLAND",  
             "pop" : 16  
          },  
          "state" : "MA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "PORTLAND",  
             "pop" : 63268  
          },  
          "smallestCity" : {  
             "name" : "BUSTINS ISLAND",  
             "pop" : 0  
          },  
          "state" : "ME"  
       }  
    ],  
    "ok" : 1  
 }  
 >   
   

All query in the examples works. It's amazing all three queries quickly bring results within second! Amazing. Whilst this is an short article to convince you to use aggregation on mongodb, and if you have been convince, you should really try on the following useful links too.


https://docs.mongodb.org/manual/core/map-reduce/

https://docs.mongodb.org/manual/reference/aggregation/

good luck!

Sunday, June 5, 2016

java collection framework

Once I was asked by a company what is the data structure in java and i was not prepare at all, but as usual, why bother remembered every details when we can google and start to read. It turn out that the answer they are seeking is the java collection framework and the next question comes, like what are the characteristics of the collections.

Well, to be really honest, who go remember every details when we can read the javadoc? Anyway, recently I found this chart circulating in the facebook which remind me of the questions asked. So I thought this is helpful and we should not memorized every fine details but the essence point is you know where to get the material and willing to share.

So here goes!

































This is a short article and I hope you find this useful in your daily coding reference than use it to answer some funny questions. haha!





Saturday, June 4, 2016

First look into sector/sphere

Wikipedia excerpt

Sector/Sphere is an open source software suite for high-performance distributed data storage and processing. It can be broadly compared to Google's GFS/MapReduce stack. Sector is a distributed file system targeting data storage over a large number of commodity computers. Sphere is the programming framework that supports massive in-storage parallel data processing for data stored in Sector. Additionally, Sector/Sphere is unique in its ability to operate in a wide area network (WAN) setting.
Today, we will take a look into another big data technology, sector/sphere Let's download the source here.

 user@localhost:~/Desktop$ tar -xf sector.2.8.tar   
 user@localhost:~/Desktop$ cd sector/  
 user@localhost:~/Desktop/sector$ ls  
 total 116K  
 -rw-r--r-- 1 user user 5.5K Feb 3 2012 rpm.spec  
 -rw-r--r-- 1 user user 2.2K Feb 3 2012 release_note.txt  
 -rw-r--r-- 1 user user 509 Feb 3 2012 NOTICE.txt  
 -rw-r--r-- 1 user user 506 Feb 3 2012 Makefile.common  
 -rw-r--r-- 1 user user 279 Feb 3 2012 Makefile  
 -rw-r--r-- 1 user user 286 Feb 3 2012 README.txt  
 -rw-r--r-- 1 user user 12K Feb 3 2012 LICENSE.txt  
 -rw-r--r-- 1 user user  0 Feb 3 2012 sector.dev.log  
 -rw-r--r-- 1 user user 6.4K Feb 3 2012 README.stable_branch.txt  
 -rw-r--r-- 1 user user 7.3K Feb 3 2012 LOG  
 drwxr-xr-x 2 user user 4.0K Feb 3 2012 client  
 drwxr-xr-x 2 user user 4.0K Feb 3 2012 common  
 drwxr-xr-x 4 user user 4.0K Feb 3 2012 doc  
 drwxr-xr-x 3 user user 4.0K Feb 3 2012 examples  
 drwxr-xr-x 4 user user 4.0K Feb 3 2012 fuse  
 drwxr-xr-x 2 user user 4.0K Feb 3 2012 gmp  
 drwxr-xr-x 2 user user 4.0K Feb 3 2012 include  
 drwxr-xr-x 2 user user 4.0K Feb 3 2012 lib  
 drwxr-xr-x 2 user user 4.0K Feb 3 2012 master  
 drwxr-xr-x 2 user user 4.0K Feb 3 2012 security  
 drwxr-xr-x 3 user user 4.0K Feb 3 2012 slave  
 drwxr-xr-x 2 user user 4.0K Feb 3 2012 test  
 drwxr-xr-x 2 user user 4.0K Feb 3 2012 tools  
 drwxr-xr-x 2 user user 4.0K Feb 3 2012 udt  
 drwxr-xr-x 3 user user 4.0K Feb 3 2012 conf  
 user@localhost:~/Desktop/sector$ find .  
 .  
 ./client  
 ./client/fscache_unittest.cpp  
 ./client/fscache_unittest  
 ./client/fscache.cpp  
 ./client/clientmgmt.cpp  
 ./client/client.h  
 ./client/config_unittest.cpp  
 ./client/client.cpp  
 ./client/Makefile  
 ./client/client_conf.cpp  
 ./client/dbclient.h  
 ./client/fscache.h  
 ./client/dcclient.cpp  
 ./client/dcclient.h  
 ./client/clientmgmt.h  
 ./client/fsclient.cpp  
 ./client/client.cpp.orig  
 ./client/fsclient.h  
 ./release_note.txt  
 ./tools  
 ./tools/fsck.cpp  
 ./tools/mv.cpp  
 ./tools/cp.cpp  
 ./tools/download.cpp  
 ./tools/sysinfo.cpp  
 ./tools/pipe.cpp  
 ./tools/stat.cpp  
 ./tools/shutdown.cpp  
 ./tools/probot.h  
 ./tools/mkdir.cpp  
 ./tools/upload.cpp  
 ./tools/replicate.cpp  
 ./tools/probot.cpp  
 ./tools/stream.cpp  
 ./tools/Makefile  
 ./tools/ls.cpp  
 ./tools/rm.cpp  
 ./LOG  
 ./Makefile.common  
 ./slave  
 ./slave/start_slave.cpp  
 ./slave/serv_db.cpp  
 ./slave/sphere  
 ./slave/sphere/streamhash.cpp  
 ./slave/sphere/.svn  
 ./slave/sphere/.svn/entries  
 ./slave/sphere/.svn/tmp  
 ./slave/sphere/.svn/tmp/prop-base  
 ./slave/sphere/.svn/tmp/props  
 ./slave/sphere/.svn/tmp/text-base  
 ./slave/sphere/.svn/prop-base  
 ./slave/sphere/.svn/props  
 ./slave/sphere/.svn/text-base  
 ./slave/sphere/.svn/text-base/streamhash.cpp.svn-base  
 ./slave/sphere/.svn/text-base/Makefile.svn-base  
 ./slave/sphere/.svn/all-wcprops  
 ./slave/sphere/.svn/format  
 ./slave/sphere/Makefile  
 ./slave/config_unittest.cpp  
 ./slave/serv_file.cpp  
 ./slave/Makefile  
 ./slave/serv_spe.cpp  
 ./slave/slave.cpp  
 ./slave/slave_conf.cpp  
 ./slave/slave.h  
 ./conf  
 ./conf/topology.conf  
 ./conf/client.conf  
 ./conf/users  
 ./conf/users/root  
 ./conf/users/.svn  
 ./conf/users/.svn/entries  
 ./conf/users/.svn/tmp  
 ./conf/users/.svn/tmp/prop-base  
 ./conf/users/.svn/tmp/props  
 ./conf/users/.svn/tmp/text-base  
 ./conf/users/.svn/prop-base  
 ./conf/users/.svn/props  
 ./conf/users/.svn/text-base  
 ./conf/users/.svn/text-base/root.svn-base  
 ./conf/users/.svn/text-base/test.svn-base  
 ./conf/users/.svn/text-base/anonymous.svn-base  
 ./conf/users/.svn/all-wcprops  
 ./conf/users/.svn/format  
 ./conf/users/anonymous  
 ./conf/users/test  
 ./conf/slave.conf  
 ./conf/slaves.list  
 ./conf/replica.conf  
 ./conf/masters.list  
 ./conf/master_acl.conf  
 ./conf/master.conf  
 ./conf/slave_acl.conf  
 ./fuse  
 ./fuse/test2  
 ./fuse/fusedircache.cpp  
 ./fuse/sector-fuse.cpp  
 ./fuse/fusedircache.h  
 ./fuse/result  
 ./fuse/sectorfs.h  
 ./fuse/Makefile  
 ./fuse/sectorfs.cpp  
 ./fuse/test  
 ./rpm.spec  
 ./NOTICE.txt  
 ./lib  
 ./sector.dev.log  
 ./LICENSE.txt  
 ./common  
 ./common/ssltransport.cpp  
 ./common/dhash.cpp  
 ./common/transaction.cpp  
 ./common/conf_unittest.cpp  
 ./common/sfopt.cpp  
 ./common/tcptransport.h  
 ./common/conf.cpp  
 ./common/snode.cpp  
 ./common/topology.h  
 ./common/writelog_unittest.cpp  
 ./common/log.cpp  
 ./common/writelog.cpp  
 ./common/osportable.cpp  
 ./common/stat.cpp  
 ./common/ssltransport.h  
 ./common/strconcat.h  
 ./common/datachn.cpp  
 ./common/replica_conf.h  
 ./common/dhash.h  
 ./common/tcptransport.cpp  
 ./common/topology_unittest.cpp  
 ./common/conf.h  
 ./common/transport.h  
 ./common/threadpool.cpp  
 ./common/udttransport.cpp  
 ./common/stat.h  
 ./common/log.h  
 ./common/sphere.cpp  
 ./common/localfs.h  
 ./common/crypto_unittest.cpp  
 ./common/writelog.h  
 ./common/topology.cpp  
 ./common/index.cpp  
 ./common/format.h  
 ./common/meta.cpp  
 ./common/replica_conf.cpp  
 ./common/Makefile  
 ./common/meta.h  
 ./common/routing.cpp  
 ./common/index.h  
 ./common/constant.cpp  
 ./common/crypto.h  
 ./common/transaction.h  
 ./common/routing.h  
 ./common/crypto_unittest  
 ./common/message.h  
 ./common/udttransport.h  
 ./common/index_unittest.cpp  
 ./common/transaction_unittest.cpp  
 ./common/test  
 ./common/memobj.cpp  
 ./common/datachn.h  
 ./common/crypto.cpp  
 ./common/message.cpp  
 ./common/threadpool.h  
 ./master  
 ./master/start_master.cpp  
 ./master/user.h  
 ./master/start_all.cpp  
 ./master/master.h  
 ./master/config_unittest  
 ./master/master_conf.cpp  
 ./master/master.cpp  
 ./master/replica.h  
 ./master/slavemgmt.h  
 ./master/replica_unittest.cpp  
 ./master/Makefile.orig  
 ./master/replica.cpp  
 ./master/config_unittest.cpp  
 ./master/Makefile  
 ./master/replica_unittest  
 ./master/user.cpp  
 ./master/slavemgmt_unittest.cpp  
 ./master/slavemgmt.cpp  
 ./master/stop_all.cpp  
 ./master/res.save  
 ./udt  
 ./udt/cache.h  
 ./udt/epoll.cpp  
 ./udt/buffer.h  
 ./udt/md5.cpp  
 ./udt/md5.h  
 ./udt/api.cpp  
 ./udt/ccc.cpp  
 ./udt/channel.cpp  
 ./udt/packet.cpp  
 ./udt/udt.h  
 ./udt/packet.h  
 ./udt/queue.h  
 ./udt/cache.cpp  
 ./udt/channel.h  
 ./udt/list.h  
 ./udt/window.h  
 ./udt/window.cpp  
 ./udt/Makefile  
 ./udt/ccc.h  
 ./udt/common.cpp  
 ./udt/queue.cpp  
 ./udt/list.cpp  
 ./udt/api.h  
 ./udt/core.h  
 ./udt/common.h  
 ./udt/buffer.cpp  
 ./udt/core.cpp  
 ./udt/epoll.h  
 ./Makefile  
 ./README.txt  
 ./security  
 ./security/security.h  
 ./security/filesrc.h  
 ./security/security.cpp  
 ./security/sserver.cpp  
 ./security/filesrc.cpp  
 ./security/Makefile  
 ./security/filesrc_unittest.cpp  
 ./security/ssl_cert_gen.cpp  
 ./doc  
 ./doc/hlp  
 ./doc/hlp/ix_endm.gif  
 ./doc/hlp/ix_open.gif  
 ./doc/hlp/ix_endp.gif  
 ./doc/hlp/ix_end.gif  
 ./doc/hlp/ix_listm.gif  
 ./doc/hlp/ix_listp.gif  
 ./doc/hlp/ix_list.gif  
 ./doc/hlp/ix_link.gif  
 ./doc/hlp/ix_up.gif  
 ./doc/hlp/ix_down.gif  
 ./doc/hlp/ix_leaf.gif  
 ./doc/hlp/ix_space.gif  
 ./doc/hlp/ix_book.gif  
 ./doc/hlp/ix_line.gif  
 ./doc/main.htm  
 ./doc/doc  
 ./doc/doc/r-c-client-rmr.htm  
 ./doc/doc/r-c-client-createsf.htm  
 ./doc/doc/r-c-sphereprocess-checkreduceprogress.htm  
 ./doc/doc/t-make.htm  
 ./doc/doc/r-c-sphereprocess-run_mr.htm  
 ./doc/doc/r-c-sinput.htm  
 ./doc/doc/t-config-client.htm  
 ./doc/doc/r-c-sectorfile-upload.htm  
 ./doc/doc/g-dfs.htm  
 ./doc/doc/header.htm  
 ./doc/doc/r-c-sectorfile.htm  
 ./doc/doc/r-c-client-remove.htm  
 ./doc/doc/r-c-client-releasesp.htm  
 ./doc/doc/r-c-client-shutdown.htm  
 ./doc/doc/intro.htm  
 ./doc/doc/t-ec2.htm  
 ./doc/doc/r-c-client-mkdir.htm  
 ./doc/doc/r-c-sphereprocess-read.htm  
 ./doc/doc/guide.htm  
 ./doc/doc/r-c-client-createsp.htm  
 ./doc/doc/installation.htm  
 ./doc/doc/t-config-fuse.htm  
 ./doc/doc/r-c-address.htm  
 ./doc/doc/t-config-slave.htm  
 ./doc/doc/r-c-sectorfile-download.htm  
 ./doc/doc/r-c-sectorfile-close.htm  
 ./doc/doc/r-c-client-move.htm  
 ./doc/doc/r-t-shutdown.htm  
 ./doc/doc/r-c-sectorfile-write.htm  
 ./doc/doc/r-c-sphereprocess-setminunitsize.htm  
 ./doc/doc/r-c-spherestream-init.htm  
 ./doc/doc/r-c-sphereprocess-checkmapprogress.htm  
 ./doc/doc/r-c-sfile.htm  
 ./doc/doc/r-t-stream.htm  
 ./doc/doc/r-c-sectorfile-seekg.htm  
 ./doc/doc/r-c-soutput-resizeresbuf.htm  
 ./doc/doc/g-psphere.htm  
 ./doc/doc/r-t-stat.htm  
 ./doc/doc/r-t-rm.htm  
 ./doc/doc/r-c-spherestream.htm  
 ./doc/doc/r-c-sphereprocess-setdatamoveattr.htm  
 ./doc/doc/r-c-client.htm  
 ./doc/doc/treeview.css  
 ./doc/doc/r-c-snode.htm  
 ./doc/doc/t-cert.htm  
 ./doc/doc/t-password.htm  
 ./doc/doc/r-c-client-stat.htm  
 ./doc/doc/reference.htm  
 ./doc/doc/r-c-sphereprocess-run.htm  
 ./doc/doc/r-c-client-sysinfo.htm  
 ./doc/doc/r-c-sphereprocess-checkprogress.htm  
 ./doc/doc/images  
 ./doc/doc/images/g-spe_clip_image002.gif  
 ./doc/doc/images/g-spe_clip_image005.gif  
 ./doc/doc/images/.svn  
 ./doc/doc/images/.svn/entries  
 ./doc/doc/images/.svn/tmp  
 ./doc/doc/images/.svn/tmp/prop-base  
 ./doc/doc/images/.svn/tmp/props  
 ./doc/doc/images/.svn/tmp/text-base  
 ./doc/doc/images/.svn/prop-base  
 ./doc/doc/images/.svn/prop-base/sector-arch.jpg.svn-base  
 ./doc/doc/images/.svn/prop-base/g-spe_clip_image002.gif.svn-base  
 ./doc/doc/images/.svn/prop-base/g-spe_clip_image005.gif.svn-base  
 ./doc/doc/images/.svn/props  
 ./doc/doc/images/.svn/text-base  
 ./doc/doc/images/.svn/text-base/sector-arch.jpg.svn-base  
 ./doc/doc/images/.svn/text-base/g-spe_clip_image002.gif.svn-base  
 ./doc/doc/images/.svn/text-base/g-spe_clip_image005.gif.svn-base  
 ./doc/doc/images/.svn/all-wcprops  
 ./doc/doc/images/.svn/format  
 ./doc/doc/images/sector-arch.jpg  
 ./doc/doc/r-c-sectorfile-flush.htm  
 ./doc/doc/r-c-sectorfile-read.htm  
 ./doc/doc/r-c-sectorfile-utime.htm  
 ./doc/doc/r-c-sphereprocess-loadoperator.htm  
 ./doc/doc/r-c-client-list.htm  
 ./doc/doc/r-t-ls.htm  
 ./doc/doc/r-c-sectorfile-seekp.htm  
 ./doc/doc/r-c-sphereprocess-setprocnumpernode.htm  
 ./doc/doc/g-psector.htm  
 ./doc/doc/r-c-client-releasesf.htm  
 ./doc/doc/r-t-upload.htm  
 ./doc/doc/r-t-sysinfo.htm  
 ./doc/doc/g-mr.htm  
 ./doc/doc/copy.htm  
 ./doc/doc/r-c-sectorfile-tellp.htm  
 ./doc/doc/r-c-sysstat.htm  
 ./doc/doc/ecode.htm  
 ./doc/doc/r-tools.htm  
 ./doc/doc/t-config-master.htm  
 ./doc/doc/r-c-client-close.htm  
 ./doc/doc/r-c-sectorfile-eof.htm  
 ./doc/doc/r-t-download.htm  
 ./doc/doc/t-advanced.htm  
 ./doc/doc/r-c-sphereprocess-close.htm  
 ./doc/doc/r-c-client-login.htm  
 ./doc/doc/r-t-mkdir.htm  
 ./doc/doc/r-c-sectorfile-tellg.htm  
 ./doc/doc/r-c-soutput-resizeidxbuf.htm  
 ./doc/doc/r-c-sphereprocess-waitforcompletion.htm  
 ./doc/doc/footer.htm  
 ./doc/doc/r-c-sphereprocess.htm  
 ./doc/doc/r-c-sphereprocess-setmaxunitsize.htm  
 ./doc/doc/g-cdn.htm  
 ./doc/doc/r-c-client-logout.htm  
 ./doc/doc/r-c-sphereresult.htm  
 ./doc/doc/g-spe.htm  
 ./doc/doc/r-classes.htm  
 ./doc/doc/r-c-sectorfile-open.htm  
 ./doc/doc/g-udf.htm  
 ./doc/doc/r-t-cp.htm  
 ./doc/doc/sectordoc.css  
 ./doc/doc/r-c-soutput.htm  
 ./doc/doc/r-t-mv.htm  
 ./doc/doc/r-c-client-init.htm  
 ./doc/doc/r-c-spherestream-setoutputpath.htm  
 ./doc/doc/t-config-security.htm  
 ./doc/index.htm  
 ./include  
 ./include/osportable.h  
 ./include/sector.h  
 ./include/sphere.h  
 ./test  
 ./test/test-framework.sh  
 ./test/testfile.cpp  
 ./test/nohup.out  
 ./test/send_dbg_cmd.cpp  
 ./test/stress.cpp  
 ./test/iotest.cpp  
 ./test/Makefile  
 ./test/sanity.sh  
 ./test/recovery.sh  
 ./examples  
 ./examples/README  
 ./examples/wordcount.cpp  
 ./examples/testdc.cpp  
 ./examples/mrword.cpp  
 ./examples/testfs.cpp  
 ./examples/text.idx.cpp  
 ./examples/Makefile  
 ./examples/mrsort.cpp  
 ./examples/funcs  
 ./examples/funcs/sorthash.cpp  
 ./examples/funcs/mr_word.cpp  
 ./examples/funcs/wordbucket.cpp  
 ./examples/funcs/mr_sort.cpp  
 ./examples/funcs/sort.cpp  
 ./examples/funcs/Makefile  
 ./examples/funcs/gen_idx.cpp  
 ./examples/funcs/randwriter.cpp  
 ./README.stable_branch.txt  
 ./gmp  
 ./gmp/gmp.cpp  
 ./gmp/prec.h  
 ./gmp/gmp.vcproj  
 ./gmp/gmp.h  
 ./gmp/Makefile  
 ./gmp/prec.cpp  
 ./gmp/message.h  
 ./gmp/message.cpp  

As you can read above, we downloaded sector 2.8 and extract it. The content consists of the client, tools, slave, conf, fuse, common, master, udt, security, doc, test, gmp and example components.

For the compiling instruction, the documentation where I refer to can be found here.

 $ make  
 make[1]: Entering directory '/home/user/Desktop/sector/udt'  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra -D IA32 -D LINUX md5.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra -D IA32 -D LINUX common.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra -D IA32 -D LINUX window.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra -D IA32 -D LINUX list.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra -D IA32 -D LINUX buffer.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra -D IA32 -D LINUX packet.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra -D IA32 -D LINUX channel.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra -D IA32 -D LINUX queue.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra -D IA32 -D LINUX core.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra -D IA32 -D LINUX cache.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra -D IA32 -D LINUX epoll.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra -D IA32 -D LINUX api.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra -D IA32 -D LINUX ccc.cpp -c  
 g++ -shared -o libudt.so md5.o common.o window.o list.o buffer.o packet.o channel.o queue.o core.o cache.o epoll.o api.o ccc.o -lstdc++ -lpthread -lssl -lcrypto -L. -L../lib   
 /usr/bin/ld: cannot find -lssl  
 /usr/bin/ld: cannot find -lcrypto  
 collect2: error: ld returned 1 exit status  
 Makefile:25: recipe for target 'libudt.so' failed  
 make[1]: *** [libudt.so] Error 1  
 make[1]: Leaving directory '/home/user/Desktop/sector/udt'  
 Makefile:5: recipe for target 'subdirs' failed  
 make: *** [subdirs] Error 2  
 user@localhost:~/Desktop/sector$   

oppsss.... looks like ssl library never get install in my system.

 user@localhost:~/Desktop/sector$ sudo apt-get install libssl-dev   
 Reading package lists... Done  
 Building dependency tree      
 Reading state information... Done  
 The following packages were automatically installed and are no longer required:  
  gstreamer0.10-nice gstreamer1.0-clutter libappstream-glib7 libasm3-java libcamel-1.2-52 libdbusmenu-qt5-2 libebook-contacts-1.2-1 libecal-1.2-18 libedata-cal-1.2-27 libedataserver-1.2-20  
  libept1.4.16 libfarstream-0.1-0 libgegl-0.2-0 libgsf-1-114 libgsf-1-common libgsoap7 libisl13 libkf5bookmarks-data libkf5bookmarks5 libkf5kiofilewidgets5 libkf5notifications-data  
  libkf5notifications5 libkf5solid5 libkf5solid5-data libphonon4qt5-4 librarian0 libraw10 libruby2.1 libvncserver1 libvte-2.90-9 libvte-2.90-common libx264-146:i386 libx265-59  
  libx265-59:i386 linux-image-4.1.0-2-amd64 phonon4qt5 phonon4qt5-backend-vlc python-dbus-dev python-zeitgeist rarian-compat vlc-plugin-samba  
 Use 'sudo apt autoremove' to remove them.  
 The following additional packages will be installed:  
  libssl-doc  
 The following NEW packages will be installed:  
  libssl-dev libssl-doc  
 0 upgraded, 2 newly installed, 0 to remove and 287 not upgraded.  
 Need to get 2,788 kB of archives.  
 After this operation, 10.3 MB of additional disk space will be used.  
 Do you want to continue? [Y/n] Y  
 Get:1 http://ftp.us.debian.org/debian testing/main amd64 libssl-dev amd64 1.0.2f-2 [1,538 kB]  
 Get:2 http://ftp.us.debian.org/debian testing/main amd64 libssl-doc all 1.0.2f-2 [1,250 kB]                                                   
 Fetched 2,788 kB in 8s (312 kB/s)                                                                                
 Selecting previously unselected package libssl-dev:amd64.  
 (Reading database ... 224709 files and directories currently installed.)  
 Preparing to unpack .../libssl-dev_1.0.2f-2_amd64.deb ...  
 Unpacking libssl-dev:amd64 (1.0.2f-2) ...  
 Selecting previously unselected package libssl-doc.  
 Preparing to unpack .../libssl-doc_1.0.2f-2_all.deb ...  
 Unpacking libssl-doc (1.0.2f-2) ...  
 Processing triggers for man-db (2.7.5-1) ...  
 Setting up libssl-dev:amd64 (1.0.2f-2) ...  
 Setting up libssl-doc (1.0.2f-2) ...  

Let's compile it again.

 user@localhost:~/Desktop/sector$ make  
 make[1]: Entering directory '/home/user/Desktop/sector/udt'  
 g++ -shared -o libudt.so md5.o common.o window.o list.o buffer.o packet.o channel.o queue.o core.o cache.o epoll.o api.o ccc.o -lstdc++ -lpthread -lssl -lcrypto -L. -L../lib   
 ar -rcs libudt.a md5.o common.o window.o list.o buffer.o packet.o channel.o queue.o core.o cache.o epoll.o api.o ccc.o  
 make[1]: Leaving directory '/home/user/Desktop/sector/udt'  
 make[1]: Entering directory '/home/user/Desktop/sector/udt'  
 mv libudt.so libudt.a ../lib  
 make[1]: Leaving directory '/home/user/Desktop/sector/udt'  
 make[1]: Entering directory '/home/user/Desktop/sector/common'  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra osportable.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra conf.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra sfopt.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra snode.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra meta.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra index.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra memobj.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra transaction.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra topology.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra log.cpp -c  
 log.cpp: In member function ‘void logger::LogAggregate::setLogLevel(logger::LogLevel)’:  
 log.cpp:404:24: warning: variable ‘newLogger’ set but not used [-Wunused-but-set-variable]  
    log_mapped_value_t newLogger;  
             ^  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra sphere.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra constant.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra crypto.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra dhash.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra routing.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra udttransport.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra ssltransport.cpp -c  
 g++ -fPIC -DMULTITHREADED_LOGGER -I. -I../include -I../udt -g -DDEBUG -Wall -Wextra tcptransport.cpp -c  
 tcptransport.cpp: In member function ‘virtual int sector::TCPTransport::close()’:  
 tcptransport.cpp:157:11: error: ‘::close’ has not been declared  
   return ::close(m_iSocket);  
       ^  
 tcptransport.cpp:157:11: note: suggested alternative:  
 In file included from ../include/sector.h:31:0,  
          from tcptransport.cpp:33:  
 ../udt/udt.h:310:13: note:  ‘UDT::close’  
  UDT_API int close(UDTSOCKET u);  
        ^  
 Makefile:30: recipe for target 'tcptransport.o' failed  
 make[1]: *** [tcptransport.o] Error 1  
 make[1]: Leaving directory '/home/user/Desktop/sector/common'  
 Makefile:5: recipe for target 'subdirs' failed  
 make: *** [subdirs] Error 2  

It's a bit pity as compilation failed and that definitely a blocker for new people to pick up this great software. If you develop or know cpp, please leave a message in the comment how to make this compilation works. Otherwise, if you want to know more on this software, this is another useful link.

Friday, June 3, 2016

First look into ECL (enterprise control language)

It's been a while for a quiet moment since I actively blogging due to the fact
of family issue. But I hope things will goes even smoother and continue what I
like the best, learning information technology and contribute back to the
opensource society.

Today, we will take a look into another big data technology. ECL or Enterprise
Control Language

is a declarative, data centric programming language designed in 2000 to allow a team of programmers to process big data across a high performance computing cluster without the programmer being involved in many of the lower level, imperative decisions.[1][2]

As this article only meant for introductory, we will just go through with whatever documentation officially available from HPCC Systems, LexisNexis Risk Solutions. As such to speed up of ecl acquaintance, download a virtual image from this link. This virtual machine which already preconfigured hpcc system ready together with ecl to play with.

For me, I have chosen image of current version with gold release running on a 64bit cpu. Next, you need to install virtualbox on your pc in order to run this virtual image. In the past, I have describe many times how to install virtualbox via apt-get.

Particulars that you might want to pay attention how to quickly get the downloaded virtual image to run on the virtualbox is, hpcc systems require two network adapters and make sure you have them configured correctly. You don't have to create a new virtual machine but just select from file dropdown and choose 'Import Appliance'. Then navigate to the downloaded image and import it. Next, power on the virtual machine. You should see something similar as of following.




Open your browser and point to http://<your hpcc system ip address>:8010/#/stub/ECL . Click on
ECL on top of menu bar and then in the submenu below, click on Playground. There are some example
how to model the data store, insert the data and them query the data. If you want more explanation
you can read on this link.



If ecl interest you, you should really read the programmer guide here, hpcc system offered a lot of documentations which can be found here.

That's it for this learning experience. I must say it is very easy to setup to quickly learn what is ECL compare to the previous hadoop system. If you are looking into big data analytic, ecl might be a good option to begin with.