Showing posts with label blob. Show all posts
Showing posts with label blob. Show all posts

Sunday, May 11, 2014

Store video on cassandra and using hector streaming IO

Today, we are going to learn how to stream in and stream out using hector-client.  There are two classes implemented in hector-client which storing binary in chunk and reading binary in chunk. That's pretty neat! The two mentioned classes are

ChunkOutputStream storing binary as blog into cassandra.
ChunkInputStream read binary as blog from cassandra.

Below is a test case coded with the two classes to show how to store and read data using the two mentioned class.
import static org.junit.Assert.*;

import java.io.IOException;
import java.util.Arrays;

import me.prettyprint.cassandra.connection.HConnectionManager;
import me.prettyprint.cassandra.io.ChunkInputStream;
import me.prettyprint.cassandra.io.ChunkOutputStream;
import me.prettyprint.cassandra.serializers.StringSerializer;
import me.prettyprint.cassandra.service.CassandraHostConfigurator;
import me.prettyprint.cassandra.service.ThriftCluster;
import me.prettyprint.cassandra.service.ThriftKsDef;
import me.prettyprint.hector.api.Keyspace;
import me.prettyprint.hector.api.ddl.KeyspaceDefinition;
import me.prettyprint.hector.api.factory.HFactory;

import org.apache.cassandra.thrift.CfDef;
import org.apache.cassandra.thrift.KsDef;
import org.junit.After;
import org.junit.AfterClass;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.Test;

public class HectorStreamTest {

private Keyspace keyspace;
private ThriftCluster cassandraCluster;
private CassandraHostConfigurator cassandraHostConfigurator;
protected HConnectionManager connectionManager;
public static KeyspaceDefinition KEYSPACE_DEV;
public final static String KEYSPACE = "TestKeyspace";
public final static String BLOB_CF = "Blob";
public final static CfDef BLOB_CF_DEF = new CfDef(KEYSPACE, BLOB_CF);

@BeforeClass
public static void setUpBeforeClass() throws Exception {
}

@AfterClass
public static void tearDownAfterClass() throws Exception {
}

@Before
public void setUp() throws Exception {
cassandraHostConfigurator = new CassandraHostConfigurator(
"192.168.0.2:9160");
connectionManager = new HConnectionManager("just4fun",
cassandraHostConfigurator);

KEYSPACE_DEV = new ThriftKsDef(new KsDef(KEYSPACE,
"org.apache.cassandra.locator.SimpleStrategy",
Arrays.asList(new CfDef[] { BLOB_CF_DEF })));
((ThriftKsDef) KEYSPACE_DEV).setReplicationFactor(1);
cassandraCluster = new ThriftCluster("just4fun",
cassandraHostConfigurator);

keyspace = HFactory.createKeyspace(KEYSPACE, cassandraCluster);

cassandraCluster.addKeyspace(KEYSPACE_DEV, true);
}

@After
public void tearDown() throws Exception {
cassandraCluster.dropKeyspace(KEYSPACE);
}

@Test
public void testWriteAndReadStream() throws IOException {
byte[] value = "hello world, store and read binary as a chunk of blob in and from cassandra.".getBytes();

// write to cassandra.
ChunkOutputStream<String> out = new ChunkOutputStream<String>(keyspace, BLOB_CF, "row1", StringSerializer.get(), 2);
out.write(value);
out.close();

// read from cassandra.
ChunkInputStream<String> in = new ChunkInputStream<String>(keyspace, BLOB_CF, "row1", StringSerializer.get());
int i = -1;
int written = 0;

while ((i = in.read()) != -1) {
assertSame(value[written++], (byte) i);
byte[] b = {(byte)i};
System.out.print(new String(b));
}

in.close();
}

}

Keyspace TestKeyspace is created and table Blob is use to write and read the blob data. The main point is probably on the chunk size in ChunkOutputStream, it is set to 2 but you can give another even number to store the byte. Remember, each byte is represented by two hexadecimal characters, see cqlsh output below for more information. The test method testWriteAndReadStream() store the data in variable value using ChunkOutputStream.write() and remember to close it so that the data actually flush to cassandra or else it will stay in client code. To read from cassandra, is by specifing the row to ChunkInputStream and calling method read() which it will return the data in chunk. When the test is done, keyspace is removed.
cqlsh:TestKeyspace> select * from "Blob";

key | column1 | value
------------+--------------------+--------
0x726f7731 | 0x0000000000000000 | 0x6800
0x726f7731 | 0x0000000000000001 | 0x6500
0x726f7731 | 0x0000000000000002 | 0x6c00
0x726f7731 | 0x0000000000000003 | 0x6c00
0x726f7731 | 0x0000000000000004 | 0x6f00
0x726f7731 | 0x0000000000000005 | 0x2000
0x726f7731 | 0x0000000000000006 | 0x7700
0x726f7731 | 0x0000000000000007 | 0x6f00
0x726f7731 | 0x0000000000000008 | 0x7200
0x726f7731 | 0x0000000000000009 | 0x6c00
0x726f7731 | 0x000000000000000a | 0x6400
0x726f7731 | 0x000000000000000b | 0x2c00
0x726f7731 | 0x000000000000000c | 0x2000
0x726f7731 | 0x000000000000000d | 0x7300
0x726f7731 | 0x000000000000000e | 0x7400
0x726f7731 | 0x000000000000000f | 0x6f00
0x726f7731 | 0x0000000000000010 | 0x7200
0x726f7731 | 0x0000000000000011 | 0x6500
0x726f7731 | 0x0000000000000012 | 0x2000
0x726f7731 | 0x0000000000000013 | 0x6100
0x726f7731 | 0x0000000000000014 | 0x6e00
0x726f7731 | 0x0000000000000015 | 0x6400
0x726f7731 | 0x0000000000000016 | 0x2000
0x726f7731 | 0x0000000000000017 | 0x7200
0x726f7731 | 0x0000000000000018 | 0x6500
0x726f7731 | 0x0000000000000019 | 0x6100
0x726f7731 | 0x000000000000001a | 0x6400
0x726f7731 | 0x000000000000001b | 0x2000
0x726f7731 | 0x000000000000001c | 0x6200
0x726f7731 | 0x000000000000001d | 0x6900
0x726f7731 | 0x000000000000001e | 0x6e00
0x726f7731 | 0x000000000000001f | 0x6100
0x726f7731 | 0x0000000000000020 | 0x7200
0x726f7731 | 0x0000000000000021 | 0x7900
0x726f7731 | 0x0000000000000022 | 0x2000
0x726f7731 | 0x0000000000000023 | 0x6100
0x726f7731 | 0x0000000000000024 | 0x7300
0x726f7731 | 0x0000000000000025 | 0x2000
0x726f7731 | 0x0000000000000026 | 0x6100
0x726f7731 | 0x0000000000000027 | 0x2000
0x726f7731 | 0x0000000000000028 | 0x6300
0x726f7731 | 0x0000000000000029 | 0x6800
0x726f7731 | 0x000000000000002a | 0x7500
0x726f7731 | 0x000000000000002b | 0x6e00
0x726f7731 | 0x000000000000002c | 0x6b00
0x726f7731 | 0x000000000000002d | 0x2000
0x726f7731 | 0x000000000000002e | 0x6f00
0x726f7731 | 0x000000000000002f | 0x6600
0x726f7731 | 0x0000000000000030 | 0x2000
0x726f7731 | 0x0000000000000031 | 0x6200
0x726f7731 | 0x0000000000000032 | 0x6c00
0x726f7731 | 0x0000000000000033 | 0x6f00
0x726f7731 | 0x0000000000000034 | 0x6200
0x726f7731 | 0x0000000000000035 | 0x2000
0x726f7731 | 0x0000000000000036 | 0x6900
0x726f7731 | 0x0000000000000037 | 0x6e00
0x726f7731 | 0x0000000000000038 | 0x2000
0x726f7731 | 0x0000000000000039 | 0x6100
0x726f7731 | 0x000000000000003a | 0x6e00
0x726f7731 | 0x000000000000003b | 0x6400
0x726f7731 | 0x000000000000003c | 0x2000
0x726f7731 | 0x000000000000003d | 0x6600
0x726f7731 | 0x000000000000003e | 0x7200
0x726f7731 | 0x000000000000003f | 0x6f00
0x726f7731 | 0x0000000000000040 | 0x6d00
0x726f7731 | 0x0000000000000041 | 0x2000
0x726f7731 | 0x0000000000000042 | 0x6300
0x726f7731 | 0x0000000000000043 | 0x6100
0x726f7731 | 0x0000000000000044 | 0x7300
0x726f7731 | 0x0000000000000045 | 0x7300
0x726f7731 | 0x0000000000000046 | 0x6100
0x726f7731 | 0x0000000000000047 | 0x6e00
0x726f7731 | 0x0000000000000048 | 0x6400
0x726f7731 | 0x0000000000000049 | 0x7200
0x726f7731 | 0x000000000000004a | 0x6100
0x726f7731 | 0x000000000000004b | 0x2e00

(76 rows)

cqlsh:TestKeyspace>

As basic classes provided by hector-client is in the package, it's a good to have feature to have if you want to stream in and stream out content like audio or video. I have implemented something similar here. The concept is similar, write binary content to cassandra and reconstruct the binary data into the file again.

That's it, hope you like this.