Apache HBase [1] is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable: A Distributed Storage System for Structured Data by Chang et al.[2] Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Apache Hadoop [3].
In this article, we can setup a single node for this adventure. Before we begin, let's download a copy of Apache HBase here. Once downloaded, extract the compressed content. At the time of this writing, I'm using Apache HBase version 1.1.1 for this learning experience.
user@localhost:~/Desktop/hbase-1.1.1$ ls
bin CHANGES.txt conf docs hbase-webapps lib LICENSE.txt NOTICE.txt README.txt
If you have not install java, go ahead and install it. Pick a recent java or at least java7. Make sure terminal prompt the correct version of java. An example would be as of following
user@localhost:~/Desktop/hbase-1.1.1$ java -version
java version "1.7.0_55"
Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)
If you cannot change system configuration for this java, then in the HBase configuration file, conf/hbase-env.sh, uncomment JAVA_HOME variable and set to the java that you installed. The main configuration file for hbase is conf/hbase-site.xml and we will now edit this file so it became such as following. Change to your environment as required.
user@localhost:~/Desktop/hbase-1.1.1$ cat conf/hbase-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
-->
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///home/user/Desktop/hbase-1.1.1</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/user/zookeeper</value>
</property>
</configuration>
Okay, we are ready to start hbase. start it with a helpful script bin/start-hbase.sh
user@localhost:~/Desktop/hbase-1.1.1$ bin/start-hbase.sh
starting master, logging to /home/user/Desktop/hbase-1.1.1/bin/../logs/hbase-user-master-localhost.out
user@localhost:~/Desktop/hbase-1.1.1/logs$ tail -F hbase-user-master-localhost.out SecurityAuth.audit hbase-user-master-localhost.log
==> hbase-user-master-localhost.out <==
==> SecurityAuth.audit <==
2015-08-18 17:49:41,533 INFO SecurityLogger.org.apache.hadoop.hbase.Server: Connection from 127.0.1.1 port: 36745 with version info: version: "1.1.1" url: "git://hw11397.local/Volumes/hbase-1.1.1RC0/hbase" revision: "d0a115a7267f54e01c72c603ec53e91ec418292f" user: "ndimiduk" date: "Tue Jun 23 14:44:07 PDT 2015" src_checksum: "6e2d8cecbd28738ad86daacb25dc467e"
2015-08-18 17:49:46,812 INFO SecurityLogger.org.apache.hadoop.hbase.Server: Connection from 127.0.0.1 port: 53042 with version info: version: "1.1.1" url: "git://hw11397.local/Volumes/hbase-1.1.1RC0/hbase" revision: "d0a115a7267f54e01c72c603ec53e91ec418292f" user: "ndimiduk" date: "Tue Jun 23 14:44:07 PDT 2015" src_checksum: "6e2d8cecbd28738ad86daacb25dc467e"
2015-08-18 17:49:48,309 INFO SecurityLogger.org.apache.hadoop.hbase.Server: Connection from 127.0.0.1 port: 53043 with version info: version: "1.1.1" url: "git://hw11397.local/Volumes/hbase-1.1.1RC0/hbase" revision: "d0a115a7267f54e01c72c603ec53e91ec418292f" user: "ndimiduk" date: "Tue Jun 23 14:44:07 PDT 2015" src_checksum: "6e2d8cecbd28738ad86daacb25dc467e"
2015-08-18 17:49:49,317 INFO SecurityLogger.org.apache.hadoop.hbase.Server: Connection from 127.0.0.1 port: 53044 with version info: version: "1.1.1" url: "git://hw11397.local/Volumes/hbase-1.1.1RC0/hbase" revision: "d0a115a7267f54e01c72c603ec53e91ec418292f" user: "ndimiduk" date: "Tue Jun 23 14:44:07 PDT 2015" src_checksum: "6e2d8cecbd28738ad86daacb25dc467e"
==> hbase-user-master-localhost.log <==
2015-08-18 17:49:49,281 INFO [StoreOpener-78a2a3664205fcf679d2043ac3259648-1] hfile.CacheConfig: blockCache=LruBlockCache{blockCount=0, currentSize=831688, freeSize=808983544, maxSize=809815232, heapSize=831688, minSize=769324480, minFactor=0.95, multiSize=384662240, multiFactor=0.5, singleSize=192331120, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false
2015-08-18 17:49:49,282 INFO [StoreOpener-78a2a3664205fcf679d2043ac3259648-1] compactions.CompactionConfiguration: size [134217728, 9223372036854775807); files [3, 10); ratio 1.200000; off-peak ratio 5.000000; throttle point 2684354560; major period 604800000, major jitter 0.500000, min locality to compact 0.000000
2015-08-18 17:49:49,295 INFO [RS_OPEN_REGION-localhost:60631-0] regionserver.HRegion: Onlined 78a2a3664205fcf679d2043ac3259648; next sequenceid=2
2015-08-18 17:49:49,303 INFO [PostOpenDeployTasks:78a2a3664205fcf679d2043ac3259648] regionserver.HRegionServer: Post open deploy tasks for hbase:namespace,,1439891388424.78a2a3664205fcf679d2043ac3259648.
2015-08-18 17:49:49,322 INFO [PostOpenDeployTasks:78a2a3664205fcf679d2043ac3259648] hbase.MetaTableAccessor: Updated row hbase:namespace,,1439891388424.78a2a3664205fcf679d2043ac3259648. with server=localhost,60631,1439891378840
2015-08-18 17:49:49,332 INFO [AM.ZK.Worker-pool3-t6] master.RegionStates: Transition {78a2a3664205fcf679d2043ac3259648 state=OPENING, ts=1439891389276, server=localhost,60631,1439891378840} to {78a2a3664205fcf679d2043ac3259648 state=OPEN, ts=1439891389332, server=localhost,60631,1439891378840}
2015-08-18 17:49:49,603 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x14f4036b87d0000 type:create cxid:0x1d5 zxid:0x44 txntype:-1 reqpath:n/a Error Path:/hbase/namespace/default Error:KeeperErrorCode = NodeExists for /hbase/namespace/default
2015-08-18 17:49:49,625 INFO [ProcessThread(sid:0 cport:-1):] server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x14f4036b87d0000 type:create cxid:0x1d8 zxid:0x46 txntype:-1 reqpath:n/a Error Path:/hbase/namespace/hbase Error:KeeperErrorCode = NodeExists for /hbase/namespace/hbase
2015-08-18 17:49:49,639 INFO [localhost:51452.activeMasterManager] master.HMaster: Master has completed initialization
2015-08-18 17:49:49,642 INFO [localhost:51452.activeMasterManager] quotas.MasterQuotaManager: Quota support disabled
and you notice, log file is also available and jps shown a HMaster is running.
user@localhost: $ jps
22144 Jps
21793 HMaster
okay, let's experience apache hbase using a hbase shell.
user@localhost:~/Desktop/hbase-1.1.1$ ./bin/hbase shell
2015-08-18 17:55:25,134 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.1.1, rd0a115a7267f54e01c72c603ec53e91ec418292f, Tue Jun 23 14:44:07 PDT 2015
hbase(main):001:0>
A help command show very helpful description such as the followings.
hbase(main):001:0> help
HBase Shell, version 1.1.1, rd0a115a7267f54e01c72c603ec53e91ec418292f, Tue Jun 23 14:44:07 PDT 2015
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.
COMMAND GROUPS:
Group name: general
Commands: status, table_help, version, whoami
Group name: ddl
Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, show_filters
Group name: namespace
Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables
Group name: dml
Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve
Group name: tools
Commands: assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, close_region, compact, compact_rs, flush, major_compact, merge_region, move, split, trace, unassign, wal_roll, zk_dump
Group name: replication
Commands: add_peer, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, list_peers, list_replicated_tables, remove_peer, remove_peer_tableCFs, set_peer_tableCFs, show_peer_tableCFs
Group name: snapshots
Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, list_snapshots, restore_snapshot, snapshot
Group name: configuration
Commands: update_all_config, update_config
Group name: quotas
Commands: list_quotas, set_quota
Group name: security
Commands: grant, revoke, user_permission
Group name: visibility labels
Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility
SHELL USAGE:
Quote all names in HBase Shell such as table and column names. Commas delimit
command parameters. Type <RETURN> after entering a command to run it.
Dictionaries of configuration used in the creation and alteration of tables are
Ruby Hashes. They look like this:
{'key1' => 'value1', 'key2' => 'value2', ...}
and are opened and closed with curley-braces. Key/values are delimited by the
'=>' character combination. Usually keys are predefined constants such as
NAME, VERSIONS, COMPRESSION, etc. Constants do not need to be quoted. Type
'Object.constants' to see a (messy) list of all constants in the environment.
If you are using binary keys or values and need to enter them in the shell, use
double-quote'd hexadecimal representation. For example:
hbase> get 't1', "key\x03\x3f\xcd"
hbase> get 't1', "key\003\023\011"
hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40"
The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.
For more on the HBase Shell, see http://hbase.apache.org/book.html
hbase(main):002:0>
To create a table (column family),
hbase(main):002:0> create 'test', 'cf'
0 row(s) in 1.5700 seconds
=> Hbase::Table - test
hbase(main):003:0>
list information about a table.
hbase(main):001:0> list 'test'
TABLE
test
1 row(s) in 0.3530 seconds
=> ["test"]
let's put something into the table we have just created.
hbase(main):002:0> put 'test', 'row1', 'cf:a', 'value1'
0 row(s) in 0.2280 seconds
hbase(main):003:0> put 'test', 'row2', 'cf:b', 'value2'
0 row(s) in 0.0140 seconds
hbase(main):004:0> put 'test', 'row3', 'cf:c', 'value3'
0 row(s) in 0.0060 seconds
hbase(main):005:0>
Here, we insert three values, one at a time. The first insert is at row1, column cf:a, with a value of value1. Columns in HBase are comprised of a column family prefix, cf in this example, followed by a colon and then a column qualifier suffix, a in this case.
To select the row from the table, use scan.
hbase(main):005:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=1439892359305, value=value1
row2 column=cf:b, timestamp=1439892363921, value=value2
row3 column=cf:c, timestamp=1439892369775, value=value3
3 row(s) in 0.0420 seconds
hbase(main):006:0>
To get a row only.
hbase(main):006:0> get 'test', 'row1'
COLUMN CELL
cf:a timestamp=1439892359305, value=value1
1 row(s) in 0.0340 seconds
hbase(main):007:0>
Something really interesting about apache hbase, say if you want to delete or change settings of a table, you need to disable it first. After that, you can enable it back.
hbase(main):007:0> disable 'test'
0 row(s) in 2.3610 seconds
hbase(main):008:0> enable 'test'
0 row(s) in 1.2790 seconds
hbase(main):009:0>
okay, now, let's delete this table.
hbase(main):009:0> drop 'test'
ERROR: Table test is enabled. Disable it first.
Here is some help for this command:
Drop the named table. Table must first be disabled:
hbase> drop 't1'
hbase> drop 'ns1:t1'
hbase(main):010:0> disable 'test'
0 row(s) in 2.2640 seconds
hbase(main):011:0> drop 'test'
0 row(s) in 1.2800 seconds
hbase(main):012:0>
Okay, we are done for this basic learning. Let's quit for now.
hbase(main):012:0> quit
user@localhost:~/Desktop/hbase-1.1.1$
To stop apache hbase instance,
user@localhost:~/Desktop/hbase-1.1.1$ ./bin/stop-hbase.sh
stopping hbase.................
user@localhost:~/Desktop/hbase-1.1.1$ jps
23399 Jps
5445 org.eclipse.equinox.launcher_1.3.0.v20140415-2008.jar
If you like me who came from apache cassandra, apache hbase looks very similar. If this interest you, I shall leave you with the following three links which will get you further.
http://hbase.apache.org/book.html
http://wiki.apache.org/hadoop/Hbase
https://blogs.apache.org/hbase/
No comments:
Post a Comment