Showing posts with label MongoDB. Show all posts
Showing posts with label MongoDB. Show all posts

Friday, June 17, 2016

trying out aggregations on mongodb

Today, we will again look into mongodb but on the specific topic of aggregation.

Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result. MongoDB provides three ways to perform aggregation: the aggregation pipeline, the map-reduce function, and single purpose aggregation methods.

Let's start with a sample aggregation from the zip code.Importing 29353 objects within second. Blazing fast, maybe because I'm using ssd, heh.

 user@localhost:~$ mongoimport --collection zipcodes < ~/Desktop/zips.json   
 connected to: 127.0.0.1  
 Tue Mar 8 20:21:37.950 check 9 29353  
 Tue Mar 8 20:21:38.099 imported 29353 objects  
   
 > db.zipcodes.aggregate( [ { $group: { _id: "$state", totalPop: { $sum: "$pop" } } }, { $match: { totalPop: { $gte: 10*1000*1000 } } } ] )  
 {  
    "result" : [  
       {  
          "_id" : "IL",  
          "totalPop" : 11427576  
       },  
       {  
          "_id" : "OH",  
          "totalPop" : 10846517  
       },  
       {  
          "_id" : "FL",  
          "totalPop" : 12686644  
       },  
       {  
          "_id" : "NY",  
          "totalPop" : 17990402  
       },  
       {  
          "_id" : "PA",  
          "totalPop" : 11881643  
       },  
       {  
          "_id" : "TX",  
          "totalPop" : 16984601  
       },  
       {  
          "_id" : "CA",  
          "totalPop" : 29754890  
       }  
    ],  
    "ok" : 1  
 }  
 > db.zipcodes.aggregate( [ { $group: { _id: { state: "$state", city: "$city" }, pop: { $sum: "$pop" } } }, { $group: { _id: "$_id.state", avgCityPop: { $avg: "$pop" } } } ] )  
 {  
    "result" : [  
       {  
          "_id" : "NH",  
          "avgCityPop" : 5232.320754716981  
       },  
       {  
          "_id" : "MA",  
          "avgCityPop" : 14855.37037037037  
       },  
       {  
          "_id" : "ME",  
          "avgCityPop" : 3006.4901960784314  
       },  
       {  
          "_id" : "NY",  
          "avgCityPop" : 13131.680291970803  
       },  
       {  
          "_id" : "VT",  
          "avgCityPop" : 2315.8765432098767  
       },  
       {  
          "_id" : "PA",  
          "avgCityPop" : 8679.067202337472  
       },  
       {  
          "_id" : "DE",  
          "avgCityPop" : 14481.91304347826  
       },  
       {  
          "_id" : "DC",  
          "avgCityPop" : 303450  
       },  
       {  
          "_id" : "VA",  
          "avgCityPop" : 8526.177931034483  
       },  
       {  
          "_id" : "SC",  
          "avgCityPop" : 11139.626198083068  
       },  
       {  
          "_id" : "FL",  
          "avgCityPop" : 27400.958963282937  
       },  
       {  
          "_id" : "AL",  
          "avgCityPop" : 7907.2152641878665  
       },  
       {  
          "_id" : "NJ",  
          "avgCityPop" : 15775.89387755102  
       },  
       {  
          "_id" : "WV",  
          "avgCityPop" : 2771.4775888717154  
       },  
       {  
          "_id" : "TN",  
          "avgCityPop" : 9656.350495049504  
       },  
       {  
          "_id" : "OH",  
          "avgCityPop" : 12700.839578454332  
       },  
       {  
          "_id" : "MD",  
          "avgCityPop" : 12615.775725593667  
       },  
       {  
          "_id" : "MN",  
          "avgCityPop" : 5372.21375921376  
       },  
       {  
          "_id" : "ND",  
          "avgCityPop" : 1645.0309278350514  
       },  
       {  
          "_id" : "NC",  
          "avgCityPop" : 10622.815705128205  
       },  
       {  
          "_id" : "MT",  
          "avgCityPop" : 2593.987012987013  
       },  
       {  
          "_id" : "IL",  
          "avgCityPop" : 9954.334494773519  
       },  
       {  
          "_id" : "MO",  
          "avgCityPop" : 5672.195338512764  
       },  
       {  
          "_id" : "KS",  
          "avgCityPop" : 3819.884259259259  
       },  
       {  
          "_id" : "LA",  
          "avgCityPop" : 10465.496277915632  
       },  
       {  
          "_id" : "AR",  
          "avgCityPop" : 4175.355239786856  
       },  
       {  
          "_id" : "CO",  
          "avgCityPop" : 9981.075757575758  
       },  
       {  
          "_id" : "IN",  
          "avgCityPop" : 9271.130434782608  
       },  
       {  
          "_id" : "KY",  
          "avgCityPop" : 4767.164721141375  
       },  
       {  
          "_id" : "OK",  
          "avgCityPop" : 6155.743639921722  
       },  
       {  
          "_id" : "ID",  
          "avgCityPop" : 4320.811158798283  
       },  
       {  
          "_id" : "WY",  
          "avgCityPop" : 3384.5373134328356  
       },  
       {  
          "_id" : "UT",  
          "avgCityPop" : 9518.508287292818  
       },  
       {  
          "_id" : "NV",  
          "avgCityPop" : 18209.590909090908  
       },  
       {  
          "_id" : "NE",  
          "avgCityPop" : 3034.882692307692  
       },  
       {  
          "_id" : "RI",  
          "avgCityPop" : 19292.653846153848  
       },  
       {  
          "_id" : "NM",  
          "avgCityPop" : 5872.360465116279  
       },  
       {  
          "_id" : "CA",  
          "avgCityPop" : 27756.42723880597  
       },  
       {  
          "_id" : "AZ",  
          "avgCityPop" : 20591.16853932584  
       },  
       {  
          "_id" : "HI",  
          "avgCityPop" : 15831.842857142858  
       },  
       {  
          "_id" : "IA",  
          "avgCityPop" : 3123.0821147356583  
       },  
       {  
          "_id" : "MS",  
          "avgCityPop" : 7524.023391812865  
       },  
       {  
          "_id" : "WI",  
          "avgCityPop" : 7323.00748502994  
       },  
       {  
          "_id" : "TX",  
          "avgCityPop" : 13775.02108678021  
       },  
       {  
          "_id" : "SD",  
          "avgCityPop" : 1839.6746031746031  
       },  
       {  
          "_id" : "MI",  
          "avgCityPop" : 12087.512353706112  
       },  
       {  
          "_id" : "GA",  
          "avgCityPop" : 11547.62210338681  
       },  
       {  
          "_id" : "OR",  
          "avgCityPop" : 8262.561046511628  
       },  
       {  
          "_id" : "CT",  
          "avgCityPop" : 14674.625  
       },  
       {  
          "_id" : "WA",  
          "avgCityPop" : 12258.670025188916  
       },  
       {  
          "_id" : "AK",  
          "avgCityPop" : 2976.4918032786886  
       }  
    ],  
    "ok" : 1  
 }  
 > db.zipcodes.aggregate( [ { $group: { _id: { state: "$state", city: "$city" }, pop: { $sum: "$pop" } } }, { $sort: { pop: 1 } }, { $group: { _id : "$_id.state", biggestCity: { $last: "$_id.city" }, biggestPop:  { $last: "$pop" }, smallestCity: { $first: "$_id.city" }, smallestPop: { $first: "$pop" } } }, { $project: { _id: 0, state: "$_id", biggestCity: { name: "$biggestCity", pop: "$biggestPop" }, smallestCity: { name: "$smallestCity", pop: "$smallestPop" } } } ] )  
 {  
    "result" : [  
       {  
          "biggestCity" : {  
             "name" : "NEWARK",  
             "pop" : 111674  
          },  
          "smallestCity" : {  
             "name" : "BETHEL",  
             "pop" : 108  
          },  
          "state" : "DE"  
       },  
       {  
          "biggestCity" : {  
             "name" : "SAINT LOUIS",  
             "pop" : 397802  
          },  
          "smallestCity" : {  
             "name" : "BENDAVIS",  
             "pop" : 44  
          },  
          "state" : "MO"  
       },  
       {  
          "biggestCity" : {  
             "name" : "CHICAGO",  
             "pop" : 2452177  
          },  
          "smallestCity" : {  
             "name" : "ANCONA",  
             "pop" : 38  
          },  
          "state" : "IL"  
       },  
       {  
          "biggestCity" : {  
             "name" : "CLEVELAND",  
             "pop" : 536759  
          },  
          "smallestCity" : {  
             "name" : "ISLE SAINT GEORG",  
             "pop" : 38  
          },  
          "state" : "OH"  
       },  
       {  
          "biggestCity" : {  
             "name" : "MANCHESTER",  
             "pop" : 106452  
          },  
          "smallestCity" : {  
             "name" : "WEST NOTTINGHAM",  
             "pop" : 27  
          },  
          "state" : "NH"  
       },  
       {  
          "biggestCity" : {  
             "name" : "WASHINGTON",  
             "pop" : 606879  
          },  
          "smallestCity" : {  
             "name" : "PENTAGON",  
             "pop" : 21  
          },  
          "state" : "DC"  
       },  
       {  
          "biggestCity" : {  
             "name" : "GRAND FORKS",  
             "pop" : 59527  
          },  
          "smallestCity" : {  
             "name" : "TROTTERS",  
             "pop" : 12  
          },  
          "state" : "ND"  
       },  
       {  
          "biggestCity" : {  
             "name" : "BALTIMORE",  
             "pop" : 733081  
          },  
          "smallestCity" : {  
             "name" : "ANNAPOLIS JUNCTI",  
             "pop" : 32  
          },  
          "state" : "MD"  
       },  
       {  
          "biggestCity" : {  
             "name" : "MINNEAPOLIS",  
             "pop" : 344719  
          },  
          "smallestCity" : {  
             "name" : "JOHNSON",  
             "pop" : 12  
          },  
          "state" : "MN"  
       },  
       {  
          "biggestCity" : {  
             "name" : "SALT LAKE CITY",  
             "pop" : 186346  
          },  
          "smallestCity" : {  
             "name" : "MODENA",  
             "pop" : 9  
          },  
          "state" : "UT"  
       },  
       {  
          "biggestCity" : {  
             "name" : "CHEYENNE",  
             "pop" : 70185  
          },  
          "smallestCity" : {  
             "name" : "LOST SPRINGS",  
             "pop" : 6  
          },  
          "state" : "WY"  
       },  
       {  
          "biggestCity" : {  
             "name" : "PHOENIX",  
             "pop" : 890853  
          },  
          "smallestCity" : {  
             "name" : "HUALAPAI",  
             "pop" : 2  
          },  
          "state" : "AZ"  
       },  
       {  
          "biggestCity" : {  
             "name" : "BRIDGEPORT",  
             "pop" : 141638  
          },  
          "smallestCity" : {  
             "name" : "EAST KILLINGLY",  
             "pop" : 25  
          },  
          "state" : "CT"  
       },  
       {  
          "biggestCity" : {  
             "name" : "SEATTLE",  
             "pop" : 520096  
          },  
          "smallestCity" : {  
             "name" : "BENGE",  
             "pop" : 2  
          },  
          "state" : "WA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "BIRMINGHAM",  
             "pop" : 242606  
          },  
          "smallestCity" : {  
             "name" : "ALLEN",  
             "pop" : 0  
          },  
          "state" : "AL"  
       },  
       {  
          "biggestCity" : {  
             "name" : "LAS VEGAS",  
             "pop" : 597557  
          },  
          "smallestCity" : {  
             "name" : "TUSCARORA",  
             "pop" : 1  
          },  
          "state" : "NV"  
       },  
       {  
          "biggestCity" : {  
             "name" : "OMAHA",  
             "pop" : 358930  
          },  
          "smallestCity" : {  
             "name" : "LAKESIDE",  
             "pop" : 5  
          },  
          "state" : "NE"  
       },  
       {  
          "biggestCity" : {  
             "name" : "MIAMI",  
             "pop" : 825232  
          },  
          "smallestCity" : {  
             "name" : "CECIL FIELD NAS",  
             "pop" : 0  
          },  
          "state" : "FL"  
       },  
       {  
          "biggestCity" : {  
             "name" : "SIOUX FALLS",  
             "pop" : 102046  
          },  
          "smallestCity" : {  
             "name" : "ZEONA",  
             "pop" : 8  
          },  
          "state" : "SD"  
       },  
       {  
          "biggestCity" : {  
             "name" : "HOUSTON",  
             "pop" : 2095918  
          },  
          "smallestCity" : {  
             "name" : "FULTON",  
             "pop" : 0  
          },  
          "state" : "TX"  
       },  
       {  
          "biggestCity" : {  
             "name" : "MILWAUKEE",  
             "pop" : 597324  
          },  
          "smallestCity" : {  
             "name" : "CLAM LAKE",  
             "pop" : 2  
          },  
          "state" : "WI"  
       },  
       {  
          "biggestCity" : {  
             "name" : "JACKSON",  
             "pop" : 204788  
          },  
          "smallestCity" : {  
             "name" : "CHUNKY",  
             "pop" : 79  
          },  
          "state" : "MS"  
       },  
       {  
          "biggestCity" : {  
             "name" : "DES MOINES",  
             "pop" : 148155  
          },  
          "smallestCity" : {  
             "name" : "DOUDS",  
             "pop" : 15  
          },  
          "state" : "IA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "HONOLULU",  
             "pop" : 396643  
          },  
          "smallestCity" : {  
             "name" : "NINOLE",  
             "pop" : 0  
          },  
          "state" : "HI"  
       },  
       {  
          "biggestCity" : {  
             "name" : "CHARLOTTE",  
             "pop" : 465833  
          },  
          "smallestCity" : {  
             "name" : "GLOUCESTER",  
             "pop" : 0  
          },  
          "state" : "NC"  
       },  
       {  
          "biggestCity" : {  
             "name" : "BILLINGS",  
             "pop" : 78805  
          },  
          "smallestCity" : {  
             "name" : "MOSBY",  
             "pop" : 7  
          },  
          "state" : "MT"  
       },  
       {  
          "biggestCity" : {  
             "name" : "TULSA",  
             "pop" : 389072  
          },  
          "smallestCity" : {  
             "name" : "SOUTHARD",  
             "pop" : 8  
          },  
          "state" : "OK"  
       },  
       {  
          "biggestCity" : {  
             "name" : "BOISE",  
             "pop" : 165522  
          },  
          "smallestCity" : {  
             "name" : "KEUTERVILLE",  
             "pop" : 0  
          },  
          "state" : "ID"  
       },  
       {  
          "biggestCity" : {  
             "name" : "INDIANAPOLIS",  
             "pop" : 348868  
          },  
          "smallestCity" : {  
             "name" : "WESTPOINT",  
             "pop" : 145  
          },  
          "state" : "IN"  
       },  
       {  
          "biggestCity" : {  
             "name" : "LOUISVILLE",  
             "pop" : 288058  
          },  
          "smallestCity" : {  
             "name" : "BROWDER",  
             "pop" : 0  
          },  
          "state" : "KY"  
       },  
       {  
          "biggestCity" : {  
             "name" : "BURLINGTON",  
             "pop" : 39127  
          },  
          "smallestCity" : {  
             "name" : "UNIV OF VERMONT",  
             "pop" : 0  
          },  
          "state" : "VT"  
       },  
       {  
          "biggestCity" : {  
             "name" : "PHILADELPHIA",  
             "pop" : 1610956  
          },  
          "smallestCity" : {  
             "name" : "HAMILTON",  
             "pop" : 0  
          },  
          "state" : "PA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "NEW ORLEANS",  
             "pop" : 496937  
          },  
          "smallestCity" : {  
             "name" : "FORDOCHE",  
             "pop" : 0  
          },  
          "state" : "LA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "COLUMBIA",  
             "pop" : 269521  
          },  
          "smallestCity" : {  
             "name" : "QUINBY",  
             "pop" : 0  
          },  
          "state" : "SC"  
       },  
       {  
          "biggestCity" : {  
             "name" : "WICHITA",  
             "pop" : 295115  
          },  
          "smallestCity" : {  
             "name" : "ARNOLD",  
             "pop" : 0  
          },  
          "state" : "KS"  
       },  
       {  
          "biggestCity" : {  
             "name" : "NEWARK",  
             "pop" : 275572  
          },  
          "smallestCity" : {  
             "name" : "IMLAYSTOWN",  
             "pop" : 17  
          },  
          "state" : "NJ"  
       },  
       {  
          "biggestCity" : {  
             "name" : "HUNTINGTON",  
             "pop" : 75343  
          },  
          "smallestCity" : {  
             "name" : "MOUNT CARBON",  
             "pop" : 0  
          },  
          "state" : "WV"  
       },  
       {  
          "biggestCity" : {  
             "name" : "MEMPHIS",  
             "pop" : 632837  
          },  
          "smallestCity" : {  
             "name" : "ALLRED",  
             "pop" : 2  
          },  
          "state" : "TN"  
       },  
       {  
          "biggestCity" : {  
             "name" : "VIRGINIA BEACH",  
             "pop" : 385080  
          },  
          "smallestCity" : {  
             "name" : "WALLOPS ISLAND",  
             "pop" : 0  
          },  
          "state" : "VA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "ANCHORAGE",  
             "pop" : 183987  
          },  
          "smallestCity" : {  
             "name" : "CHEVAK",  
             "pop" : 0  
          },  
          "state" : "AK"  
       },  
       {  
          "biggestCity" : {  
             "name" : "BROOKLYN",  
             "pop" : 2300504  
          },  
          "smallestCity" : {  
             "name" : "RAQUETTE LAKE",  
             "pop" : 0  
          },  
          "state" : "NY"  
       },  
       {  
          "biggestCity" : {  
             "name" : "DENVER",  
             "pop" : 451182  
          },  
          "smallestCity" : {  
             "name" : "CHEYENNE MTN AFB",  
             "pop" : 0  
          },  
          "state" : "CO"  
       },  
       {  
          "biggestCity" : {  
             "name" : "DETROIT",  
             "pop" : 963243  
          },  
          "smallestCity" : {  
             "name" : "LELAND",  
             "pop" : 0  
          },  
          "state" : "MI"  
       },  
       {  
          "biggestCity" : {  
             "name" : "PORTLAND",  
             "pop" : 518543  
          },  
          "smallestCity" : {  
             "name" : "KENT",  
             "pop" : 0  
          },  
          "state" : "OR"  
       },  
       {  
          "biggestCity" : {  
             "name" : "ATLANTA",  
             "pop" : 609591  
          },  
          "smallestCity" : {  
             "name" : "FORT STEWART",  
             "pop" : 0  
          },  
          "state" : "GA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "CRANSTON",  
             "pop" : 176404  
          },  
          "smallestCity" : {  
             "name" : "CLAYVILLE",  
             "pop" : 45  
          },  
          "state" : "RI"  
       },  
       {  
          "biggestCity" : {  
             "name" : "ALBUQUERQUE",  
             "pop" : 449584  
          },  
          "smallestCity" : {  
             "name" : "ALGODONES",  
             "pop" : 0  
          },  
          "state" : "NM"  
       },  
       {  
          "biggestCity" : {  
             "name" : "LOS ANGELES",  
             "pop" : 2102295  
          },  
          "smallestCity" : {  
             "name" : "TWIN BRIDGES",  
             "pop" : 0  
          },  
          "state" : "CA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "LITTLE ROCK",  
             "pop" : 192895  
          },  
          "smallestCity" : {  
             "name" : "TOMATO",  
             "pop" : 0  
          },  
          "state" : "AR"  
       },  
       {  
          "biggestCity" : {  
             "name" : "WORCESTER",  
             "pop" : 169856  
          },  
          "smallestCity" : {  
             "name" : "BUCKLAND",  
             "pop" : 16  
          },  
          "state" : "MA"  
       },  
       {  
          "biggestCity" : {  
             "name" : "PORTLAND",  
             "pop" : 63268  
          },  
          "smallestCity" : {  
             "name" : "BUSTINS ISLAND",  
             "pop" : 0  
          },  
          "state" : "ME"  
       }  
    ],  
    "ok" : 1  
 }  
 >   
   

All query in the examples works. It's amazing all three queries quickly bring results within second! Amazing. Whilst this is an short article to convince you to use aggregation on mongodb, and if you have been convince, you should really try on the following useful links too.


https://docs.mongodb.org/manual/core/map-reduce/

https://docs.mongodb.org/manual/reference/aggregation/

good luck!

Saturday, October 24, 2015

Study MongoDB security by setup and configure server and client on secure line

It's been a while since my last learning on MongoDB. The last learning on MongoDB was on administration. Today, we will learn another topic of mongoDB; MongoDB security. As a general for MongoDB security context, it means

Maintaining a secure MongoDB deployment requires administrators to implement controls to ensure that users and applications have access to only the data that they require. MongoDB provides features that allow administrators to implement these controls and restrictions for any MongoDB deployment.

This article is reference the official documentation which can be found here. As the security context is pretty huge, in this short article, we will focus how to setup mongdb server to use on ssl and how client will access the database resource securely.

First, make sure you have install the server and client package. If you are on deb package linux distribution, it is as easy as sudo apt-get install mongodb-clients mongodb-server. Once both packages are install, you can check in the log file at /var/log/mongodb/mongodb.log similar such as the following. So our mongodb version is 2.6.3 and it has support using openssl library.

 2015-09-27T16:04:48.849+0800 [initandlisten] db version v2.6.3  
 2015-09-27T16:04:48.849+0800 [initandlisten] git version: nogitversion  
 2015-09-27T16:04:48.849+0800 [initandlisten] OpenSSL version: OpenSSL 1.0.1f 6 Jan 2014  

Next, let's generate a public and private key and a self sign certifcate.

 user@localhost:~/test1$ openssl req -newkey rsa:2048 -new -x509 -days 365 -nodes -out mongodb-cert.crt -keyout mongodb-cert.key  
 Generating a 2048 bit RSA private key  
 .............................+++  
 ..................................................................................................................................................................................................................+++  
 writing new private key to 'mongodb-cert.key'  
 -----  
 You are about to be asked to enter information that will be incorporated  
 into your certificate request.  
 What you are about to enter is what is called a Distinguished Name or a DN.  
 There are quite a few fields but you can leave some blank  
 For some fields there will be a default value,  
 If you enter '.', the field will be left blank.  
 -----  
 Country Name (2 letter code) [AU]:MY  
 State or Province Name (full name) [Some-State]:KL  
 Locality Name (eg, city) []:Kuala Lumpur  
 Organization Name (eg, company) [Internet Widgits Pty Ltd]:example.com  
 Organizational Unit Name (eg, section) []:Engineering  
 Common Name (e.g. server FQDN or YOUR name) []:Jason Wee  
 Email Address []:jason@example.com  
 user@localhost:~/test1$ ls  
 mongodb-cert.crt mongodb-cert.key  

Now put everything into a file with extension .pem.

 user@localhost:~/test1$ cat mongodb-cert.key mongodb-cert.crt > mongodb.pem  

Now, stop mongodb instance if it is running. As we will now configured the server to use the certificate we generated previously.

 user@localhost:~/test1$ sudo systemctl status mongodb  
 ● mongodb.service - An object/document-oriented database  
   Loaded: loaded (/lib/systemd/system/mongodb.service; enabled; vendor preset: enabled)  
   Active: inactive (dead) since Sun 2015-09-27 16:13:34 MYT; 23min ago  
    Docs: man:mongod(1)  
  Main PID: 15343 (code=exited, status=0/SUCCESS)  
   
 Sep 27 16:04:48 localhost systemd[1]: Started An object/document-oriented database.  
 Sep 27 16:04:48 localhost systemd[1]: Starting An object/document-oriented database...  
 Sep 27 16:13:33 localhost systemd[1]: Stopping An object/document-oriented database...  
 Sep 27 16:13:34 localhost systemd[1]: Stopped An object/document-oriented database.  
 Sep 27 16:36:30 localhost systemd[1]: Stopped An object/document-oriented database.  
 user@localhost:~/test1$ sudo tail -10 /etc/mongodb.conf   
 # Size limit for in-memory storage of op ids.  
 #opIdMem = <bytes>  
   
 # SSL options  
 # Enable SSL on normal ports  
 sslOnNormalPorts = true  
 # SSL Key file and password  
 #sslPEMKeyFile = /etc/ssl/mongodb.pem  
 sslPEMKeyFile = /home/user/test1/mongodb.pem  
 #sslPEMKeyPassword = pass  
 user@localhost:~/test1$   

In the above output, as an example, I have set the file mongodb.pem to the configuration sslPEMKeyFile and also set sslOnNormalPorts to true. Now if you start mongodb instance.

 user@localhost:~/test1$ sudo systemctl start mongodb  
 user@localhost:~/test1$   

In the log file, noticed that ssl is enabled and no ssl related error.

 2015-09-27T16:46:41.648+0800 [initandlisten] options: { config: "/etc/mongodb.conf", net: { bindIp: "127.0.0.1", ssl: { PEMKeyFile: "/home/user/test1/mongodb.pem", mode: "requireSSL" } }, storage: { dbPath: "/var/lib/mongodb", journal: { enabled: true } }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongodb.log" } }  
 2015-09-27T16:46:41.788+0800 [initandlisten] journal dir=/var/lib/mongodb/journal  
 2015-09-27T16:46:41.797+0800 [initandlisten] recover : no journal files present, no recovery needed  
 2015-09-27T16:46:42.162+0800 [initandlisten] waiting for connections on port 27017 ssl  

On the server configuration and setup, it is now done. Now, we will focus on the mongdb client. If you connect to mongodb using its client, you will get error.

 user@localhost:~/test1$ mongo foo  
 MongoDB shell version: 2.6.3  
 connecting to: foo  
 2015-09-27T17:22:54.300+0800 DBClientCursor::init call() failed  
 2015-09-27T17:22:54.302+0800 Error: DBClientBase::findN: transport error: 127.0.0.1:27017 ns: admin.$cmd query: { whatsmyuri: 1 } at src/mongo/shell/mongo.js:146  
 exception: connect failed  
 user@localhost:~/test1$ mongo --ssl --sslPEMKeyFile mongodb.pem  
 MongoDB shell version: 2.6.3  
 connecting to: test  
 Server has startup warnings:   
 2015-09-27T16:46:41.647+0800 [initandlisten]   
 2015-09-27T16:46:41.647+0800 [initandlisten] ** NOTE: This is a 32 bit MongoDB binary.  
 2015-09-27T16:46:41.647+0800 [initandlisten] **    32 bit builds are limited to less than 2GB of data (or less with --journal).  
 2015-09-27T16:46:41.647+0800 [initandlisten] **    See http://dochub.mongodb.org/core/32bit  
 2015-09-27T16:46:41.647+0800 [initandlisten]   
 > show dbs  
 admin (empty)  
 local 0.078GB  
 >   

As you can read above, you need to specify parameter ssl and the pem file. That's it for this article, if you want to go the distance, try using tcpdump to listen to this port traffic. Good luck!

Sunday, July 20, 2014

Study MongoDB administration

Today we are going to look into MongoDB administration. We will focus on a few areas, the backup, monitoring, configuration, import and export data.

backup and restore

  • journaling must be enabled on the logical volume.


To backup, start by using command mongodump.

jason@localhost:~$ mongodump
connected to: 127.0.0.1
2014-07-07T22:46:09.351+0800 all dbs
2014-07-07T22:46:09.352+0800 DATABASE: test to dump/test
2014-07-07T22:46:09.354+0800 test.system.indexes to dump/test/system.indexes.bson
2014-07-07T22:46:09.355+0800 4 documents
2014-07-07T22:46:09.356+0800 test.testData to dump/test/testData.bson
2014-07-07T22:46:09.359+0800 400 documents
2014-07-07T22:46:09.360+0800 Metadata for test.testData to dump/test/testData.metadata.json
2014-07-07T22:46:09.360+0800 test.users to dump/test/users.bson
2014-07-07T22:46:09.361+0800 1 documents
2014-07-07T22:46:09.362+0800 Metadata for test.users to dump/test/users.metadata.json
2014-07-07T22:46:09.362+0800 test.accounts to dump/test/accounts.bson
2014-07-07T22:46:09.369+0800 2 documents
2014-07-07T22:46:09.370+0800 Metadata for test.accounts to dump/test/accounts.metadata.json
2014-07-07T22:46:09.370+0800 test.transactions to dump/test/transactions.bson
2014-07-07T22:46:09.372+0800 1 documents
2014-07-07T22:46:09.373+0800 Metadata for test.transactions to dump/test/transactions.metadata.json
2014-07-07T22:46:09.374+0800 DATABASE: mydb to dump/mydb
2014-07-07T22:46:09.375+0800 mydb.system.indexes to dump/mydb/system.indexes.bson
2014-07-07T22:46:09.376+0800 2 documents
2014-07-07T22:46:09.377+0800 mydb.testData to dump/mydb/testData.bson
2014-07-07T22:46:09.378+0800 27 documents
2014-07-07T22:46:09.389+0800 Metadata for mydb.testData to dump/mydb/testData.metadata.json
2014-07-07T22:46:09.390+0800 mydb.users to dump/mydb/users.bson
2014-07-07T22:46:09.391+0800 1 documents
2014-07-07T22:46:09.392+0800 Metadata for mydb.users to dump/mydb/users.metadata.json
2014-07-07T22:46:09.392+0800 DATABASE: mp3db to dump/mp3db
2014-07-07T22:46:09.393+0800 mp3db.system.indexes to dump/mp3db/system.indexes.bson
2014-07-07T22:46:09.394+0800 4 documents
2014-07-07T22:46:09.395+0800 mp3db.mp3.files to dump/mp3db/mp3.files.bson
2014-07-07T22:46:09.396+0800 1 documents
2014-07-07T22:46:09.397+0800 Metadata for mp3db.mp3.files to dump/mp3db/mp3.files.metadata.json
2014-07-07T22:46:09.397+0800 mp3db.mp3.chunks to dump/mp3db/mp3.chunks.bson
2014-07-07T22:46:09.401+0800 2 documents
2014-07-07T22:46:09.401+0800 Metadata for mp3db.mp3.chunks to dump/mp3db/mp3.chunks.metadata.json
2014-07-07T22:46:09.402+0800 DATABASE: admin to dump/admin
2014-07-07T22:46:09.406+0800 DATABASE: config to dump/config

Let's remove some data from the database before we restore.
jason@localhost:~$ mongo
MongoDB shell version: 2.6.3
connecting to: test
Server has startup warnings:
2014-06-24T21:23:40.227+0800 [initandlisten]
2014-06-24T21:23:40.227+0800 [initandlisten] ** NOTE: This is a 32 bit MongoDB binary.
2014-06-24T21:23:40.227+0800 [initandlisten] ** 32 bit builds are limited to less than 2GB of data (or less with --journal).
2014-06-24T21:23:40.227+0800 [initandlisten] ** Note that journaling defaults to off for 32 bit and is currently off.
2014-06-24T21:23:40.228+0800 [initandlisten] ** See http://dochub.mongodb.org/core/32bit
2014-06-24T21:23:40.228+0800 [initandlisten]
> use mp3db;
switched to db mp3db
> show tables;
mp3.chunks
mp3.files
system.indexes
> db.mp3.chunks.remove({});
WriteResult({ "nRemoved" : 2 })
> db.mp3.files.remove({});
WriteResult({ "nRemoved" : 1 })
> db.mp3.chunks.find();
> db.mp3.files.find();
>

Now we restore using command mongorestore.
jason@localhost:~$ mongorestore --collection mp3.chunks --db mp3db dump/mp3db/mp3.chunks.bson
connected to: 127.0.0.1
2014-07-07T23:14:43.504+0800 dump/mp3db/mp3.chunks.bson
2014-07-07T23:14:43.504+0800 going into namespace [mp3db.mp3.chunks]
Restoring to mp3db.mp3.chunks without dropping. Restored data will be inserted without raising errors; check your server log
2 objects found
2014-07-07T23:14:43.534+0800 Creating index: { key: { _id: 1 }, name: "_id_", ns: "mp3db.mp3.chunks" }
2014-07-07T23:14:43.635+0800 Creating index: { key: { files_id: 1, n: 1 }, name: "files_id_1_n_1", ns: "mp3db.mp3.chunks" }

jason@localhost:~$ mongorestore --collection mp3.files --db mp3db dump/mp3db/mp3.files.bson
connected to: 127.0.0.1
2014-07-07T23:17:24.813+0800 dump/mp3db/mp3.files.bson
2014-07-07T23:17:24.813+0800 going into namespace [mp3db.mp3.files]
Restoring to mp3db.mp3.files without dropping. Restored data will be inserted without raising errors; check your server log
1 objects found
2014-07-07T23:17:24.819+0800 Creating index: { key: { _id: 1 }, name: "_id_", ns: "mp3db.mp3.files" }
2014-07-07T23:17:24.822+0800 Creating index: { key: { filename: 1, uploadDate: 1 }, name: "filename_1_uploadDate_1", ns: "mp3db.mp3.files" }

Looks good, the restoration process and now we verify the content.
jason@localhost:~$ mongo
MongoDB shell version: 2.6.3
connecting to: test
Server has startup warnings:
2014-06-24T21:23:40.227+0800 [initandlisten]
2014-06-24T21:23:40.227+0800 [initandlisten] ** NOTE: This is a 32 bit MongoDB binary.
2014-06-24T21:23:40.227+0800 [initandlisten] ** 32 bit builds are limited to less than 2GB of data (or less with --journal).
2014-06-24T21:23:40.227+0800 [initandlisten] ** Note that journaling defaults to off for 32 bit and is currently off.
2014-06-24T21:23:40.228+0800 [initandlisten] ** See http://dochub.mongodb.org/core/32bit
2014-06-24T21:23:40.228+0800 [initandlisten]
> use mp3db;
switched to db mp3db
> db.mp3.files.find();
{ "_id" : ObjectId("53ad61c844ae8a6ee12fcb63"), "chunkSize" : NumberLong(262144), "length" : NumberLong(316773), "md5" : "7293e9fd795e2bb6d5035e5b69cb2923", "filename" : "django.mp3", "contentType" : "audio/mpeg", "uploadDate" : ISODate("2014-06-27T12:21:28.646Z"), "aliases" : null }
>

Looks good. Now we move on to monitoring.

monitoring

mongostats - captures and returns the counts of database operations by type (e.g. insert, query, update, delete, etc.). These counts report on the load distribution on the server.
jason@localhost:~$ mongostat
connected to: 127.0.0.1
insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn time
*0 *0 *0 *0 0 1|0 0 320m 445m 10m 14 config:0.0% 0 0|0 0|0 62b 3k 1 22:33:54
*0 *0 *0 *0 0 1|0 0 320m 445m 10m 0 test:0.0% 0 0|0 0|0 62b 3k 1 22:33:55
*0 *0 *0 *0 0 1|0 0 320m 445m 10m 0 test:0.0% 0 0|0 0|0 62b 3k 1 22:33:56
^C

mongotop tracks and reports the current read and write activity of a MongoDB instance, and reports these statistics on a per collection basis.
jason@localhost:~$ mongotop
connected to: 127.0.0.1

ns total read write 2014-07-07T15:20:38
mp3db.mp3.chunks 0ms 0ms 0ms
local.system.replset 0ms 0ms 0ms
local.system.namespaces 0ms 0ms 0ms
local.system.indexes 0ms 0ms 0ms
local.startup_log 0ms 0ms 0ms
config.version 0ms 0ms 0ms
config.system.namespaces 0ms 0ms 0ms

ns total read write 2014-07-07T15:20:39
mp3db.mp3.chunks 0ms 0ms 0ms
local.system.replset 0ms 0ms 0ms
local.system.namespaces 0ms 0ms 0ms
local.system.indexes 0ms 0ms 0ms
local.startup_log 0ms 0ms 0ms
config.version 0ms 0ms 0ms
config.system.namespaces 0ms 0ms 0ms
^C
jason@localhost:~$

HTTP Console - MongoDB provides a web interface that exposes diagnostic and monitoring information in a simple web page. For example , by accessing http://192.168.0.2:27017/

Now using db.serverStatus() from the mongo shell. The serverStatus command, or db.serverStatus() from the shell, returns a general overview of the status of the database, detailing disk usage, memory use, connection, journaling, and index access. The command returns quickly and does not impact MongoDB performance.
> db.serverStatus()
{
"host" : "debby.e2e.serveftp.net",
"version" : "2.6.3",
"process" : "mongod",
"pid" : NumberLong(3651),
"uptime" : 1130615,
"uptimeMillis" : NumberLong(1130614302),
"uptimeEstimate" : 1115929,
"localTime" : ISODate("2014-07-07T15:27:14.416Z"),
"asserts" : {
"regular" : 0,
"warning" : 0,
"msg" : 0,
"user" : 2,
"rollovers" : 0
},
"backgroundFlushing" : {
"flushes" : 18843,
"total_ms" : 127648,
"average_ms" : 6.774292840842754,
"last_ms" : 2,
"last_finished" : ISODate("2014-07-07T15:26:43.282Z")
},
"connections" : {
"current" : 1,
"available" : 51199,
"totalCreated" : NumberLong(57)
},
"cursors" : {
"note" : "deprecated, use server status metrics",
"clientCursors_size" : 0,
"totalOpen" : 0,
"pinned" : 0,
"totalNoTimeout" : 2,
"timedOut" : 1
},
"extra_info" : {
"note" : "fields vary by platform",
"heap_usage_bytes" : 23483832,
"page_faults" : 13958
},
"globalLock" : {
"totalTime" : NumberLong("1130614310000"),
"lockTime" : NumberLong(184448),
"currentQueue" : {
"total" : 0,
"readers" : 0,
"writers" : 0
},
"activeClients" : {
"total" : 0,
"readers" : 0,
"writers" : 0
}
},
"indexCounters" : {
"accesses" : 451,
"hits" : 451,
"misses" : 0,
"resets" : 0,
"missRatio" : 0
},
"locks" : {
"." : {
"timeLockedMicros" : {
"R" : NumberLong(149),
"W" : NumberLong(184448)
},
"timeAcquiringMicros" : {
"R" : NumberLong(29),
"W" : NumberLong(32)
}
},
"admin" : {
"timeLockedMicros" : {
"r" : NumberLong(7348709),
"w" : NumberLong(0)
},
"timeAcquiringMicros" : {
"r" : NumberLong(55635),
"w" : NumberLong(0)
}
},
"local" : {
"timeLockedMicros" : {
"r" : NumberLong(59492773),
"w" : NumberLong(32)
},
"timeAcquiringMicros" : {
"r" : NumberLong(3164744),
"w" : NumberLong(3)
}
},
"config" : {
"timeLockedMicros" : {
"r" : NumberLong(182516),
"w" : NumberLong(0)
},
"timeAcquiringMicros" : {
"r" : NumberLong(46473),
"w" : NumberLong(0)
}
},
"mydb" : {
"timeLockedMicros" : {
"r" : NumberLong(43791920),
"w" : NumberLong(118)
},
"timeAcquiringMicros" : {
"r" : NumberLong(2159715),
"w" : NumberLong(9)
}
},
"test" : {
"timeLockedMicros" : {
"r" : NumberLong(28235652),
"w" : NumberLong(252)
},
"timeAcquiringMicros" : {
"r" : NumberLong(4052053),
"w" : NumberLong(19)
}
},
"mp3db" : {
"timeLockedMicros" : {
"r" : NumberLong(42491162),
"w" : NumberLong(1053565)
},
"timeAcquiringMicros" : {
"r" : NumberLong(6120501),
"w" : NumberLong(832)
}
}
},
"network" : {
"bytesIn" : 13516862,
"bytesOut" : 34014948,
"numRequests" : 733
},
"opcounters" : {
"insert" : 112,
"query" : 100247,
"update" : 0,
"delete" : 18,
"getmore" : 3,
"command" : 344
},
"opcountersRepl" : {
"insert" : 0,
"query" : 0,
"update" : 0,
"delete" : 0,
"getmore" : 0,
"command" : 0
},
"recordStats" : {
"accessesNotInMemory" : 10,
"pageFaultExceptionsThrown" : 1,
"admin" : {
"accessesNotInMemory" : 0,
"pageFaultExceptionsThrown" : 0
},
"config" : {
"accessesNotInMemory" : 0,
"pageFaultExceptionsThrown" : 0
},
"local" : {
"accessesNotInMemory" : 1,
"pageFaultExceptionsThrown" : 0
},
"mp3db" : {
"accessesNotInMemory" : 1,
"pageFaultExceptionsThrown" : 1
},
"mydb" : {
"accessesNotInMemory" : 2,
"pageFaultExceptionsThrown" : 0
},
"test" : {
"accessesNotInMemory" : 6,
"pageFaultExceptionsThrown" : 0
}
},
"writeBacksQueued" : false,
"mem" : {
"bits" : 32,
"resident" : 12,
"virtual" : 445,
"supported" : true,
"mapped" : 320
},
"metrics" : {
"cursor" : {
"timedOut" : NumberLong(1),
"open" : {
"noTimeout" : NumberLong(2),
"pinned" : NumberLong(0),
"total" : NumberLong(0)
}
},
"document" : {
"deleted" : NumberLong(72),
"inserted" : NumberLong(112),
"returned" : NumberLong(1294),
"updated" : NumberLong(0)
},
"getLastError" : {
"wtime" : {
"num" : 0,
"totalMillis" : 0
},
"wtimeouts" : NumberLong(0)
},
"operation" : {
"fastmod" : NumberLong(0),
"idhack" : NumberLong(0),
"scanAndOrder" : NumberLong(0)
},
"queryExecutor" : {
"scanned" : NumberLong(106),
"scannedObjects" : NumberLong(106)
},
"record" : {
"moves" : NumberLong(0)
},
"repl" : {
"apply" : {
"batches" : {
"num" : 0,
"totalMillis" : 0
},
"ops" : NumberLong(0)
},
"buffer" : {
"count" : NumberLong(0),
"maxSizeBytes" : 268435456,
"sizeBytes" : NumberLong(0)
},
"network" : {
"bytes" : NumberLong(0),
"getmores" : {
"num" : 0,
"totalMillis" : 0
},
"ops" : NumberLong(0),
"readersCreated" : NumberLong(0)
},
"preload" : {
"docs" : {
"num" : 0,
"totalMillis" : 0
},
"indexes" : {
"num" : 0,
"totalMillis" : 0
}
}
},
"storage" : {
"freelist" : {
"search" : {
"bucketExhausted" : NumberLong(0),
"requests" : NumberLong(97),
"scanned" : NumberLong(166)
}
}
},
"ttl" : {
"deletedDocuments" : NumberLong(0),
"passes" : NumberLong(18840)
}
},
"ok" : 1
}
>

dbStats - The dbStats command, or db.stats() from the shell, returns a document that addresses storage use and data volumes. The dbStats reflect the amount of storage used, the quantity of data contained in the database, and object, collection, and index counters.
> use mp3db
switched to db mp3db
> db.stats()
{
"db" : "mp3db",
"collections" : 4,
"objects" : 14,
"avgObjSize" : 42210.28571428572,
"dataSize" : 590944,
"storageSize" : 35037184,
"numExtents" : 7,
"indexes" : 4,
"indexSize" : 32704,
"fileSize" : 67108864,
"nsSizeMB" : 16,
"dataFileVersion" : {
"major" : 4,
"minor" : 5
},
"extentFreeList" : {
"num" : 0,
"totalSize" : 0
},
"ok" : 1
}
>

collStats - The collStats provides statistics that resemble dbStats on the collection level, including a count of the objects in the collection, the size of the collection, the amount of disk space used by the collection, and information about its indexes.
> db.mp3.chunks.stats()
{
"ns" : "mp3db.mp3.chunks",
"count" : 2,
"size" : 589792,
"avgObjSize" : 294896,
"storageSize" : 35012608,
"numExtents" : 4,
"nindexes" : 2,
"lastExtentSize" : 15290368,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 1,
"totalIndexSize" : 16352,
"indexSizes" : {
"_id_" : 8176,
"files_id_1_n_1" : 8176
},
"ok" : 1
}
>

replSetGetStatus - The replSetGetStatus command (rs.status() from the shell) returns an overview of your replica set’s status. The replSetGetStatus document details the state and configuration of the replica set and statistics about its members.
> rs.status()
{ "ok" : 0, "errmsg" : "not running with --replSet" }

log available in /var/log/mongodb/mongod.log

configuration

There are too many configurations to covered here but below are the essential configurations which you might need to change.

As mentioned previously, if the app that connected to the database are on two different servers, then in the server that run mongo instance, you should comment out bind_ip
# Listen to local interface only. Comment out to listen on all interfaces.
#bind_ip = 127.0.0.1

Default port running is 27017 but you should make sure that this port and ip allow to be access remotely.

use kernel 2.6.36 or later.

In general, if you use the Ext4 file system, use at least version 2.6.23 of the Linux Kernel.

In general, if you use the XFS file system, use at least version 2.6.25 of the Linux Kernel.

Set the file descriptor limit, -n, and the user process limit (ulimit), -u, above 20,000, according to the suggestions in the ulimit document. A low ulimit will affect MongoDB when under heavy use and can produce errors and lead to failed connections to MongoDB processes and loss of service.

That's it. Please go to the donate page and contribute back if you learned something.

Saturday, July 5, 2014

Study MongoDB GridFS with java example

In the past, we have learned basic MongoDB and study data model, in this article , we will study MongoDB GridFS by storing a file into MongoDB. Below is the java simple application to show how to store, retrieve, and delete eventually.
import java.io.File;
import java.io.IOException;

import com.mongodb.DB;
import com.mongodb.DBCursor;
import com.mongodb.Mongo;
import com.mongodb.MongoException;
import com.mongodb.gridfs.GridFS;
import com.mongodb.gridfs.GridFSDBFile;
import com.mongodb.gridfs.GridFSInputFile;

public class LearnMongo {

public static void main(String[] args) throws MongoException, IOException {
Mongo mongo = new Mongo("192.168.0.2", 27017);
DB db = mongo.getDB("mp3db");

// save image
String newFilename = "django.mp3";
File mp3File = new File("src/resources/django.mp3");
GridFS gfsMp3 = new GridFS(db, "mp3");
GridFSInputFile gfsFile = gfsMp3.createFile(mp3File);
gfsFile.setFilename(newFilename);
gfsFile.setContentType("audio/mpeg");
System.out.println(gfsFile.toString());
gfsFile.save();

// get mp3
GridFSDBFile imageForOutput = gfsMp3.findOne(newFilename);
System.out.println(imageForOutput);

// print image
DBCursor cursor = gfsMp3.getFileList();
while (cursor.hasNext()) {
System.out.println(cursor.next());
}

// save into another image
imageForOutput.writeTo("/home/jason/Desktop/newsong.mp3");

// delete image
gfsMp3.remove(gfsMp3.findOne(newFilename));

}

}

We start by connecting to the server, so with this example MongoDB instance running on server 192.168.0.2 on port 27017. You may want to check the configuration for MongoDB if you connect remotely as the default MongoDB configuration only listen to localhost.

Then we form a MongoDB DB object on mp3db. You can store other object as well but for this example, I'm going to store a mp3. With this ready, we are going to store the mp3. The important piece of code is probably below.
GridFS gfsMp3 = new GridFS(db, "mp3");
GridFSInputFile gfsFile = gfsMp3.createFile(mp3File);

Instantiate two object, GrisFS and GridFSInputFile. You can set additional information like filename, content type. Calling GridFSInputFile.save() will save the object into MongoDB. If you have access to MongoDB cli, command such as > db.mp3.files.find(); will shown below the output.
{ "_id" : ObjectId("53ad60f944aeaca83109d253"), "chunkSize" : NumberLong(262144), "length" : NumberLong(316773), "md5" : "7293e9fd795e2bb6d5035e5b69cb2923", "filename" : "django.mp3", "contentType" : "audio/mpeg", "uploadDate" : ISODate("2014-06-27T12:18:01.934Z"), "aliases" : null }

To find the mp3, you can use the code, GridFSDBFile imageForOutput = gfsMp3.findOne(newFilename); below is the output.
{ "_id" : { "$oid" : "53ad60f944aeaca83109d253"} , "chunkSize" : 262144 , "length" : 316773 , "md5" : "7293e9fd795e2bb6d5035e5b69cb2923" , "filename" : "django.mp3" , "contentType" : "audio/mpeg" , "uploadDate" : { "$date" : "2014-06-27T12:18:01Z"} , "aliases" : null }

You can also use GridFS.getFileList(); to retrieve all the files currently store on this database. The code continue on writing the object into a file. As you can see, I'm writing to desktop just to ensure it is not from the source.

I end this article by removing the object in the MongoDB database.

Friday, July 4, 2014

Study MongoDB data models

Today we are going to learn on MongoDB Data Models.

It is important to study data modal is because as a developer, you would want to leverage what MongoDB is excel at and aware what it is not suitable for.

You can basically store a few document and reference then using and id field.
But remember this need two round trip back and forth from the application servers
to the mongo database.

As such, in this scenario if it better to embed the document into a document.
user document
{
_id: <ObjectId1>, <-------------+
username : "jasonwee" |
} |
|
contact document |
{ |
_id: <ObjectId2>, |
user_id: <ObjectId1> <-------------+
phone: "012-3456789"
}

into
user document
{
_id: <ObjectId1>,
contact : {
phone: "012-3456789"
}
}

This modelling guarantee you atomicity of a document as mongodb write operations
are atomic at document level.

Indexes

Use indexes to improve performance for common queries. Build indexes on fields that appear often in queries and for all operations that return sorted results. MongoDB automatically creates a unique index on the _id field.

Each index requires at least 8KB of data space.

GridFS

GridFS is a specification for storing and retrieving files that exceed the BSON-document size limit of 16MB.

Model Relationships Between Documents

  • Model One-to-One Relationships with Embedded Documents
    Presents a data model that uses embedded documents to describe one-to-one relationships between connected data.

  • Model One-to-Many Relationships with Embedded Documents
    Presents a data model that uses embedded documents to describe one-to-many relationships between connected data.

  • Model One-to-Many Relationships with Document References
    Presents a data model that uses references to describe one-to-many relationships between documents.


Model Tree Structures

MongoDB allows various ways to use tree data structures to model large hierarchical or nested data relationships.

  • Model Tree Structures with Parent References 
    Presents a data model that organizes documents in a tree-like structure by storing references to “parent” nodes in “child” nodes.

  • Model Tree Structures with Child References
    Presents a data model that organizes documents in a tree-like structure by storing references to “child” nodes in “parent” nodes.

  • Model Tree Structures with an Array of Ancestors
    Presents a data model that organizes documents in a tree-like structure by storing references to “parent” nodes and an array that stores all ancestors.

  • Model Tree Structures with Materialized Paths
    Presents a data model that organizes documents in a tree-like structure by storing full relationship paths between documents. In addition to the tree node, each document stores the _id of the nodes ancestors or path as a string.

  • Model Tree Structures with Nested Sets
    Presents a data model that organizes documents in a tree-like structure using the Nested Sets pattern. This optimizes discovering subtrees at the expense of tree mutability.

Sunday, June 22, 2014

Learning basic MongoDB by installing and using CRUD

Today, we are going to learn MongoDB, including understand what is MongoDB, installation and doing CRUD operation. We start with the basic question.

what is MongoDB?

MongoDB (from "humongous") is a cross-platform document-oriented database. Classified as a NoSQL database, MongoDB eschews the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas (MongoDB calls the format BSON), making the integration of data in certain types of applications easier and faster.

With that said, let's move on to install MongoDB. There are many ways to install MongoDB but with this article, the one I'm chosen is to install MongoDB using deb package built by MongoDB. Even though MongoDB comes with ubuntu however the version in the repository is just too old. Current in the ubuntu repository, mongodb version is 1:2.4.9-1ubuntu2 and meanwhile official production release version is 2.6.1.

The instructions below are from http://docs.mongodb.org/manual/tutorial/install-mongodb-on-ubuntu/ . But I summarize into one liner. You will add a new MongoDB repository from official database site and install latest version.
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10 && echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.list.d/mongodb.list && sudo apt-get update && sudo apt-get install mongodb-org

If everything goes well, you should get a similar output installation MongoDB such as below:
jason@localhost:~$ sudo apt-get install mongodb-org
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
jhead libcec2 libgdata-google1.2-1 libgdata1.2-1 libjdependency-java liblockdev1 libmaven-archiver-java libmaven-clean-plugin-java
libmaven-compiler-plugin-java libmaven-dependency-tree-java libmaven-filtering-java libmaven-install-plugin-java libmaven-jar-plugin-java
libmaven-resources-plugin-java libmaven-shade-plugin-java libphp-adodb libpigment0.3-11 libplexus-compiler-java libplexus-digest-java oxideqt-codecs-extra
php-auth-sasl php-cache php-date php-file php-http-request php-log php-mail php-mail-mime php-mdb2 php-mdb2-driver-mysql php-net-dime php-net-ftp
php-net-smtp php-net-socket php-net-url php-services-weather php-soap php-xml-parser php-xml-serializer printer-driver-c2esp printer-driver-min12xxw
printer-driver-pnm2ppa printer-driver-pxljr python-axiom python-coherence python-configobj python-epsilon python-gpod python-louie python-nevow python-pgm
python-pyasn1 python-storm python-tagpy python-twill python-twisted-conch python-twisted-web2 qtdeclarative5-window-plugin tinymce2 xbmc-pvr-argustv
xbmc-pvr-dvbviewer xbmc-pvr-mediaportal-tvserver xbmc-pvr-mythtv-cmyth xbmc-pvr-nextpvr xbmc-pvr-njoy xbmc-pvr-tvheadend-hts xbmc-pvr-vdr-vnsi
xbmc-pvr-vuplus xdg-user-dirs-gtk
Use 'apt-get autoremove' to remove them.
The following extra packages will be installed:
mongodb-org-mongos mongodb-org-server mongodb-org-shell mongodb-org-tools
The following NEW packages will be installed:
mongodb-org mongodb-org-mongos mongodb-org-server mongodb-org-shell mongodb-org-tools
0 upgraded, 5 newly installed, 0 to remove and 51 not upgraded.
Need to get 113 MB of archives.
After this operation, 284 MB of additional disk space will be used.
Do you want to continue? [Y/n] Y
Get:1 http://downloads-distro.mongodb.org/repo/ubuntu-upstart/ dist/10gen mongodb-org-shell i386 2.6.1 [4,389 kB]
Get:2 http://downloads-distro.mongodb.org/repo/ubuntu-upstart/ dist/10gen mongodb-org-server i386 2.6.1 [9,308 kB]
Get:3 http://downloads-distro.mongodb.org/repo/ubuntu-upstart/ dist/10gen mongodb-org-mongos i386 2.6.1 [7,045 kB]
Get:4 http://downloads-distro.mongodb.org/repo/ubuntu-upstart/ dist/10gen mongodb-org-tools i386 2.6.1 [92.3 MB]
Get:5 http://downloads-distro.mongodb.org/repo/ubuntu-upstart/ dist/10gen mongodb-org i386 2.6.1 [3,652 B]
Fetched 113 MB in 3min 25s (549 kB/s)
Selecting previously unselected package mongodb-org-shell.
(Reading database ... 564794 files and directories currently installed.)
Preparing to unpack .../mongodb-org-shell_2.6.1_i386.deb ...
Unpacking mongodb-org-shell (2.6.1) ...
Selecting previously unselected package mongodb-org-server.
Preparing to unpack .../mongodb-org-server_2.6.1_i386.deb ...
Unpacking mongodb-org-server (2.6.1) ...
Selecting previously unselected package mongodb-org-mongos.
Preparing to unpack .../mongodb-org-mongos_2.6.1_i386.deb ...
Unpacking mongodb-org-mongos (2.6.1) ...
Selecting previously unselected package mongodb-org-tools.
Preparing to unpack .../mongodb-org-tools_2.6.1_i386.deb ...
Unpacking mongodb-org-tools (2.6.1) ...
Selecting previously unselected package mongodb-org.
Preparing to unpack .../mongodb-org_2.6.1_i386.deb ...
Unpacking mongodb-org (2.6.1) ...
Processing triggers for man-db (2.6.7.1-1) ...
Processing triggers for ureadahead (0.100.0-16) ...
Setting up mongodb-org-shell (2.6.1) ...
Setting up mongodb-org-server (2.6.1) ...
Adding system user `mongodb' (UID 143) ...
Adding new user `mongodb' (UID 143) with group `nogroup' ...
Not creating home directory `/home/mongodb'.
Adding group `mongodb' (GID 155) ...
Done.
Adding user `mongodb' to group `mongodb' ...
Adding user mongodb to group mongodb
Done.
mongod start/running, process 22386
Setting up mongodb-org-mongos (2.6.1) ...
Setting up mongodb-org-tools (2.6.1) ...
Processing triggers for ureadahead (0.100.0-16) ...
Setting up mongodb-org (2.6.1) ...

Looks like installation processed is done and fine. Even it is already started. So now let's play using mongo db command line.
jason@localhost:~$ mongo
MongoDB shell version: 2.6.1
connecting to: test
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
http://docs.mongodb.org/
Questions? Try the support group
http://groups.google.com/group/mongodb-user
Server has startup warnings:
2014-06-02T22:29:43.933+0800 [initandlisten]
2014-06-02T22:29:43.933+0800 [initandlisten] ** NOTE: This is a 32 bit MongoDB binary.
2014-06-02T22:29:43.933+0800 [initandlisten] ** 32 bit builds are limited to less than 2GB of data (or less with --journal).
2014-06-02T22:29:43.933+0800 [initandlisten] ** Note that journaling defaults to off for 32 bit and is currently off.
2014-06-02T22:29:43.933+0800 [initandlisten] ** See http://dochub.mongodb.org/core/32bit
2014-06-02T22:29:43.934+0800 [initandlisten]
>

As you can see, I'm running 32bit cpu, but it should work fine for 64bit cpu and the rest of this article. So everything has been smooth sailing so far, we will start to create, read, update and delete operation.

  • create




To create or insert a document, it is as easy as
db.inventory.insert( { _id: 10, type: "misc", item: "card", qty: 15 } )

More insert example
db.inventory.update(
{ type: "book", item : "journal" },
{ $set : { qty: 10 } },
{ upsert : true }
)

Interesting insert using save
db.inventory.save( { type: "book", item: "notebook", qty: 40 } )



  • read




to read or query document, it is as easy as
db.inventory.update(
{ type: "book", item : "journal" },
{ $set : { qty: 10 } },
{ upsert : true }
)

read more example here.




  • update


see create above for example.


  • delete




to remove all documents,
db.inventory.remove({})


That's it for this lengthy introduction.