Showing posts with label github. Show all posts
Showing posts with label github. Show all posts

Sunday, June 18, 2017

setup local cassandra repository coexist with cassandra upstream

For the past several years, apache cassandra has always been in my working realm. From start of cassandra 0.8 until cassandra 1.2 to beyond cassandra 3.9 (as of this time), I have been using it by modelling data in cassandra, inserting, retrieving, administrating and maintenance of production cluster.

So to take a step further, I thought of going into cassandra development. In this article, I will describe how I got source from cassandra official git repository and setup my own repository in github , so to coexist both of them.

Previously I have done git clone http://git-wip-us.apache.org/repos/asf/cassandra.git

 user@localhost:~/cassandra-trunk$ git remote -v  
 origin     http://git-wip-us.apache.org/repos/asf/cassandra.git (fetch)  
 origin     http://git-wip-us.apache.org/repos/asf/cassandra.git (push)  

So pretty much usual. Now, let's change to the follow
* origin point to github repository
* upstream point to github repository

First, let's remove the remote origin and then add my repository in github. Of cause, create an empty repository in github first before you continue following.

 user@localhost:~/cassandra-trunk$ git remote remove origin  
 user@localhost:~/cassandra-trunk$ git remote add origin https://github.com/jasonwee/cassandra.git  

Now, let's check the remote origin.

 user@localhost:~/cassandra-trunk$ git remote -v  
 origin     https://github.com/jasonwee/cassandra.git (fetch)  
 origin     https://github.com/jasonwee/cassandra.git (push)  

okay, everything is on track and expected. Now let's add upstream to the cassandra git repository.

 user@localhost:~/cassandra-trunk$ git remote add upstream http://git-wip-us.apache.org/repos/asf/cassandra.git  

and then we check again.

 user@localhost:~/workspace/StudyCassandra/cassandra-trunk$ git remote -v  
 origin     https://github.com/jasonwee/cassandra.git (fetch)  
 origin     https://github.com/jasonwee/cassandra.git (push)  
 upstream     http://git-wip-us.apache.org/repos/asf/cassandra.git (fetch)  
 upstream     http://git-wip-us.apache.org/repos/asf/cassandra.git (push)  

OK! everything is good to go. okay.. now that we have two remote repositories, so how should we continue to work further? Now, when we pull, we have to first specify where to pull from and what branch to pull. In the following example, we pull from upstream on the trunk (master) branch.

 user@localhost:~/cassandra-trunk$ git pull upstream trunk  
 From http://git-wip-us.apache.org/repos/asf/cassandra  
  * branch      trunk   -> FETCH_HEAD  
 Already up-to-date.  

beautiful, now we can pull from upstream and into our working repository. Let's push our repository into github now.

 user@localhost:~/cassandra-trunk$ git push -u origin trunk  
 Counting objects: 266270, done.  
 Delta compression using up to 8 threads.  
 Compressing objects: 100% (42703/42703), done.  
 Writing objects: 100% (266270/266270), 136.16 MiB | 547.00 KiB/s, done.  
 Total 266270 (delta 160130), reused 265316 (delta 159364)  
 remote: Resolving deltas: 100% (160130/160130), done.  
 To https://github.com/jasonwee/cassandra.git  
  * [new branch]   trunk -> trunk  
 Branch trunk set up to track remote branch trunk from origin.  

okay, that's it, one more step ahead.

Friday, August 29, 2014

Where to read branch work (or commits) in github?

Have you been stuck either of these situations:

  • a lot of times, when you do your works on branch, and as days passed, you wanna review your own codes by browsing through the history but no idea how?

  • or maybe you want to let you colleague take a look at the work you have done and code review for you?

  • or see the changes you made in the branch and write a change log before you merge back into the master branch.


Today, we are going to learn just that.

With command line, you can use git log. LEAD-451 is an example of my branch and it is here for illustration purposes but you should change to the branch you want to view.
git log master...LEAD-451

this will show the changes including commit, author, date, message. If you notice, the order is chronological, with latest being to top and oldest at the bottom. You can use --reverse to see the oldest first.

If you want to see the file status, if you add --name-status to the command

.If you want to see the actual code changes, it is very intuitive, you use git diff. So
git diff master...LEAD-451

and you get a lengthy code different output between branch master and branch LEAD-451. If you want to generate a patch, you can give -p to the command. If you want to see what files change/add/delete between these two branches, you can add parameter --name-status or --name-only.

Enough for the command line, now we go for some visual representation. For this, I will illustrate using github.

With the same condition, in github, there is a feature called compare view.

https://github.com/Opentracker/luceneOnCassandra/compare/master...LEAD-451

As you can see on the bottom, the output is very much same with the command line we have tried before this. But github condense everything into one , very nice.

Assuming you are at your project landing page at github, how do you quickly get the compare view?

  • at the front page, https://github.com/Opentracker/luceneOnCassandra/

  • click on the branch drop down, select the branch you want to diff. example LEAD-451

  • at the page https://github.com/Opentracker/luceneOnCassandra/tree/LEAD-451, you can click on the compare button.


 

That's it, I hope you learned something and please donate as a mean to continue funding this blog maintenance. Thank you.

Saturday, August 16, 2014

how to push branch work to github and list unpushed git commit

Often time when we work on issue, we branch from master branch and started our development on the branch. However, if the branch work never published, your co developer cannot read the changes. In this article, we are going to learn how
to publish the branch work to github.

You should have familiar basic branch work in git. Example.
git branch my-branch-work
git checkout my-branch-work
// do develope work here until you are ready to merge to the master branch.
git checkout master
git merge my-branch-work

If your local branch has set to my-branch-work, if you are trying to pull down from github, you will get similar message below.
There is no tracking information for the current branch.
Please specify which branch you want to merge with.
See git-pull(1) for details

git pull <remote> <branch>

If you wish to set tracking information for this branch you can do so with:

git branch --set-upstream-to=origin/<branch> my-branch-work

That is when you should start to push your branch work to github.
jason@localhost:~$ git push -u origin my-branch-work
Username for 'https://github.com': xxxxxx
Password for 'https://xxxxxx@github.com':
Total 0 (delta 0), reused 0 (delta 0)
To https://github.com/organization/myproject.git
* [new branch] my-branch-work -> my-branch-work
Branch my-branch-work set up to track remote branch my-branch-work from origin.

Then make sure your local branch is also pointed to the correct branch
$ git branch
* my-branch-work
master

The next time you do git pull, you will not receive the error. If you want to push your branch changes to github, you should use this command.
$ git push origin my-branch-work
Username for 'https://github.com': xxxxxx
Password for 'https://xxxxxx@github.com':
Counting objects: 10, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (7/7), done.
Writing objects: 100% (10/10), 2.73 KiB | 0 bytes/s, done.
Total 10 (delta 3), reused 0 (delta 0)
To https://github.com/organization/myproject.git
954be4a..5c1bcb6 my-branch-work -> my-branch-work

Often times, when you commit locally and you go on develop. Then probably pause for some period of time due to other priority works, and when you come back and do git status, you started to notice, hey, there is some local commit which you did not push but you forgotten what is actually in the commit. So is there a way to view it?
$ git status
On branch master
Your branch is ahead of 'origin/master' by 2 commits.
(use "git push" to publish your local commits)

Yes, there is, you can use command like git log origin/master..HEAD

Some additional command which is helpful including viewing the different using command git diff origin/master..HEAD

That's it, I hope you like it and you can donate via our donation page. Thank you.

Monday, December 23, 2013

Elasticsearch index slow log for search and indexing

Today, we are going to learn on the logging for elasticsearch for its search and index. In elasticsearch config file, elasticsearch.yml, it should have a configuration such as below:
################################## Slow Log ##################################

# Shard level query and fetch threshold logging.

#index.search.slowlog.threshold.query.warn: 10s
#index.search.slowlog.threshold.query.info: 5s
#index.search.slowlog.threshold.query.debug: 2s
index.search.slowlog.threshold.query.trace: 500ms

#index.search.slowlog.threshold.fetch.warn: 1s
#index.search.slowlog.threshold.fetch.info: 800ms
#index.search.slowlog.threshold.fetch.debug: 500ms
index.search.slowlog.threshold.fetch.trace: 200ms

#index.indexing.slowlog.threshold.index.warn: 10s
#index.indexing.slowlog.threshold.index.info: 5s
#index.indexing.slowlog.threshold.index.debug: 2s
index.indexing.slowlog.threshold.index.trace: 500ms

So with this example, I have enable tracing for search query and search fetch with 500ms and 200ms respectively. A search in elasticsearch consists of query time and fetch time. Hence the two configuration for search. Meanwhile, logging for elasticsearch index is also enable with a threshold of 500ms.

With these configuration sets, and if your indexing or search exceed that threshold,
an entry will be log into a file. The logging file should be located in path.log
that is set in elasticsearch.yml.

So what does the number really means? Excerpts from elasticsearch official documentation

The logging is done on the shard level scope, meaning the executionof a search request within a specific shard. It does not encompass the whole search request, which can be broadcast to several shards in order to execute. Some of the benefits of shard level logging is the association of the actual execution on the specific machine, compared with request level.


 

All settings are index level settings (and each index can have different values for it), and can be changed in runtime using the indexupdate settings API.


 

... and, I have tried updating the index setting via a simple tool I've made earlier on. But the idea is same, you just need to http get by putting the variable into the index setting. You can find more information here The key for the configuration is available at ShardSlowLogSearchService.java class.
[jason@node1 bin]$ ./indices-setting.sh set search.slowlog.threshold.query.trace 500
{
"ok" : true,
"acknowledged" : true
}

[2013-12-23 12:31:12,758][TRACE][index.search.slowlog.query] [node1] [index_test][146] took[1s], took_millis[1026], types[foo,bar], stats[], search_type[QUERY_THEN_FETCH], total_shards[90], source[{"size":80,"timeout":10000,"query":{"filtered":{"query":{"query_string":{"query":"maxis*","default_operator":"and"}},"filter":{"and":{"filters":[{"query":{"match":{"site":{"query":"www.google.com","type":"boolean"}}}},{"range":{"unixtimestamp":{"from":null,"to":1387825199000,"include_lower":true,"include_upper":true}}}]}}}},"filter":{"query":{"match":{"site":{"query":"www.google.com","type":"boolean"}}}},"sort":[{"unixtimestamp":{"order":"desc"}}]}], extra_source[],

With this example, it has exceed the threshold set at 500ms which it ran for 1 second.

As for indexing, the fundamental concept is the same, so we won't elaborate in this article and that should leave you as a tutorial. :-)