Tuesday, March 7, 2023

how to setup mariadb galera cluster in docker container

 1. create network (one time)

   docker network create --subnet=172.18.0.0/16 mynet123


2. create image that can start service in container (one time)

   mkdir systemd

   cd systemd

   vim Dockerfile

   FROM almalinux

   ENV container docker


   RUN (cd /lib/systemd/system/sysinit.target.wants/; for i in ; do [ $i == systemd-tmpfiles-setup.service ] || rm -f $i; done);


   RUN rm -rf /lib/systemd/system/multi-user.target.wants/ \

   && rm -rf /etc/systemd/system/.wants/ \

   && rm -rf /lib/systemd/system/local-fs.target.wants/ \

   && rm -f /lib/systemd/system/sockets.target.wants/udev \

   && rm -f /lib/systemd/system/sockets.target.wants/initctl \

   && rm -rf /lib/systemd/system/basic.target.wants/ \

   && rm -f /lib/systemd/system/anaconda.target.wants/*


   VOLUME [ “/sys/fs/cgroup” ]

   CMD ["/usr/sbin/init"]


3. build image (one time)

   docker build -t almalinux-md .


4. start the container in the background

   docker run -v /tmp/node1/:/var/lib/mysql --detach --rm -it --privileged --net mynet123 --ip 172.18.0.11 --name node1  almalinux-md 

   docker run -v /tmp/node2/:/var/lib/mysql --detach --rm -it --privileged --net mynet123 --ip 172.18.0.12 --name node2  almalinux-md

   docker run -v /tmp/node3/:/var/lib/mysql --detach --rm -it --privileged --net mynet123 --ip 172.18.0.13 --name node3  almalinux-md  


5. attach it and so can install mariadb and gelera, etc

   docker exec -it node1 bash

   docker exec -it node2 bash

   docker exec -it node3 bash


6. install in all 3 nodes

   dnf install -y mariadb-server.x86_64  mariadb.x86_64 rsync python3-policycoreutils vim nc telnet epel-release.noarch mariadb-server-galera.x86_64


7. on first node

   systemctl start mariadb

   systemctl status mariadb

   [root@6c3d7f5bc394 /]# mysql -uroot

   MariaDB [(none)]> set password = password("password");

   quit;

   systemctl stop mariadb

      

8. from step 5. wsrep_node_address and wsrep_node_name update to its corresponding name   


   [root@cfeee129fe92 /]# cat /etc/my.cnf.d/galera.cnf

   [mysqld]

   binlog_format=ROW

   default-storage-engine=innodb

   innodb_autoinc_lock_mode=2

   bind-address=0.0.0.0


   # Galera Provider Configuration

   wsrep_on=ON

   wsrep_provider=/usr/lib64/galera/libgalera_smm.so


   # Galera Cluster Configuration

   wsrep_cluster_name="ot_mariadb_cluster"

   wsrep_cluster_address="gcomm://172.18.0.11,172.18.0.12,172.18.0.13"


   # Galera Synchronization Configuration

   wsrep_sst_method=rsync


   # Galera Node Configuration

   wsrep_node_address="172.18.0.13"

   wsrep_node_name="node3"

   [root@cfeee129fe92 /]# 


9.  on first node, start the cluster

    galera_new_cluster

   

10. check on the first node,

    mysql -u root -ppassword -e "SHOW STATUS LIKE 'wsrep_cluster_size'"   

   

11. bring up the remaining nodes one by one

    systemctl start mariadb

    mysql -u root -ppassword -e "SHOW STATUS LIKE 'wsrep_cluster_size'"   


12. ready to write/read

mysql -u root -ppassword -e 'CREATE DATABASE playground;

CREATE TABLE playground.equipment ( id INT NOT NULL AUTO_INCREMENT, type VARCHAR(50), quant INT, color VARCHAR(25), PRIMARY KEY(id));

INSERT INTO playground.equipment (type, quant, color) VALUES ("slide", 2, "blue");'


mysql -u root -ppassword -e 'SELECT * FROM playground.equipment;'

Thursday, February 24, 2022

Best place I think docker container is for

 docker container has been with us for many years. For software engineer which deal with legacy application, it can be intimidating to migrate to docker application. Because system stability, production investigation and quick fix and service stability surpasses every technology stack. At least that is for me, this cannot be compromise. 


So where is docker container is best for? Where do we start docker container for? For many people, if you google, they build the application into docker image and deploy. Sure, that can be done but for me, as a start, I think for learning purposes and best use cases. Here, I would like share my experience and best use cases.


code compiling

Many of us do code compiling and this is especially true if you want to deploy a bunch of application, modify here and there, build the binary , get the final result and then move on. Installing software packages into container will not taint the host. Modify the configuration also need not worry about taint the host. What is even more merrier, you don't have to clean up after you done compiling. One can just delete the container and start over. Guaranteed reproducible end result. This is similar to chroot but container allow you to do further with different o/s.


application on legacy database

sometime when we migrate development server to the latest o/s but however it break the legacy application or its dependency, example database. In such a case, docker can come to the rescue, you can install legacy database in the docker container whilst the application still can access the database without any puzzle or not expected result. This is even more true when you are busy and focus on upgrading the development server, you don't have the time to upgrade the application or its database. When you are done develop development server, you can then upgrade the database to another version inside docker. I think docker in this case is a really good fit.


quick test for code

Many times when you write bash script, you want to quickly test if the script works fine. But you don't want to test run on your machine as it will change the underlying configuration/state. If you run a container, those code can be quickly test it and when you are done, you can exit the container and nothing harmful done to the host. In this case, you are totally sure your code is tested well and in the same time with less maintenance after tests.


That's it I want to share my opinions and use cases for docker container. If I learn new way of using docker, I will update this document. I hope you will find it useful for your use cases.



Friday, June 25, 2021

How to cherry pick on a commit but only a few files from upstream patch

I forked a repository from upstream and there was a patch I like and I would like that patch goes into my forked repository. But at the same time, I would like to pick only a few changes, that is, not the whole commit and at the same time too, I need to make some minor modifications. So here goes how this is done.

Let's check where we are at now.

 jason@localhost:~/advanced-policy-firewall$ git branch  
 * master  
  rfxn_upstream  
  robertoberto-patch  
  support-ipv6  

Get the commit id from roberto patch, cherry pick the whole commit but do not commit yet.. just stage at the current branch (which is master)

 jason@localhost:~/advanced-policy-firewall$ git cherry-pick -n 0d1df6549820f9592aefb2353b77c52eadbe759f  
 Auto-merging files/vnet/vnetgen  
 CONFLICT (content): Merge conflict in files/vnet/vnetgen  
 error: could not apply 0d1df65... - fixed vnetgen for newer linux distros to prioritize ip over ipconfig  
 hint: after resolving the conflicts, mark the corrected paths  
 hint: with 'git add <paths>' or 'git rm <paths>'  
 jason@localhost:~/advanced-policy-firewall$ git status .  
 On branch master  
 Your branch is up to date with 'origin/master'.  
 Changes to be committed:  
  (use "git restore --staged <file>..." to unstage)  
      modified:  README  
      new file:  test/distros_examples/ifconfig_ubuntu_12  
      new file:  test/distros_examples/ifconfig_ubuntu_20  
      new file:  test/distros_examples/ip_addr_ubuntu_12  
      new file:  test/distros_examples/ip_addr_ubuntu_20  
      new file:  test/distros_examples/ip_link_ubuntu_12  
      new file:  test/distros_examples/ip_link_ubuntu_20  
 Unmerged paths:  
  (use "git restore --staged <file>..." to unstage)  
  (use "git add <file>..." to mark resolution)  
      both modified:  files/vnet/vnetgen  

Let's get the diff of the stage file

 jason@localhost:~/advanced-policy-firewall$ git diff --cached README  
 diff --git a/README b/README  
 index 07ad004..0adf39d 100644  
 --- a/README  
 +++ b/README  
 @@ -199,7 +199,7 @@ Fedora Core Any  
  Slackware 8.0+  
  Debian GNU/Linux 3.0+  
  Suse Linux 8.1+  
 -Unbuntu Any  
 +Ubuntu Any  
  TurboLinux Server 9+  
  TurboLinux Fuji (Desktop)  
  RedHat Linux 7.3,8,9  

Unstage a file because I wanna make minor modification. After changes are make, add it back to the stage

 $ git restore --staged test/distros_examples/ip_link_ubuntu_12  
 $ vim test/distros_examples/ip_link_ubuntu_12  
 $ git add test/distros_examples/ip_link_ubuntu_12  

All good now, let's commit

 $ git commit -m "cherry pick but only selected files from roberto patched"  
 [master 80830c9] cherry pick but only selected files from roberto patched  
  8 files changed, 330 insertions(+), 28 deletions(-)  
  create mode 100644 test/distros_examples/ifconfig_ubuntu_12  
  create mode 100644 test/distros_examples/ifconfig_ubuntu_20  
  create mode 100644 test/distros_examples/ip_addr_ubuntu_12  
  create mode 100644 test/distros_examples/ip_addr_ubuntu_20  
  create mode 100644 test/distros_examples/ip_link_ubuntu_12  
  create mode 100644 test/distros_examples/ip_link_ubuntu_20  
 $ git branch  
 * master  
  rfxn_upstream  
  robertoberto-patch  
  support-ipv6  

and we push the changes to github

 $ git push  
 Enumerating objects: 19, done.  
 Counting objects: 100% (19/19), done.  
 Delta compression using up to 16 threads  
 Compressing objects: 100% (13/13), done.  
 Writing objects: 100% (14/14), 3.68 KiB | 3.68 MiB/s, done.  
 Total 14 (delta 7), reused 0 (delta 0), pack-reused 0  
 remote: Resolving deltas: 100% (7/7), completed with 4 local objects.  
 To https://github.com/jasonwee/advanced-policy-firewall.git  
   f98e9bb..80830c9 master -> master  

in case you want to see the final result, please visit this github commit.

Monday, June 21, 2021

Continuous Integration with GitHub Action

This is a contribution to the SUSE Cloud Native Foundations Scholarship Program which I received and some of the peers want a CI online demo. So here I will share on my knowledge and experience on continuous integration using a public repository and public runner. I will start with a sample project creation, GitHub project setup and end with GitHub action setup and runner. 


What is CI? 

In software engineering, continuous integration (CI) is the practice of merging  all developers' working copies to a shared mainline several times a day.[1]


So essentially, it is a routine where every developer has to go through after they made code changes. Example, syntax check, linting, code compiling, multiple tests, package building, perhaps also support different runtime versions and/or operating systems.

There are many CI software available, please choose the one which match your requirements. I particularly like Jenkins a lot but Jenkins would require you to setup the runner on your machine. Since the objective of this blog is all public, hence, I picked GitHub action.

Let's begin with initial project setup. Reference if you want to know more.

https://maven.apache.org/guides/getting-started/maven-in-five-minutes.html

https://docs.github.com/en/actions/guides/building-and-testing-java-with-maven


```

$ export JAVA_HOME=/usr/lib/jvm/jdk-11.0.5/

$ mvn --version

Apache Maven 3.6.3

Maven home: /usr/share/maven

Java version: 11.0.5, vendor: Oracle Corporation, runtime: /usr/lib/jvm/jdk-11.0.5

Default locale: en_US, platform encoding: UTF-8

OS name: "linux", version: "5.10.0-7-amd64", arch: "amd64", family: "unix"

$ mvn archetype:generate -DgroupId=ch.weetech.app -DartifactId=demo_ci -DarchetypeArtifactId=maven-archetype-quickstart -DarchetypeVersion=1.4 -DinteractiveMode=false

[INFO] Scanning for projects...

[INFO] 

[INFO] ------------------< org.apache.maven:standalone-pom >-------------------

[INFO] Building Maven Stub Project (No POM) 1

[INFO] --------------------------------[ pom ]---------------------------------

...

...

[INFO] ------------------------------------------------------------------------

[INFO] BUILD SUCCESS

[INFO] ------------------------------------------------------------------------

[INFO] Total time:  5.555 s

[INFO] Finished at: 2021-06-20T14:53:50+08:00

[INFO] ------------------------------------------------------------------------

$ cd demo_ci

$ tree .

.

├── pom.xml

└── src

    ├── main

    │   └── java

    │       └── ch

    │           └── weetech

    │               └── app

    │                   └── App.java

    └── test

        └── java

            └── ch

                └── weetech

                    └── app

                        └── AppTest.java


11 directories, 3 files

$ mvn package


[INFO] Scanning for projects...

[INFO] 

[INFO] -----------------------< ch.weetech.app:demo_ci >-----------------------

[INFO] Building demo_ci 1.0-SNAPSHOT

[INFO] --------------------------------[ jar ]---------------------------------

[INFO] 

[INFO] --- maven-resources-plugin:3.0.2:resources (default-resources) @ demo_ci ---

[INFO] Using 'UTF-8' encoding to copy filtered resources.

...

...

...

[INFO] ------------------------------------------------------------------------

[INFO] BUILD SUCCESS

[INFO] ------------------------------------------------------------------------

[INFO] Total time:  2.221 s

[INFO] Finished at: 2021-06-20T14:56:41+08:00

[INFO] ------------------------------------------------------------------------

$ java -cp target/demo_ci-1.0-SNAPSHOT.jar ch.weetech.app.App

Hello World!

```

Once we have the sample project setup, test and compile code locally okay. Then it is time that we initialize git repository locally and push to GitHub. But before  we do of the following, you need to create a new repository on your GitHub.


```

$ mvn clean

[INFO] Scanning for projects...

[INFO] 

[INFO] -----------------------< ch.weetech.app:demo_ci >-----------------------

[INFO] Building demo_ci 1.0-SNAPSHOT

[INFO] --------------------------------[ jar ]---------------------------------

[INFO] ------------------------------------------------------------------------

[INFO] BUILD SUCCESS

[INFO] ------------------------------------------------------------------------

[INFO] Total time:  2.079 s

[INFO] Finished at: 2021-06-20T15:24:46+08:00

[INFO] ------------------------------------------------------------------------

$ git init 

hint: Using 'master' as the name for the initial branch. This default branch name

hint: is subject to change. To configure the initial branch name to use in all

hint: of your new repositories, which will suppress this warning, call:

hint: 

hint: git config --global init.defaultBranch <name>

hint: 

hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and

hint: 'development'. The just-created branch can be renamed via this command:

hint: 

hint: git branch -m <name>

$ git status -m master

$ git status .

On branch master


No commits yet


Untracked files:

  (use "git add <file>..." to include in what will be committed)

pom.xml

src/


nothing added to commit but untracked files present (use "git add" to track)

$ git add pom.xml src/

$ git commit -m "initial"

[master (root-commit) 2cd8641] initial

 3 files changed, 108 insertions(+)

 create mode 100644 pom.xml

 create mode 100644 src/main/java/ch/weetech/app/App.java

 create mode 100644 src/test/java/ch/weetech/app/AppTest.java

$ git remote add origin https://github.com/jasonwee/demo_ci.git

$ git branch -M main

$ git push -u origin main

Enumerating objects: 16, done.

Counting objects: 100% (16/16), done.

Delta compression using up to 16 threads

Compressing objects: 100% (6/6), done.

Writing objects: 100% (16/16), 1.76 KiB | 899.00 KiB/s, done.

Total 16 (delta 0), reused 0 (delta 0), pack-reused 0

To https://github.com/jasonwee/demo_ci.git

 * [new branch]      main -> main

Branch 'main' set up to track remote branch 'main' from 'origin'.

$ git status .

On branch main

Your branch is up to date with 'origin/main'.


nothing to commit, working tree clean


```

Now that you pushed everything to GitHub, next is to setup GitHub Actions.

1. Locate Actions from the project tab.

2. Picked 'Java with Maven' as this best match the current requirement.

3. Check action configurations are okay and commit it. 



4. once you commit, the action will kickstart the first run.


Congratulation! This is the first CI that you setup. The rest, as you may have already guessed will be the same as every developer will do, i.e. making code changes.

```

$ vim README.md

$ cat README.md

a contribution to https://www.udacity.com/scholarships/suse-cloud-native-foundations-scholarship

$ git status .

On branch main

Your branch is up to date with 'origin/main'.


Untracked files:

  (use "git add <file>..." to include in what will be committed)

README.md


nothing added to commit but untracked files present (use "git add" to track)

$ git add README.md 

$ git commit -m "added readme" README.md 

[main a966f09] added readme

 1 file changed, 1 insertion(+)

 create mode 100644 README.md

$ git push

Enumerating objects: 4, done.

Counting objects: 100% (4/4), done.

Delta compression using up to 16 threads

Compressing objects: 100% (3/3), done.

Writing objects: 100% (3/3), 410 bytes | 410.00 KiB/s, done.

Total 3 (delta 0), reused 0 (delta 0), pack-reused 0

To https://github.com/jasonwee/demo_ci.git

   a273126..a966f09  main -> main

```





That's it, in case you figure where can I find this repository to begin mine? https://github.com/jasonwee/demo_ci 


Friday, December 18, 2020

OpenHAB vs Xiaomi Sensor

original from https://www.planetknauer.net/blog/archives/2020-12-OpenHAB-vs-Xiaomi-Sensoren.html 


Some time ago I started to use OpenHAB to expand the monitoring of my servers to the whole apartment. It would be nice to know whether I closed the freezer in the basement again or if I left the door open out of sheer senility. Corresponding sensors are available for relatively little money, e.g. from Xiaomi (or from other manufacturers for a lot more money, whether they are easier to integrate is the big question). However, the integration of the Xiaomi sensors in OpenHAB is a Chinese opera in several acts: 

 Act 1: The starter set The starter set is available for around 80.- and it comes with a pressure switch, two door sensors and two motion detectors as well as the base station. According to the description, everything is very simple: Start the base station, install the Mihome app on an Android phone, select "Mainland China" as the server location, connect the app to the base station, activate developer mode via a hidden menu, read out the key and so that the base station can be integrated into OpenHAB. Then everything would be very easy with a data flow sensor -> Xiaomi Hub -> OpenHAB. Yes but... Exactly, because the whole thing only works with the "Mainland China Edition" base station, which is not available in Europe. You can presumably have them delivered overseas through a Chinese wholesaler (don't forget the travel adapter, Chinese plugs don't fit into European sockets). But the EU edition is unusable: - If you select "Mainland China" as the server location in the app, the base station cannot be found and therefore cannot be connected - If you choose the server location "Europe", the hidden menu to activate the developer mode is missing. 

 Act 2: Obsolete app With a little research it turned out that an outdated version of the Mihome app can be found on dubious pages, which contains an error: It writes a debug log. Where to find the access key. Unfortunately, that alone doesn't help. This allows you to integrate the hub via the Xiaomi Mi IO extension from OpenHAB. But that was all. As a result, there is still no access to the sensors. For this, the developer mode would still have to be activated, which also opens Telnet access on the device. Now there are two more options: A modified Mihome app from a Russian website that is completely in Cyrillic. Well uh ... Njet! Or unpack the soldering iron, tap the serial port, use it to gain terminal access and activate the Telnet server. Now that there is a good chance that this can destroy the device, I prefer to leave it (for now). At least I could resell it while it still works. 

 Act 3: The Aqara Hub A further look at the documentation of the Mihome binding shows: The hub in version 3 (which is available as Aqara Hub for Apple Homekit) should be a little more accessible. Unfortunately, that costs almost as much as the whole set. And then just as little can. I immediately sent it back accordingly ...

Act 4: Cheap Zigbee Stick
The Xiaomi devices, like all proprietary garbage, of course never exactly adhere to the standards. But at least enough that the protocol just barely passes as Zigbee. So I bought a USB Zigbee stick from the richest man in the world for € 9 + € 3.50 shipping. To my great surprise, this, although electronic device, it was sent from amazon germany to switzerland. Very unusual. And arrived super fast too. Also unusual.

It is a simple USB stick with a CC2531 chip and zigbee2mqtt-compatible firmware preinstalled. Very awesome!

In principle, OpenHAB would be able to address the Zigbee stick directly via the Zigbee binding. The data flow would then be Sensor -> USB stick -> OpenHAB. But there was something with the protocol standard at Xiaomi and stick to it. The sensors can be integrated, but are displayed as "offline" and no status can be queried. As usual, the following applies: why easy when it can be complicated?

Now the from-behind-through-the-chest-in-the-eye installation for the data flow sensor -> USB stick -> zigbee2mqtt -> mqtt-broker -> OpenHAB begins.
First the stick is connected, it is recognized as a USB serial device / dev / ttyACM0.
Now an MQTT broker has to be installed, e.g. mosquitto from the Debian package sources. This is started without further configuration.
Next, zigbee2mqtt is installed with what feels like two thousand Node.JS dependencies (including npm from the Debian backports if you use Debian Stable as a base). In contrast to the OpenHAB part that follows later, this is excellently documented, so that this part feels more like paint-by-numbers than system administration.

In principle, the devices can now be integrated. Simply reset the sensor with the SIM pin included in the package, and that's it. According to the instructions, you may have to repeat the process several times, but with the first two sensors it worked right away. A look at journalctl -u zigbee2mqtt -f shows activity.

Now comes the hard part: connecting OpenHAB to MQTT. This is documented very superficially and abstractly. Add to this the chaos with instructions for MQTT1 and MQTT2 binding when you google for solutions. Which now applies to my installation? Boh? Ultimately, I followed the instructions for MQTT2, and that worked at some point. Probably: MQTT1 == OpenHAB1, MQTT2 == OpenHAB2 (and I'm running 2.5).

How to proceed:
In the zigbee2mqtt configuration file /opt/zigbee2mqtt/data/configuration.yaml, the output should not be output as JSON but as an attribute. To do this, insert the following lines, save, restart zigbee2mqtt:

experimental:
    output: attribute

And if we are already fiddling with the configuration, one should also assign sensible friendly_name to the sensors.
First install the MQTT binding in OpenHAB.
Then create a .things file with the required entries in /etc/openhab2/things /. At some point I found halfway suitable instructions in the forum ...
And now you are surprised that the things appear in the GUI, but no data is read ... Signal strength? NaN. Battery level? NaN. Status? Off. grrrmpf. After a long debugging process (yes, zigbee2mqtt writes in mosquitto, you can read along with mosquitto_sub -v -t '#') at some point just triggered the spontaneous Windows reflex and restarted OpenHAB itself. Aaand! Bingo! Everything works. So easy! Incidentally, the restart is necessary for every newly added (or renamed) device.


The finale: the OpenHAB Things file


Bridge mqtt:broker:MosquittoMqttBroker "Mosquitto MQTT Broker" [ host="127.0.0.1", secure=false] {
Thing topic xdoor1 "Xiaomi Door Sensor" @ "Location" {
Channels:
Type switch : contact "contact" [ stateTopic = "zigbee2mqtt/xdoor1/contact", on="true", off="false" ]
Type number : voltage "voltage" [ stateTopic = "zigbee2mqtt/xdoor1/voltage" ]
Type number : battery "battery" [ stateTopic = "zigbee2mqtt/xdoor1/battery" ]
Type number : linkquality "linkquality" [ stateTopic = "zigbee2mqtt/xdoor1/linkquality" ]
}
}


Additional sensors can now easily be added to the bridge block. With a little more typing, sensors can also be defined outside the bridge block:


Thing mqtt:topic:MosquittoMqttBroker:BodySensor "Xiaomi Body Sensor" (mqtt:broker:MosquittoMqttBroker) @ "Location" {
Channels:
Type switch : occupancy "occupancy" [ stateTopic = "zigbee2mqtt/xbody1/occupancy", on="true", off="false" ]
Type number : voltage "voltage" [ stateTopic = "zigbee2mqtt/xbody1/voltage" ]
Type number : battery "battery" [ stateTopic = "zigbee2mqtt/xbody1/battery" ]
Type number : linkquality "linkquality" [ stateTopic = "zigbee2mqtt/xbody1/linkquality" ]
}


The existing channels can be found out via mosquitto_sub or journalctl. As soon as you stimulate a sensor, it sends all of this information to the Zigbee controller.



Applause
Of course, especially in combination with Zigbee (or Z-Wave), OpenHAB is a bottomless pit in terms of possibilities. A lot of technology can be connected even without a wireless connection: printers, mail and XMPP accounts, WLAN (or connected devices), telephone systems, mpd (Music Player Daemon), video cameras (e.g. via Zoneminder - but that would be a blog entry in itself) . With Zigbee everything gets even wilder. After the sensors, the entire rest of the house can be integrated, from lamps, heating and roller shutter control to the washing machine to the lawn mower to the wallbox of the electric vehicle.
If more Zigbee sensors / actuators are to be set up a little further away, you simply take a Raspberry Pi, connect another USB stick to it, install zigbee2mqtt and have the sensor data sent over the network to the MQTT broker on the OpenHAB machine .

Thursday, November 26, 2020

How does Reverse DNS work behind the scene - a layman explanation

Ever wonder what actually happen behind the scene when you do a reverse DNS query?

It is quick and it return a value.

 $ time dig -x 8.8.8.8 +short  
 dns.google.  
 real     0m0.019s  
 user     0m0.005s  
 sys     0m0.005s

In this article, I will explain to you want happen behind the scene.

when the query pass to your resolver, what your resolver does, when you ask it for the ptr (which is 8.8.8.8.in-addr.arpa )

 $ dig ptr 8.8.8.8.in-addr.arpa @a.root-servers.net  

which will tell them: "I don't know about in-addr.arpa - you need to ask the in-addr.arpa server" which correspond to

 ;; AUTHORITY SECTION:  
 in-addr.arpa.          172800     IN     NS     a.in-addr-servers.arpa.  
 in-addr.arpa.          172800     IN     NS     b.in-addr-servers.arpa.  
 in-addr.arpa.          172800     IN     NS     c.in-addr-servers.arpa.  
 in-addr.arpa.          172800     IN     NS     d.in-addr-servers.arpa.  
 in-addr.arpa.          172800     IN     NS     e.in-addr-servers.arpa.  
 in-addr.arpa.          172800     IN     NS     f.in-addr-servers.arpa.  

then the resolver asks one or more of them:


dig ns 8.8.8.8.in-addr.arpa @a.in-addr-servers.arpa

again, it will get delegated to the next servers, which handle "8.in-addr.arpa"


8.in-addr.arpa.		86400	IN	NS	arin.authdns.ripe.net.
8.in-addr.arpa.		86400	IN	NS	z.arin.net.
8.in-addr.arpa.		86400	IN	NS	y.arin.net.
8.in-addr.arpa.		86400	IN	NS	r.arin.net.
8.in-addr.arpa.		86400	IN	NS	x.arin.net.
8.in-addr.arpa.		86400	IN	NS	u.arin.net.

the game continues:


dig ns 8.8.8.8.in-addr.arpa @z.arin.net

"you gotta ask level 3, they know about 8.8.in-addr.arpa"


8.8.in-addr.arpa.	86400	IN	NS	ns1.level3.net.
8.8.in-addr.arpa.	86400	IN	NS	ns2.level3.net.

and the final delegation from level 3 is to the google nameservers:


dig ns 8.8.8.8.in-addr.arpa @ns1.level3.net

[...]

;; AUTHORITY SECTION:
8.8.8.in-addr.arpa.	3600	IN	NS	ns4.google.com.
8.8.8.in-addr.arpa.	3600	IN	NS	ns2.google.com.
8.8.8.in-addr.arpa.	3600	IN	NS	ns3.google.com.
8.8.8.in-addr.arpa.	3600	IN	NS	ns1.google.com.

and only from them will you get the final anser for 8.8.8.8:

 dig PTR 8.8.8.8.in-addr.arpa @ns1.google.com  
 ; <<>> DiG 9.10.6 <<>> PTR 8.8.8.8.in-addr.arpa @ns1.google.com  
 ;; global options: +cmd  
 ;; Got answer:  
 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 20871  
 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1  
 ;; WARNING: recursion requested but not available  
 ;; OPT PSEUDOSECTION:  
 ; EDNS: version: 0, flags:; udp: 512  
 ;; QUESTION SECTION:  
 ;8.8.8.8.in-addr.arpa.          IN     PTR  
 ;; ANSWER SECTION:  
 8.8.8.8.in-addr.arpa.     86400     IN     PTR     dns.google.  
 ;; Query time: 132 msec  
 ;; SERVER: 2001:4860:4802:32::a#53(2001:4860:4802:32::a)  
 ;; WHEN: Thu Nov 26 10:53:58 CET 2020  
 ;; MSG SIZE rcvd: 73  

That's it!

Friday, October 2, 2020

cassandra 4.0 important points

Recently I have read a very good apache cassandra 4.0 books as available here. I would really like to recommend this book if you are new or even have use cassandra before (since version 1.0) and would like to know what have change since then. Below are the important points that I think would help me in the future of using cassandra 4.0.

---

https://github.com/jeffreyscarpenter/cassandra-guide

Cassandra versions from 3.0 onward require a Java 8 JVM or later,
preferably the latest stable version. It has been tested on both the
OpenJDK and Oracle’s JDK. Cassandra 4.0 has been compiled and
tested against both Java 8 and Java 11. You can check your installed
Java version by opening a command prompt and executing java -
version .



The committers work hard to ensure that data is readable from one
minor dot release to the next and from one major version to the
next. The commit log, however, needs to be completely cleared out
from version to version (even minor versions).
If you have any previous versions of Cassandra installed, you may
want to clear out the data directories for now, just to get up and
running. If you’ve messed up your Cassandra installation and want
to get started cleanly again, you can delete the data folders.

If you’ve used Cassandra in releases prior to 3.0, you may also be
familiar with the command-line client interface known as
cassandra-cli . The CLI was removed in the 3.0 release because it
depends on the legacy Thrift API, which was deprecated in 3.0 and
removed entirely in 4.0.


Cassandra uses a special type of primary key called a composite key (or compound
key) to represent groups of related rows, also called partitions. The composite key
consists of a partition key, plus an optional set of clustering columns. The partition key
is used to determine the nodes on which rows are stored and can itself consist of mul‐
tiple columns. The clustering columns are used to control how data is sorted for stor‐
age within a partition. Cassandra also supports an additional construct called a static
column, which is for storing data that is not part of the primary key but is shared by
every row in a partition.


Insert, Update, and Upsert
Because Cassandra uses an append model, there is no fundamental
difference between the insert and update operations. If you insert a
row that has the same primary key as an existing row, the row is
replaced. If you update a row and the primary key does not exist,
Cassandra creates it.


Remember that TTL is stored on a per-column level for nonpri‐
mary key columns. There is currently no mechanism for setting
TTL at a row level directly after the initial insert; you would instead
need to reinsert the row, taking advantage of Cassandra’s upsert
behavior. As with the timestamp, there is no way to obtain or set
the TTL value of a primary key column, and the TTL can only be
set for a column when you provide a value for the column.

Primary Keys Are Forever
After you create a table, there is no way to modify the primary key,
because this controls how data is distributed within the cluster, and
even more importantly, how it is stored on disk.



Server-Side Denormalization with Materialized Views
Historically, denormalization in Cassandra has required designing
and managing multiple tables using techniques we will introduce
momentarily. Beginning with the 3.0 release, Cassandra provides
an experimental feature known as materialized views which allows
you to create multiple denormalized views of data based on a base
table design. Cassandra manages materialized views on the server,
including the work of keeping the views in sync with the table.


A key goal as you begin creating data models in Cassandra is to minimize the number
of partitions that must be searched in order to satisfy a given query. Because the parti‐
tion is a unit of storage that does not get divided across nodes, a query that searches a
single partition will typically yield the best performance.

The CQL SELECT statement does support ORDER BY semantics, but only in the order specified by the
clustering columns (ascending or descending).


The Importance of Primary Keys in Cassandra
The design of the primary key is extremely important, as it will
determine how much data will be stored in each partition and how
that data is organized on disk, which in turn will affect how quickly
Cassandra processes read queries.


The queue anti-pattern serves as a reminder that any design that relies on the deletion
of data is potentially a poorly performing design.

A rack is a logical set of nodes in close proximity to each other, perhaps on 
physical machines in a single rack of equipment.

A data center is a logical set of racks, perhaps located in the same building 
and connected by reliable network.


The replication factor is
set per keyspace. The consistency level is specified per query, by the
client. The replication factor indicates how many nodes you want
to use to store a value during each write operation. The consistency
level specifies how many nodes the client has decided must
respond in order to feel confident of a successful read or write
operation. The confusion arises because the consistency level is
based on the replication factor, not on the number of nodes in the
system.

Since the 2.0 release, Cassandra supports a lightweight
transaction (LWT) mechanism that provides linearizable consistency.

The basic Paxos algorithm consists of two stages: prepare/promise and propose/
accept.

In
early implementations of Cassandra, memtables were stored on the JVM heap, but
improvements starting with the 2.1 release have moved some memtable data to native
memory, with configuration options to specify the amount of on-heap and native
memory available.

The counter cache was added in the 2.1 release to improve counter performance
by reducing lock contention for the most frequently accessed counters.

One interesting feature of compaction relates to its intersection with incremental
repair. A feature called anticompaction was added in 2.1.


sers with prior experience may recall that Cassandra exposes an
administrative operation called major compaction (also known as
full compaction) that consolidates multiple SSTables into a single
SSTable. While this feature is still available, the utility of perform‐
ing a major compaction has been greatly reduced over time. In fact,
usage is actually discouraged in production environments, as it
tends to limit Cassandra’s ability to remove stale data.


Traditionally, SSTables have been streamed one partition at a time.
The Cassandra 4.0 release introduced a zero-copy streaming fea‐
ture to stream SSTables in their entirety using zero-copying APIs of
the host operating system. These APIs allow files to be transferred
over the network without first copying them into the CPU. This
feature is enabled by default and has been estimated to improve
streaming speed by a factor of 5.


, the system_traces keyspace
was added in 1.2 to support request tracing. The system_auth and
system_distributed keyspaces were added in 2.2 to support role-
based access control (RBAC) and persistence of repair data, respec‐
tively. Tables related to schema definition were migrated from
system to the system_schema keyspace in 3.0.


Hinted handoffs have traditionally been stored in the sys
tem.hints table. As thoughtful developers have noted, the fact that
hints are really messages to be kept for a short time and deleted
means this usage is really an instance of the well-known anti-
pattern of using Cassandra as a queue, which is discussed in Chap‐
ter 5. Hint storage was moved to flat files in the 3.0 release.


Because Cassandra partitions data across multiple nodes, each
node must maintain its own copy of a secondary index based on
the data stored in partitions it owns. For this reason, queries
involving a secondary index typically involve more nodes, making
them significantly more expensive.
Secondary indexes are not recommended for several specific cases:
• Columns with high cardinality. For example, indexing on the
hotel.address column could be very expensive, as the vast
majority of addresses are unique.
• Columns with very low data cardinality. For example, it would
make little sense to index on the user.title column (from
the user table in Chapter 4) in order to support a query for
every “Mrs.” in the user table, as this would result in a massive
row in the index.
• Columns that are frequently updated or deleted. Indexes built
on these columns can generate errors if the amount of deleted
data (tombstones) builds up more quickly than the compac‐
tion process can handle.


Elimination of the Cluster Object
Previous versions of DataStax drivers supported the concept of a
Cluster object used to create Session objects. Recent driver ver‐
sions (for example, the 4.0 Java driver and later) have combined
Cluster and Session into CqlSession .


Because a CqlSession maintains TCP connections to multiple
nodes, it is a relatively heavyweight object. In most cases, you’ll
want to create a single CqlSession and reuse it throughout your
application, rather than continually building up and tearing down
CqlSessions . Another acceptable option is to create a CqlSession
per keyspace, if your application is accessing multiple keyspaces.


The write path begins when a client initiates a write query to a Cassandra node which
serves as the coordinator for this request. The coordinator node uses the partitioner
to identify which nodes in the cluster are replicas, according to the replication factor
for the keyspace. The coordinator node may itself be a replica, especially if the client
is using a token-aware load balancing policy. If the coordinator knows that there are
not enough replicas up to satisfy the requested consistency level, it returns an error
immediately.
Next, the coordinator node sends simultaneous write requests to all local replicas for
the data being written. If the cluster spans multiple data centers, the local coordinator
node selects a remote coordinator in each of the other data centers to forward the
write to the replicas in that data center. Each of the remote replicas acknowledges the
write directly to the original coordinator node.

The DataStax drivers do not provide separate mechanisms for
counter batches. Instead, you must simply remember to create
batches that include only counter modifications or only non-
counter modifications.


A node is considered unresponsive if it does not respond to a query before the
value specified by read_request_timeout_in_ms in the configuration file. The
default is 5 seconds.


The read repair may be performed either before or after the return to the client. If
you are using one of the two stronger consistency levels ( QUORUM or ALL ), then the
read repair happens before data is returned to the client. If the client specifies a weak
consistency level (such as ONE ), then the read repair is optionally performed in the
background after returning to the client. The percentage of reads that result in back‐
ground repairs for a given table is determined by the read_repair_chance and
dc_local_read_repair_chance options for the table.


The syntax of the WHERE clause involves two rules. First, all elements of the partition
key must be identified. Second, a given clustering key may only be restricted if all pre‐
vious clustering keys are restricted by equality.


While it is possible to change the partitioner on an existing cluster,
it’s a complex procedure, and the recommended approach is to
migrate data to a new cluster with your preferred partitioner using
techniques we discuss in Chapter 15.

Deprecation of Thrift RPC Properties
Historically, Cassandra supported two different client interfaces:
the original Thrift API, also known as the Remote Procedure Call
(RPC) interface, and the CQL native transport first added in 0.8.
For releases through 2.2, both interfaces were supported and
enabled by default. Starting with the 3.0 release, Thrift was disabled
by default and has been removed entirely as of the 4.0 release. If
you’re using an earlier version of Cassandra, know that properties
prefixed with rpc generally refer to the Thrift interface.


Timeouts
If you’re building a cluster that spans multiple data centers, it’s a good idea to
measure the latency between data centers and tune timeout values in the cassan‐
dra.yaml file accordingly.


However, you may wish to reclaim the disk space used by this excess data more
quickly to reduce the strain on your cluster. To do this, you can use the nodetool
cleanup command. To complete as quickly as possible, you can allocate all compac‐
tion threads to the cleanup by adding the -j 0 option. As with the flush command,
you can select to clean up specific keyspaces and tables.

The repair command can be restricted to run in the local data cen‐
ter via the -local option (which you may also specify via the
longer form --in-local-dc ), or in a named data center via the -dc
<name> option (or --in-dc <name> ).

Transitioning to Incremental Repair
Incremental repair became the default in the 2.2 release, and you
must use the -full option to request a full repair. If you are using a
version of Cassandra prior to 2.2, make sure to consult the release
documentation for any additional steps to prepare your cluster for
incremental repair.

If you’re using the PropertyFileSnitch , you’ll need to add the address of your new
node to the properties file on each node and do a rolling restart of the nodes in your
cluster. It is recommended that you wait 72 hours before removing the address of the
old node to avoid confusing the gossiper.

If the node is down, you’ll have to use the nodetool removenode command instead of
decommission . If your cluster uses vnodes, the removenode command causes Cassan‐
dra to recalculate new token ranges for the remaining nodes and stream data from
current replicas to the new owner of each token range.

Beware the Large Partition
In addition to the nodetool tablehistograms discussed earlier,
you can detect large partitions by searching logs for WARN mes‐
sages that reference “Writing large partition” or “Compacting large
partition.” The threshold for warning on compaction of large parti‐
tions is set by the compaction_large_partition_warning_thres
hold_mb property in the cassandra.yaml file.


On the server side, you can configure individual nodes to trace some or all of their
queries via the nodetool settraceprobability command. This command takes a
number between 0.0 (the default) and 1.0, where 0.0 disables tracing and 1.0 traces
every query.


DateTieredCompactionStrategy Deprecated
TWCS replaces the DateTieredCompactionStrategy (DTCS)
introduced in the 2.0.11 and 2.1.1 releases, which had similar goals
but also some rough edges that made it difficult to use and main‐
tain. DTCS is now considered deprecated as of the 3.8 release. New
tables should use TWCS.


Property name                        Default value           Description
read_request_timeout_in_ms           5000 (5 seconds)        How long the coordinator waits for read operations to complete
range_request_timeout_in_ms          10000 (10 seconds)      How long the coordinator should wait for range reads to complete
write_request_timeout_in_ms          2000 (2 seconds)        How long the coordinator should wait for writes to complete
counter_write_request_time_out_in_ms 5000 (5 seconds)        How long the coordinator should wait for counter writes to complete
cas_contention_timeout_in_ms         1000 (1 second)         How long a coordinator should continue to retry a lightweight transaction
truncate_request_timeout_in_ms       60000 (1 minute)        How long the coordinator should wait for truncates to complete (including snapshot)
streaming_socket_timeout_in_ms       3600000 (1 hour)        How long a node waits for streaming to complete
request_timeout_in_ms                10000 (10 seconds)      The default timeout for other, miscellaneous operations


G1GC generally requires fewer tuning decisions; the intended usage is that you need
only define the min and max heap size and a pause time goal. A lower pause time will
cause GC to occur more frequently.

There has been considerable discussion in the Cassandra community about switching
to G1GC as the default. For example, G1GC was originally the default for the Cassan‐
dra 3.0 release, but was backed out because it did not perform as well as the CMS for
heap sizes smaller than 8 GB. The emerging consensus is that the G1GC performs
well without tuning, but the default configuration of ParNew/CMS can result in
shorter pauses when properly tuned.


Request throttling
If you’re concerned about a client flooding the cluster with a large number of
requests, you can use the Java driver’s request throttling feature to limit the rate
of queries to a value you define using configuration options in the
advanced.throttler namespace. Queries in excess of the rate are queued until
the utilization is back within range. This behavior is mostly transparent from the
client perspective, but it is possible to receive a RequestThrottlingException on
executing a statement; this indicates that the CqlSession is overloaded and
unable to queue the request.



As of the 4.0 release, Cassandra supports hot reloading of certificates, which enables
certificate rotation without downtime. The keystore and truststore settings are
reloaded every 10 minutes, or you can force a refresh with the nodetool reloadssl
command.