VirtualBox multi-server development configuration

It’s not always enough to have only one virtual machine for testing purposes. More – it’s not enough in most of the cases. One of such cases is – for example – when you want to try a multi-node configuration for software like Hadoop or Cassandra, or make a failover test of your system. For me most comfortable and useful is to have a configuration that allows guest system to use the Internet (usually it’s a default, NAT mode) but also makes it possible to easily connect to guest system from host system (not by port forwarding) and makes guest systems able to “talk” with each other – it’s good enough to mimic most of production configurations I use. To do this we need to setup a two-network-cards configuration for alle the guest systems we have. This article is about how to make it work.

Continue reading

Problem with very slow sendmail startup

I was playing a bit with some virtual machines I need for testing, when after a reboot I noticed that sendmail is starting very slow – it took about 3-4 minutes to have it working. I’ve checked the log too see what’s wrong:

[root@hdps01 ~]# tail /var/log/maillog
Jul 11 21:26:43 hdps01 sm-msp-queue[1266]: My unqualified host name (hdps01) unknown; sleeping for retry
Jul 11 21:27:43 hdps01 sm-msp-queue[1266]: unable to qualify my own domain name (hdps01) -- using short name
Jul 11 21:27:43 hdps01 sm-msp-queue[1289]: starting daemon (8.14.4): queueing@01:00:00

So, what was the problem?

Continue reading

Be careful with Hadoop and pbzip2!

Or you may be surprised at one day when you see that your output looks like it’s missing a lot of data. The problem affects Hadoop versions older than 1.4 (according to Jira) and is caused by the misinterpretation of EOS in compressed files, which is interpreted as EOF, so it – obviously – ends reading the file:

https://issues.apache.org/jira/browse/COMPRESS-146

So, if your Hadoop is misbehaving and your output data look odd without any reason – ask your admins if they didn’t change bzip2 to pbzip2.

Very good interactive Git workflow cheat sheet

Few minutes ago I was looking for a Git workflow cheat sheet to verify some rarely-used parts of my knowledge before doing something I might regret. Actually I was hoping to find something very simple (preferrably one pdf page or so), but instead I found this one, which is a very good, interactive webpage. So I decided to share this find with you, because it’s definitely worth it:

http://ndpsoftware.com/git-cheatsheet.html

Git workflow cheat sheet

Git workflow cheat sheet

What’s best in it, it presents everything in a very intuitive, visual way which is easy to understand. If you are looking for a command which will completely revert your commited changes, you can just click on “Local Repository” and see which arrow points to “Workspace” – it’s a git reset --hard one. How about leaving the changes you’ve made, but reverting commit? It’s an arrow with git reset --soft. Brillant!

It’s not only a good thing for people who look for a typical cheat sheet, but also for those who have some problems with understanding git workflow.

I really like it – nice work guys!

More on Cassandra’s SimpleAuthority permissions

Few days ago I had some doubts on how Cassandra’s SimpleAuthenticator and SimpleAuthority really work. I mean – I was not sure of the way I should configure them to get the expected results. It may seem obvious now, but I had to look at source code to find out what is possible and what is not. So, to save your time, here’s a brief description of this.

Continue reading

My first Cassandra contribution

A bit surprisingly and somehow accidentally, today I became a Cassandra contributor. I had a problem with the project I do for work, which made us unable to make our bulkloading script work together with Cassandra authentication (which I described in one of the previous articles on this blog), so we decided to try solving this issue on our own.

The solution was quite simple, but gave me a bit more of Cassandra knowledge and understanding. If you are interested in contributing to Cassandra I think you can take a look at this problem and the solution or even try to reproduce this problem (on Cassandra 1.1.0-rc1 or earlier) and then try to solve it on your own. As I said it was simple, so you won’t get frustrated with the problems you will face, but I think it’s good start for something more. Here is the link to Cassandra’s bug tracker issue:

https://issues.apache.org/jira/browse/CASSANDRA-4155

Working on interesting things, being nicely paid for this and contributing to remarkable Open Source projects in the same time – could it be any better? ;)

Mounting HDFS cluster as a block device with hadoop-fuse

Using Hadoop may quickly become very annoying if you have to navigate through the HDFS filesystem with a standard hadoop command. As a Linux user I got used to TAB-autocompletion feature, which lets me quickly and easily use my filesystem so I was really disappointed with this difficulty. Luckily – there’s a solution which eased my pain!

Continue reading