Apache Hadoop Download For Ubuntu And Instalation

If you are just getting acquainted with big data, it is definitely one that you need is Hadoop, you need to know about Hadoop, or maybe you want to try to install a Hadoop itself. The simplest step is to do the installation of Hadoop on one machine, or called by the single node. If you don't have a linux environment on your own, then the easiest is to run linux as a virtual machine with VMWare or VirtualBox.

By doing the installation yourself, you will get a clearer picture about what is a component of Hadoop, and Hadoop is roughly how it works.

In this tutorial will describe the installation step Hadoop 2.6.0 on Ubuntu VMware at 14.04.

The first is surely you've prepared the operating system linux ubuntu using VMware, then you must do the installation against Java.

Install openjdk
Install the open jdk 7 by entering commands on your linux terminal:
$ sudo apt-get install openjdk-7-jdk

Check the installation of java
After that you check the results of the installation of java earlier with entering commands in the terminal:
$ java -version

Make your dedication to user hadoop
Create user hadoop useful for running hadoop. This step is not to be done, but it is recommended to separate the hadoop installation with other applications in the same machine. ype the following command in your terminal
$ sudo addgroup hdgroup
$ sudo adduser --ingroup hdgroup hduser

Enter into the hadoop sudoers (to create a directory, setting permissions, etc.)
$ sudo adduser hduser sudo

Configure SSH
Hadoop requires SSH access to manage its nodes. For single node Hadoop, we need to configure SSH access to the localhost to the user hadoop we have created earlier.

1. Install ssh
$ sudo apt-get install ssh

2. Generate an SSH key for user hadoop
$ su -hduser/p>
$ ssh-keygen -t rsa -P ""

The second command line above will create an RSA key pair with an empty password. In fact the use of blank passwords is not recommended from a security side, but in this case we need access without a password for purposes of interaction with Hadoop nodes. We don't want to enter password every time Hadoop access node.

3. enable SSH access to your local machine with the newly created key
$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

4.Test SSH settings
The final step is to test the SSH setup by connecting to localhost using the hadoop user. This step is also needed to store host key dingerprint to the file known_host user's hadoop.
$ ssh localhost

Installing Hadoop
Download hadoop IN HERE official website or LIVE IN HERE, extract it to a directory, example /usr/local/hadoop.
$ cd /usr/local hduser@ubuntu:~$ sudo tar xzf hadoop-1.0.3.tar.gz
$ sudo mv hadoop-1.0.3 hadoop

Do not forget to take ownership of the directory
$ sudo chown -R hadoop:hadoop hadoop

Update the .bashrc file.
Add the following lines to the end of the file $HOME/.bashrc from the user hadoop. If you are using a shell other than bash, then you need to update the config files that correspond.

Update The Configuration File Hadoop
Here are some files that need to be updated in the directory /usr/local/hadoop/conf/ 

1. File hadoop-env.sh
# The java implementation to use. Required.
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64

2. Files *-site.xml
Create a temporary directory for hadoop for app parameters  hadoop.tmp.dir, in these terms of use /app/hadoop/tmp.
$ sudo mkdir -p /app/hadoop/tmp
$ sudo chown hduser:hdgroup /app/hadoop/tmp

Note: If the above steps are missed, then chances are you will get permission denied errors or java.io.IOException When you will be formatting the HDFS namenode.

HDFS file system format
Do on first installation. Don't do the Hadoop namenode-format to have walked (data contents), because this format command will erase all data in the HDFS.
$ /usr/local/hadoop/bin/hadoop namenode -format

The result will be as below.

Start single-node Hadoop cluster
$ /usr/local/hadoop/sbin/start-dfs.sh

This command runs the Namenode, Datanode, Jobtracker and a Tasktracker Output like below:

One practical way of checking up on Hadoop process any succesfully run is the jps (included in the package open jdk 6).
$ jps

Stop Hadoop services
$ /usr/local/hadoop/bin/stop-dfs.sh

The output like below:

Click here for Comments