
Using Cloudera Deploy to install Cloudera Data Platform (CDP) Private Cloud
After our last one Cloudera Data Platform (CDP) overview, we cover how to deploy CDP Private Cloud on your on-premises infrastructure. It is fully automated using the Ansible cookbooks published by Cloudera and it is reproducible on your local host using Vagrant.
CDP is an enterprise data cloud. It provides a powerful Big Data platform, built-in security with automated compliance and data protection governance, and policy-based, metadata-driven analytics for end users.
Deploy one CDP Private Cloud clustering is not an easy task. Therefore, we present a way to get a local cluster up and running in a few simple steps. We will deploy a basic cluster consisting of two nodes, a master and a worker. In our cluster we will run the following services: HDFS, YARN and Zookeeper.
Conditions
You can use the on-premises infrastructure of your choice to deploy CDP Private Cloud. In this tutorial we will use Drifter and VirtualBox to quickly start two virtual machines that will act as the nodes of the cluster.
VirtualBox
VirtualBox is a cross-platform virtualization application. Download the latest version of VirtualBox.
Drifter
Vagrant is a tool for building and managing virtual machine environments. Download the latest version of Drifter.
Once Vagrant is installed, you need to install a plugin that automatically installs the host's VirtualBox Guest Additions on the guest system. Open a terminal and type the following command:
vagrant plugin install vagrant-vbguest
Dock workers
Cloudera Deploy runs from within a Docker container. When it runs, it starts the cluster. Follow the official Docker instructions to install Docker on your machine:
Getting Started
Bootstrap your nodes
A Vagrantfile
used to configure and provision virtual machines per project. Make sure you have an ssh key on your host before proceeding. If none is provided, the quickstart (next section) will generate an SSH key pair. Create a new file called Vagrantfile
in your working directory and paste the following code:
box = "centos/7"
Vagrant.configure("2") do |config|
config.vm.synced_folder ".", "/vagrant", disabled: true
config.ssh.insert_key = false
config.vm.box_check_update = false
ssh_pub_key = File.readlines("#{Dir.home}/.ssh/id_rsa.pub").first.strip
config.vm.provision "Add ssh_pub_key", type: "shell" do |s|
s.inline = <<-SHELL
echo #{ssh_pub_key} >> /home/vagrant/.ssh/authorized_keys
sudo mkdir -p /root/.ssh/
sudo echo #{ssh_pub_key} >> /root/.ssh/authorized_keys
sudo touch /home/vagrant/.ssh/config
sudo chmod 600 /home/vagrant/.ssh/config
sudo chown vagrant /home/vagrant/.ssh/config
SHELL
end
config.vm.define :master01 do |node|
node.vm.box = box
node.vm.network :private_network, ip: "10.10.10.11"
node.vm.network :forwarded_port, guest: 22, host: 24011, auto_correct: true
node.vm.network :forwarded_port, guest: 8080, host: 8080, auto_correct: true
node.vm.provider "virtualbox" do |d|
d.memory = 8192
end
node.vm.hostname = "master01.nikita.local"
end
config.vm.define :worker01 do |node|
node.vm.box = box
node.vm.network :private_network, ip: "10.10.10.16"
node.vm.network :forwarded_port, guest: 22, host: 24015, auto_correct: true
node.vm.provider "virtualbox" do |d|
d.customize ["modifyvm", :id, "--memory", 2048]
d.customize ["modifyvm", :id, "--cpus", 2]
d.customize ["modifyvm", :id, "--ioapic", "on"]
end
node.vm.hostname = "worker01.nikita.local"
end
end
The master01
the node has master01.nikita.local
FQDN and 10.10.10.11
IP. The worker01
the node has master01.nikita.local
FQDN and 10.10.10.16
IP.
Now run the following command:
It creates two connected virtual machines that form a small cluster.
Edit your local /etc/hosts
file by adding the following lines:
10.10.10.11 master01.nikita.local
10.10.10.16 worker01.nikita.local
Now connect to master01
using ssh:
Add or edit the following lines /etc/hosts
file:
10.10.10.11 master01.nikita.local
10.10.10.16 worker01.nikita.local
Repeat the operation by connecting to worker01
.
Download the quickstart script
The quickstart.sh
the script will set up the Docker container with the software dependencies you need for deployment. Download it to your host computer with the following command:
curl https://raw.githubusercontent.com/cloudera-labs/cloudera-deploy/main/quickstart.sh -o quickstart.sh
Run the quickstart script
The script will prepare and run the Ansible Runner in a Docker container.
chmod +x quickstart.sh
./quickstart.sh
You should see cldr {build}-{version} #>
orange prompt. You are now inside the container.
Create an inventory file
Navigate to cloudera-deploy
folder:
Create a new file called inventory_static.ini
which contains your hosts:
[cloudera_manager]
master01.nikita.local
[cluster_master_nodes]
master01.nikita.local host_template=Master1
[cluster_worker_nodes]
worker01.nikita.local
[cluster_worker_nodes:vars]
host_template=Workers
[cluster:children]
cluster_master_nodes
cluster_worker_nodes
[db_server]
master01.nikita.local
[deployment:children]
cluster
db_server
[deployment:vars]
ansible_user=vagrant
Configure the cluster
Set use_download_mirror
to no
in the definition file located at examples/sandbox/definition.yml
to avoid triggering behaviors dependent on public cloud services.
Run the main playbook
ansible-playbook /opt/cloudera-deploy/main.yml -e "definition_path=examples/sandbox" -e "profile=/opt/cloudera-deploy/profile.yml" -i /opt/cloudera-deploy/inventory_static.ini -t default_cluster
The command creates a CDP Private Base cluster with your local infrastructure. More specifically, it deploys a cluster with HDFS, YARN, and Zookeeper.
Conclusion
Cloudera Data Platform can be deployed in a variety of ways, making it a versatile option when considering a data platform. In this article, we described how to deploy a CDP Private Cloud cluster using Cloudera's official deployment script. This allows the user to test the platform locally and make relevant business decisions. From there, you can add services to your cluster as well as configure CDP Private Cloud's built-in components.
Troubleshooting
Should you encounter any problems with SSH between the host and the two virtual machines, you can force the installation of Virtualbox Guest Additions for master01
and worker01
by adding the following line to their individual configurations i Vagrantfile
:
node.vbguest.installer_options = { allow_kernel_upgrade: true }
SSH_AUTH_SOCK
The quickstart.sh script may terminate abruptly if it detects that SSH_AUTH_SOCK
the path is not properly defined or empty. If you encounter this error, first run the following command:
This returns the path to the unix socket used by ssh-agent, which must be added as the variable SSH_AUTH_SOCK
to the quickstart script for ssh to work properly; your quickstart script should now look like this:
In this example, the socket path is “/run/user/1000/keyring/ssh”.
#Cloudera #Deploy #install #Cloudera #Data #Platform #CDP #Private #Cloud
Source link