Saturday, 26 September 2015

Installation of VM image for Jupyter Ipython Notebook for Pyspark

Process of installing VM image for Jupyter Ipython Notebook for Pyspark :

VirtualBox 4.3.28 (or later).
Make sure you have a virtualbox installed in your machine by the command:
vboxmanage –version
If it is not installed then install it with the following command:
sudo apt-get install virtualbox

Vagrant 1.7.2 (or later).
Make sure you have a vagrant installed in your machine by the command:
vagrant –version
If it is not installed then install it with the following command:
sudo apt-get install vagrant



Create a file named Vagrantfile in the empty directory of your choice having the following code in it:

# -*- mode: ruby -*-
# vi: set ft=ruby :

ipythonPort = 8001                 # Ipython port to forward (also set in IPython notebook config)

Vagrant.configure(2) do |config|
  config.ssh.insert_key = true
  config.vm.define "sparkvm" do |master|
    master.vm.box = "sparkmooc/base"
    master.vm.box_url = "https://atlas.hashicorp.com/sparkmooc/boxes/base/versions/0.0.7.1/providers/virtualbox.box"
    master.vm.box_download_insecure = true
    master.vm.boot_timeout = 900
    master.vm.network :forwarded_port, host: ipythonPort, guest: ipythonPort, auto_correct: true   # IPython port (set in notebook config)
    master.vm.network :forwarded_port, host: 4040, guest: 4040, auto_correct: true                 # Spark UI (Driver)
    master.vm.hostname = "sparkvm"
    master.vm.usable_port_range = 4040..4090

    master.vm.provider :virtualbox do |v|
      v.name = master.vm.hostname.to_s
    end
  end
end


Then run the command vagrant up.

Once the VM is running, to access the notebook, open a web browser to "http://localhost:8001/" (on Windows and Mac) or "http://127.0.0.1:8001/" (on Linux).


No comments:

Post a Comment