Process of installing VM image for Jupyter Ipython Notebook for Pyspark :
VirtualBox 4.3.28 (or later).
Make sure you have a virtualbox installed in your machine by the command:
Make sure you have a virtualbox installed in your machine by the command:
vboxmanage –version
If it is not installed then install it with the following command:
sudo apt-get install virtualbox
Vagrant 1.7.2 (or later).
sudo apt-get install virtualbox
Vagrant 1.7.2 (or later).
Make sure you have a vagrant installed in your machine by the command:
vagrant –version
If it is not installed then install it with the following command:
sudo apt-get install vagrant
Create a file named Vagrantfile in the empty directory of your choice having the following code in it:
# -*- mode: ruby -*-
# vi: set ft=ruby :
ipythonPort = 8001 # Ipython port to forward (also set in IPython notebook config)
Vagrant.configure(2) do |config|
config.ssh.insert_key = true
config.vm.define "sparkvm" do |master|
master.vm.box = "sparkmooc/base"
master.vm.box_url = "https://atlas.hashicorp.com/sparkmooc/boxes/base/versions/0.0.7.1/providers/virtualbox.box"
master.vm.box_url = "https://atlas.hashicorp.com/sparkmooc/boxes/base/versions/0.0.7.1/providers/virtualbox.box"
master.vm.box_download_insecure = true
master.vm.boot_timeout = 900
master.vm.network :forwarded_port, host: ipythonPort, guest: ipythonPort, auto_correct: true # IPython port (set in notebook config)
master.vm.network :forwarded_port, host: 4040, guest: 4040, auto_correct: true # Spark UI (Driver)
master.vm.hostname = "sparkvm"
master.vm.usable_port_range = 4040..4090
master.vm.provider :virtualbox do |v|
v.name = master.vm.hostname.to_s
end
end
end
Then run the command vagrant up.
Once the VM is running, to access the notebook, open a web browser to "http://localhost:8001/" (on Windows and Mac) or "http://127.0.0.1:8001/" (on Linux).