Jupyter Installation (Outdated)¶
Warning
This page is outdated!
Jupyter notebooks allow to analyze data interactively using your web browser. One advantage of Jupyter is, that code, documentation and visualization can be included in a single notebook, so that it forms a unit. Jupyter notebooks can be used for many tasks, such as data cleaning and transformation, numerical simulation, statistical modeling, data visualization and also machine learning.
There are two general options on how to work with Jupyter notebooks on ZIH systems: remote Jupyter server and JupyterHub.
These sections show how to set up and run a remote Jupyter server with GPUs within a Slurm job. Furthermore, the following sections explain which modules and packages you need for that.
Note
On ZIH systems, there is a JupyterHub, where you do not need the manual server setup described below and can simply run your Jupyter notebook on HPC nodes. Keep in mind, that, with JupyterHub, you can't work with some special instruments. However, general data analytics tools are available.
The remote Jupyter server is able to offer more freedom with settings and approaches.
Preparation phase (optional)¶
On ZIH system, start an interactive session for setting up the environment:
marie@login$ srun --pty -n 1 --cpus-per-task=2 --time=2:00:00 --mem-per-cpu=2500 --x11=first bash -l -i
Create a new directory in your home, e.g. Jupyter
marie@compute$ mkdir Jupyter
marie@compute$ cd Jupyter
There are two ways how to run Anaconda. The easiest way is to load the Anaconda module. The second one is to download Anaconda in your home directory.
- Load Anaconda module (recommended):
marie@compute$ module load release/23.04
marie@compute$ module load Anaconda3
- Download the latest Anaconda release (see example below) and change the rights to make it an executable script and run the installation script:
marie@compute$ wget https://repo.continuum.io/archive/Anaconda3-2019.03-Linux-x86_64.sh
marie@compute$ chmod u+x Anaconda3-2019.03-Linux-x86_64.sh
marie@compute$ ./Anaconda3-2019.03-Linux-x86_64.sh
(during installation you have to confirm the license agreement)
Hint
For working with conda virtual environments, it may be necessary to configure your shell via
conda init
as described in Python virtual environments
Next step will install the anaconda environment into the home
directory (/home/userxx/anaconda3
). Create a new anaconda environment with the name jnb
.
marie@compute$ conda create --name jnb
Set environmental variables¶
In the shell, activate previously created python environment (you can deactivate it also manually) and install Jupyter packages for this python environment:
marie@compute$ source activate jnb
marie@compute$ conda install jupyter
If you need to adjust the configuration, you should create the template. Generate configuration files for Jupyter notebook server:
marie@compute$ jupyter notebook --generate-config
Find a path of the configuration file, usually in the home under .jupyter
directory, e.g.
/home//.jupyter/jupyter_notebook_config.py
Set a password (choose easy one for testing), which is needed later on to log into the server in browser session:
marie@compute$ jupyter notebook password
Enter password:
Verify password:
You get a message like that:
[NotebookPasswordApp] Wrote *hashed password* to
/home/marie/.jupyter/jupyter_notebook_config.json
I order to create a certificate for secure connections, you can create a self-signed certificate:
marie@compute$ openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mykey.key -out mycert.pem
Fill in the form with decent values.
Possible entries for your Jupyter configuration (.jupyter/jupyter_notebook*config.py*
).
c.NotebookApp.certfile = u'<path-to-cert>/mycert.pem'
c.NotebookApp.keyfile = u'<path-to-cert>/mykey.key'
# set ip to '*' otherwise server is bound to localhost only
c.NotebookApp.ip = '*'
c.NotebookApp.open_browser = False
# copy hashed password from the jupyter_notebook_config.json
c.NotebookApp.password = u'<your hashed password here>'
c.NotebookApp.port = 9999
c.NotebookApp.allow_remote_access = True
Note
<path-to-cert>
- path to key and certificate files, for example:
(/home/marie/mycert.pem
)
Slurm job file to run the Jupyter server on ZIH system with GPU (1x K80) (also works on K20)¶
#!/bin/bash -l
#SBATCH --gres=gpu:1 # request GPU
#SBATCH --partition=gpu2 # use partition GPU 2
#SBATCH --output=notebook_output.txt
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --time=02:30:00
#SBATCH --mem=4000M
#SBATCH -J "jupyter-notebook" # job-name
#SBATCH -A p_number_crunch
unset XDG_RUNTIME_DIR # might be required when interactive instead of sbatch to avoid 'Permission denied error'
srun jupyter notebook
Start the script above (e.g. with the name jnotebook
) with sbatch command:
sbatch jnotebook.slurm
If you have a question about sbatch script see the article about Slurm.
Check by the command: tail notebook_output.txt
the status and the token of the server. It
should look like this:
https://(taurusi2092.taurus.hrsk.tu-dresden.de or 127.0.0.1):9999/
You can see the server node's hostname by the command: squeue --me
.
Remote connect to the server¶
There are two options on how to connect to the server:
- You can create an ssh tunnel if you have problems with the solution above. Open the other terminal and configure ssh tunnel: (look up connection values in the output file of Slurm job, e.g.) (recommended):
node=taurusi2092 #see the name of the node with squeue -u <your_login>
localport=8887 #local port on your computer
remoteport=9999 #pay attention on the value. It should be the same value as value in the notebook_output.txt
ssh -fNL ${localport}:${node}:${remoteport} <zih_user>@taurus.hrsk.tu-dresden.de #configure the ssh tunnel for connection to your remote server
pgrep -f "ssh -fNL ${localport}" #verify that tunnel is alive
- On your client (local machine) you now can connect to the server. You need to know the node's hostname, the port of the server and the token to login (see paragraph above).
You can connect directly if you know the IP address (just ping the node's hostname while logged on ZIH system).
#command on remote terminal
marie@taurusi2092$ host taurusi2092
# copy IP address from output
# paste IP to your browser or call on local terminal e.g.:
marie@local$ firefox https://<IP>:<PORT> # https important to use SSL cert
To login into the Jupyter notebook site, you have to enter the token.
(https://localhost:8887
). Now you can create and execute notebooks on ZIH system with GPU support.
Important
If you would like to use JupyterHub after using a remote manually
configured Jupyter server (example above) you need to change the name of the configuration file
(/home//.jupyter/jupyter_notebook_config.py
) to any other.