The quickest way to run a Jupyter Notebook instance in a containerised environment such as OpenShift, is to use the Docker-formatted images provided by the Jupyter Project developers. Unfortunately the Jupyter Project images do not run out of the box with the typical default configuration of an OpenShift cluster.
In this second post of this series about running Jupyter Notebooks on OpenShift, I am going to detail the steps required in order to run the Jupyter Notebook software on OpenShift.
Jupyter Project Images
The original code for the Jupyter Project images can be found on GitHub, but the images are also hosted on the Docker Hub Registry.
The Jupyter Project provides a number of images with different capabilities and packages pre-installed. These are:
jupyter/minimal-notebook
- Base image with support for working with Python 3.jupyter/scipy-notebook
- Builds onminimal-notebook
, adding Python packages commonly used in data analysis and visualisation, includingnumpy
,scipy
andmatplotlib
. Also adds support for Python 2.jupyter/tensorflow-notebook
- Builds onscipy-notebook
, adding packages for working withtensorflow
.jupyter/datascience-notebook
- Builds onscipy-notebook
, adding support for Julia and R.jupyter/pyspark-notebook
- Builds onscipy-notebook
, adding support for working with Spark and Hadoop clusters.jupyter/all-spark-notebook
- Builds onpyspark-notebook
, adding support for Scala and R.jupyter/r-notebook
- Base image with support for working with R.
Deploying Images to OpenShift
The Docker-formatted images from the Jupyter Project can be deployed to OpenShift using the web console Deploy Image page:
Alternatively you can deploy the jupyter/mininal-notebook
image from the command line using the oc new-app
command:
$ oc new-app jupyter/minimal-notebook:latest
--> Found Docker image acba6ac (4 days old) from Docker Hub for "jupyter/minimal-notebook:latest"* An image stream will be created as "minimal-notebook:latest" that will track this image
* This image will be deployed in deployment config "minimal-notebook"
* Port 8888/tcp will be load balanced by service "minimal-notebook"
* Other containers can access this service through the hostname "minimal-notebook"
--> Creating resources ...
imagestream "minimal-notebook" created
deploymentconfig "minimal-notebook" created
service "minimal-notebook" created
--> Success
Run 'oc status' to view your app.
To expose the Jupyter Notebook so that it will be accessible via a public URL, from the Overview page in the web console you can select Create Route. If using the command line, you can run oc expose
:
$ oc expose svc/minimal-notebook
route "minimal-notebook" exposed
Having performed these steps, once the image has been pulled down and deployed, you will find that the image fails to start. Digging into the logs for the failed deployment, you will find an error:
File "/opt/conda/lib/python3.5/site-packages/jupyter_core/migrate.py", line 241, in migrate
with open(os.path.join(env['jupyter_config'], 'migrated'), 'w') as f:
PermissionError: [Errno 13] Permission denied: '/home/jovyan/.jupyter/migrated'
The reason for this is due to one aspect of the default security model applied by OpenShift to ensure that, in a multi tenant environment, one user cannot interfere with another.
In such a multi tenant environment, applications in different projects are run with different assigned user IDs. This is enforced by running an image as the assigned user ID, rather than any user ID the image itself says it wants to run as.
The image has failed to start up in this case as it hasn't been constructed in a way so as to be started as an arbitrary user ID.
The good thing about the Jupyter Project images at least is that they don't expect to run as the root
user. Instead they have been built with the expectation that they run as the jovyan
user, with user ID of 1000. The group that the jovyan
user is a member of, and how permissions have been set up on directories and files, means the image will not work in an environment which applies a more strict security regime required of a multi tenant system. The issues with the Jupyter Project images have been reported, however not all problems have been addressed which would allow them to run in a more secure multi tenant environment.
Overriding User an Image Runs As
In a situation where an image has not been constructed to allow it to be run as an assigned user ID, one can override OpenShift and configure it to allow running of images as any user ID. This is done using the oc adm policy add-scc-to-user
command, with the security context constraint of anyuid
being added to the service account the image is run as.
$ oc adm policy add-scc-to-user anyuid -z default
Error from server: User "developer" cannot get securitycontextconstraints at the cluster scope
As shown, this command will fail if you attempt to run it as a normal user. This is because only an administrator has the ability to override the security context constraints.
The reason for this is that giving a user the ability to run an image as any user ID, also allows them to run images as the root
user. In this case the image declares that it will run as the jovyan
user so will not run as the root
user. If enabling the ability for a user to run images as any user ID, an administrator should first ensure that the user is trusted, and that the source of any images is known and that the images are also trusted.
Presuming the administrator is satisfied, the administrator of the OpenShift cluster should run the command:
# oc adm policy add-scc-to-user anyuid -z default -n myproject
The -n
option and the argument that follows declares which project the command should be applied to. In this case it would be applied in the project called myproject
.
Logging in to Jupyter Notebook
Having enabled the ability to run the Jupyter Notebook image as the jovyan
user, trigger a redeployment and the image should now start up.
To get the URL for the Jupyter Notebook, you can look up the hostname using oc get routes
:
$ oc get routes
NAME HOST/PORT PATH SERVICES PORT TERMINATION
minimal-notebook minimal-notebook-myproject.192.168.99.100.xip.io minimal-notebook 8888-tcp
Accessing the Jupyter Notebook from the browser using the hostname and you will be presented with a login page.
This is the default login page for Jupyter Notebook. As we have not specified a password when we deployed the application, Jupyter Notebook will generate a secret token to be used when logging in. The value of this token is output in the logs for the Jupyter Hub application.
To view the logs you can run the oc get pods
command to get a list of any pods running and then use oc logs
on the name of the pod for the running application.
$ oc get pods --selector app=minimal-notebook
NAME READY STATUS RESTARTS AGE
minimal-notebook-7-6dwp8 1/1 Running 0 1h$ oc logs minimal-notebook-7-6dwp8
...Copy/paste this URL into your browser when you connect for the first time,
to login with a token:
http://localhost:8888/?token=10c88f9dab876869b46884443e1157e5eb199ac615fb33e5
...
As described on the login page, you can also use the jupyter notebook list
command to show the running servers. This needs to be run inside of the container running the Jupyter Notebook instance. You can do this using the oc rsh
command:
$ oc rsh minimal-notebook-7-6dwp8 jupyter notebook list
Currently running servers:
http://localhost:8888/?token=10c88f9dab876869b46884443e1157e5eb199ac615fb33e5 :: /home/jovyan/work
Copy just the token from the URL which is shown in the logs or output from the jupyter notebook list
command and use that in the login page for Jupyter Notebook in your browser. You should then be presented with the Jupyter Notebook dashboard.
Adding a Persistent Volume
When you work with a Jupyter Notebook, you can create new notebooks or upload an existing notebook. Any changes you make will be saved to the local file system within the container. As a result, if the container running the Jupyter Notebook instance is restarted, all your work will be lost.
If your OpenShift cluster is configured with persistent volumes, to avoid this you should use a persistent volume claim in conjunction with the Jupyter Notebook instance. To claim and mount the persistent volume, you can use the oc set volume
command. The directory at which the persistent volume should be mounted inside of the container should be /home/jovyan/work
.
$ oc set volume dc/minimal-notebook --add --mount-path /home/jovyan/work --claim-size=1G
info: Generated volume name: volume-tnjug
persistentvolumeclaims/pvc-acnzs
deploymentconfig "minimal-notebook" updated
A persistent volume claim can also be made, and associated with the Jupyter Notebook application from the web console by going to the Deployment Config for the Jupyter Notebook application. The option to Add Storage can be found in the Actions drop down menu.
Installing Additional Packages
Which packages for Python are available for you to use from your Jupyter Notebook instance will depend on which of the Jupyter Project images you chose. If you choose the minimal image, or are using an uncommon package, you will need to install it yourself from a terminal created from the Jupyter Notebook dashboard, or from within a notebook. Because everything is discarded when the container running the Jupyter Notebook instance is restarted, you would have to do this each time.
A way to avoid this is to extend the image and add support to it for running it as a Source-to-Image (S2I) builder. Using S2I, you can then build up a custom image which incorporates the packages you need. An S2I builder can also be used to pre-populate an image with notebooks and data files you may need.
I will explain how to create a S2I builder from the Jupyter Project images in the next post in this series.
About the author
Browse by channel
Automation
The latest on IT automation for tech, teams, and environments
Artificial intelligence
Updates on the platforms that free customers to run AI workloads anywhere
Open hybrid cloud
Explore how we build a more flexible future with hybrid cloud
Security
The latest on how we reduce risks across environments and technologies
Edge computing
Updates on the platforms that simplify operations at the edge
Infrastructure
The latest on the world’s leading enterprise Linux platform
Applications
Inside our solutions to the toughest application challenges
Original shows
Entertaining stories from the makers and leaders in enterprise tech
Products
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Cloud services
- See all products
Tools
- Training and certification
- My account
- Customer support
- Developer resources
- Find a partner
- Red Hat Ecosystem Catalog
- Red Hat value calculator
- Documentation
Try, buy, & sell
Communicate
About Red Hat
We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.
Select a language
Red Hat legal and privacy links
- About Red Hat
- Jobs
- Events
- Locations
- Contact Red Hat
- Red Hat Blog
- Diversity, equity, and inclusion
- Cool Stuff Store
- Red Hat Summit