How to: Jupyter notebook

JupyterLab is a web-based interactive development environment for notebooks, code, and data. Its interface allows users to configure and arrange workflows in data science, scientific computing, computational journalism, and machine learning.

 

Use cases

Data Cleaning and Transformation: Jupyter Notebook is useful for cleaning and transforming messy or raw data into usable formats. Users can write code to handle missing values, normalize data, and perform other preprocessing tasks.

 

Education and Learning: Jupyter Notebook is widely used in teaching programming, data science, and other technical subjects. Students can run code snippets, visualize data, and experiment with concepts in a hands-on, interactive manner.

 

Data Analysis and Visualization: Jupyter Notebook is commonly used for exploratory data analysis, where users can write and execute code to analyze datasets and create visualizations to uncover insights. For example, data scientists often use Jupyter to analyze financial data, social media trends, or scientific measurements

 

Machine Learning Prototyping: Data scientists frequently use Jupyter to prototype machine learning models, from data preprocessing and feature engineering to model training and evaluation. This allows for iterative development and testing of models in a single, interactive environment.

 

Running a Jupyter notebook

1) Create a Dockerfile

2a) The minimal Docker file will look like this (the packages with the pip install aren’t necessary but are nice to have when analyzing data): 

FROM jupyter/datascience-notebook:latest

WORKDIR /app

RUN pip install pandas numpy seaborn matplotlib

EXPOSE 8888

CMD ["jupyter", "notebook", "--ip=0.0.0.0", "--port=8888", "--no-browser", "--allow-root"]

 

2b) If you already have a jupyter notebook running locally you can use this Dockerfile (the packages with the pip install aren’t necessary but are nice to have when analyzing data): 

FROM jupyter/datascience-notebook:latest

WORKDIR /app

COPY . /app

RUN pip install pandas numpy seaborn matplotlib

EXPOSE 8888

CMD ["jupyter", "notebook", "--ip=0.0.0.0", "--port=8888", "--no-browser", "--allow-root"]

 

3) Go to cloud.tilaa.com

4) Create a container

 

5) In the configuration for the container add the following:

  •  Expose the port for container traffic use the same port as in the Dockerfile
  • Make the container accessible internet
  • Add new ingress and fill in the following:
    • Domain: this can be any domain you want, it's going to be the URL for the notebook
    • HTTP port: fill in the port number used for the container traffic
    • TLS: enable TLS
    • IP allow-list: You can leave this empty, but you can also specify which IP's are allowed to reach the link

Picture

 

6) Fill in the remaining parts

7) Add the container

Picture

 

8) Click on the log when the container is running

Picture

 

9) In the line with the link to the http://127.0.0.1:8888 you can see the token (after the ‘token=’ part), this token is needed to access the notebooks (if you don't see anything, try refreshing or waiting a bit longer)

Picture

 

10) Go to your application (the link at the ingress label)

11) Log in with your token or setup a password to be used instead of the token

Picture

 

Now you have a jupyter notebook running

 

Best practices

  • It's recommended to save token somewhere, so you don't have to remember it when you need it later on
  • It's recommended to setup a password for easier access as a password is easier to remember

 

FAQ

What are the arguments in the CMD for?

  • Jupyter : calls the jupyter command-line tool
  • Notebook: specifies that you want to run a notebook
  • --ip=0.0.0.0 : binds jupyter to available network interfaces. This is required for accessing jupyter
  • --port=8888 : runs jupyter notebook on this port. Must match the expose port from the dockerfile for proper forwarding
  • --no-browser: prevents jupyter from opening a browser when starting up, which isn’t needed in a container
  • --allow-root: allows to run jupyter as the root user. Normally it isn’t recommended because of security concerns. But in a container, it’s common and safe

 

How do I add packages to jupyter notebook?

There are three ways to add packages to a jupyter notebook:

  • Before creating the image, you can add the following line: RUN pip install <package>, replace <package> with the package that you want to add
  • After the notebook is already deployed, you can add packages trough a notebook cell by running the following code in a cell: !pip install <package>
  • You can also use the jupyter terminal to add packages, by clicking on new and selecting terminal you can add packages by running the following command: pip install <package>
Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.

Articles in this section