Jupyter Books' Many Manifestations of a Notebook
March 03, 2020
This is an addendum to my 2020-01 post entitled, Jupyter Book to Colab.
Jupyter Book is a
tool which generates static HTML renderings of Jupyter .ipynb
files.
It can optionally generate links to live Python kernels which can run
the code in the original .ipynb
files. This is called the “interact”
button.
In my earlier post I described how I extended the interact button to work with Colab rather than one of Jupyter Books’ already built-out interacts to Jupyer services (e.g. Binder).
While discussing this little hack, I have found this whiteboard sketch
helps explain what is going on under the hood. In the context of
Jupyter-Book-interact-Colab deploy, any given Jupyter notebook
.ipynb
file can have four manifestions.
Let’s walk through the five steps.
1. The source notebook at home
A git repository is archived somewhere, say, Microsoft GitHub (but it could be any git repo). In the context of this post the repo is one built out to work with Jupyter Book, which means is it essentially just a collection of Jupyter notebooks and markdown files.
2. Pre-run notebook as HTML
For step 2, the repo has been fetched from GitHub and run through Jupyter Book with the output being a bunch of static web content (HTML, JavaScript, CSS, and images).
Static web sites are the simplest kind of web site: they are simply file servers talking HTTP. In this diagram the example static site is http://static-bar.com.
3. Hand off to Colab
This is what in Jupyter Book is referred to as interacting, moving from a static web page rendering of an (optionally pre-run) notebook to something backed by a live Jupyter kernel. Normally, Jupyter Book will hand off to Binder for provisioning Jupyter kernels. In my hack, open source Binder is replaced with commercial Google Colab.
The hand off is simply an http://
URL to Colab, which includes/ends-with a
map to the .ipynb
file that Colab should load from GitHub. That mapping
will result in an URL of the form:
https://colab.research.google.com/github/my_org/my_repo/blob/my_branch/my_file.ipynb
4. Colab kernel spin-up
Next, the web browser follows the http://colab.research.google.com
URL,
loading a new web page. At Colab, an HTTP GET arrives and the URL is
parsed. When colab sees the /github/
part, it knows that the user is
requesting that an .ipynb
file be fetched from GitHub. The tail of
the URL provides the organization, repo name, and relative file
path. Colab then fetches the specified file from github.com
.
Behind the scene Colab spins up a new virtual machine to provide a Jupyter kernel for the request. (Anyone with a gMail email address can have up to two VMs running simultaneously.)
Eventually (quickly) the HTTP response goes down to the browser where the user sees the notebook and can run the code.
5. Persisting a modified version
“Playground mode” is the Colab term for a transient, unpersisted
version of a notebook running in a Colab VM. If a reader wants to
play with and run the code (read: modify the input notebook) and keep
a copy, exiting playground mode will save a copy of the modified
.ipynb
in the users Google Drive.
The take away is that open source tools make it possible to have a
static web site showing HTML rendering of .ipynb
files. Those static
HTML files can then link to Colab (or Binder) to on-demand hook the
notebook up to a new VM. A static web site linking to free compute.