The parts of a notebook
On the Assets page in a project, you can see the following information about a notebook:
- The name of the notebook
- The name of the person who last edited the notebook
- The date when the notebook was last modified
- The programming language of the notebook
- Whether a read-only version of the notebook is shared
- The runtime status of the notebook
- Whether the notebook is scheduled to run
When you open a notebook in edit mode, the notebook editor includes the following features:
- Menu bar and toolbar
- Notebook action bar
- The cells in a Jupyter notebook
- Spark job progress bar
- Project token for authorization
Menu bar and toolbar
You can select notebook features that affect the way the notebook functions and perform the most-used operations within the notebook by clicking an icon.
Notebook action bar
You can select features that enhance notebook collaboration. From the action bar, you can:
- Publish your notebook as a gist or to GitHub.
- Create a permanent URL so that anyone with the link can view your notebook.
- Create a job in which to run your notebook. See Scheduling a notebook.
- Add a project token so that code can access the project resources. See Add code to set the project token.
- View your notebook information. You can:
- Change the name of your notebook by editing it in the Name field.
- Edit the description of your notebook in the Description field.
- View the date when the notebook was created.
- View the environment details and runtime status; you can change the notebook runtime from here. See Notebook environments.
- Save versions of your notebook.
- View and add data sources.
- Post comments to project collaborators. See Comments
- Integrate data into your notebook.
- Find resources in the Gallery, for example, useful data sets.
You can add comments to a notebook that you’re editing.
To add a comment to a notebook, click , enter text in the comment field, and click Post. All collaborators in the project receive a notification that you added a comment.
You can add a comment that notifies only a specific project collaborator by mentioning that user in the comment. Anywhere in the comment, enter the @ symbol and start entering the user’s name until you can choose it from the search results:
The cells in a Jupyter notebook
A Jupyter notebook consists of a sequence of cells. The flow of a notebook is sequential. You enter code into an input cell, and when you run the cell, the notebook executes the code and prints the output of the computation to an output cell.
You can change the code in an input cell and re-run the cell as often as you like. In this way, the notebook follows a read-evaluate-print loop paradigm. You can choose to use tags to describe cells in a notebook.
The behavior of a cell is determined by a cell’s type. The different types of cells include:
Jupyter code cells
Where you can edit and write new code.
Jupyter markdown cells
Where you can document the computational process. You can input headings to structure your notebook hierarchically.
You can also add and edit image files as attachments to the notebook. The markdown code and images are rendered when the cell is run.
Raw Jupyter NBConvert cells
Where you can write output directly or put code that you don’t want to run. Raw cells are not evaluated by the notebook.
Spark job progress bar
When you run code in a notebook that triggers Spark jobs, it is often challenging to determine why your code is not running efficiently.
To help you better understand what your code is doing and assist you in code debugging, you can monitor the execution of the Spark jobs for a code cell.
To enable Spark monitoring for a cell in a notebook:
- Select the code cell you want to monitor.
- Click the Enable Spark Monitoring icon () on the notebook toolbar.
The progress bars you see display the real time runtime progress of your jobs on the Spark cluster. Each Spark job runs on the cluster in one or more stages, where each stage is a list of tasks that can be run in parallel. The monitoring pane can become very large is the Spark job has many stages.
The job monitoring pane also displays the duration of each job and the status of the job stages. A stage can have one of the following statuses:
Running: Stage active and started.
Completed: Stage completed.
Skipped: The results of this stage were cached from a earlier operation and so the task doesn’t have to run again.
Pending: Stage hasn’t started yet.
Click the icon again to disable monitoring in a cell.
Note: Spark monitoring is currently only supported in notebooks that run on Python.