

The IPython.parallel architecture consists of four components:

digraph IPython_parallel { graph [fontname = "Calibri", fontsize="16"]; node [fontname = "Calibri", fontsize="16"]; edge [fontname = "Calibri", fontsize="16"]; // Nodes hub [ label="Hub" target="_top", href="../parallel/ipyparallel/intro.html#ipython-hub"] engine [ label="Engine" target="_top", href="../parallel/ipyparallel/intro.html#ipython-engine"] schedulers [ label="Schedulers" target="_top", href="../parallel/ipyparallel/intro.html#ipython-schedulers"] client [ label="Client" target="_top", href="../parallel/ipyparallel/intro.html#ipython-client"] // Edges engine -> hub client -> hub schedulers -> hub engine -> schedulers client -> schedulers }


The IPython engine is an extension of the IPython kernel for Jupyter. The module waits for requests from the network, executes code and returns the results. IPython parallel extends the Jupyter messaging protocol with native Python object serialisation and adds some additional commands. Several engines are started for parallel and distributed computing.


The main job of the hub is to establish and monitor connections to clients and engines.


All actions that can be carried out on the engine go through a scheduler. While the engines themselves block when user code is executed, the schedulers hide this from the user to provide a fully asynchronous interface for a number of engines.


There is a main object Client to connect to the cluster. Then there is a corresponding View for each execution model. These Views allow users to interact with a number of engines. The two standard views are:


  1. Starting the IPython Hub:

    $ pipenv run ipcontroller
    [IPControllerApp] Hub listening on tcp:// for registration.
    [IPControllerApp] Hub using DB backend: 'DictDB'
    [IPControllerApp] hub::created hub
    [IPControllerApp] writing connection info to /Users/veit/.ipython/profile_default/security/ipcontroller-client.json
    [IPControllerApp] writing connection info to /Users/veit/.ipython/profile_default/security/ipcontroller-engine.json
    [IPControllerApp] task::using Python leastload Task scheduler
    DB backend

    The database in which the IPython tasks are managed. In addition to the in-memory database DictDB, MongoDB and SQLite are further options.


    Configuration file for the IPython client


    Configuration file for the IPython engine


    The possible routing scheme. leastload always assigns tasks to the engine with the fewest open tasks. Alternatively, lru (Least Recently Used), plainrandom, twobin and weighted can be selected, the latter two also need NumPy.

    This can be configured in ipcontroller_config.py, for example with c.TaskScheduler.scheme_name = 'leastload' or with

    $ pipenv run ipcontroller --scheme=pure
  2. Starting the IPython controller and the engines:

    $ pipenv run ipcluster start
    [IPClusterStart] Starting ipcluster with [daemon=False]
    [IPClusterStart] Creating pid file: /Users/veit/.ipython/profile_default/pid/ipcluster.pid
    [IPClusterStart] Starting Controller with LocalControllerLauncher
    [IPClusterStart] Starting 4 Engines with LocalEngineSetLauncher
    Batch systems

    Besides the possibility to start ipcontroller and ipengine locally, see Starting a cluster with SSH, there are also the profiles for MPI, PBS, SGE, LSF, HTCondor, Slurm, SSH and WindowsHPC.

    This can be configured in ipcluster_config.py for example with c.IPClusterEngines.engine_launcher_class = 'SSH' or with

    $ pipenv run ipcluster start --engines=MPI

    See also


  3. Starting the Jupyter Notebook and loading the IPython-Parallel-Extension:

    $ pipenv run jupyter notebook
    [I NotebookApp] Loading IPython parallel extension
    [I NotebookApp] [jupyter_nbextensions_configurator] enabled 0.4.1
    [I NotebookApp] Serving notebooks from local directory: /Users/veit//jupyter-tutorial
    [I NotebookApp] The Jupyter Notebook is running at:
    [I NotebookApp] http://localhost:8888/?token=4e9acb8993758c2e7f3bda3b1957614c6f3528ee5e3343b3
  4. Finally the cluster with the default profile can be started in the browser at the URL http://localhost:8888/tree/docs/parallel/ipyparallel#ipyclusters.