Tutorials

This page contains tutorials and code snippets to showcase datafold’s API. All tutorials can be viewed online below. If you want to execute the notebooks in Jupyter, please also note the instructions in “Run notebooks with Jupyter”.

List

Download all tutorials in a zipped file.

  1. Data structures: PCManifold and TSCDataFrame (download)

    We introduce datafold’s basic data structures for time series collection data and kernel-based algorithms. They are both used internally in model implementations and for input/output.

  2. Uniform subsampling of point cloud manifold (download)

    We show how the PCManifold data structure can be used to subsample a manifold point cloud uniformly.

    Warning

    The tutorial generates a large dataset with 10 Mio. samples by default. This may have to be reduced, depending on the available computer memory.

  3. Diffusion Maps: Embedding of an S-curved manifold (download)

    We use a DiffusionMaps model to compute lower dimensional embeddings of an S-curved point cloud manifold. We also select the best combination of intrinsic parameters automatically with an optimization routine.

  4. Manifold learning on handwritten digits (download)

    We use the DiffusionMaps model to cluster data from handwritten digits and perform an out-of-sample embedding. This example is taken from the scikit-learn project and can be compared against other manifold learning algorithms.

  5. Geometric Harmonics: interpolate function values on data manifold (download)

    We showcase the out-of-sample extension for manifold learning models such as the DiffusionMaps model. For this we use the GeometricHarmonicsInterpolator for forward and backwards interpolation.

    Warning

    The tutorial requires also the Python package scikit-optimize which does not install with datafold.

  6. Extended Dynamic Mode Decomposition on Limit Cycle (download)

    We generate data from a dynamical system (Hopf system) and compare different dictionaries of the Extended Dynamic Mode Decomposition (EDMD). We also evaluate out-of-sample predictions with time ranges exceeding the time horizon of the training data.

  7. Jointly Smooth Functions: An Example (download)

    We use JointlySmoothFunctions to learn commonly smooth functions from multimodal data. Also, we introduce JsfDataset, which is used to make JointlySmoothFunctions consistent with scikit-learn’s estimator and transformer APIs. Finally, we demonstrate the out-of-sample extension.

    Warning

    The code for jointly smooth functions inside this notebook is experimental.

Run notebooks with Jupyter

Download files

  • If datafold was installed via PyPI, …

    … the tutorials are not included in the package. Download them separately from the above list.

  • If the datafold repository was downloaded, …

    … navigate to the folder /path/to/datafold/tutorials/. Before executing the tutorials, please make sure that the package is either installed (python setup.py install) or that path/to/datafold/ is included in the PYTHONPATH environment variable (export PYTHONPATH=$PYTHONPATH:/path/to/datafold/).

Start Jupyter

All tutorials are Jupyter notebooks (.ipynb file ending). The Jupyter package and dependencies install with

python -m pip install jupyter

For further information visit the Jupyter homepage. To open a Jupyter notebook in a web browser, run

jupyter notebook path/to/datafold/tutorials