Documentation

Developing Python Packages

James Fulton

Climate informatics researcher

Why include documentation?

Helps your users use your code
Document each
- Function
- Class
- Class method

import numpy as np
help(np.sum)

...
sum(a, axis=None, dtype=None, out=None)
    Sum of array elements over a given axis.

    Parameters
    ----------
    a : array_like
        Elements to sum.
    axis : None or int or tuple of ints, optional
        Axis or axes along which a sum is performed.  
        The default, axis=None, will sum all of the 
        elements of the input array.
...

Why include documentation?

Helps your users use your code
Document each
- Function
- Class
- Class method

import numpy as np
help(np.array)

...
    array(object, dtype=None, copy=True)

    Create an array.

    Parameters
    ----------
    object : array_like
        An array, any object exposing the array 
        interface ...
    dtype : data-type, optional
        The desired data-type for the array. 
    copy : bool, optional
        If true (default), then the object is copied.
...

Why include documentation?

Helps your users use your code
Document each
- Function
- Class
- Class method

import numpy as np
x = np.array([1,2,3,4])
help(x.mean)

...
mean(...) method of numpy.ndarray instance
    a.mean(axis=None, dtype=None, out=None)

    Returns the average of the array elements 
    along given axis.

    Refer to `numpy.mean` for full documentation.
...

Function documentation

def count_words(filepath, words_list):

    """ ... """

Function documentation

def count_words(filepath, words_list):

    """Count the total number of times these words appear."""

Function documentation

def count_words(filepath, words_list):

    """Count the total number of times these words appear.

    The count is performed on a text file at the given location.
    """

Function documentation

def count_words(filepath, words_list):

    """Count the total number of times these words appear.

    The count is performed on a text file at the given location.

    [explain what filepath and words_list are]

    [what is returned]
    """

Documentation style

Google documentation style

"""Summary line.

Extended description of function.

Args:
    arg1 (int): Description of arg1
    arg2 (str): Description of arg2

NumPy style

    """Summary line.

    Extended description of function.

    Parameters
    ----------
    arg1 : int
        Description of arg1 ...

    Returns
    ----------
    numpy.ndarray

reStructured text style

    """Summary line.

    Extended description of function.

    :param arg1: Description of arg1
    :type arg1: int
    :param arg2: Description of arg2
    :type arg2: str

Epytext style

"""Summary line.

  Extended description of function.

  @type arg1: int
  @param arg1:  Description of arg1
  @type arg2: str
  @param arg2: Description of arg2

NumPy documentation style

Popular in scientific Python packages like

numpy
scipy
pandas
sklearn
matplotlib
dask
etc.

NumPy documentation style

import scipy
help(scipy.percentile)

percentile(a, q, axis=None, out=None, overwrite_input=False, interpolation='linear')
    Compute the q-th percentile of the data along the specified axis.

    Returns the q-th percentile(s) of the array elements.


    Parameters
    ----------

    a : array_like

        Input array or object that can be converted to an array.

Other types include - int, float, bool, str, dict, numpy.array, etc.

NumPy documentation style

import scipy
help(scipy.percentile)

percentile(a, q, axis=None, out=None, overwrite_input=False, interpolation='linear')
    ...
    Parameters
    ----------
    ...
    axis : {int, tuple of int, None}
    ...
    interpolation : {'linear', 'lower', 'higher', 'midpoint', 'nearest'}

List multiple types for parameter if appropriate
List accepted values if only a few valid options

NumPy documentation style

import scipy
help(scipy.percentile)

percentile(a, q, axis=None, out=None, overwrite_input=False, interpolation='linear')
    ...
    Returns
    -------
    percentile : scalar or ndarray
        If `q` is a single percentile and `axis=None`, then the result
        is a scalar. If multiple percentiles are given, first axis of
        the result corresponds to the percentiles...
    ...

NumPy documentation style

Other sections

Raises
See Also
Notes
References
Examples

¹ https://numpydoc.readthedocs.io/en/latest/format.html

Documentation templates and style translation

pyment can be used to generate docstrings
Run from terminal
Any documentation style from
- Google
- Numpydoc
- reST (i.e. reStructured-text)
- Javadoc (i.e. epytext)
Modify documentation from one style to another

Documentation templates and style translation

pyment -w -o numpydoc textanalysis.py

def count_words(filepath, words_list):
    # Open the text file
    ...
    return n

-w - overwrite file
-o numpydoc - output in NumPy style

Documentation templates and style translation

pyment -w -o numpydoc textanalysis.py

def count_words(filepath, words_list):
    """

    Parameters
    ----------
    filepath :

    words_list :


    Returns
    -------
    type
    """

Translate to Google style

pyment -w -o google textanalysis.py

def count_words(filepath, words_list):
    """Count the total number of times these words appear.

    The count is performed on a text file at the given location.

    Parameters
    ----------
    filepath : str
        Path to text file.
    words_list : list of str
        Count the total number of appearances of these words.

    Returns
    -------

    """

Translate to Google style

pyment -w -o google textanalysis.py

def count_words(filepath, words_list):
    """Count the total number of times these words appear.

    The count is performed on a text file at the given location.

    Args:
      filepath(str): Path to text file.
      words_list(list of str): Count the total number of appearances of these words.

    Returns:


    """

Package, subpackage and module documentation

mysklearn/__init__.py

"""
Linear regression for Python
============================

mysklearn is a complete package for implmenting
linear regression in python. 
"""

mysklearn/preprocessing/__init__.py

"""
A subpackage for standard preprocessing operations.
"""

mysklearn/preprocessing/normalize.py

"""
A module for normalizing data.
"""

Let's practice!

Developing Python Packages