Python package installation with pip

Data Processing in Shell

Susan Sun

Data Person

Python standard library

Python standard library has a collection of:

  • built-in functions (e.g. print())
  • built-in packages (e.g. math, os)

Data science packages like scikit-learn and statsmodel:

  • are NOT part of the Python standard library
  • can be installed through pip, the standard package manager for Python, via the command line
Data Processing in Shell

Using pip documentation

Documentation:

pip -h
Usage:
  pip <command> [options]

Commands:
  install       Install packages.
  uninstall     Uninstall packages.
  freeze        Output installed packages in requirements format.
  list          List installed packages.
Data Processing in Shell

Using pip documentation

Documentation:

pip --version
pip 19.1.1 from /usr/local/lib/python3.5/dist-packages/pip (python 3.5)
python --version
Python 3.5.2
Data Processing in Shell

Upgrading pip

If pip is giving an upgrade warning:

WARNING: You are using pip version 19.1.1, however version 19.2.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

Upgrade pip using itself:

pip install --upgrade pip
Collecting pip
     |################################| 1.4MB 10.7MB/s
Successfully installed pip-19.2.1
Data Processing in Shell

pip list

pip list: displays the Python packages in your current Python environment

pip list
Package         Version
- - - - - - - - - - - - 
agate           1.6.1
agate-dbf       0.2.1
agate-excel     0.2.3
agate-sql       0.5.4
Babel           2.7.0
Data Processing in Shell

pip install one package

pip install installs the package specified and any other dependencies

pip install scikit-learn
Collecting scikit-learn
  Downloading https://files.pythonhosted.org/packages/1f/af/e3c3cd6f61093830059138624dbd26d938d6da1caeec5aeabe772b916069/scikit_learn-0.21.3-cp35-cp35m-manylinux1_x86_64.whl (6.6MB)
     |################################| 6.6MB 32.5MB/s
Collecting scipy>=0.17.0 (from scikit-learn)
  Downloading https://files.pythonhosted.org/packages/14/49/8f13fa215e10a7ab0731cc95b0e9bb66cf83c6a98260b154cfbd0b55fb19/scipy-1.3.0-cp35-cp35m-manylinux1_x86_64.whl (25.1MB)
     |################################| 25.1MB 35.5MB/s
...
Data Processing in Shell

pip install a specific version

By default, pip install will always install the latest version of the library.

pip install scikit-learn
Successfully built sklearn
Installing collected packages: joblib, scipy, scikit-learn, sklearn
Successfully installed joblib-0.13.2 scikit-learn-0.21.3 scipy-1.3.0 sklearn-0.0
Data Processing in Shell

pip install a specific version

To install a specific (or older) version of the library:

pip install scikit-learn==0.19.2
Collecting scikit-learn==0.19.2
  Downloading https://files.pythonhosted.org/packages/b6/e2/a1e254a4a4598588d4fe88b45ab88a226c289ecfd0f6c90474eb6a9ea6b3/scikit_learn-0.19.2-cp35-cp35m-manylinux1_x86_64.whl (4.9MB)
     |################################| 4.9MB 15.6MB/s
Installing collected packages: scikit-learn
Successfully installed scikit-learn-0.19.2
Data Processing in Shell

Upgrading packages using pip

Upgrade the Scikit-Learn package using pip:

pip install --upgrade scikit-learn
Collecting scikit-learn
  Downloading https://files.pythonhosted.org/packages/1f/af/e3c3cd6f61093830059138624dbd26d938d6da1caeec5aeabe772b916069/scikit_learn-0.21.3-cp35-cp35m-manylinux1_x86_64.whl (6.6MB)
     |################################| 6.6MB 41.5MB/s
Requirement already satisfied, skipping upgrade: numpy>=1.11.0 in /usr/local/lib/python3.5/dist-packages (from scikit-learn) (1.16.4)
Collecting scipy>=0.17.0 (from scikit-learn)
Installing collected packages: scipy, joblib, scikit-learn
Successfully installed joblib-0.13.2 scikit-learn-0.21.3 scipy-1.3.0
Data Processing in Shell

pip install multiple packages

To pip install multiple packages, separate the packages with spaces:

pip install scikit-learn statsmodels

Upgrade multiple packages:

pip install --upgrade scikit-learn statsmodels
Data Processing in Shell

pip install with requirements.txt

requirements.txt file contains a list of packages to be installed:

cat requirements.txt
scikit-learn
statsmodel

Most Python developers include requirements.txt files in their Python Github repos.

Data Processing in Shell

pip install with requirements.txt

-r allows pip install to install packages from a pre-written file:

-r, --requirement <file>
Install from the given requirements file. This option can be used multiple times.

In our example:

pip install -r requirements.txt

is the same as

pip install scikit-learn statsmodel
Data Processing in Shell

Let's practice!

Data Processing in Shell

Preparing Video For Download...