Downloading data using Wget

Data Processing in Shell

Susan Sun

Data Person

What is Wget?

Wget:

  • derives its name from World Wide Web and get
  • native to Linux but compatible for all operating systems
  • used to download data from HTTP(S) and FTP
  • better than curl at downloading multiple files recursively
Data Processing in Shell

Checking Wget Installation

Check if Wget is installed correctly:

which wget

If Wget has been installed, this will print the location of where Wget has been installed:

/usr/local/bin/wget

If Wget has not been installed, there will be no output.

Data Processing in Shell

Wget Installation by Operating System

Wget source code: https://www.gnu.org/software/wget/

Linux: run sudo apt-get install wget

MacOS: use homebrew and run brew install wget

Windows: download via gnuwin32

Data Processing in Shell

Browsing the Wget Manual

Once installation is complete, use the man command to print the Wget manual:

Screenshot of the beginning of the Wget manual as though printed in a dark Terminal window

Data Processing in Shell

Learning Wget Syntax

Basic Wget syntax:

wget [option flags] [URL]

URL is required.

Wget also supports HTTP, HTTPS, FTP, and SFTP.

For a full list of the option flags available, see:

wget --help
Data Processing in Shell

Downloading a Single File

Option flags unique to Wget:

-b: Go to background immediately after startup

-q: Turn off the Wget output

-c: Resume broken download (i.e. continue getting a partially-downloaded file)

wget -bqc https://websitename.com/datafilename.txt
Continuing in background, pid 12345.
Data Processing in Shell

Have fun Wget-ing!

Data Processing in Shell

Preparing Video For Download...