Data Processing in Shell
Susan Sun
Data Person
curl:
Check curl installation:
man curl
If curl has not been installed, you will see:
curl command not found.
For full instructions, see https://curl.haxx.se/download.html.
If curl is installed, your console will look like this:

Press Enter to scroll.

Press q to exit.
Basic curl syntax:
curl [option flags] [URL]
URL is required.
curl also supports HTTP, HTTPS, FTP, and SFTP.
For a full list of the options available:
curl --help
Example:
A single file is stored at:
https://websitename.com/datafilename.txt
Use the optional flag -O to save the file with its original name:
curl -O https://websitename.com/datafilename.txt
To rename the file, use the lower case -o + new file name:
curl -o renameddatafilename.txt https://websitename.com/datafilename.txt
Oftentimes, a server will host multiple data files, with similar filenames:
https://websitename.com/datafilename001.txt
https://websitename.com/datafilename002.txt
...
https://websitename.com/datafilename100.txt
Using Wildcards (*)
Download every file hosted on https://websitename.com/ that starts with datafilename and ends in .txt:
curl -O https://websitename.com/datafilename*.txt
Continuing with the previous example:
https://websitename.com/datafilename001.txt
https://websitename.com/datafilename002.txt
...
https://websitename.com/datafilename100.txt
Using Globbing Parser
The following will download every file sequentially starting with datafilename001.txt and ending with datafilename100.txt.
curl -O https://websitename.com/datafilename[001-100].txt
Continuing with the previous example:
https://websitename.com/datafilename001.txt
https://websitename.com/datafilename002.txt
...
https://websitename.com/datafilename100.txt
Using Globbing Parser
Increment through the files and download every Nth file (e.g.datafilename010.txt, datafilename020.txt, ... datafilename100.txt)
curl -O https://websitename.com/datafilename[001-100:10].txt
curl has two particularly useful option flags in case of timeouts during download:
-L Redirects the HTTP URL if a 300 error code occurs.
-C Resumes a previous file transfer if it times out before completion.
Putting everything together:
curl -L -O -C https://websitename.com/datafilename[001-100].txt
-L -C -O is fine)Data Processing in Shell