Course recap
Data Processing in Shell
Data downloading on the command line
- How to download data files via
curl and wget
- Documentations, manuals (e.g.
man curl, wget --help)
- Multiple file downloads (e.g.
wget --limit-rate=200k -i url_list.txt)
Data processing on the command line
- Introduction to command line data toolkit:
csvkit
- Convert files to csv using
in2csv
- Print preview using
csvlook, csvstat
- Filter data using
csvcut, csvgrep
- Append multiple data files using
csvstack
Database manipulation on the command line
- Database manipulation using
sql2csv, csvsql
- Advanced SQL-like ETL commands using
csvkit
Building data pipelines on the command line
- Execute Python on the command line
- Python package management using
pip
- Automate Python model and build pipelines with
cron
Thank you! So long!
Data Processing in Shell
Preparing Video For Download...