Course recap
Data Processing in Shell
Data downloading on the command line
- How to download data files via
curl
and wget
- Documentations, manuals (e.g.
man curl
, wget --help
)
- Multiple file downloads (e.g.
wget --limit-rate=200k -i url_list.txt
)
Data processing on the command line
- Introduction to command line data toolkit:
csvkit
- Convert files to csv using
in2csv
- Print preview using
csvlook
, csvstat
- Filter data using
csvcut
, csvgrep
- Append multiple data files using
csvstack
Database manipulation on the command line
- Database manipulation using
sql2csv
, csvsql
- Advanced SQL-like ETL commands using
csvkit
Building data pipelines on the command line
- Execute Python on the command line
- Python package management using
pip
- Automate Python model and build pipelines with
cron
Thank you! So long!
Data Processing in Shell
Preparing Video For Download...