Course recap

Data Processing in Shell

Susan Sun

Data Person

Data downloading on the command line

  • How to download data files via curl and wget
  • Documentations, manuals (e.g. man curl, wget --help)
  • Multiple file downloads (e.g. wget --limit-rate=200k -i url_list.txt)
Data Processing in Shell

Data processing on the command line

  • Introduction to command line data toolkit: csvkit
  • Convert files to csv using in2csv
  • Print preview using csvlook, csvstat
  • Filter data using csvcut, csvgrep
  • Append multiple data files using csvstack
Data Processing in Shell

Database manipulation on the command line

  • Database manipulation using sql2csv, csvsql
  • Advanced SQL-like ETL commands using csvkit
Data Processing in Shell

Building data pipelines on the command line

  • Execute Python on the command line
  • Python package management using pip
  • Automate Python model and build pipelines with cron
Data Processing in Shell

Thank you! So long!

Data Processing in Shell

Preparing Video For Download...