Introduction to Bash Scripting
Alex Scriven
Data Scientist
This course will cover:
Firstly, let's consider why Bash?
Bash stands for 'Bourne Again Shell' (a pun)
Developed in the 80's but a very popular shell today. Default in many Unix systems, Macs
Unix is the internet! (Running ML Models, Data Pipelines)
So why Bash scripting?
You are expected to have some basic knowledge for this course.
cat, grep, sed etc.If you are rusty, don't worry - we will revise this now!
Some important shell commands:
(e)grep filters input based on regex pattern matchingcat concatenates file contents line-by-linetail \ head give only the last -n (a flag) lineswc does a word or line count (with flags -w -l)sed does pattern-matched string replacement'Regex' or regular expressions are a vital skill for Bash scripting.
You will often need to filter files, data within files, match arguments and a variety of other uses. It is worth revisiting this.
To test your regex you can use helpful sites like regex101.com
Let's revise some shell commands in an example.
Consider a text file fruits.txt with 3 lines of data:
banana
apple
carrot
If we ran grep 'a' fruits.txt we would return:
banana
apple
carrot
But if we ran grep 'p' fruits.txt we would return:
apple
Recall that square parentheses are a matching set such as [eyfv]. Using ^ makes this an inverse set (not these letters/numbers)
So we could run grep '[pc]' fruits.txt we would return:
apple
carrot
You have likely used 'pipes' before in terminal. If we had many many fruits in our file we could use sort | uniq -c
wc -l and use headcat new_fruits.txt | sort | uniq -c | head -n 3
14 apple
13 bannana
12 carrot
Introduction to Bash Scripting