As I branch out to cover other connected topics I still push myself to keep going with the MIT Missing Semester Lectures. Today I started watching Lecture 6 – Git. It is difficult to be jumping around so much from topic to topic but I’m trying not to sit on one topic obsessively. I hope that by keeping myself accountable I can balance a continued exposure to broad new information and reviewing specific syntax details on other topics I am more familiar with. Hopefully this is a good strategy that benefits me in this next month.
TLDR;
Okay, so here are the highlights of what I did:
- I started watching the Lecture 6 YouTube video for the MIT Missing Semester Course
- I went through some more
awk
examples and learned about the conditional method of using regular expressions with theawk FS
variable. It is very useful but a bit peculiar with it’s syntax. - Played with a personal expense file with
sed
. It was terrifying to run a substitution in place withsed -i
and not knowing if I could easily reverse that change. I guess this is why we should rarely use the-i
flag. There is no going back when we change the original LOL.
Notes on awk
The Field Separator Value (FS
)
FS
contains the field separator character which is used to divide fields on the input line. The default is “white space”, meaning space and tab characters. FS
can be reassigned to another character (typically in BEGIN
) to change the field separator.
We can use regular expressions for our FS
. That allows us to pass multiple characters as FS
values instead of just one. For Example:
echo '{foo} bar=baz' | awk -F'[{}= ]+' '{print $3}'
# Outputs: bar
## Or
echo '{foo} bar=baz' | awk 'BEGIN {FS = "[{}= ]+"} {print $3}'
# Outputs: bar
Notice how in the FS
assignment there is no usage of explicit regexp markers /regexp/
. This is something to be aware of when using regular expressions with the FS
variable.
Additionally if you want to use the regexp special characters you will need to use a double backslash \\<char>
instead of a single backslash. This applies to some but not all (meaning I haven’t tested all of them yet to confirm).
# This DOES NOT Work. The \s will be interpreted as a regular 's' within the double quotes ("").
BEGIN {FS = "\s{3,}"; sum = 0;}
# This Works
BEGIN {FS = "\\s{3,}"; sum = 0;}
Conclusion
That’s all for today. If you are interested in the MIT course you can check out the video lecture I’m currently going through. The lecture is helpful but isn’t sufficient by itself. Anyways, until next time PEACE!