Skip to content

Commit

Permalink
updates for version 2.0
Browse files Browse the repository at this point in the history
  • Loading branch information
learnbyexample committed Aug 22, 2023
1 parent aec1868 commit 96b8442
Show file tree
Hide file tree
Showing 40 changed files with 101,994 additions and 1,518 deletions.
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2020 Sundeep Agarwal
Copyright (c) 2023 Sundeep Agarwal

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
65 changes: 32 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,36 @@
# GNU AWK
# CLI text processing with GNU awk

Example based guide to mastering GNU awk.
Example based guide to mastering GNU awk. Visit https://youtu.be/KIa_EaYwGDI for a short video about the book.

<p align="center">
<img src="./images/gawk.png" width="320px" height="400px" />
</p>
<p align="center"><img src="./images/gawk_ls.png" alt="CLI text processing with GNU awk ebook cover image" /></p>

The book also includes exercises to test your understanding, which is presented together as a single file in this repo - [Exercises.md](./exercises/Exercises.md)
The book also includes exercises to test your understanding, which are presented together as a single file in this repo [Exercises.md](./exercises/Exercises.md).

For solutions to the exercises, see [Exercise_solutions.md](./exercises/Exercise_solutions.md).

You can also use [this interactive TUI app](https://github.com/learnbyexample/TUI-apps/blob/main/AwkExercises) to practice some of the exercises from the book.

See [Version_changes.md](./Version_changes.md) to keep track of changes made to the book.

<br>

# E-book

You can purchase the pdf/epub versions of the book using these links:

* https://learnbyexample.gumroad.com/l/gnu_awk
* https://leanpub.com/gnu_awk

You can also get the book as part of these bundles:

* **Magical one-liners** bundle from https://learnbyexample.gumroad.com/l/oneliners or https://leanpub.com/b/oneliners
* **Awesome Regex** bundle from https://learnbyexample.gumroad.com/l/regex or https://leanpub.com/b/regex
* **All books bundle** bundle from https://learnbyexample.gumroad.com/l/all-books
* Includes all my programming books

See https://learnbyexample.github.io/books/ for list of other books
* You can purchase the pdf/epub versions of the book using these links:
* https://learnbyexample.gumroad.com/l/gnu_awk
* https://leanpub.com/gnu_awk
* You can also get the book as part of these bundles:
* **All books bundle** bundle from https://learnbyexample.gumroad.com/l/all-books
* Includes all my programming books
* **Magical one-liners** bundle from https://learnbyexample.gumroad.com/l/oneliners or https://leanpub.com/b/oneliners
* **Awesome Regex** bundle from https://learnbyexample.gumroad.com/l/regex or https://leanpub.com/b/regex
* See https://learnbyexample.github.io/books/ for a list of other books

For a preview of the book, see [sample chapters](https://github.com/learnbyexample/learn_gnuawk/blob/master/sample_chapters/gnu_awk_sample.pdf)
For a preview of the book, see [sample chapters](./sample_chapters/gnu_awk_sample.pdf).

The book can also be [viewed as a single markdown file in this repo](./gnu_awk.md). See my blogpost on [generating pdf from markdown using pandoc](https://learnbyexample.github.io/tutorial/ebook-generation/customizing-pandoc/) if you are interested in the ebook creation process.
The book can also be [viewed as a single markdown file in this repo](./gnu_awk.md). See my blogpost on [generating pdfs from markdown using pandoc](https://learnbyexample.github.io/customizing-pandoc/) if you are interested in the ebook creation process.

For web version of the book, visit https://learnbyexample.github.io/learn_gnuawk/
For the web version of the book, visit https://learnbyexample.github.io/learn_gnuawk/

<br>

Expand All @@ -52,15 +48,17 @@ For web version of the book, visit https://learnbyexample.github.io/learn_gnuawk
<br>

# Feedback
# Feedback and Contributing

[Open an issue](https://github.com/learnbyexample/learn_gnuawk/issues) if you spot any typo/errors.
⚠️ ⚠️ Please DO NOT submit pull requests. Main reason being any modification requires changes in multiple places.

:warning: :warning: Please DO NOT submit pull requests. Main reason being any modification requires changes in multiple places.
I would highly appreciate it if you'd let me know how you felt about this book. It could be anything from a simple thank you, pointing out a typo, mistakes in code snippets, which aspects of the book worked for you (or didn't!) and so on. Reader feedback is essential and especially so for self-published authors.

I'd also highly appreciate your feedback about the book.
You can reach me via:

Twitter: https://twitter.com/learn_byexample
* Issue Manager: [https://github.com/learnbyexample/learn_gnuawk/issues](https://github.com/learnbyexample/learn_gnuawk/issues)
* E-mail: `echo 'bGVhcm5ieWV4YW1wbGUubmV0QGdtYWlsLmNvbQo=' | base64 --decode`
* Twitter: [https://twitter.com/learn_byexample](https://twitter.com/learn_byexample)

<br>

Expand Down Expand Up @@ -89,13 +87,14 @@ Twitter: https://twitter.com/learn_byexample
# Acknowledgements

* [GNU awk documentation](https://www.gnu.org/software/gawk/manual/) — manual and examples
* [stackoverflow](https://stackoverflow.com/) and [unix.stackexchange](https://unix.stackexchange.com/) — for getting answers to pertinent questions on `bash`, `awk` and other commands
* [stackoverflow](https://stackoverflow.com/) and [unix.stackexchange](https://unix.stackexchange.com/) — for getting answers to pertinent questions on `awk` and related commands
* [tex.stackexchange](https://tex.stackexchange.com/) — for help on [pandoc](https://github.com/jgm/pandoc/) and `tex` related questions
* [LibreOffice Draw](https://www.libreoffice.org/discover/draw/) — cover image
* [pngquant](https://pngquant.org/) and [svgcleaner](https://github.com/RazrFalcon/svgcleaner) for optimizing images
* [softwareengineering.stackexchange](https://softwareengineering.stackexchange.com/questions/39/whats-your-favourite-quote-about-programming) and [skolakoda](https://skolakoda.org/programming-quotes) for programming quotes
* [/r/commandline/](https://old.reddit.com/r/commandline), [/r/linux4noobs/](https://old.reddit.com/r/linux4noobs/), [/r/linuxquestions/](https://old.reddit.com/r/linuxquestions/) and [/r/linux/](https://old.reddit.com/r/linux/) — helpful forums
* [canva](https://www.canva.com/) — cover image
* [oxipng](https://github.com/shssoichiro/oxipng), [pngquant](https://pngquant.org/) and [svgcleaner](https://github.com/RazrFalcon/svgcleaner) — optimizing images
* [Warning](https://commons.wikimedia.org/wiki/File:Warning_icon.svg) and [Info](https://commons.wikimedia.org/wiki/File:Info_icon_002.svg) icons by [Amada44](https://commons.wikimedia.org/wiki/User:Amada44) under public domain
* [arifmahmudrana](https://github.com/arifmahmudrana) for spotting an ambiguous explanation
* [Pound-Hash](https://github.com/Pound-Hash) for critical feedback
* [mdBook](https://github.com/rust-lang/mdBook) — for web version of the book
* [mdBook-pagetoc](https://github.com/JorelAli/mdBook-pagetoc) — for adding table of contents for each chapter
* [minify-html](https://github.com/wilsonzlin/minify-html) — for minifying html files
Expand All @@ -106,7 +105,7 @@ Special thanks to all my friends and online acquaintances for their help, suppor

# License

The book is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-nc-sa/4.0/)
The book is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-nc-sa/4.0/).

The code snippets are licensed under MIT, see [LICENSE](./LICENSE) file
The code snippets are licensed under MIT, see [LICENSE](./LICENSE) file.

14 changes: 14 additions & 0 deletions Version_changes.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,19 @@
<br>

### 2.0

* Command version updated to **GNU awk 5.2.2**
* Many more exercises added, and you can practice some of them using this [interactive TUI app](https://github.com/learnbyexample/TUI-apps/blob/main/AwkExercises)
* Long sections split into smaller ones
* In general, many of the examples, exercises, solutions, descriptions and external links were updated/corrected
* Updated Acknowledgements section
* Code snippets related to info/warning sections will now appear as a single block
* Book title changed to **CLI text processing with GNU awk**
* New cover image
* Images centered for EPUB format

<br>

### 1.4

* Added example for `NF` value when input line doesn't contain the input field separator or if it is empty.
Expand Down
8 changes: 4 additions & 4 deletions code_snippets/Builtin_functions.sh
Original file line number Diff line number Diff line change
Expand Up @@ -74,17 +74,17 @@ echo 'abcdefghij' | awk -v FS= '{print $3, $5}'

## match

s='051 035 154 12 26 98234'
s='051 035 154 12 26 98234 3'

echo "$s" | awk 'match($0, /[0-9]{4,}/){print substr($0, RSTART, RLENGTH)}'

echo "$s" | awk 'match($0, /0*[1-9][0-9]{2,}/, m){print m[0]}'

echo 'foo=42, baz=314' | awk 'match($0, /baz=([0-9]+)/, m){print m[0]}'
echo 'apple=42, fig=314' | awk 'match($0, /fig=([0-9]+)/, m){print m[0]}'

echo 'foo=42, baz=314' | awk 'match($0, /baz=([0-9]+)/, m){print m[1]}'
echo 'apple=42, fig=314' | awk 'match($0, /fig=([0-9]+)/, m){print m[1]}'

s='42 foo-5, baz3; x-83, y-20: f12'
s='42 apple-5, fig3; x-83, y-20: f12'

echo "$s" | awk '{ while( match($0, /([0-9]+),/, m) ){print m[1];
$0=substr($0, RSTART+RLENGTH)} }'
Expand Down
4 changes: 2 additions & 2 deletions code_snippets/Control_Structures.sh
Original file line number Diff line number Diff line change
Expand Up @@ -28,15 +28,15 @@ echo 'titillate' | awk '{do{print} while(gsub(/til/, ""))}'

## next

awk '/\<par/{print "%% " $0; next} {print /s/ ? "X" : "Y"}' word_anchors.txt
awk '/\<par/{print "%% " $0; next} {print /s/ ? "X" : "Y"}' anchors.txt

## exit

seq 3542 4623452 | awk 'NR==2452{print; exit}'

echo $?

awk '/^br/{print "Invalid input"; exit 1}' table.txt
awk '/^br/{print "invalid data"; exit 1}' table.txt

echo $?

Expand Down
16 changes: 14 additions & 2 deletions code_snippets/Field_separators.sh
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ echo ' a b c ' | awk -F' ' '{print NF}'

echo ' a b c ' | awk -F'[ ]' '{print NF}'

echo 'RECONSTRUCTED' | awk -F'[aeiou]+' -v IGNORECASE=1 '{print $1}'
echo 'RECONSTRUCTED' | awk -F'[aeiou]+' -v IGNORECASE=1 '{print $NF}'

echo 'RECONSTRUCTED' | awk -F'e' -v IGNORECASE=1 '{print $1}'

Expand All @@ -92,6 +92,12 @@ echo 'Sample123string42with777numbers' | awk -F'[0-9]+' -v OFS=, '1'

echo 'Sample123string42with777numbers' | awk -F'[0-9]+' -v OFS=, '{$1=$1} 1'

echo -v{,O}FS=:

echo 'goal:amazing:whistle:kwality' | awk -v{,O}FS=: '{$2 = 42} 1'

echo 'goal:amazing:whistle:kwality' | awk '{$2 = 42} 1' {,O}FS=:

## Manipulating NF

echo 'goal:amazing:whistle:kwality' | awk -F: -v OFS=, '{NF=2} 1'
Expand All @@ -108,7 +114,13 @@ s='Sample123string42with777numbers'

echo "$s" | awk -v FPAT='[0-9]+' '{print $2}'

echo "$s" | awk -v FPAT='[a-zA-Z]+' -v OFS=, '{$1=$1} 1'
s='coat Bin food tar12 best Apple fig_42'

echo "$s" | awk -v FPAT='\\<[a-z0-9]+\\>' -v OFS=, '{$1=$1} 1'

s='items: "apple" and "mango"'

echo "$s" | awk -v FPAT='"[^"]+"' '{print $1}'

s='eagle,"fox,42",bee,frog'

Expand Down
20 changes: 11 additions & 9 deletions code_snippets/Gotchas_and_Tips.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ awk -v word="cake" '$2==word' table.txt

awk -v field=2 '{print $field}' table.txt

## Dos style line endings
## DOS style line endings

printf 'mat dog\n123 789\n' | awk '{print $2, $1}'

Expand Down Expand Up @@ -36,7 +36,7 @@ echo 'hi log_42 12b' | awk '{gsub(/\</, ":")} 1'

echo 'hi log_42 12b' | awk '{gsub(/\>/, ":")} 1'

## Relying on default initial value
## Relying on the default initial value

awk '{sum += $NF} END{print sum}' table.txt

Expand All @@ -46,7 +46,7 @@ awk '{sum += $NF} ENDFILE{print FILENAME ":" sum}' table.txt marks.txt

awk '{sum += $NF} ENDFILE{print FILENAME ":" sum; sum=0}' table.txt marks.txt

## Code in replacement section
## Code in the replacement section

awk '{sub(/^(br|ye)/, ++c ") &")} 1' table.txt

Expand All @@ -64,6 +64,8 @@ awk '{sum += $1} END{print sum}' /dev/null

awk '{sum += $1} END{print +sum}' /dev/null

## Locale based numbers

echo '3.14' | awk '{$0++} 1'

echo '3,14' | awk '{$0++} 1'
Expand All @@ -90,21 +92,21 @@ awk 'NF>2{print $(NF-2)}' varying.txt

## Faster execution

time awk '/^([a-d][r-z]){3}$/' /usr/share/dict/words > f1
time awk '/^([a-d][r-z]){3}$/' words.txt > f1

time LC_ALL=C awk '/^([a-d][r-z]){3}$/' /usr/share/dict/words > f2
time LC_ALL=C awk '/^([a-d][r-z]){3}$/' words.txt > f2

time mawk '/^[a-d][r-z][a-d][r-z][a-d][r-z]$/' /usr/share/dict/words > f3
time mawk '/^[a-d][r-z][a-d][r-z][a-d][r-z]$/' words.txt > f3

diff -s f1 f2

diff -s f2 f3

rm f[123]

time awk -F'a' 'NF==4{cnt++} END{print +cnt}' /usr/share/dict/words
time awk -F'a' 'NF==4{cnt++} END{print +cnt}' words.txt

time LC_ALL=C awk -F'a' 'NF==4{cnt++} END{print +cnt}' /usr/share/dict/words
time LC_ALL=C awk -F'a' 'NF==4{cnt++} END{print +cnt}' words.txt

time mawk -F'a' 'NF==4{cnt++} END{print +cnt}' /usr/share/dict/words
time mawk -F'a' 'NF==4{cnt++} END{print +cnt}' words.txt

8 changes: 3 additions & 5 deletions code_snippets/Installation_and_Documentation.sh
Original file line number Diff line number Diff line change
@@ -1,19 +1,17 @@
## Installation

wget https://ftp.gnu.org/gnu/gawk/gawk-5.1.0.tar.xz
wget https://ftp.gnu.org/gnu/gawk/gawk-5.2.2.tar.xz

tar -Jxf gawk-5.1.0.tar.xz
tar -Jxf gawk-5.2.2.tar.xz

cd gawk-5.1.0/
cd gawk-5.2.2/

./configure

make

sudo make install

type -a awk

awk --version | head -n1

## Documentation
Expand Down
6 changes: 3 additions & 3 deletions code_snippets/Processing_multiple_records.sh
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
## Processing consecutive records

awk 'p ~ /as/ && /not/{print p ORS $0} {p=$0}' programming_quotes.txt
awk 'p ~ /he/ && /you/{print p ORS $0} {p=$0}' para.txt

awk 'p ~ /as/ && /not/{print p} {p=$0}' programming_quotes.txt
awk 'p ~ /he/ && /you/{print p} {p=$0}' para.txt

awk 'p ~ /as/ && /not/; {p=$0}' programming_quotes.txt
awk 'p ~ /he/ && /you/; {p=$0}' para.txt

## Context matching

Expand Down
14 changes: 7 additions & 7 deletions code_snippets/Record_separators.sh
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
## Input record separator

printf 'this,is\na,sample' | awk -v RS=, '{print NR ")", $0}'
printf 'this,is\na,sample,text' | awk -v RS=, '{print NR ")", $0}'

s=' a\t\tb:1000\n\n\n\n123 7777:x y \n \n z '
s=' a\t\tb:1000\n\n\t \n\n123 7777:x y \n \n z :apple banana cherry'

printf '%b' "$s" | awk -v RS=: -v OFS=, '{$1=$1} 1'

Expand All @@ -18,7 +18,7 @@ awk -v IGNORECASE=1 -v RS='[e]' 'NR==1' report.log

## Output record separator

printf 'foo\0bar\0' | awk -v RS='\0' -v ORS='.\n' '1'
printf 'apple\0banana\0cherry\0' | awk -v RS='\0' -v ORS='.\n' '1'

cat msg.txt

Expand All @@ -44,11 +44,11 @@ echo 'Sample123string42with777numbers' | awk -v RS='[0-9]+' '{print NR, RT}'

## Paragraph mode

cat programming_quotes.txt
cat para.txt

awk -v RS= -v ORS='\n\n' '/you/' programming_quotes.txt
awk -v RS= -v ORS='\n\n' '/do/' para.txt

awk -v RS= '/you/{print s $0; s="\n"}' programming_quotes.txt
awk -v RS= '/do/{print s $0; s="\n"}' para.txt

s='\n\n\na\nb\n\n12\n34\n\nhi\nhello\n'

Expand All @@ -66,7 +66,7 @@ s='a:b\nc:d\n\n1\n2\n3'

printf '%b' "$s" | awk -F: -v RS= -v ORS='\n---\n' '{$1=$1} 1'

printf '%b' "$s" | awk -F':+' -v RS= -v ORS='\n---\n' '{$1=$1} 1'
printf '%b' "$s" | awk -F'[:]' -v RS= -v ORS='\n---\n' '{$1=$1} 1'

printf '%b' "$s" | awk -F: -v RS='\n\n+' -v ORS='\n---\n' '{$1=$1} 1'

Expand Down
Loading

0 comments on commit 96b8442

Please sign in to comment.