×

Awk is the most popular utility that is developed for the purpose of data extraction, text processing, and moreover like creating formatted reports.

It is way more similar to sed but more powerful than sed as sed has limitations in text processing. 

AWK doesn’t have a specific meaning to its name as it is named using the first letter of its developers Alfred Aho, Peter J. Weinberger, and Brian Kernighan.

Here at LinuxAPT, as part of our Server Management Services, we regularly help our Customers to perform Linux Terminal related queries.

In this context, we shall look into some useful awk commands you must need to know as a Linux User.

Here, have created and added the following set of data in people.txt as an example. 

The data set has 4 columns where the first field contains the first name, the second field contains the second name, the third field contains age and the last one contains the class:

$ cat people.txt
Mike Hamby 19 10
John Fray 14 6
Ellen Hoy 18 8
Robbie Shinn 15 6
Jake Mickel 18 9


How to Print Specific Field Using Variable

Awk has many prebuilt variables that have their respective purpose. Using this command we can print all the specific field data using $x where x refers to the field numbering position:

$ awk '{print $1, $2}' people.txt
Mike Hamby
John Fray
Ellen Hoy
Robbie Shinn
Jake Mickel


BEGIN Variable

BEGIN Variable is used to add header or title to resulting data as it executed the script before processing the data. 

It helps in indexing while formatting the data tables. 

In the following example, we have printed some text as indexing and then print all student names:

$ awk 'BEGIN {print "People : "} {print $1}' people.txt
People :
Mike 
John
Ellen
Robbie
Jake


END Variable

END is just the opposite of BEGIN as it executes the script after data processing. It can be used for the final reporting of the data set. 

In the following example, we have printed all the student age and then printed some ending messages:

$ awk '{print $3}
END {
print "These are people's age"
} ' people.txt

In our case, you will see:

$ awk '{print $3}
> END {
> print "These are people's age"
> } ' people.txt
19
14
18
15
18
These are people's age


File Separator

Space and Tab space are default separators of the awk command however we can separate text based on other separators like comma, slash, etc. 

To achieve this we need to add the -F flag to the command and the provide separator in a single quotation mark.

$ awk -F':' '{print $1}' /etc/passwd
root daemon
bin
sys
sync
games
man


How to Run Script From File ?

We can execute the awk script from the file also which provides us the tendency of creating reports efficiently. For this, you need to create the file then write the script and execute it using the awk command. For the demo, you can create a file name demo_script and copy-paste the following script:

$ vi demo_script
{
sum+=$3
}
END {
print("Sum of all people's age is", sum)
}
So, you will get:
{
sum+=$3
}
END {
print("Sum of all people's age is", Sum)
}

The awk command provides a -f flag for executing the script from the file:

$ awk -f demo_script people.txt
Sum of all people's age is 84


How to use Multiple Script ?

We can execute the multiple scripts using the semicolon. In the following example, we have printed some text then pipe the output, with awk and print out the modified result.

$ echo "Hello, Dr. John" | awk '{$3="George"; print $0}'
Hello, Dr. George


How to Count Number of Lines ?

We can allocate the number to the report using the NR variable which is awk built-in variable that automatically prints the line number to the report:

$ awk '{print NR "\t" $0}' people.txt
1 Mike Hamby 19 10
2 John Fray 14 6
3 Ellen Hoy 18 8
4 Robbie Shinn 15 6
5 Jake Mickel 18 9


How to Count Number of Fields ?

Sometimes, while preparing the data we forgot to add data in the specific column which may lead to irregularity in the report. 

We can count fields using the NF variable which makes us easier to review and arrange the reports.

$ awk '{print NR".",$0 "\n Count=" NF}' people.txt
1. Mike Hamby 19 10
 Count=4
2.  John Fray 14 6
 Count=4
3. Ellen Hoy 18 8
 Count=4
4. Robbie Shinn 15 6
 Count=4
5.  Jake Mickel 18 9
 Count=4


If Condition

We can use if condition in preparing a conditional report. In the following example, we print all the student whose age is below 16:

$ awk '
BEGIN{
print "People whose age are under 16 are:"
}
{
if($3<16){
print $1
}
}' people.txt


For Loop

In the following example, we use for loop to print 5 random numbers in succession. For generating random numbers we will use the rand() function which is a system inbuilt function. 

This function will generate a random number in decimal so we need to multiply 100 to get random numbers 1 to 100:

$ awk 'BEGIN {
for (i = 1; i <= 5; i++){
print int(100 * rand())
}
}'

[Need urgent assistance in fixing Linux related errors ? We can help you. ]


Conclusion

More Linux Tutorials

We create Linux HowTos and Tutorials for Sys Admins. Visit us on IbmiMedia.com

Also for Tech related tips, Visit forum.outsourcepath.com or General Technical tips on www.outsourcepath.com