When dealing with numerical or string values in a line of text, filtering text or strings using comparison operators comes in handy for awk command users.
In this part of the Awk series, we shall take a look at how you can filter text or strings using comparison operators.
If you are a programmer then you must already be familiar with comparison operators but for those who are not, let me explain in the section below.
What are Comparison operators in Awk?
Before diving into how to use comparison operators with Awk, let’s first understand what comparison operators are.
Comparison operators consist of symbols or keywords utilized to compare values in programming languages.
In Awk, comparison operators are often used to compare the value of numbers or strings and they include the following:
>
– greater than<
– less than>=
– greater than or equal to<=
– less than or equal to==
– equal to!=
– not equal tosome_value ~ / pattern/
– true if some_value matches the patternsome_value !~ / pattern/
– true if some_value does not match the pattern
Now that we have looked at the various comparison operators in Awk, let us understand them better using an example.
Filtering Data with Awk
In this example, we have a file named food_list.txt which is a shopping list for different food items and I would like to flag food items whose quantity is less than or equal 20 by adding (**)
at the end of each line.
No Item_Name Quantity Price 1 Mangoes 45 $3.45 2 Apples 25 $2.45 3 Pineapples 5 $4.45 4 Tomatoes 25 $3.45 5 Onions 15 $1.45 6 Bananas 30 $3.45
The general syntax for using comparison operators in Awk is:
expression { actions; }
To achieve the above goal, I will have to run the command below:
awk '$3 <= 20 {print $0 " (**)" } $3 > 20 {print $0}' food_list.txt
Here is the explanation of the command:
- awk – This command invokes the Awk text processing utility.
- ‘$3 <= 20 {print $0 ” (**)” } – This part of the command is a condition followed by an action. It checks if the value in the third column (Quantity) of each line is less than or equal to 20. If the condition is true, it prints the entire line ($0) with “(**)” appended to it.
- $3 > 20 {print $0} – This part of the command is another condition followed by an action. It checks if the value in the third column (Quantity) of each line is greater than 20. If the condition is true, it prints the entire line ($0) without any modifications.
- food_list.txt – This is the input file that the Awk command will process. It contains the data on which the conditions and actions specified in the command will be applied.
Another example is to mark lines where the quantity is less than or equal to 20 with the word “(TRUE)” at the end.
awk '$3 <= 20 { printf "%s\t%s\n", $0,"TRUE" ; } $3 > 20 { print $0 ;} ' food_list.txt
Combining Operators in Awk
We can also combine multiple comparison operators to create more complex conditions. For example, if we want to filter out food items whose quantity is between 20 and 50, we can use the logical AND operator (&&) as shown.
awk '$3 >= 20 && $3 <= 50' food_list.txt
The above command will print lines where the quantity (third column) falls between 20 and 50.
Summary
This is an introductory tutorial to comparison operators in Awk, therefore you need to try out many other options and discover more.
For those seeking a comprehensive resource, we’ve compiled all the Awk series articles into a book, that includes 13 chapters and spans 41 pages, covering both basic and advanced Awk usage with practical examples.
Product Name | Price | Buy |
---|---|---|
eBook: Introducing the Awk Getting Started Guide for Beginners | $8.99 | [Buy Now] |
In case of any problems you face or any additions that you have in mind, then drop a comment in the comment section below. Remember to read the next part of the Awk series where I will take you through compound expressions.
Forward read.
Filename SRR11910146_1.fastq
Total Sequences 705425
%GC 46
PASS Adapter Content SRR11910146_1.fastq
reverse read
Filename SRR11910146_2.fastq
Total Sequences 705425
%GC 46
PASS Adapter Content SRR11910146_2.fastq
I have a file that contains this and if I want to compare each row separately how can I do this . for example I have to check %GC > 50 or not how can I achieve this?
Hi,
Please refer: https://unix.stackexchange.com/questions/607973/compare-2-percentage-value-using-if-statement
Just grep for %GC, take the second field value and compare it according to your needs, as per this article.
Regards,
Nathan SR
Apparently the script is not smart enough to validate if
$3
is a number of character.@Dan
Sure, you can modify it for that purpose and share it with us.
My try script, while the output is good however I do feel it’s a bit redundant particularly checking ($3 ~ /^[0-9]/) twice. Please simply :)
$ awk ‘($3 ~ “^[a-zA-Z]”) { print $0} (($3 ~ /^[0-9]/) && ($3 <= 30)) { print $0," 30)) { print $0, “<– quantity is greater than 30" ;}' food_list.txt
@Dan
Thanks for sharing this, we will take some time to analyze it.
What is the difference between using these expression directly and using this expressions with if statement. How it makes difference.
@upkar
It depends on what you want to achieve, but you can use an if statement for conditional execution.
awk ‘NR==1 {print }; NR>1 && $3 > 30 { printf “%s\t%s\n”, $0,”**” ; };NR>1 && $3 <= 30 { print $0 ;}' food_list.txt
will be better
@jumen
Very good suggestion here, it actually puts into use the concept of Awk built-in variables that we covered in part 10 of the series. https://www.tecmint.com/awk-built-in-variables-examples/
if I want filter the quantity greater than 30; there will be something wrong
awk ‘$3 > 30 { printf “%s\t%s\n”, $0,”**” ; } $3 1 && $3 > 30 { printf “%s\t%s\n”, $0,”**” ; };NR>1 && $3 <= 30 { print $0 ;}' food_list.txt