Basic bash concepts, Part-11 (Awk) with solved Hackerrank problems

Hey Pals, This article is about AWK, it’s basics, examples and Hackerrank Problem and Solution.

AWK

AWK is an interpreted programming language for Bash Shell Scripting. It is very powerful and specially designed for text processing,Arithmetic operation in Unix. Its name is derived from the family names of its authors − Alfred Aho, Peter Weinberger, and Brian Kernighan.

We can use AWK for following uses:

  • Text processing,
  • formatted text reports,Pattern
  • arithmetic operations,
  • string operations

However you can learn bash from the beginning

Before this post, you have learned

  1. Basic bash hackerrank
  2. Shell beginner
  3. Bash II chapter
  4. Bash III chapter
  5. Bash IV chapter
  6. Bash V chapter
  7. Bash VI chapter
  8. Bash VII chapter
  9. Bash Part 8 Array
  10. Bash Part 9 Array
  11. Bash Part 10 AWK

Basic bash concepts, Part-11 (Awk)

Awk breaks each line of input  into fields. By default, a field is a string of consecutive characters delimited by whitespace, but we can chnage that. Awk parses and operates on each separate field. This makes it powerful and useful for handling structured text files — especially tables — data organized into consistent chunks, such as rows and columns.

Example

echo yes no | awk '{print $1}'
# yes

echo yes no | awk '{print $2}'
# no

# But what is field #0 ($0)?
echo yes no | awk '{print $0}'
# yes no
# All the fields!

here we can see yes and no are separated by whitespace so in $1 awk store yes and  in $2 no.

awk '{print $1 $5 $6}' $filename
# Prints fields #1, #5, and #6 of file $filename.

awk '{print $0}' $filename
# Prints the entire file!
Hackerrank Problems

Q1.You are given a file with four space separated columns containing the scores of students in three subjects. The first column contains a single character (A – Z), the student identifier. The next three columns have three numbers each. The numbers are between 0 and 100, both inclusive. These numbers denote the scores of the students in English, Mathematics, and Science, respectively.

Your task is to identify those lines that do not contain all three scores for students.

Input Format

There will be no more than 10 rows of data.
Each line will be in the following format:
[Identifier][English Score][Math Score][Science Score]

Output Format

For each student, if one or more of the three scores is missing, display:

Not all scores are available for [Identifier]

Sample Input

A 25 27 50
B 35 75
C 75 78 
D 99 88 76

Sample Output

Not all scores are available for B
Not all scores are available for C

Explanation

Only 2 scores have been provided for student B and student C.

Solution

awk '{ if ($2 =="" || $3 == "" || $4 == "") print "Not all scores are available for",$1; }'

Here we checked that if  field #2 or #3 or #4 are empty for any character A-Z in each line the we can say that this character like B or C has not all scores.

Problem 2

You are given a file with four space separated columns containing the scores of students in three subjects. The first column contains a single character (A-Z), the student identifier. The next three columns have three numbers each. The numbers are between 0 and 100, both inclusive. These numbers denote the scores of the students in English, Mathematics, and Science, respectively.

Your task is to identify whether each of the students has passed or failed.
A student is considered to have passed if (s)he has a score 50 or more in each of the three subjects.

Input Format

There will be no more than rows of data.
Each line will be in the following format:
[Identifier][English Score][Math Score][Science Score]

Output Format

Depending on the scores, display the following for each student:

[Identifier] : [Pass] 

or

[Identifier] : [Fail]  

Sample Input

A 25 27 50
B 35 37 75
C 75 78 80
D 99 88 76

Sample Output

A : Fail
B : Fail
C : Pass
D : Pass

Explanation

Only student C and student D have scored>=50  in all three subjects.

Solution

awk '{
  if ( $2>=50 && $3>=50 && $4>=50 ) grade="Pass";
  else grade="Fail";
 print $1,":",grade;
 ;}'
 
 
 
 or
 
 
 
 
 awk '{
if ($2 >= 50 && $3 >= 50 && $4 >= 50)
    print $1,": Pass";
else
    print $1,": Fail"; }'

$1 contains character like A, B or C so we will check $2,$3 and $4 that value is >=50 or not , we can print PASS or Fail accordingly.

We can also use Ternary Operator in one line

print $1, ":", ($2 >= 50 && $3 >= 50 && $4 >= 50) ? "Pass" : "Fail"
Problem 3

This is the additional task we used in problem 1 and 2

Your task is to identify the performance grade for each student. If the average of the three scores is 80 or more, the grade is ‘A’. If the average is 60 or above, but less than 80, the grade is ‘B’. If the average is 50 or above, but less than 60, the grade is ‘C’. Otherwise the grade is ‘FAIL’.

Sample Input

A 25 27 50
B 35 37 75
C 75 78 80
D 99 88 76

Sample Output

A 25 27 50 : FAIL
B 35 37 75 : FAIL
C 75 78 80 : B
D 99 88 76 : A

Explanation

A scored an average less than 50 => FAIL Same for B C scored an average between 60 and 80 => B
D scored an average between 80 and 90 => A

Solution

awk '{
  avg=$2+$3+$4;
 avg= avg/3;
 if(avg >80 && avg <=90) grade="A";
else if(avg >60) grade="B";
else grade="FAIL";
 print $0,":",grade;
 ;}'
 
 
 
 or
 we can use ternary
 
 
 awk '{average=($2+$3+$4)/3; print $0, ":", (average>=80)? "A" : (average>=60)? "B" : (average>=50)? "C" : "FAIL"}'

Here we used mathematical logic only, as you can see , we took average of 3 subjects and checked if its is >=80 or >=60 >=50 or less.

Problem 4

You are provided a file with four space-separated columns containing the scores of students in three subjects. The first column contains a single character (A-Z) – the identifier of the student. The next three columns have three numbers (each between 0 and 100, both inclusive) which are the scores of the students in English, Mathematics, and Science respectively.Concatenate every 2 lines of input with a semicolon.

Sample Input

A 25 27 50
B 35 37 75
C 75 78 80
D 99 88 76 

Sample Output

A 25 27 50;B 35 37 75
C 75 78 80;D 99 88 76 

Explanation

Every pair of lines has been concatenated with a semi-colon.

Solution

awk 'ORS=NR%2?";":"\n"'


or


paste -d';' - -

All rights reserved. No part of this Post may be copied, distributed, or transmitted in any form or by any means, without the prior written permission of the website admin, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted by copyright law. For permission requests, write to the owner, addressed “Attention: Permissions Coordinator,” to the admin @coderinme

A web developer(Front end and Back end), and DBA at csdamu.com. Currently working as Salesforce Developer @ Tech Matrix IT Consulting Private Limited. Check me @about.me/s.saifi

Leave a reply:

Your email address will not be published.