• Jobs
  • About Us
  • professionals
    • Home
    • Jobs
    • Courses and challenges
  • business
    • Home
    • Post vacancy
    • Our process
    • Pricing
    • Assessments
    • Payroll
    • Blog
    • Sales
    • Salary Calculator

0

175
Views
How to display file columns containing a specific word using awk

I would like to print all columns that contains word, for example "watermelon". A was thinking about using together these 2 formulas, because they are working separetly (one is doing something for every column in file and another is checking if column contains specyfic word).

awk '{for(i=1;i<=NF-1;i++) printf $i" "; print $i}' a.csv
awk -F"," '{if ($2 == " watermelon") print $2}' a.csv

But when I try put them toghether my code isn't working

#!/bin/bash 
awk '{for(i=1;i<=NF-1;i++) 
         awk -F"," '{if ($i == " watermelon") 
              print $i}' a.csv    
        }' a.csv

For example this is my file a.csv

lp, type, name, number, letter
1, fruit, watermelon, 6, a
2, fruit, apple, 7, b
3, vegetable, onion, 8, c
4, vegetable, broccoli, 6, b
5, fruit, orange, 5, c

And this is the result i would like to get, while searching for word watermelon

name
watermelon
apple
onion
broccoli
orange
about 3 years ago · Santiago Trujillo
2 answers
Answer question

0

$ cat tst.awk
BEGIN { FS=OFS=", " }
NR==FNR {
    for (inFldNr=1; inFldNr<=NF; inFldNr++) {
        if ( $inFldNr == tgt ) {
            hits[inFldNr]
        }
    }
    next
}
FNR==1 {
    for (inFldNr=1; inFldNr<=NF; inFldNr++) {
        if ( inFldNr in hits ) {
            out2in[++numOutFlds] = inFldNr
        }
    }
}
{
    for (outFldNr=1; outFldNr<=numOutFlds; outFldNr++) {
        inFldNr = out2in[outFldNr]
        printf "%s%s", $inFldNr, (outFldNr<numOutFlds ? OFS : ORS)
    }
}

$ awk -v tgt='watermelon' -f tst.awk file file
name
watermelon
apple
onion
broccoli
orange

The main difference between the above and @JamesBrown's approach is that in the 2nd pass of the file my script only loops over the fields to be output while James' loops over all input fields and so will be slower in what is presumably the normal case where not all input fields have to be output.

Regarding printf $i in your code btw - never do that, always do printf "%s", $i for any input data instead as the former will fail when your input contains printf formatting chars like %s.

about 3 years ago · Santiago Trujillo Report

0

Here's one that processes the data twice:

$ awk -F', ' '                          # remember to se OFS if you need one
NR==FNR {                               # on the first run
    for(i=1;i<=NF;i++)                  # find 
        if($i=="watermelon")            # watermelon fields
            a[i]                        # and mark them
    next
}
FNR==1 {                                # in case there were no such field
    for(i in a)                         # test 
        next                            # and continue
    exit                                # or exit
}
{                                       # on the second run
    for(i=1;i<=NF;i++)                 
        if(i in a)b=b (b==""?"":OFS) $i # buffer those fields for output
    print b                             # and output
    b=""                                # clean that buffer for next record
}' file file

Output:

name
watermelon
apple
onion
broccoli
orange
about 3 years ago · Santiago Trujillo Report
Answer question
Find remote jobs

Discover the new way to find a job!

Top jobs
Top job categories
Business
Post vacancy Pricing Our process Sales
Legal
Terms and conditions Privacy policy
© 2025 PeakU Inc. All Rights Reserved.

Andres GPT

Recommend me some offers
I have an error