awk filter columns with calculated values and number range | updated

tok19

Hi i dont know how to achieve expected output from given input file , current code giving output but need to have to expected one

input file                     awk code output
--------

3700 TEXT_0A 34 NA NA        | 3700  TEXT_0A  34    NA   NA    850
3721 TEXT_0A 55 ete 851      | 3721  TEXT_0A  55    ete  851   850
3701 TEXT_0A 100 NA NA       | 3701  TEXT_0A  100   NA   NA    850
3724 TEXT_0A 150 ete 1275    | 3724  TEXT_0A  150   ete  1275  1275
3713 TEXT_0A 170 ete 1445    | 3713  TEXT_0A  170   ete  1445  1445
25 TEXT_XA 200 ete 822       | 3686  TEXT_0A  200   ete  822   1700
3686 TEXT_0A 200 ete 822     | 3692  TEXT_0A  200   ete  1700  1700
3692 TEXT_0A 200 ete 1700    | 3716  TEXT_0A  200   ete  3000  1700
3716 TEXT_0A 200 ete 3000    | 3688  TEXT_0A  250   ete  822   1875
88 TEXT_AF 250 ete 822       | 3708  TEXT_0A  300   ete  2250  2250
69 DNTT_AF 250 ete 822       | 3704  TEXT_0A  304   ete  2280  2280
3688 TEXT_0A 250 ete 822     | 3705  TEXT_0A  304   ete  2350  2280
3708 TEXT_0A 300 ete 2250    | 3689  TEXT_0A  400   ete  3000  3000
3704 TEXT_0A 304 ete 2280    | 3717  TEXT_0A  400   ete  3002  3000
3705 TEXT_0A 304 ete 2350    | 3706  TEXT_0A  404   ete  3030  3030
3689 TEXT_0A 400 ete 3000    | 3696  TEXT_0A  500   ete  3750  3750
3717 TEXT_0A 400 ete 3002    | 3707  TEXT_0A  650   ete  4500  4875
3706 TEXT_0A 404 ete 3030    | 3720  TEXT_0A  800   ete  3000  6000
3696 TEXT_0A 500 ete 3750    | 3694  TEXT_0A  1000  ete  7500  7500
3707 TEXT_0A 649.5 ete 4500  | 
3720 TEXT_0A 800 ete 3000    | 
3694 TEXT_0A 1000 ete 7500   | 

filter rules:

  1. filter lines containging strings _0A and exclude lines containing DNTT _AF _XA based on column $3 <=100 assign to output column $6 fixed value 850
  2. based on column $3 > 100 && $3 <=200 within this number range assign to output column $6 calculated value $3*8.5
  3. based on column $3 > 200 && $3 <=1000 within this number range assign to output column $6 calculated value $3*7.5
  4. and filter it to expected output once compared if not equeal column $5 and $6 below

expected output:

3700  TEXT_0A  34    NA   NA    850
3721  TEXT_0A  55    ete  851   850
3701  TEXT_0A  100   NA   NA    850
3686  TEXT_0A  200   ete  822   1700
3716  TEXT_0A  200   ete  3000  1700
3688  TEXT_0A  250   ete  822   1875
3705  TEXT_0A  304   ete  2350  2280
3717  TEXT_0A  400   ete  3002  3000
3707  TEXT_0A  650   ete  4500  4875
3720  TEXT_0A  800   ete  3000  6000

partially working code:

awk '{if (( $2 ~ /_0A/ && !/DNTT|_AF|_XA/ && $3 <=100 )) \
printf ("%s %s %.0f %s %s %.0f\n",$1,$2,$3,$4,$5,850); \
else if (( $2 ~ /_0A/ && !/DNTT|_AF|_XA/ && $3 > 100 && $3 <=200 )) \
printf ("%s %s %.0f %s %s %.0f\n",$1,$2,$3,$4,$5,$3*8.5); \
else if (( $2 ~ /_0A/ && !/DNTT|_AF|_XA/ && $3 > 200 && $3 <=1000 )) \
printf ("%s %s %.0f %s %s %.0f\n",$1,$2,$3,$4,$5,$3*7.5)}' file2 | column -t
Luuk

Reading the lines from 1 to 4, and converting this to awk:

gawk '$0~"_0A" && $0!~"DNTT"{ $6=(($3>100 && $3<=200)?($3*8.5):($3>200 && $3<=1000?($3*7.5):850)); if($5!=$6){ print }}' input.txt

output:

3700 TEXT_0A 34 NA NA 850
3721 TEXT_0A 55 ete 851 850
3701 TEXT_0A 100 NA NA 850
3686 TEXT_0A 200 ete 822 1700
3716 TEXT_0A 200 ete 3000 1700
3688 TEXT_0A 250 ete 822 1875
3705 TEXT_0A 304 ete 2350 2280
3717 TEXT_0A 400 ete 3002 3000
3707 TEXT_0A 649.5 ete 4500 4871.25
3720 TEXT_0A 800 ete 3000 6000

An if statement (if a=b then c else d;) can also be written as (a=b?c:d)

NOTE: You have to do the exclusion of _AF _XA, it was unclear to me, because of no seeing _XA. That's why i excluded DNTT.

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

Textbox to filter all values that matches in range

分類Dev

awk: Perform arithmetic on subset of columns and print all columns with modified values

分類Dev

How to elegantly assign values in function of number range?

分類Dev

Group rows by two columns and filter values by comparison

分類Dev

custom function to filter values in pandas dataframe columns

分類Dev

Sum of columns based on range of values of other columns in a Pandas dataframe

分類Dev

loop drop columns over number of missing values

分類Dev

Use like '%' and match NULL values with NUMBER columns

分類Dev

Date filter reference in a calculated field

分類Dev

Filter the rows of a file based on the number of columns that have =0.00000000

分類Dev

Group/merge array values based on a set number range

分類Dev

Filter out duplicate row based on values in 2 columns

分類Dev

Netlogo: Return the Minimum of calculated values

分類Dev

Get values from beginning of file, Insert values as new columns, trim beginning lines using AWK/GAWK/SED

分類Dev

Count number of columns with some values for each row in pandas

分類Dev

R - generate dynamic number of columns and substring column values

分類Dev

Does mysql query cache the dynamically calculated columns

分類Dev

Remove empty columns by awk

分類Dev

How is the time stamp number in /etc/shadow calculated?

分類Dev

DataTable Filter with range of dates

分類Dev

Check if a value in one column in one dataframe is within the range between values in two columns in another dataframe

分類Dev

Tableau calculated field summing up the values

分類Dev

Is there a more concise method to filter columns containing a number greater than or less than, in dplyr?

分類Dev

Reducing number of awk passes

分類Dev

Increment number using AWK

分類Dev

How to use awk or anything else to number of shared x values of 2 different y values in a csv file consists of column a and b?

分類Dev

How to filter values of 2 columns in a dataframe from a second dataframe using pandas

分類Dev

How to filter a dataframe efficient when I have to compare values of two columns?

分類Dev

Updated rows not expected values with JOIN?

Related 関連記事

  1. 1

    Textbox to filter all values that matches in range

  2. 2

    awk: Perform arithmetic on subset of columns and print all columns with modified values

  3. 3

    How to elegantly assign values in function of number range?

  4. 4

    Group rows by two columns and filter values by comparison

  5. 5

    custom function to filter values in pandas dataframe columns

  6. 6

    Sum of columns based on range of values of other columns in a Pandas dataframe

  7. 7

    loop drop columns over number of missing values

  8. 8

    Use like '%' and match NULL values with NUMBER columns

  9. 9

    Date filter reference in a calculated field

  10. 10

    Filter the rows of a file based on the number of columns that have =0.00000000

  11. 11

    Group/merge array values based on a set number range

  12. 12

    Filter out duplicate row based on values in 2 columns

  13. 13

    Netlogo: Return the Minimum of calculated values

  14. 14

    Get values from beginning of file, Insert values as new columns, trim beginning lines using AWK/GAWK/SED

  15. 15

    Count number of columns with some values for each row in pandas

  16. 16

    R - generate dynamic number of columns and substring column values

  17. 17

    Does mysql query cache the dynamically calculated columns

  18. 18

    Remove empty columns by awk

  19. 19

    How is the time stamp number in /etc/shadow calculated?

  20. 20

    DataTable Filter with range of dates

  21. 21

    Check if a value in one column in one dataframe is within the range between values in two columns in another dataframe

  22. 22

    Tableau calculated field summing up the values

  23. 23

    Is there a more concise method to filter columns containing a number greater than or less than, in dplyr?

  24. 24

    Reducing number of awk passes

  25. 25

    Increment number using AWK

  26. 26

    How to use awk or anything else to number of shared x values of 2 different y values in a csv file consists of column a and b?

  27. 27

    How to filter values of 2 columns in a dataframe from a second dataframe using pandas

  28. 28

    How to filter a dataframe efficient when I have to compare values of two columns?

  29. 29

    Updated rows not expected values with JOIN?

ホットタグ

アーカイブ