I have an input file in .csv format which contains entries of tax invoices separated by pipe.
for example:
Header--TIN | NAME | INV NO | DATE | NET | TAX | OTHERS | TOTAL
Record1-29001234768 | A S Spares | AB012 | 23/07/2016 | 5600.25 | 200.70 | 10.05 | 5811.00
Record2-29450956221 | HONDA Spare Parts | HOSS0987 |29/09/2016 | 70000 | 2200 | 0 | 72200
The record's NET value, TAX Value, OTHER Charges and TOTAL value column may contain positive whole numbers or positive floating point numbers with 2-4 places after the decimal point.
Now my requirement is to check whether the columns meets the specified constraints by checking with appropriate 'Regular Expression using awk'. I need to match these 4 columns with regular expression such that if I encounter any numeric value other than positive whole number or positive floating point number , I need to print an error message to the user.
I've tried the following , but it doesn't seem to work.
if(!($5 ~ /[0-9]+/) || !($5 ~ /[0-9]+[.][0-9]+/) || ($5<=0))
{ printf("NET VALUE (Violates constraints)" }
Can anyone give the proper working regular expression or any implementation using built-in-function to meet my requirements?
Sounds like your validation should be:
$5 ~ /^[0-9]+(\.[0-9]{2,4})?$/
If it matches that, then it's valid (either a positive whole number, or a number followed by .
and between 2 and 4 other numbers).
The anchors to the start and end of the field are important!
As rightly pointed out in the comments, if you want to accept numbers with no digits before the decimal point, then you will have to go for a more complex regular expression.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments