Python Regex To Ignore Date Pattern

WeShall

Sample Data:

Weight Measured: 80.7 kg (11/27/1900 24:59:00)
Pulse 64 \F\ Temp 37.3?C (99.1 ?F) \F\ Wt 101.2 kg (223 lb)
Weight as of 11/11/1900 72.2 kg (159 lb 1.6 oz)
Resp. rate 16, height 177.8 cm (5' 10"), weight 84.7 kg (186 lb|
11.2 oz)
And one extra weight example 100lbs

Partially working Regex:

\b(?i)(?:weight|wt)\b(?:.){1,25}?\b(\d+\.?(?:\d+)).*?(\w+)\b

Current output:

('80.7', 'kg'), ('101.2', 'kg'), ('11', '11'), ('84.7', 'kg'), ('100', 'lbs')

Expected ouput:

('80.7', 'kg'), ('101.2', 'kg'), ('72.2', 'kg'), ('84.7', 'kg'), ('100', 'lbs')

How do I make my current regex ignore dates and capture the value that follows? Also, how do I make this regex to stop matching at the end of line?

Wiktor Stribiżew

You may use

re.findall(r'(?i)\bw(?:eigh)?t\b.{1,25}?\b(?<!\d/)(\d+(?:\.\d+)?)(?!/?\d)\s*(\w+)', text)

See the regex demo

Details

  • (?i) - same as re.I - case insensitive mode on
  • \b - a word boundary
  • w(?:eigh)?t - wt or weight
  • \b - a word boundary
  • .{1,25}? - any 1 to 25 chars other than line break chars, as few as possible
  • \b - a word boundary
  • (?<!\d/) - a negative lookbehind that fails the match if immediately to the left of the current location there is a digit and /
  • (\d+(?:\.\d+)?) - Group 1: one or more digits followed with an optional sequence of a dot and one or more digits
  • (?!/?\d) - a negative lookahead that fails the match if immediately to the right of the current location there is an optional / and a digit
  • \s* - 0+ whitespaces
  • (\w+) - Group 2: one or more letters, digits or underscores.

See Python demo:

import re
text = """Weight Measured: 80.7 kg (11/27/1900 24:59:00)\nPulse 64 \F\ Temp 37.3?C (99.1 ?F) \F\ Wt 101.2 kg (223 lb)\nWeight as of 11/11/1900 72.2 kg (159 lb 1.6 oz)\nResp. rate 16, height 177.8 cm (5' 10"), weight 84.7 kg (186 lb|\n11.2 oz)\nAnd one extra weight example 100lbs"""
print(re.findall(r'(?i)\bw(?:eigh)?t\b.{1,25}?\b(?<!\d/)(\d+(?:\.\d+)?)(?!/?\d)\s*(\w+)', text))
# => [('80.7', 'kg'), ('101.2', 'kg'), ('72.2', 'kg'), ('84.7', 'kg'), ('100', 'lbs')]

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

Regex.Replace - pattern for correcting date format

分類Dev

How to choose a regex pattern in Python

分類Dev

regex to find a specific pattern in python

分類Dev

Validating date format with Python regex

分類Dev

python regex OR on single charcters with spacy pattern matching

分類Dev

Logstash Grok Pattern vs Python Regex?

分類Dev

how to include \ in the string as pattern for regex python

分類Dev

How to ignore brackets in a regex

分類Dev

Regex removing ignore chars

分類Dev

unicode regex pattern not working

分類Dev

Regex OR pattern not retrieving match

分類Dev

Perform replace on a string but ignore some pattern

分類Dev

ls *.csv --ignore="*pattern*" returns files which contain "pattern"

分類Dev

How to match the routing key with binding pattern for RabbitMQ topic exchange using python regex?

分類Dev

Regex to read a file and return the first line after the matched pattern from inside the file in Python

分類Dev

Python regex error: look-behind requires fixed-width pattern

分類Dev

SED to remove a Line with REGEX Pattern

分類Dev

Lua regex to match pattern in makefile

分類Dev

Why does this regex pattern not match?

分類Dev

Java USSD code regex pattern

分類Dev

Regex pattern for Swift with some differences

分類Dev

Split with irregular pattern (regex) SCALA

分類Dev

Regex pattern matching for contains a character

分類Dev

Regex pattern counting with repetitive words

分類Dev

Regex pattern for Eventlog 4740 with powershell

分類Dev

Regex to match anything between a pattern

分類Dev

Powershell Regex Complex Pattern(XML)

分類Dev

Regex - trouble matching the pattern "(cmd: )"

分類Dev

PHP Regex: Pattern to match "url()"

Related 関連記事

ホットタグ

アーカイブ