Python regex to match 6-digit numbers of different formats

jgredecka

I need to find and capture all occurrences of 6 digit numbers with OMIM and MIM prefixes and all 6 digit numbers where there is no preceding colon.

Expected output

[111111, 222222, 555555, 444444]

What I have tried

import re

sentence = '111111;Dystonia-1,222222,OMIM:555555; 3333333 Dystonic disorder1,MIM#444444'

re1 = r'OMIM:(\d{6})'
re2 = r'MIM#(\d{6})'
re3 = r'[^:](\d{6})'

identifiers = re.compile("(%s|%s|%s)" % (re1, re2, re3)).findall(sentence)

Current output

[
  ( ',222222'     , ''       , ''       , '222222' ),
  ( 'OMIM:555555' , '555555' , ''       , ''       ),
  ( ' 333333'     , ''       , ''       , '333333' ),
  ( 'MIM#444444'  , ''       , '444444' , ''       )
]
JvdV

I think you could try:

\b(?:MIM#|OMIM:|(?<!:))(\d{6})\b

See the online demo

  • \b - Word boundary.
  • (?: - Non-capture group:
    • MIM#|OMIM:|(?<!:) - Literally "MIM#" or "OMIM:" or a negative lookbehind to assert position is not preceded by a colon.
    • ). Close non-capture group.
  • (\d{6}) - Capture six digits in a capture group.
  • \b - Word boundary.

import re
sentence = '111111;Dystonia-1,222222,OMIM:555555; 3333333 Dystonic disorder1,MIM#444444'
print(re.findall(r'\b(?:MIM#|OMIM:|(?<!:))(\d{6})\b', sentence))

Prints:

['111111', '222222', '555555', '444444']

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Regex to match all single or double digit numbers

From Dev

regex to match 3-digit numbers but not 4-digit ones

From Dev

Using Regex to match numbers on rows of different size in Python

From Dev

Using Python and Regex to extract different formats of dates

From Dev

digit regex no match

From Dev

Regex for different prices formats

From Java

Using a regex to match only decimal numbers but I keep matching non-single-digit numbers

From Dev

How to match a regular expression with exactly one digit in it using python regex?

From Dev

How to match a regular expression with exactly one digit in it using python regex?

From Dev

How do you multiply each digit of a number by different numbers in python?

From Java

Extract phone number using regex with different formats python

From Dev

Match dates with different formats in Powershell

From Dev

Match dates with different formats in Powershell

From Dev

python regex to match characters and numbers together

From Dev

regex expression for two digit numbers

From Dev

Regex match one digit or two

From Dev

Regex to match a digit not followed by a dot(".")

From Dev

Multiple Regex Pattern with different formats

From Dev

Regex for capturing different date formats

From Dev

Regex to capture timestamp in different formats

From Dev

Regex to match non-negative numbers, no leading zeros, 2 decimal places, at least 1 digit, optional decimal

From Dev

Regex to match colon and numbers

From Dev

Regex to match 3 numbers

From Dev

RegEx to match numbers with letters

From Dev

Regex pattern to match two datetime formats

From Dev

Regex match for several similar string formats

From Dev

Two formats shows different complex numbers in Octave

From Dev

How to extract the 4 digit numbers using a regex

From Dev

Regex - replace all the 11 digit numbers with link

Related Related

  1. 1

    Regex to match all single or double digit numbers

  2. 2

    regex to match 3-digit numbers but not 4-digit ones

  3. 3

    Using Regex to match numbers on rows of different size in Python

  4. 4

    Using Python and Regex to extract different formats of dates

  5. 5

    digit regex no match

  6. 6

    Regex for different prices formats

  7. 7

    Using a regex to match only decimal numbers but I keep matching non-single-digit numbers

  8. 8

    How to match a regular expression with exactly one digit in it using python regex?

  9. 9

    How to match a regular expression with exactly one digit in it using python regex?

  10. 10

    How do you multiply each digit of a number by different numbers in python?

  11. 11

    Extract phone number using regex with different formats python

  12. 12

    Match dates with different formats in Powershell

  13. 13

    Match dates with different formats in Powershell

  14. 14

    python regex to match characters and numbers together

  15. 15

    regex expression for two digit numbers

  16. 16

    Regex match one digit or two

  17. 17

    Regex to match a digit not followed by a dot(".")

  18. 18

    Multiple Regex Pattern with different formats

  19. 19

    Regex for capturing different date formats

  20. 20

    Regex to capture timestamp in different formats

  21. 21

    Regex to match non-negative numbers, no leading zeros, 2 decimal places, at least 1 digit, optional decimal

  22. 22

    Regex to match colon and numbers

  23. 23

    Regex to match 3 numbers

  24. 24

    RegEx to match numbers with letters

  25. 25

    Regex pattern to match two datetime formats

  26. 26

    Regex match for several similar string formats

  27. 27

    Two formats shows different complex numbers in Octave

  28. 28

    How to extract the 4 digit numbers using a regex

  29. 29

    Regex - replace all the 11 digit numbers with link

HotTag

Archive