Python regex: Subsentence that contains number which can have thousand separator and decimal

CaptainCsaba

I have the following text. I would like to collect all subsentences (from comma or period to comma or period) that have a number in them. I have managed to create the following regex that collects the number and the part after it, but since my number can have commas or periods inside it I don't know how I can grab the words before it.The sentence with the parts I would like to get in bold:

In connection with the consummation of this offering, we will enter into a forward purchase agreement with OrION Capital Structure Solutions UK Limited, or OrION, an affiliate of our sponsor, pursuant to which OrION will commit that it will purchase from us 10,000,000 forward purchase units, or at its option up to an aggregate maximum of 30,000,000 forward purchase units, each consisting of one Class A ordinary share,or a forward purchase share, and one-third of one warrant to purchase one Class A ordinary share, or a forward purchase warrant, for $10.00 per unit, or an aggregate amount of $100,000,000, or at OrION’s option up to an aggregate amount of $300,000,000, in a private placement that will close concurrently with the closing of our initial business combination.

What I want to collect:

["pursuant to which OrION will commit that it will purchase from us 10,000,000 forward purchase units",
"or at its option up to an aggregate maximum of 30,000,000 forward purchase units", "for $10.00 per unit", "or an aggregate amount of $100,000,000", "or at OrION’s option up to an aggregate amount of $300,000,000"]

The regex I wrote currently gets the number and the part after until the next comma or period.

[0-9]{1,2}([,.][0-9]{1,2})?.*?[\.,]

How can I collect part of the sentence (starting with a period or comma), and the number that can have a decimal or thousand separator in it and then part of the sentence until the next comma or period?

EDIT: anubhava and bb1 both give the correct solution. anubhava solved the question exactly as I have asked it and it is the correct answer. bb1 however prepares for something that is bound to happen (and I did not think of) so in the end I used his answer, but marked anubhava as the one who gave the solution because that is the exact solution that i have asked.

EDIT 2: anubhava since updated his answer so it solves the same problem as bb1-s.

anubhava

You may use this regex with look-around assertions:

(?<=[.,] )(?:[^,.]*?\d+(?:[.,]\d+)*)+[^.,]*(?=[,.])

RegEx Demo

RegEx Details:

  • (?<=[.,] ): Lookbehind assertion to assert that we have comma or dot followed by a space before the current position
  • (?:: Start a non-capture group
    • [^,.]*?: Match 0 or more of any character that are not , and . (lazy)
    • \d+(?:[.,]\d+)*: Match a number that may contain . or ,
  • )+: End non-capture group. + repeats this group 1+ times
  • [^.,]*: Match 0 or more of any character that are not , and .
  • (?=[,.]): Lookahead assertion to assert that we have comma or dot after the current position

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

How to format number with "." as thousand separator, and "," as decimal separator?

From Dev

Regex for 8,2 decimal with thousand separator in JAVA

From Dev

validate decimal with decimal separator and thousand separator

From Dev

Add Thousand separator to any number even if it contains special characters

From Dev

Add Thousand separator to any number even if it contains special characters

From Dev

Format a number to always have a sign and decimal separator

From Dev

Can I have a fixed decimal separator in a UITextField?

From Dev

Regex valid numbers with thousand separator

From Dev

Regex valid numbers with thousand separator

From Dev

Format number with space as thousand separator

From Dev

Format number with space as thousand separator

From Dev

String Number format with "/" for thousand separator

From Dev

MS Access export to Excel breaks the Excel file when the decimal and thousand separator have been swapped

From Dev

Decimal not showing group(thousand) separator after parse

From Dev

Regular expression for thousand separator without decimal point

From Dev

Only add thousand separator before decimal comma

From Dev

Cast Varchar to Decimal with Thousand Separator Error

From Dev

Import csv which have numerics with comma as the decimal separator

From Dev

Which is the current decimal separator?

From Dev

How to create a regex that match only number and decimal separator in Perl

From Dev

regex for multiple formats decimals and thousand separator

From Dev

ExtJS 6 thousand separator in number field

From Dev

Is it possible to print a number formatted with thousand separator in Rust?

From Dev

Adding thousand separator while printing a number

From Dev

ionic +number format trouble based on the thousand separator

From Dev

Format integer number with . as thousand separator and no decimals

From Dev

R: How can I read a CSV file with data.table::fread, that has a comma as decimal and point as thousand separator="."

From Dev

Xamarin.Forms Entry with auto thousand and decimal separator

From Dev

C# thousand separator issue with decimal.tryparse

Related Related

  1. 1

    How to format number with "." as thousand separator, and "," as decimal separator?

  2. 2

    Regex for 8,2 decimal with thousand separator in JAVA

  3. 3

    validate decimal with decimal separator and thousand separator

  4. 4

    Add Thousand separator to any number even if it contains special characters

  5. 5

    Add Thousand separator to any number even if it contains special characters

  6. 6

    Format a number to always have a sign and decimal separator

  7. 7

    Can I have a fixed decimal separator in a UITextField?

  8. 8

    Regex valid numbers with thousand separator

  9. 9

    Regex valid numbers with thousand separator

  10. 10

    Format number with space as thousand separator

  11. 11

    Format number with space as thousand separator

  12. 12

    String Number format with "/" for thousand separator

  13. 13

    MS Access export to Excel breaks the Excel file when the decimal and thousand separator have been swapped

  14. 14

    Decimal not showing group(thousand) separator after parse

  15. 15

    Regular expression for thousand separator without decimal point

  16. 16

    Only add thousand separator before decimal comma

  17. 17

    Cast Varchar to Decimal with Thousand Separator Error

  18. 18

    Import csv which have numerics with comma as the decimal separator

  19. 19

    Which is the current decimal separator?

  20. 20

    How to create a regex that match only number and decimal separator in Perl

  21. 21

    regex for multiple formats decimals and thousand separator

  22. 22

    ExtJS 6 thousand separator in number field

  23. 23

    Is it possible to print a number formatted with thousand separator in Rust?

  24. 24

    Adding thousand separator while printing a number

  25. 25

    ionic +number format trouble based on the thousand separator

  26. 26

    Format integer number with . as thousand separator and no decimals

  27. 27

    R: How can I read a CSV file with data.table::fread, that has a comma as decimal and point as thousand separator="."

  28. 28

    Xamarin.Forms Entry with auto thousand and decimal separator

  29. 29

    C# thousand separator issue with decimal.tryparse

HotTag

Archive