Extract multiple values from string

user3657339

We were using this approach to find a single keyword

Get-Content $SourceFile | Select-String -Pattern "search keyword value"

However, we have to extract 4 values, namely embedded pound (£) values (variable currency amounts) and literal substrings, as demonstrated below:

# Sample input
$String =' in the case of a single acquisition the Total Purchase Price of which (less the amount
funded by Acceptable Funding Sources (Excluding Debt)) exceeds £5,000,000 (or its
equivalent) but is less than or equal to £10,000,000 or its equivalent, the Parent shall
supply to the Agent for the Lenders not later than the date a member of the Group
legally commits to make the relevant acquisition, a copy of any financial due diligence
reports obtained by the Group in relation to the Acquisition Target, on a non-reliance
basis (subject to the Agent and any other relevant Reliance Party signing any required
hold harmless letter) and a copy of the acquisition agreement under which the
Acquisition Target is to be acquired;'

# Values to extract

$Value1 = ' in the case of a single acquisition the Total Purchase Price '

$Value2 = ' £5,000,000'

$Value3 = ' £10,000,000'

$Value4 = ' a copy of any financial due diligence
reports obtained by the Group in relation to the Acquisition Target, on a non-reliance
basis (subject to the Agent and any other relevant Reliance Party signing any required
hold harmless letter) and a copy of the acquisition agreement under which the
Acquisition Target is to be acquired;'
mklement0
# Define the regex patterns to search for indidvidually, as elements of an array.
$patterns = 
    # A string literal; escape it, to be safe.
    [regex]::Escape(' in the case of a single acquisition the Total Purchase Price '),     
    # A regex that matches a currency amount in pounds.
    # (Literal ' £', followed by at least one ('+') non-whitespace char. ('\S')
    # - this could be made more stringent by matching digits and commas only.)
    ' £\S+',     
    # A string literal that *needs* escaping due to use of '(' and ')'
    # Note the use of a literal here-string (@'<newline>...<newline>'@)
    [regex]::Escape(@'
a copy of any financial due diligence
reports obtained by the Group in relation to the Acquisition Target, on a non-reliance
basis (subject to the Agent and any other relevant Reliance Party signing any required
hold harmless letter) and a copy of the acquisition agreement under which the
Acquisition Target is to be acquired;
'@)

# - Use Get-Content -Raw to read the file *as a whole*
# - Use Select-String -AllMatches to find *multiple* matches (per input string)
# - ($patterns -join '|') joins the individual regexes with an alternation (|)
#   so that matches of any one of them are returned.
Get-Content -Raw $SourceFile | Select-String -AllMatches -Pattern ($patterns -join '|') |
  ForEach-Object {
    # Loop over the matches, each of which contains the captured substring
    # in index [0], and collect them in an *array*, $capturedSubstrings
    # Note: You could use `Set-Variable` to create individual variables $Variable1, ...
    #       but it's usually easier to work with an array.
    $capturedSubstrings = foreach ($match in $_.Matches) { $match[0].Value }
    # Output the array elements in diagnostic form.
    $capturedSubstrings | % { "[$_]" }
  }

Note that -Pattern normally accepts an array of values, so using -Pattern $patterns should work (albeit with subtly different behavior), but as of PowerShell Core 6.1.0 doesn't due to a bug.

Caveat: The assumption is that your script uses the same newline style as $SourceFile (CRLF vs. LF-only); more work is needed if the two differ, which would surface as the last pattern (the multi-line one) not matching.

With a file containing the contents of $String above, this yields:

[ in the case of a single acquisition the Total Purchase Price ]
[ £5,000,000]
[ £10,000,000]
[a copy of any financial due diligence
reports obtained by the Group in relation to the Acquisition Target, on a non-reliance
basis (subject to the Agent and any other relevant Reliance Party signing any required
hold harmless letter) and a copy of the acquisition agreement under which the
Acquisition Target is to be acquired;]

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

Extract values from string using regex groups

分類Dev

How to extract multiple urls from String?

分類Dev

How to extract substring from a String with different values in Javascript?

分類Dev

Extract Values from JObject

分類Dev

Extract number from string

分類Dev

Extract from string in Java

分類Dev

Extract tensor from string

分類Dev

Extract dict from string

分類Dev

Find string length from select form field with multiple values with javascript

分類Dev

Powershell search in String and extract specific values of a string

分類Dev

What's the best way (ES6 allowed) to extract values from an array and convert them to a string?

分類Dev

Mysql extract json data and search multiple values

分類Dev

How to Extract digits from string?

分類Dev

Extract maximum number from a string

分類Dev

Extract Word Templates from String

分類Dev

PHP: extract string from other

分類Dev

Extract phone number from String

分類Dev

Extract numbers from string in MATLAB

分類Dev

Extract data from string with regex

分類Dev

Extract numbers from a string based on a another string

分類Dev

Extract string from Given vector string in R

分類Dev

Get Values from String

分類Dev

Check string contains multiple values

分類Dev

string get single / multiple values

分類Dev

How to extract attributes values from svyciprop object?

分類Dev

Extract values from a property file using bash

分類Dev

Use regex to extract all values from HTML

分類Dev

SQL Extract Values from Timestamp Difference

分類Dev

How to extract values from method in Angular ngFor