Regular expression for matching either or

sbolla

I am doing this in Notepad++

Here's how my data looks like

N|12345|JOHN|TAKÁCSI|blah|blah|
N|12466|PÉTER|VÁLI|blah|blah|
Y|45645|SÁNDAR|SÁKU|blah|blah|
N|89789|DÓRA|MERRY|blah|blah|


My regular expression: ^([N|Y]\|.*\|.*[^\x00-\x7F].*\|.*[^\x00-\x7F].*\|)

which is matching only the rows that have that UTF characters in both the first and lastname.
Is not showing if either name has that character.

How to get that?

DrCord

^[NY]\|\d{5}\|(?:[\w_]+[^\x00-\x7F]?[\w_]+\|){2}(?:[\w_]+[\x00-\x7F]?[\w_]+\|){2}$

matches:

N|12345|JOHN|TAKÁCSI|blah|blah|
N|12466|PÉTER|VÁLI|blah|blah|
Y|45645|SÁNDAR|SÁKU|blah|blah|
N|89789|DÓRA|MERRY|blah|blah|

does not match:

N|89789|DÓRA|MERRY|blah|blÓh|
N|89789|DoRA|MERRY|blaÓh|blah|
N|89789|DoRA|MERRY|blaÓh|blÓah|

You were checking for both to have UTF characters, I changed it to only need to match one, the other is not mandatory now. I have also used parts of @HamZa's answer below to modify this answer to suit your data set and wants.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related