I have a string, actually is a directory file name.
str='\\198.168.0.10\share\ccdfiles\UA-midd3-files\UA0001A_15_Jun_2014_08.17.49\Midd3\y12m05d25h03m16.midd3'
I need to extract the target substring 'UA0001A'
with matlab (well I would like think all tools should have same syntax). It does not necessary to be exact 'UA0001A'
, it is arbitrary alphabet-number combination. To make it more general, I would like to think the substring (or the word) shall satisfy
it is a alphabet-number combination word
it cannot be pure alphabet word or pure number word
it cannot include 'midd'
or 'midd3'
or 'Midd3'
or 'MIDD3'
, etc, so may use case-intensive method to exclude word begin with 'midd'
it cannot include 'y[0-9]{2,4}m[0-9]{1,2}d[0-9]{1,2}\w*'
How to write the regular expression to find the target substring?
Thanks in advance!
You can use
s = '\\198.168.0.10\share\ccdfiles\UA-midd3-files\UA0001A_15_Jun_2014_08.17.49\Midd3\y12m05d25h03m16.midd3';
res = regexp(s, '(?i)\\(?![^\W_]*(midd|y\d+m\d+))(?=[^\W_]*\d)(?=[^\W_]*[a-zA-Z])([^\W_]+)','tokens');
disp(res{1}{1})
See the regex demo
Pattern explanation:
(?i)
- the case-insensitive modifier\\
- a literal backslash(?![^\W_]*(midd|y\d+m\d+))
- a negative lookahead that will fail a match if there are midd
or y
+digits+m
+digits after 0+ letters or digits(?=[^\W_]*\d)
- a positive lookahead that requires at least 1 digit after 0+ digits or letters ([^\W_]*
)(?=[^\W_]*[a-zA-Z])
- there must be at least 1 letter after 0+ letters or digits([^\W_]+)
- Group 1 (what will extract) matching 1+ letters or digits (or 1+ characters other than non-word chars and _
).The 'tokens'
"mode" will let you extract the captured value rather than the whole match.
See the IDEONE demo
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments