Can I recover a file extension option to correct this with a regular expression?
I am trying to match a string with an optional component, but something looks wrong.
* (header_ \ d {: wire is being matched by a printer log)
My RegEx (.NET Taste) As follows.10,11} _). * (_. * _ \ D {8}). * (\. \ W {3,4}). * ---------------- ---------------------------. * # Do not ignore some garbage in the ahead (Header_ # File Name Beginning Match, \ d {10,11} _) ID (including # 11 - 11 digits) * Do not ignore type code in # * _ \ D {8}) # matches some random characters, then 8-digit date. * # Do not ignore anything between this and file extension (\. \ W {3,4}) # Match file extension, 3 or 4 characters long. * Ignore the rest of the string #
I hope it will match the string:
str1 = "header_0000000602_t_mc2e1nrobr1a3s55niyrrqvy_20081212 [1 ] .doc [compatibility mode] "str2 =" Microsoft PowerPoint - header_00000000076_d_al41zguyvgqfj2454jki5l55_20071203 [1] .txt "Str3 =" header_0000000 0076_d_al41zguyvgqfj2454jki5l55_20071203 [1] "[/ code>
Where something like captcha groups Return:
$ 1 = header_0000000602_ $ 2 = _mc2e1nrobr1a3s55niyrrqvy_20081212 $ 3 = .doc
$ 3 empty if no file extension is found It is possible. $ 3 is an optional part, as you can see in the above straws.
If I "?" At the end of the third capture group "(. \ W {3,4})?", RegEx no longer captures $ 3 for any string. If I add "+" instead of "(. \ W {3,4}) ', the regular expression is no longer str3 exactly, which captures to be expected.
I feel like using "? "The third occupation is the right thing to do at the end of the group, but it probably does not work as I expected." * "I'm very naïve with classes that I use to ignore parts of the string.
Does not work as expected:
. * (Header_ \ d * _). * (_. * _. {8}). * (\. \ W {3,4}). *
There is a possibility that the last .
is greedy, you can try to change it:
. * (Header_ \ d * _). * (_. * _. {8}). * (? \. \ W {3,4}) .. ^ Added that
This was not correct, it was provided by you The first one will match the input, but it assumes that the first .
is the beginning of a file extension:
. * (Header_ \ d * _). * (_ . * _. {8}) [^.] * (\. \ W {3,4}). *
Edit: Remove from running in.
Comments
Post a Comment