I'm a beginner to regex and I am trying to make an expression to find if there are two of the same digits next to each other, and the digit behind and in front of the pair is different.
For example,
123456678 should match as there is a double 6,
1234566678 should not match as there is no double with different surrounding numbers. 12334566 should match because there are two 3s.
So far i have this which works only with 1, and as long as the double is not at the start or end of the string, however I can deal with that by adding a letter at the start and end.
^.*([^1]11[^1]).*$
I know i can use [0-9]
instead of the 1s but the problem is having them all be the same digit.
Thank you!
With regex, it is much more convenient to use a PyPi regex
module with the (*SKIP)(*FAIL)
based pattern:
import regex
rx = r'(\d)\1{2,}(*SKIP)(*F)|(\d)\2'
l = ["123456678", "1234566678"]
for s in l:
print(s, bool(regex.search(rx, s)) )
See the Python demo. Output:
123456678 True
1234566678 False
Regex details
(\d)\1{2,}(*SKIP)(*F)
- a digit and then two or more occurrences of the same digit|
- or(\d)\2
- a digit and then the same digit.The point is to match all chunks of identical 3 or more digits and skip them, and then match a chunk of two identical digits.
See the regex demo.
You can also use a simple way .
import re
l=["123456678",
"1234566678",
"12334566 "]
for i in l:
matches = re.findall(r"((.)\2+)", i)
if any(len(x[0])!=2 for x in matches):
print "{}-->{}".format(i, False)
else:
print "{}-->{}".format(i, True)
You can customize this based on you rules.
Output:
123456678-->True
1234566678-->False
12334566 -->True
Inspired by the answer or Wiktor Stribiżew, another variation of using an alternation with re
is to check for the existence of the capturing group which contains a positive match for 2 of the same digits not surrounded by the same digit.
In this case, check for group 3.
((\d)\2{2,})|\d(\d)\3(?!\3)\d
(
Capture group 1
(\d)\2{2,}
Capture group 2, match 1 digit and repeat that same digit 2+ times)
Close group|
Or\d(\d)
Match a digit, capture a digit in group 3\3(?!\3)\d
Match the same digit as in group 3. Match the 4th digit, but is should not be the same as the group 3 digitFor example
import re
pattern = r"((\d)\2{2,})|\d(\d)\3(?!\3)\d"
strings = ["123456678", "12334566", "12345654554888", "1221", "1234566678", "1222", "2221", "66", "122", "221", "111"]
for s in strings:
match = re.search(pattern, s)
if match and match.group(3):
print ("Match: " + match.string)
else:
print ("No match: " + s)
Output
Match: 123456678
Match: 12334566
Match: 12345654554888
Match: 1221
No match: 1234566678
No match: 1222
No match: 2221
No match: 66
No match: 122
No match: 221
No match: 111
If for example 2 or 3 digits only is also ok to match, you could check for group 2
(\d)\1{2,}|(\d)\2