正则表达式匹配的两个或多个连续字符

Question

问题说明

使用常规的前pressions我想匹配一言而

Using regular expressions I want to match a word which

以字母开头
有英语alpahbets
数字，时间，连字符（。）（ - ），下划线（_）
在不应该有两个或更多的连续周期或连字符或下划线
在可以有多个时段或连字符或下划线

例如，

flin..stones或flin__stones或FLIN - 石头

flin..stones or flin__stones or flin--stones

是不允许的。

fl_i_stones或fli_st.ones或flin.stones或flinstones

fl_i_stones or fli_st.ones or flin.stones or flinstones

是允许的。

到目前为止，我经常EX pression是 ^ [A-ZA-Z] [A-ZA-Z \ D ._-] $

So far My regular expression is ^[a-zA-Z][a-zA-Z\d._-] $

所以，我的问题是如何使用正则EX pression做

So My question is how to do it using regular expression

Answer 1

正确答案

#1

您可以使用的前瞻和逆向引用以解决这个问题。但需要注意的是，现在你需要至少2个字符。起始字母和另一个（由于）。你可能想使该和 * 使第二个字符类可以重复0次或更多次：

You can use a lookahead and a backreference to solve this. But note that right now you are requiring at least 2 characters. The starting letter and another one (due to the ). You probably want to make that and * so that the second character class can be repeated 0 or more times:

^(?!.*(.)\1)[a-zA-Z][a-zA-Z\d._-]*$

如何超前的工作？首先，这是一个负面的预计。如果这个模式里面找到了匹配，超前导致整个模式失败，反之亦然。因此，我们可以有一个模式内的匹配，如果我们的做有两个连续的字符。首先，我们来看看在字符串中的任意位置（。* ），那么我们配单（任意）字符（。）和捕捉的用括号。因此，一个角色进入捕获组 1 。然后，我们要求所应遵循这本身捕获组（引用它\ 1 ）。因此，内部格局将在每一个位置上尝试在字符串中（由于回溯）是否有后跟自己的字符。如果发现这两个连续的字符，图案就会失败。如果它们不能被发现，发动机跳回其中先行开始（所述字符串的开头），并继续具有匹配的实际模式

How does the lookahead work? Firstly, it's a negative lookahead. If the pattern inside finds a match, the lookahead causes the entire pattern to fail and vice-versa. So we can have a pattern inside that matches if we do have two consecutive characters. First, we look for an arbitrary position in the string (.*), then we match single (arbitrary) character (.) and capture it with the parentheses. Hence, that one character goes into capturing group 1. And then we require this capturing group to be followed by itself (referencing it with \1). So the inner pattern will try at every single position in the string (due to backtracking) whether there is a character that is followed by itself. If these two consecutive characters are found, the pattern will fail. If they cannot be found, the engine jumps back to where the lookahead started (the beginning of the string) and continue with matching the actual pattern.

另外，您可以拆分这分成两个独立的检查。一个有效字符和首字母：

Alternatively you can split this up into two separate checks. One for valid characters and the starting letter:

^[a-zA-Z][a-zA-Z\d._-]*$

和一个用于连续字符（在这里您可以反转匹配结果）：

And one for the consecutive characters (where you can invert the match result):

(.)\1

这会大大增加你的code的可读性（因为它是比先行少晦涩），也将让你检测实际问题的模式，并返回一个适当的和有用的错误消息。

This would greatly increase the readability of your code (because it's less obscure than that lookahead) and it would also allow you to detect the actual problem in pattern and return an appropriate and helpful error message.

这篇好文章是转载于：学新通技术网

正则表达式匹配的两个或多个连续字符

问题说明

正确答案

YouTube API 不能在 iOS (iPhone/iPad) 工作，但在桌面浏览器工作正常?

iPhone，一张图像叠加到另一张图像上以创建要保存的新图像?(水印)

保持在后台运行的 iPhone 应用程序完全可操作

使用 iPhone 进行移动设备管理

在android同时打开手电筒和前置摄像头

扫描 NFC 标签时是否可以启动应用程序?

检查邮件是否发送成功

Android微调工具-删除当前选择

希伯来语的空格句子标记化错误

Android App 和三星 Galaxy S4 不兼容