Both sides previous revision Previous revision Next revision | Previous revision |
kb:zapisi:regex_main [2023/08/15 14:45] – [Greedy and Lazy match] milano | kb:zapisi:regex_main [2023/08/16 09:11] (current) – [Pogled napred i pogled iza — (?=) i (?<=)] milano |
---|
Kvantifikatori ( ''* + {}'') su *pohlepni operatori*, tako da proširuju podudaranje koliko god mogu kroz dati tekst. | Kvantifikatori ( ''* + {}'') su *pohlepni operatori*, tako da proširuju podudaranje koliko god mogu kroz dati tekst. |
| |
For example, ''<.+>'' matches ''<nowiki><div>simple div</div></nowiki>'' in ''This is a <nowiki><div>simple div</div></nowiki>'' test. In order to catch only the ''div'' tag we can use a ''?'' to make it lazy \\ | Na primer, ''<.+>'' odgovara ''<nowiki><div>simple div</div></nowiki>'' u ''This is a <nowiki><div>simple div</div></nowiki>'' test. Da bismo uhvatili samo oznaku ''div'', možemo koristiti ''?'' da je učinimo lenjim \\ |
''<.+?>'' matches any character **one or more** times included inside ''<'' and ''>'', **expanding as needed**. \\ | ''<.+?>'' podudara se sa bilo kojim znakom **jedan ili više** puta uključenim unutar ''<'' i ''>'', **proširujući po potrebi**. \\ |
| |
Notice that a better solution should avoid the usage of ''.'' in favor of a more strict regex: | Primetite da bi bolje rešenje trebalo da izbegava upotrebu ''.'' u korist strožijeg redovnog izraza: |
''<[^<>]+>'' matches any character except ''<'' or ''>'' one or more times included inside ''<'' and ''>''. | ''<[^<>]+>'' odgovara bilo kom znaku osim ''<'' ili ''>'' jednom ili više puta uključenim unutar ''<'' i ''>''. |
| |
---- | ---- |
| |
====== Advanced topics ====== | ====== Napredne teme ====== |
| |
===== Boundaries — b and B ===== | ===== Granice — b i B ===== |
''\babc\b'' performs a **"whole words only"** search \\ \\ | ''\babc\b'' vrši **"samo cele reči"** pretragu \\ \\ |
| |
''\b'' represents an **anchor like caret** (it is similar to ''$'' and ''^'') matching positions where **one side** is a word character (like ''\w'') and the **other side** is not a word character (for instance it may be the beginning of the string or a space character). | ''\b'' predstavlja **sidro kao karet** (slično je kao ''$'' i ''^'') odgovarajućim pozicijama gde je **jedna strana** karakter reči (kao ''\ v''), a **druga strana** nije karakter reči (na primer, može biti početak stringa ili razmak). |
| |
It comes with its **negation**, ''\B''. This matches all positions where ''\b'' doesn’t match and could be if we want to find a search pattern fully surrounded by word characters. | Dolazi sa svojom **negacijom**, ''\B''. Ovo se poklapa sa svim pozicijama na kojima se ''\b'' ne podudara i može biti ako želimo da pronađemo obrazac pretrage u potpunosti okružen znakovima reči. |
''\Babc\B'' matches only if the pattern is **fully surrounded** by word characters. | ''\Babc\B'' se podudara samo ako je obrazac **potpuno okružen** znakovima reči. |
| ===== Povratne reference — 1 ===== |
| |
===== Back-references — 1 ===== | ''([abc])\1'' koristeći ''\1'' odgovara **istom** tekstu koji je upario **prva grupa za snimanje**. \\ |
| ''([abc])([de])\2\1'' možemo da koristimo ''\2'' (3, 4, itd.) da identifikujemo **isti** tekst koji je **podudarala druga** (treća, četvrta, itd.) grupa za snimanje. \\ |
| ''(?<foo>[abc])\k<foo>'' grupi stavljamo ime ''foo'' i referenciramo ga kasnije (''k<foo>''). Rezultat je isti kao prvi regularni izraz. |
| |
''([abc])\1'' using ''\1'' it matches **the same** text that was matched by the **first capturing group**. \\ | |
''([abc])([de])\2\1'' we can use ''\2'' (3, 4, etc.) to identify **the same** text that **was matched by the second** (third, fourth, etc.) capturing group. \\ | |
''(?<foo>[abc])\k<foo>'' we put the name ''foo'' to the group and we reference it later (''k<foo>''). The result is the same of the first regex. | |
| |
| ===== Pogled napred i pogled iza — (?=) i (?<=) ===== |
| ''d(?=r)'' odgovara „d“ samo ako ga **prati r**, ali „r“ neće biti deo ukupnog podudaranja regularnog izraza. \\ |
| ''(?<=r)d'' odgovara „d“ samo ako mu **prethodi r**, ali r neće biti deo ukupnog podudaranja regularnog izraza. \\ |
| |
===== Look-ahead and Look-behind — (?=) and (?<=) ===== | Možete koristiti i operator negacije! |
''d(?=r)'' matches a ''d'' only if it is **followed by r**, but ''r'' will not be part of the overall regex match. \\ | |
''(?<=r)d'' matches a ''d'' only if it is **preceded by an r**, but r will not be part of the overall regex match. \\ | |
| |
You can use also the negation operator! | |
| |
''d(?!r)'' matches a ''d'' only if is **not followed by r**, but r will not be part of the overall regex match. \\ | |
''(?<!r)d'' matches a ''d'' only if is **not preceded by an r**, but r will not be part of the overall regex match. | |
| |
| ''d(?!r)'' odgovara „d“ samo ako **ne prati r**, ali r neće biti deo ukupnog podudaranja regularnog izraza. \\ |
| ''(?<!r)d'' odgovara „d“ samo ako mu **ne prethodi r**, ali r neće biti deo ukupnog podudaranja regularnog izraza. |