파이썬 정규식

❮ 이전의 다음 ❯

RegEx 또는 정규식은 검색 패턴을 형성하는 일련의 문자입니다.

RegEx를 사용하여 문자열에 지정된 검색 패턴이 포함되어 있는지 확인할 수 있습니다.

정규식 모듈

rePython에는 정규식으로 작업하는 데 사용할 수 있는 이라는 내장 패키지 가 있습니다.

re모듈 가져오기 :

import re

파이썬의 정규식

모듈 을 가져오면 re정규식 사용을 시작할 수 있습니다.

예시

문자열을 검색하여 "The"로 시작하고 "Spain"으로 끝나는지 확인합니다.

import re

txt = "The rain in Spain"
x = re.search("^The.*Spain$", txt)

정규식 함수

이 re모듈은 일치하는 문자열을 검색할 수 있는 함수 세트를 제공합니다.

Function	Description
findall	Returns a list containing all matches
search	Returns a Match object if there is a match anywhere in the string
split	Returns a list where the string has been split at each match
sub	Replaces one or many matches with a string

메타 문자

메타 문자는 특별한 의미를 가진 문자입니다.

Character	Description	Example
[]	A set of characters	"[a-m]"
\	Signals a special sequence (can also be used to escape special characters)	"\d"
.	Any character (except newline character)	"he..o"
^	Starts with	"^hello"
$	Ends with	"planet$"
*	Zero or more occurrences	"he.*o"
+	One or more occurrences	"he.+o"
?	Zero or one occurrences	"he.?o"
{}	Exactly the specified number of occurrences	"he{2}o"
\|	Either or	"falls\|stays"
()	Capture and group

특수 시퀀스

특수 시퀀스는 \아래 목록의 문자 중 하나가 오고 특별한 의미를 갖습니다.

Character	Description	Example
\A	Returns a match if the specified characters are at the beginning of the string	"\AThe"
\b	Returns a match where the specified characters are at the beginning or at the end of a word (the "r" in the beginning is making sure that the string is being treated as a "raw string")	r"\bain" r"ain\b"
\B	Returns a match where the specified characters are present, but NOT at the beginning (or at the end) of a word (the "r" in the beginning is making sure that the string is being treated as a "raw string")	r"\Bain" r"ain\B"
\d	Returns a match where the string contains digits (numbers from 0-9)	"\d"
\D	Returns a match where the string DOES NOT contain digits	"\D"
\s	Returns a match where the string contains a white space character	"\s"
\S	Returns a match where the string DOES NOT contain a white space character	"\S"
\w	Returns a match where the string contains any word characters (characters from a to Z, digits from 0-9, and the underscore _ character)	"\w"
\W	Returns a match where the string DOES NOT contain any word characters	"\W"
\Z	Returns a match if the specified characters are at the end of the string	"Spain\Z"

세트

[]집합은 특별한 의미를 가진 한 쌍의 대괄호 안에 있는 문자 집합입니다 .

Set	Description	Try it
[arn]	Returns a match where one of the specified characters (`a`, `r`, or `n`) are present
[a-n]	Returns a match for any lower case character, alphabetically between `a` and `n`
[^arn]	Returns a match for any character EXCEPT `a`, `r`, and `n`
[0123]	Returns a match where any of the specified digits (`0`, `1`, `2`, or `3`) are present
[0-9]	Returns a match for any digit between `0` and `9`
[0-5][0-9]	Returns a match for any two-digit numbers from `00` and `59`
[a-zA-Z]	Returns a match for any character alphabetically between `a` and `z`, lower case OR upper case
[+]	In sets, `+`, `*`, `.`, `\|`, `()`, `$`,`{}` has no special meaning, so `[+]` means: return a match for any `+` character in the string

findall() 함수

이 findall()함수는 모든 일치 항목이 포함된 목록을 반환합니다.

예시

모든 일치 목록 인쇄:

import re

txt = "The rain in Spain"
x = re.findall("ai", txt)
print(x)

목록에는 찾은 순서대로 일치 항목이 포함됩니다.

일치하는 항목이 없으면 빈 목록이 반환됩니다.

예시

일치하는 항목이 없으면 빈 목록을 반환합니다.

import re

txt = "The rain in Spain"
x = re.findall("Portugal", txt)
print(x)

search() 함수

이 search()함수는 문자열에서 일치하는 항목을 검색하고 일치 하는 항목이 있으면 Match 개체 를 반환합니다.

둘 이상의 일치 항목이 있는 경우 일치 항목의 첫 번째 항목만 반환됩니다.

예시

문자열에서 첫 번째 공백 문자를 검색합니다.

import re

txt = "The rain in Spain"
x = re.search("\s", txt)

print("The first white-space character is located in position:", x.start())

일치하는 항목이 없으면 값 None이 반환됩니다.

예시

일치 항목이 없는 검색을 수행합니다.

import re

txt = "The rain in Spain"
x = re.search("Portugal", txt)
print(x)

split() 함수

이 split()함수는 일치할 때마다 문자열이 분할된 목록을 반환합니다.

예시

각 공백 문자에서 분할:

import re

txt = "The rain in Spain"
x = re.split("\s", txt)
print(x)

maxsplit 매개변수 를 지정하여 발생 횟수를 제어할 수 있습니다 .

예시

첫 번째 항목에서만 문자열을 분할합니다.

import re

txt = "The rain in Spain"
x = re.split("\s", txt, 1)
print(x)

sub() 함수

이 sub()기능은 일치 항목을 선택한 텍스트로 바꿉니다.

예시

모든 공백 문자를 숫자 9로 바꿉니다.

import re

txt = "The rain in Spain"
x = re.sub("\s", "9", txt)
print(x)

count 매개변수 를 지정하여 교체 횟수를 제어할 수 있습니다 .

예시

처음 2개의 항목을 바꿉니다.

import re

txt = "The rain in Spain"
x = re.sub("\s", "9", txt, 2)
print(x)

개체 일치

일치 개체는 검색 및 결과에 대한 정보를 포함하는 개체입니다.

참고: 일치하는 항목이 없으면 일치 None개체 대신 값이 반환됩니다.

예시

일치 개체를 반환하는 검색을 수행합니다.

import re

txt = "The rain in Spain"
x = re.search("ai", txt)
print(x) #this will print an object

Match 개체에는 검색 및 결과에 대한 정보를 검색하는 데 사용되는 속성과 메서드가 있습니다.

.span()일치의 시작 위치와 끝 위치를 포함하는 튜플을 반환합니다.
.string함수에 전달된 문자열을
.group()반환합니다. 일치하는 문자열 부분을 반환합니다.

예시

첫 번째 일치 항목의 위치(시작 및 끝 위치)를 인쇄합니다.

정규식은 대문자 "S"로 시작하는 모든 단어를 찾습니다.

import re

txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.span())

예시

함수에 전달된 문자열을 인쇄합니다.

import re

txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.string)

예시

일치하는 문자열 부분을 인쇄합니다.

정규식은 대문자 "S"로 시작하는 모든 단어를 찾습니다.

import re

txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.group())

참고: 일치하는 항목이 없으면 일치 None개체 대신 값이 반환됩니다.

❮ 이전의 다음 ❯

파이썬 튜토리얼

파일 처리

파이썬 모듈

파이썬 Matplotlib

기계 학습

파이썬 MySQL

파이썬 몽고DB

파이썬 참조

모듈 참조

파이썬 사용법

파이썬 예제