Опубликовано в 2025-01-31

Removing Emojis from Strings in Python

The provided Python code for removing emojis fails because it contains syntax errors. Unicode strings must be designated using the u'' prefix on Python 2. Additionally, the re.UNICODE flag should be passed to the regular expression, and the input data should be converted to Unicode using codecs:

import codecs
import re

text = codecs.decode('This dog \U0001f602'.encode('UTF-8'), 'UTF-8')
print(text) # with emoji

emoji_pattern = re.compile("["
        u"\U0001F600-\U0001F64F"  # emoticons
        u"\U0001F300-\U0001F5FF"  # symbols & pictographs
        u"\U0001F680-\U0001F6FF"  # transport & map symbols
        u"\U0001F1E0-\U0001F1FF"  # flags (iOS)
                           "]+", flags=re.UNICODE)
print(emoji_pattern.sub(r'', text)) # no emoji


<pre>This dog ?
This dog

Note: This pattern only matches a limited range of emojis. For a more comprehensive solution, refer to Unicode character ranges.

