“Python eliminar palabras de parada” Código de respuesta

Python eliminar palabras de parada

from nltk.corpus import stopwords
nltk.download("stopwords")
stop = set(stopwords.words("english"))
filtered_words = [word.lower() for word in text.split() if word.lower() not in stop]
Plif Plouf

Python elimina todos excepto los números

>>> import re
>>> re.sub('\D', '', 'aas30dsa20')
'3020'
Dead Dingo

Cómo eliminar las palabras de detener en Python

# You need a set of stopwords. You can build it by yourself if OR use built-in sets in modules like nltk and spacy

# in nltk
import nltk
nltk.download('stopwords') # needed once
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize 
stop_words = set(stopwords.words('english')) 
example_sent = "This is my awesome sentence"
# tokenization at the word level
word_tokens = word_tokenize(example_sent) 
# list of words not in the stopword list
filtered_sentence = [w for w in word_tokens if not w.lower() in stop_words] 

# in spacy
# from terminal
python -m spacy download en_core_web_lg # or some other pretrained model
# in your program
import spacy
nlp = spacy.load("en_core_web_lg") 
stop_words = nlp.Defaults.stop_words
example_sent = "This is my awesome sentence"
doc = nlp(example_sent) 
filtered_sentence = [w.text for w in doc if not w.text.lower() in stop_words] 
wolf-like_hunter

Respuestas similares a “Python eliminar palabras de parada”

Preguntas similares a “Python eliminar palabras de parada”

Más respuestas relacionadas con “Python eliminar palabras de parada” en Python

Explore las respuestas de código populares por idioma

Explorar otros lenguajes de código