python-kerasHow can I use Python and Keras to perform text classification?
Text classification is a task that can be performed using Python and Keras. It involves assigning a category or label to a given piece of text. This can be done by first tokenizing the text into words, then creating a model using Keras layers and training it on a labeled dataset.
Example code
# Import necessary modules
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.layers import Dense, Flatten, Embedding
# Define parameters for tokenizer
num_words = 1000
tokenizer = Tokenizer(num_words=num_words)
# Fit the tokenizer on the training data
tokenizer.fit_on_texts(X_train)
# Create a sequence of tokens for each review
X_train_tokens = tokenizer.texts_to_sequences(X_train)
# Pad sequences so that all inputs have the same length
X_train_pad = pad_sequences(X_train_tokens, maxlen=max_tokens)
# Create a neural network model
model = Sequential()
# Add an embedding layer
model.add(Embedding(input_dim=num_words,
output_dim=8,
input_length=max_tokens,
name='layer_embedding'))
# Add a convolutional layer
model.add(Conv1D(16, 8, activation='relu'))
# Add a max pooling layer
model.add(MaxPooling1D(4))
# Add a flatten layer
model.add(Flatten())
# Add a fully connected layer
model.add(Dense(10, activation='relu'))
# Add a classifier layer
model.add(Dense(1, activation='sigmoid'))
# Compile the model
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
# Train the model
model.fit(X_train_pad, y_train,
epochs=5,
batch_size=32,
validation_split=0.2)
The code above uses the Keras library to perform text classification. It first tokenizes the text into words, then creates a model using Keras layers and trains it on a labeled dataset. The model consists of an embedding layer, a convolutional layer, a max pooling layer, a flatten layer, a fully connected layer, and a classifier layer. The model is then compiled and trained on the data.
Code explanation
-
from keras.preprocessing.text import Tokenizer
: imports the Tokenizer module from Keras, which is used for tokenizing text into words. -
from keras.preprocessing.sequence import pad_sequences
: imports the pad_sequences module from Keras, which is used for padding sequences so that all inputs have the same length. -
num_words = 1000
: defines the number of words to be used for tokenizing the text. -
tokenizer = Tokenizer(num_words=num_words)
: creates a Tokenizer object with the specified number of words. -
tokenizer.fit_on_texts(X_train)
: fits the tokenizer on the training data. -
X_train_tokens = tokenizer.texts_to_sequences(X_train)
: creates a sequence of tokens for each review. -
X_train_pad = pad_sequences(X_train_tokens, maxlen=max_tokens)
: pads sequences so that all inputs have the same length. -
model = Sequential()
: creates a neural network model. -
model.add(Embedding(input_dim=num_words, output_dim=8, input_length=max_tokens, name='layer_embedding'))
: adds an embedding layer to the model. -
model.add(Conv1D(16, 8, activation='relu'))
: adds a convolutional layer to the model. -
model.add(MaxPooling1D(4))
: adds a max pooling layer to the model. -
model.add(Flatten())
: adds a flatten layer to the model. -
model.add(Dense(10, activation='relu'))
: adds a fully connected layer to the model. -
model.add(Dense(1, activation='sigmoid'))
: adds a classifier layer to the model. -
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
: compiles the model. -
model.fit(X_train_pad, y_train, epochs=5, batch_size=32, validation_split=0.2)
: trains the model.
Helpful links
More of Python Keras
- How do I use validation_data when creating a Keras model in Python?
- How do I use Python Keras to zip a file?
- How can I improve the validation accuracy of my Keras model using Python?
- How do I check which version of Keras I am using in Python?
- How can I enable verbose mode when using Python Keras?
- How can I use word2vec and Keras to develop a machine learning model in Python?
- How do I use zero padding in Python Keras?
- How can I install the python module tensorflow.keras in R?
- How can I use Python and Keras to create a Variational Autoencoder (VAE)?
- How do I save weights in a Python Keras model?
See more codes...