Machine Learning – Natural Language Processing in Python

Warren HansenMachine LearningLeave a Comment

I wanted to show in a simple example how easy it is to process some text and assign a value to it in Python. As data scientists, this is an often called on task. The first step is cleaning the data. I’ll give an example of processing reviews from a restaurant. At the start of this example, I’ll keep it simple and break down how to process one review instead of the whole dataset. Then show you haw to loop through the entire file.

Here is the example text

First, let’s import the libraries and dataset

Next step is to clean the text and remove the operators

This changes
Wow… Loved this place.
to
wow loved this place

After that, we remove the words without significance like “of or the” ect…

This gives us
wow love place

That’s the end of cleaning the data. Now I will wrap it all together by looping through each line in the data and forming a new cleaned list called corpus.

Here is the console

As a reminder, here is the original dataset

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.