API · Big Data · Books · Data Mining · NLP · Python · Text Analytics · Text Mining

Mastering Social Media Mining with Python

Great news, my book on data mining for social media is finally out! The title is Mastering Social Media Mining with Python. I’ve been working with Packt Publishing over the past few months, and in July the book has been finalised and released. Links: ebook and paperback on Packt Publishing (the publisher) ebook and paperback… Continue reading Mastering Social Media Mining with Python

Big Data · Data Mining · Engineering · Python

Building Data Pipelines with Python and Luigi

As a data scientist, the emphasis of the day-to-day job is often more on the R&D side rather than engineering. In the process of going from prototypes to production though, some of the early quick-and-dirty decisions turn out to be sub-optimal and require a decent amount of effort to be re-engineered. This usually slows down… Continue reading Building Data Pipelines with Python and Luigi

Data Mining · NLP · Python · Sentiment Analysis

Mining Twitter Data with Python (Part 6 – Sentiment Analysis Basics)

Sentiment Analysis is one of the interesting applications of text analytics. Although the term is often associated with sentiment classification of documents, broadly speaking it refers to the use of text analytics approaches applied to the set of problems related to identifying and extracting subjective material in text sources. This article continues the series on… Continue reading Mining Twitter Data with Python (Part 6 – Sentiment Analysis Basics)

Data Mining · Data Visualisation · NLP · Python

Mining Twitter Data with Python: Part 5 – Data Visualisation Basics

A picture is worth a thousand tweets: more often than not, designing a good visual representation of our data, can help us make sense of them and highlight interesting insights. After collecting and analysing Twitter data, the tutorial continues with some notions on data visualisation with Python. Tutorial Table of Contents: Part 1: Collecting data… Continue reading Mining Twitter Data with Python: Part 5 – Data Visualisation Basics

Data Mining · NLP · Python

Mining Twitter Data with Python (Part 4: Rugby and Term Co-occurrences)

Last Saturday was the closing day of the Six Nations Championship, an annual international rugby competition. Before turning on the TV to watch Italy being trashed by Wales, I decided to use this event to collect some data from Twitter and perform some exploratory text analysis on something more interesting than the small list of… Continue reading Mining Twitter Data with Python (Part 4: Rugby and Term Co-occurrences)

Data Mining · NLP · Python

Mining Twitter Data with Python (Part 3: Term Frequencies)

This is the third part in a series of articles about data mining on Twitter. After collecting data and pre-processing some text, we are ready for some basic analysis. In this article, we’ll discuss the analysis of term frequencies to extract meaningful terms from our tweets. Tutorial Table of Contents: Part 1: Collecting data Part… Continue reading Mining Twitter Data with Python (Part 3: Term Frequencies)

Data Mining · NLP · Python

Mining Twitter Data with Python (Part 2: Text Pre-processing)

This is the second part of a series of articles about data mining on Twitter. In the previous episode, we have seen how to collect data from Twitter. In this post, we’ll discuss the structure of a tweet and we’ll start digging into the processing steps we need for some text analysis. Table of Contents… Continue reading Mining Twitter Data with Python (Part 2: Text Pre-processing)