Day 16: Goose Extractor–An Article Extractor That Just Works

Today for my 30 day challenge, I decided to learn how to do article extraction using the Python programming language. I have been interested in article extraction for a few month when I wanted to write a Prismatic clone. Prismatic creates a news feed based on user interest. Extracting article’s main content, images, and other meta information is a very common requirement in most of the content discovery websites like Prismatic. In this blog post, we will learn how we can use a Python package called goose-extractor to accomplish this task. We will first cover some basics, and then we will develop a simple Flask application which will use the Goose Extractor API. Read the full article here https://www.openshift.com/blogs/day-16-goose-extractor-an-article-extractor-that-just-works