Building an Article Extraction Python API with newspaper3k and flask

Today, I was working on an application that required me to extract the main content html for a web page. This is called article extraction. Most of the time you want to extract the text of the article but I wanted to extract HTML of the main content. For example, if you are reading following WashingtonPost article then I want to extract the main HTML content on the left. I don’t want sidebar HTML containing ads or other information.

Continue reading “Building an Article Extraction Python API with newspaper3k and flask”

Python json object_hook

Today, I was building a REST client for one of the REST server applications using Python. I decided to use Python requests library for writing my REST API client. Requests is a very easy to use library that you can use to quickly bootstrap your REST API client. Writing REST client for REST endpoints was a matter of an hour. This REST API client will be used from our custom Jython(JVM implementation of Python) REPL. REST API has only two endpoints that return JSON objects. Response of first endpoint was fed to the second endpoint. I was returning the JSON response as Python dictionary. User can change values of the first response and pass it to the second API call. In Python, you work with dictionary as shown below. Continue reading “Python json object_hook”

Python abc.py Puzzle

Let me start with the confession that I am not an expert Python developer so this might not be a surprise for some of you. Yesterday, I was working on a Python REST API client using awesome requests library for one of my server application. To quickly hack my client, I created a Python virtual environment using virtualenv and installed required libraries using pip. I was ready to play with Python(again). I created a new file abc.py and added a method. For demonstration, let’s suppose our method is called hello, as shown below. Continue reading “Python abc.py Puzzle”

Using boto3 with Jython

Few days back I had a requirement that I had to use boto3 with Jython. boto3 is AWS EC2 python SDK that you can use to work with various Amazon Cloud API’s. Jython is the JVM implementation of Python. We were packaging our Jython scripts and boto3 and its dependencies inside a JAR. boto3 and Jython work great together when you use them in a normal way i.e. when boto3 can load its data model files from file system. This does not work when you package your script and its dependencies inside a JAR as the model files are then not available on the filesystem but are available on the classpath. In this blog, I will show you how we used boto3 to overcome this limitation. Continue reading “Using boto3 with Jython”

Getting Started with vSphere EXSi — The Missing Tutorial

Getting Started with vSphere

Today, I got an opportunity to work with vSphere. The plan for the day was to install vSphere on one of our machine and then connect to it using a Python API so that we can launch virtual machines. The official documentation lacked clarity and it was not easy for a newbie like me to get started with vSphere. Throughout the day we faced numerous problems, stumbled across many blogs and vmware forum posts, and finally managed to create our first VM via the official vSphere Python API — pyvmomi. In this detailed blog, I will go over all the steps required to get started with vSphere. We will start with how to install vSphere on a machine, then look at how to install command-line client on a linux machine, and finally learn how to talk to the vSphere host using Python. This blog is a work in progress and I will continue updating it as I learn more about vSphere. Continue reading “Getting Started with vSphere EXSi — The Missing Tutorial”

How to Write Node.js Applications in Python using PythonJS on OpenShift

Today I came across an interesting library called PythonJS that converts Python to JavaScript. It converts to JavaScript and generates JavaScript, CoffeeScript, and Dart code for the given Python code. In this blog you will learn how you can use PythonJS to deploy Python’s Tornado web framework application to OpenShift’s Node 0.10 cartridge. Read the full blog here https://www.openshift.com/blogs/how-to-write-nodejs-applications-in-python-using-pythonjs-on-openshift

Using Python Flask Jinja2 with Mustache

Today I was building a single page web application using Python Flask framework and Backbone.js and faced a problem where Jinja2 was parsing the mustache template. Both Jinja2 and Mustache use {{}} in their templates. When a user makes a first request, I render index.html that contains all my mustache templates as well. The solution to avoid Jinja2 from parsing Mustache templates is to put all the templates inside  {% raw %} and {% endraw %} as shown below.

{% raw %}
<script type="text/template" id="company-template">
	<a href="#companies/{{id}}/jobs" class="list-group-item">
    	<h4 class="list-group-item-heading">{{name}}</h4>
    	<p class="list-group-item-text">{{description}}</p>
  </a>
</script>
{% endraw %}

Build Your App on OpenShift Using Flask, SQLAlchemy, and PostgreSQL 9.2

Deploy Flask Python Apps on OpenShift

Let me start this blog by confessing that I am a Java guy who first learned Python three years back but haven’t used it much in my day to day work. So, after three long years, I have decided to brush up on my Python skills by developing a simple web application. By simple I don’t mean “Hello World” application but an application which does some work like storing data to a database. After spending some time googling “best web framework in Python,” I zeroed in on Flask. Flask is a microframework for Python based on Werkzeug and Jinja 2. It is a very easy to learn framework and is based on convention over configuration, which means that many things are preconfigured with sensible defaults.

You can read full blog here https://www.openshift.com/blogs/build-your-app-on-openshift-using-flask-sqlalchemy-and-postgresql-92