Sentiment Analysis in Scala with Stanford CoreNLP

In this post, we will learn how to use Stanford CoreNLP library for performing sentiment analysis of unstructured text in Scala.

Sentiment analysis or opinion mining is a field that uses natural language processing to analyze sentiments in a given text. It has applications in many domains ranging from marketing to customer service. Few years back, I wrote a simple Java application using Naive Bayes classifier to determine whether people liked a movie or not based on sentiment analysis of tweets about a movie. Continue reading “Sentiment Analysis in Scala with Stanford CoreNLP”

Finatra Tutorial: Building Scalable Services The Twitter Way

Finatra is an open-source project by Twitter that can be used to build REST APIs in Scala programming language. Finatra builds on top of Twitter’s Scala stack — twitter-server, finagle, and twitter-util.

  1. Finagle: It can be used to construct high performance servers.
  2. Twitter Server: It defines a template from which servers at Twitter are built. It uses finagle underneath.
  3. Twitter-Util: A bunch of idiomatic, small, general purpose tools for Scala.
    Continue reading “Finatra Tutorial: Building Scalable Services The Twitter Way”

Working through sbt test deadlock

Today, I encountered an issue while running tests for one of my Scala SBT projects. Each time, ran sbt test command hang. After running jvisualvm, I discovered that it is due to thread deadlock. I couldn’t figure out why deadlock is happening. Test cases worked fine when ran individually. To work through this issue, I disabled parallel execution of tests.

From command-line, you can use following command to disable parallel execution of test:

$ sbt 'set parallelExecution in Test := false' test

You can also set this setting in your build.sbt to avoid setting this setting manually.In your build.sbt , add the following line.

parallelExecution := false

Hands-on guide for building Serverless applications

Yesterday, I released hands-on guide to building Serverless applications using AWS Lambda and Serverless framework. The guide is open-source and available on Github. Checkout the guide and please give feedback.

Serverless is an overloaded word. Serverless means different things depending on the context. It could mean using third party managed services like Firebase, or it could mean an event driven architecture style or it could mean next generation compute service offered by cloud providers or it could mean a framework to build Serverless applications. This series will start with an introduction to Serverless compute and architecture. Once we learned the basics, we will start developing application in a step by manner.

Read more https://github.com/shekhargulati/hands-on-serverless-guide.

Gatling: The Ultimate Load Testing Tools for Programmers

Welcome to the tenth blog of 52 Technologies in 2016 blog series. Gatling is a high performance open source load testing tool built on top of Scala, Netty, and Akka. It is a next generation, modern load testing tools very different from existing tools like Apache JMeter. Load testing is conducted to understand behavior of an application under load. You put load on the application by simulating users and measure its response time to understand how application will behave under both normal and anticipated peak load conditions.

Gatling can be used to load test your HTTP server. HTTP is not the only protocol that one can load test with Gatling. Gatling also has inbuilt support for Web Socket and JMS protocols. You can extend Gatling to support your protocol of choice.

Load testing is often neglected by most software teams resulting in poor understanding of their application performance characteristics. These days most software teams take unit testing and functional testing seriously but still they ignore load testing. They write unit tests, integration tests, and functional tests and integrate them in their software build. I think part of the reason developer still don’t write load tests has to do with the fact that most load testing tools are GUI based so you can’t code your load tests. They allow you to export your load test as XML.

You can read full blog at https://github.com/shekhargulati/52-technologies-in-2016/blob/master/10-gatling/README.md

Building A Lightweight Scala REST API Client with OkHttp

Welcome to the sixth blog of 52-technologies-in-2016 blog series. In this blog, we will learn how to write Scala REST API client for Medium’s REST API using OkHttp library. REST APIs have become a standard method of communication between two devices over a network. Most applications expose their REST API that developers can use to get work with an application programmatically. For example, if I have to build a realtime opinion mining application then I can use Twitter or Facebook REST APIs to get hold of their data and build my application. To work with an application REST APIs, you either can write your own client or you can use one of the language specific client provided by the application. Last few weeks, I have started using Medium for posting non-technical blogs. Medium is a blog publishing platform created by Twitter co-founder Evan Williams. Evan Williams is the same guy who earlier created Blogger, which was bought by Google in 2003.

Medium exposed their REST API to the external world last year. The API is simple and allows you to do operations like submitting a post, getting details of the authenticated user, getting publications for a user, etc. You can read about Medium API documentation in their Github repository. Medium officially provides REST API clients for Node.js, Python, and Go programming languages. I couldn’t find Scala client for Medium REST API so I decided to write my own client using OkHttp.

You can read the full blog here.

Slick 3: Functional Relational Mapping for Mere Mortals Part 2: Querying data

Last week we learnt the basics of Slick library. We started with a general introduction of Slick, then covered how to define a table definition, custom mappers, and perform insert queries. Today, we will learn how to perform select queries with Slick. Slick allows you to work with database tables in the same way as you work with Scala collections. This means that you can use methods like map, filter, sort, etc. to process data in your table.

You can read the full blog here https://github.com/shekhargulati/52-technologies-in-2016/blob/master/05-slick/README.md

Slick: Functional Relational Mapping for Mere Mortals Part 1

Welcome to the fourth blog of 52-technologies-in-2016 blog series. Today, we will get started with Slick. Slick(Scala Language-Integrated Connection Kit) is a powerful Scala library to work with relational databases. Slick is not an ORM library. It bases its implementation on functional programming and does not hide database behind an ORM layer giving you full control over when a database access should happen. It allows you to work with database just like you are working with Scala collections. Slick API is asynchronous in nature making it suitable for building reactive applications. Although Slick itself is asynchronous in nature, internally it uses JDBC which is a synchronous API. Slick is a big topic so today we will only cover basics. I will write couple more parts to this blog.

The core idea behind Slick is that as a developer you don’t have to write SQL queries. Instead, library will create SQL for you if you build the query using the constructs provided by the library.

You can read full blog here https://github.com/shekhargulati/52-technologies-in-2016/blob/master/04-slick/README.md

Sentiment Analysis in Scala with Stanford CoreNLP

So far in this series, we have looked at finatra and sbt open-source Scala projects. This week I decided to learn Stanford CoreNLP library for performing sentiment analysis of unstructured text in Scala.

Sentiment analysis or opinion mining is a field that uses natural language processing to analyze sentiments in a given text. It has applications in many domains ranging from marketing to customer service. Few years back, I wrote a simple Java application using Naive Bayes classifier to determine whether people liked a movie or not based on sentiment analysis of tweets about a movie.

From the Stanford CoreNLP website,

Stanford CoreNLP provides a set of natural language analysis tools. It can give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize dates, times, and numeric quantities, and mark up the structure of sentences in terms of phrases and word dependencies, indicate which noun phrases refer to the same entities, indicate sentiment, extract open-class relations between mentions, etc.

You can read full blog here https://github.com/shekhargulati/52-technologies-in-2016/blob/master/03-stanford-corenlp/README.md

SBT: The Missing Tutorial

Welcome to the second blog of 52-technologies-in-2016 blog series. From last year, I have started using Scala as my main programming language. One of the tools that you have to get used to while working with a programming language is a build tool. In my office projects, we use Gradle for all our projects be it Scala or Java. In most of my personal Scala projects, I have started using sbt as my preferred build tool. sbt is a general purpose build tool written in Scala. Most of the time we try to hack our way while using a build tool never learning it properly. As Scala will be the language that I will cover most in this series, I decided to thoroughly learn sbt this week. We(developers) often underestimate the importance of learning a build tool thoroughly and end up not using build tool in the most effective way. Good working knowledge of a build tool can make us more productive so we should take it seriously.

You can read full blog here https://github.com/shekhargulati/52-technologies-in-2016/blob/master/02-sbt/README.md