Skip to content

Final project submission for Natural Language Processing course at the UC Berkeley School of Information, and grand prize winning submission for the 2016 Wells Fargo Campus Analytics Challenge

Notifications You must be signed in to change notification settings

pdglenn/WellsFargoAnalyticsChallenge

Repository files navigation

Wells Fargo Analytics Challenge

Final project submission for Natural Language Processing course at the UC Berkeley School of Information, and grand prize winning submission for the 2016 Wells Fargo Campus Analytics Challenge.

The challenge provided a dataset of social media messages about four major banks and asked the question: What are banking customers saying on social media?

Our approach to the problem was a four­step process:

  1. Clean data by removing irrelevant messages and common words resulting from data preprocessing (i.e. NAME, ADDRESS)
  2. Identify main topics being discussed with bigram collocations
  3. Cross-tabulate messages by topic and bank
  4. Use Latent Dirichlet Allocation (LDA) clustering to further separate and identify substance of messages

Submitted by Paul Glenn and Vijay Velagapudi

About

Final project submission for Natural Language Processing course at the UC Berkeley School of Information, and grand prize winning submission for the 2016 Wells Fargo Campus Analytics Challenge

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published