Viral-Tweet-With-K-Nearest-Neighbor

A Machine Learning model that predicts whether or a tweet will go viral based on certain features associated to that tweet.

Requirements

Pandas
Numpy
Scikit Learn
Seaborn

The Data

The data associated with this project is I scraped it from the twitter. For, if you need to do the same first you'll need the Twitter Developer Account. And then create an app there and so you'll have your secret keys. So I scraped twitter for 5000 tweets. And get the data about the tweet text, number of followers that specific user has and number of followees i-e here it is mentioned as friends.
One thing to remember is that I scraped twitter for the keyword Machine Learning you can do at your own also one can scrape for some specific user as well.
Here is the decription of the data

Viral or Not

Define the viral tweet, as here if a tweet has greater than thousand retweets that means, it is viral otherwise not. For this viral is denoted by 1 and not viral is 0. We're doing this, because as we know the numbers do good in machine learning.

At Last

Plot the classifier score over different values of k and see the result. Here is my result, at most we can have 97+ accuracy which is awesome. Moreover, next we should definitely code for its confidence aka the precision of the algorithm as well.

Scraping the Data

To scraping the data from twitter I use this script to scrap and paste the twitter data in the form of csv file. This will create the -tweets.csv file. I use the keyword Machine Learning that's why I have MachineLearning-tweets.csv in the read_csv method. Here the data I used is not that much enough like 500 records is not sufficient I guess so if you like to train a model over this at least use 2000 plus records. As it is said that larger dataset doesn't fit every situation but in this case I can say if we go through some of the larger dataset our model can do well than it is now. I'm now planning of to do some more scraping randomly over a keyword and will merge that dataset and this one and then we'll try.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
LICENSE		LICENSE
README.md		README.md
Scraping Twitter Data by Keyword.py		Scraping Twitter Data by Keyword.py
Viral Tweets by KNN.ipynb		Viral Tweets by KNN.ipynb
data description.PNG		data description.PNG
plot.png		plot.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Viral-Tweet-With-K-Nearest-Neighbor

Requirements

The Data

Viral or Not

At Last

Scraping the Data

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

taneemishere/Viral-Tweet-With-K-Nearest-Neighbor

Folders and files

Latest commit

History

Repository files navigation

Viral-Tweet-With-K-Nearest-Neighbor

Requirements

The Data

Viral or Not

At Last

Scraping the Data

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages