Skip to content

This is an indexer meant to process .tsv tweet files (user, text, favourites, retweets, UNIX timestamp), and create a system of indexes to be later used for search purposes

Notifications You must be signed in to change notification settings

slobodaapl/TweetIndexer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TweetIndexer

This is an indexer meant to process .tsv tweet files (user, text, favourites, retweets, UNIX timestamp), and create a system of indexes to be later used for search purposes It also contains a search utility for the created index, capable of looking up users and their tweets (full text search), is able to predict a user's sentiment for a word, output search to a file, or output the most frequently occuring words.

Usage:

The program asks for a query, into which multiple things can be input separated by a space

  • text:"text to search" -> output all matching tweets and their similarity score
  • user:UserName -> outputs all tweets by the user
  • freq(from,to) -> The frequency of top words, from the top X (top 3rd place) to top X place (up to 17th place)
  • sentiment:UserName:word -> outputs the tweet plus the sentiment score from -1 to 1

Freq and sentiment can only be used alone, however text and user search can either be used independently or together. Due to the amount of output results it's a better idea to limit the amount displayed, or save the output to a file instead of the terminal:

-lX -> argument that limits output to top X best results.. for example: "user:Mihkail100 -l5" outputs only 5 tweets. With text, it outputs 5 best matches. -o -> saves output to output.txt for viewing

-l and -o can be used together, but can only be used with text and user search.

About

This is an indexer meant to process .tsv tweet files (user, text, favourites, retweets, UNIX timestamp), and create a system of indexes to be later used for search purposes

Resources

Stars

Watchers

Forks

Languages