-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
Description
KNN right now does not maintain the state of the top k from one round of tweets to the next round. To see this in full effect have the tweets come in at a slow rate (set the sleep for 4000) and you will notice that for each query the one tweet record that is added is the top k every time. i.e. the query focal point and k are saved but the top k are not. I believe this is due to the built in priority queue that is holding the current top k needs to be maintained in a DStream. At the moment Thamir agreed to look into this but anyone is welcome help since I need to refocus my efforts on data mining for these next few weeks.
Reactions are currently unavailable