-
Notifications
You must be signed in to change notification settings - Fork 9
Research Objectives
Achyudh K Ram edited this page Apr 10, 2016
·
2 revisions
The objective of this research is to answer the following questions:
- How does a typical mailing list subscriber communicate with other subscribers?
- What are the invariant characteristics of a discussion thread on the mailing list?
- What are the suitable graph models (simple, weighted - directed, hypergraph) for the discussion threads and subscriber communication?
- In light of answers to above questions, what kind of updates can be done to mailing list filters in order to remove spam messages for subscribers.
- How can we model the temporal behaviour of the authors in the mailing list?
Further research objectives:
- Formulation of a proper algorithm for the identification of popular authors in a given mailing list.
- Degree distribution of the nodes and authors in the mailing list.
- Mean path length analysis
- Community detection: Identification of the various communities of authors in the mailing list and assigning proper labels for these communities.
- Stability and growth of an egocentric neighbourhood.
- Identification of a pattern that leads to changes in marked observers across generations in a thread.
We shall use Linux Kernel Mailing List (LKML) as the base case to form suitable hypothesis. After the hypotheses are formulated, we shall validate the hypotheses on other open mailing lists. This research can be divided into data collection, the analysis of threads and the author-centric analysis.
- Why do mailing list subscribers put people in to CC, BCC instead of To?
- What is the Dunbar number (Refer: https://en.wikipedia.org/wiki/Dunbar's_number) on LKML?
- What are the primary labels for each of the subscribers of the LKML?
- Is there a correlation between the number of authors and the length of a thread (either in time or the number of messages)?
- Each user can be thought of as a peer or a server. Servers arrive or depart with a high churn rate. A peer posts a question (ticket) and waits for service while the ticket is serviced by a server or sometimes self-serviced. If this is the case, what is the churn rate or diurnal activity and the message arrival and servicing characterization?
- Is there a correlation between the number of marked people in a node and the height of node? Does this relationship depend on the length of the entire thread?