Skip to content
Yutaka Takeda edited this page Aug 24, 2014 · 2 revisions

Introduction

I would assume you are reading this because you care about scalability and high availability of your system, and you need to know exactly what this module is doing before jumping to it right away. It's a valid concern. Let me explain how this module works so that you will be able to understand the actual code better.

Overview

Redis tables

This module uses 4 tables on redis-server:

  1. Global hash (hashes) - for global variables
  2. Channel/listener list (list) - list of listeners
  3. Event ID list (sorted set) - list of event IDs sorted by expiration time
  4. Event list (hashes) - list of events.

Why using lua script

To juggle above 4 tables, lua is required to complete operation atomically. For loading of the lua, lured is used because it makes everything easier and it allows me to directly use SHA values.

Use Cases

It is assumed that dtimer be used in a clustered environment and for one of these use cases:

  • One emitter and many listeners. (master & workers)
  • Many emitters to many listeners. (distributed workers)
  • Mixture of above (namespacing)

Behavior

Basic Idea

Essentially, dtimer posts an event into a list sorted by expiration time, then all listers check to see if there's due events. This is robust, but delivery time of due event may not be accurate. In addition to this polling strategy, dtimer makes use of pubsub to adjust next listeners polling time. This allows dtimer to fire event more accurately in time, without having to increase all listeners polling frequency.

Details

The lua script (lib/update.lua) implements the heart of dtimer process. The script looks for an event that will expire next, then sends a pubsub message to a listener that is at the top of the list, to tell the listener to update its local interval time (clearTimeout, then setTimeout with new interval). Pubsub is used only to update next listener's interval time. On timeout (by setTimeout), the listener will fetch due events, while moving itself to the back of the listener list to accomplish round-robin based load balancing.

It has some guard for the case where many events are posted with very close expiration time, which could cause bursty pubsub messages for interval updates to the same listener. To mitigate this, lua script implements so that pubsub message (for interval time update) is only sent when the new expiration time is more than 20 msec (called 'gracePeriod') apart from the previous. Therefore, accuracy in event notification time is between actual RTT and ~20 msec. There may be a case where a listener suddenly goes away. In that case, due event will eventually picked up by the next listener (everyone is polling here - hence robust) with the default interval in zero to N (number of listeners) seconds at worst. In addition, when doing pubsub, the lua script checks the return value to learn the message was sent to subscriber, so lua script can detect the disconnection to the listener which is already gone, then assign the next listener on the list to send the interval update using pubsub.

(TODO - complete this page)

Clone this wiki locally