Skip to content

Architecture

Richard Lin edited this page Oct 9, 2020 · 1 revision

We have implemented DataSpread as a web-based tool on top of a PostgreSQL relational database implementing the Model-View-Controller approach.

The above figure shows the DataSpread's architecture, which at a high level can be divided into three main layers, i.e., (a) user interface, (b) execution engine, and (c) storage.

The user interface layer consists of a spreadsheet widget, which presents a spreadsheet on a web-based interface to users and records the interactions on it.

The execution engine layer is a web application developed in Java that resides on an application server. The controller accepts user interactions in the form of events and identifies the corresponding actions, e.g., a formula update is sent to the formula parser, an update to a cell is sent to the cell cache. The dependency graph captures the formula dependencies between the cells and aids in triggering the computation of dependent cells. The positional mapper translates the row and column numbers into the corresponding stored identifiers and vice versa. The ROM, COM, RCV, and hybrid translators use their corresponding spreadsheet representations and provide a "collection of cells" abstraction to the upper layers. This collection of cells are then cached in memory via an LRU cell cache.

The storage layer consists of a relational database, which is responsible for persisting data. This data is persisted using a combination of ROM, COM and RCV data models along with positional indexes, which map row and column numbers to corresponding stored identifiers and metadata, which records information about the hybrid data model, and which tables are responsible for handling which rectangular areas on the spreadsheet. The hybrid optimizer determines the optimal hybrid data model and is responsible for migrating data across different tables and primitive data models.

Clone this wiki locally