Skip to content

resource-disaggregation/jiffy-artifact

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

409 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Artifact evaluation

DOI

This repo contains scripts and instructions for running the experiments from "Honeycomb: Fine-grained Sharing of Ephemeral Storage for Serverless Analytics"

Important other repositories:

  • jiffy The main Jiffy repo containing source code, build instructions, documentation, and more.
  • snowset Snowflake workloads that are used for many of our experiments. We include specific traces used for this paper in this repo to keep it self contained.

AWS: A Note for Artifact Evaluators

For artifact evaluation, we deploy all of our systems on AWS Cloud Services, since our prototype relies heavily on AWS Lambda for serverless applications and AWS EC2 for hosting various systems. To reduce evaluator burden, we will provide pre-configured instances with all relevant system configured properly. However, due to AWS EC2 per-user vCPU limits and the high cost of EC2 instances, we are unable to keep all instances running throughput the evaluation period. We request the evaluators to reserve time-slots through this calendar, and we will make sure the instances are available before the time-slot starts. We request evaluators to mark themselves as Reviewer A/B/C, etc., to preserve anonymity. A private access key will be used to access all EC2 instances; we plan to share the key with evaluators anonymously with the evaluators prior to the start of their timeslot. Once the private key is provided, the following steps should permit ssh access to the instances:

chmod 400 key.pem
ssh -i key.pem ubuntu@public_ip

We also provide AWS EC2 AMI images for all systems, saving the effort from setting up their specific environments if the evaluators want to launch the instances from their own AWS accounts.

Please check this document for tips of using aws EC2 machines.

Note (Updated March 2nd 2022): We will really appreaciate it if reviewers could help shutdown the system when they finish their testing earlier before the time slot ends. Please check this document for how to shutdown different systems.

Directory structure

  • conf Configurations files inclusing AWS EC2 instance information. Generated by scripts automatically.
  • docs Documentation for general environment setup used by all experiments.
  • exp_e1 Job performance and resource utilization evaluation for Jiffy, Elasicache, Pocket, reported in Figure 9 of Section 6.1 in the paper.
  • exp_e2 Throughput and latency evaluation for six systems (S3, DynamoDB, Apache Crail, ElastiCache, Pocket and Jiffy), reported in Figure 10 of Section 6.2 in the paper.
  • exp_e3 Lifetime management and data repartitioning evaluation for Jiffy, reported in Figure 11 of Section 6.3 in the paper.
  • exp_e4 Controller overhead results for Jiffy, reported in Figure 12 of Section 6.4 in the paper.
  • scripts Contains fast scripts for aws and all systems.

Availability and Functionality: Building, Configuring and Deploying Honeycomb

While we provide EC2 AMIs and instances for easier reproducibility of experiment results, we recommend following the instructions provided here to build, configure and deploy Honeycomb. The Honeycomb code itself can be found here.

Reproducing Results

AWS EC2 Instances and AWS Lambda Functions

We use AWS EC2 and Lambda services as the evaluation platform. Most experiments require 10-13 m4.16xlarge (64 vCPUs, 256GB DRAM, 25Gb/s network bandwidth) EC2 instances for all evaluated systems. All serverless applications are deployed on AWS Lambda.

We list the configuration for all systems below:

  • jiffy 10 storage servers, 1 directory server, 1 client server
  • pocket 5 DRAM servers, 5 NVME servers, 1 metadata server, 1 controller server, 1 client server
  • s3 A single S3 bucket, AWS takes care of auto-scaling
  • Apache crail 10 TCP datanode servers, 1 namenode server, 1 client server
  • DynamoDB A single table with 10000 read/write capacity, auto-scaling disabled 1 client server
  • Elasticache We emulate Elasticache services by deploying redis directly on EC2 servers. The performance is identical while it's easier to setup and the cost is less.

We also provide a special gateway EC2 instance (also referred to as client servers for some systems above), which has AWS user credentials setup. Any lambda functions can be issued, and any S3/dynamodb object can be accessed directly from this instance without additional authentication. Evaluators do not need to launch/stop instances at any point of time.

Note: We will try to allocate spot-instances, which are much more cheaper than on-demand instances. The downside is that the server may get destroyed during the evaluation process. We will set the reclaim threshold as high as possible to avoid this.

Experiments

Experiments described in the paper can be run using the scripts provided in this repository. We have also provided descriptions of how to run the experiments manually, but we recommend using provided pre-deployed EC2 instances or EC2 AMIs to avoid configuration overheads.

The repository is structured based on the Evaluation section in the paper. The following table summarizes different experiments in the paper and the directory containing the respective experiment scripts. The READMEs in the respective experiment directories explain the experiment in detail.

Experiment Name / Section / Paragraph Related Figures Experiment Directory Estimated time
6.1. Benefits of Honeycomb Figure 9 exp_e1 5hrs
6.2. Performance Benchmarks for Six Systems Figure 10 exp_e2 5hrs
6.3. Understanding Honeycomb Benefits Figure 11 exp_e3 2hr
6.4. Controller Overheads Figure 12 exp_e4 0.5 hr

Using the Experiment scripts

Environment Variables

Please set the following environment variables:

  • ARTIFACT_ROOT points to location of the artifact repo (~/jiffy-artifact in the provided instances/AMIs)

Script Modifications

Unless explicitly mentioned in the README, users are not required to modify any components in the scripts

Troubleshooting

Please first try to update the artifact repo to the latest version:

cd ~/jiffy-artifact
git pull origin main

Please use GitHub issues for any issues during the evaluation. For anonymity, please create a new GitHub account with a random name.

About

Artifact evaluation for "Jiffy: Fine-grained Sharing of Ephemeral Storage for Serverless Analytics"

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors