Date: Tuesday, September 25, 2018
Time: 9:00am - 5:00pm
Location: Texas Advanced Computing Center, Austin, TX
Registration: Registration
The R language is used within the Data Science community as a pragmatic problem solving tool. While more general purpose languages like Python, Javascript, and PHP are often the choice of Science Gateway developers, R provides attractive options to enhance existing gateways with visualization, analytics, and data manipulation to rapidly construct independent gateways and REST APIs. In this 3 hour, hands-on tutorial, participants will begin with a brief introduction to R for data science, then receive a short primer on using the Agave Platform's R SDK to generate and share remote datasets. After the break, they will take the visualization and analysis examples from the first two sections and expose them as a small visualization gateway using Shiny and a secure REST API using OpenCPU and the Agave Platform.
-
Content Level: Introductory: 67%, Intermediate: 33%
-
Audience: developers, administrators, and enthusiasts from the gateways community interested in learning about some non-traditional ways that technologies common to the data analytics and data science community can be applied to science gateway projects to add value and functionality to more traditional science gateway technology stacks.
-
Prerequisites: Participants with some experience in web application development and scripting languages (R, Python, Javascript, PHP, etc.) will benefit the most. Users with R experience will find value in the Data Science notebooks and additional materials. Prior experience in R is NOT required.
-
Required: Participants must provide their own laptop, tablet, or the like with a modern web browser to access the training environment.
| Duration | Presenter(s) | Description |
|---|---|---|
| 15min | Rion & Sean | Introductions and Overview: Introduce instructors and learners. Lay out learning objectives and provide overview. slides |
| 30 min | Rion | Agave Overview: Brief introduction to the Agave Platform |
| 45 min | Sean | Intro to R for data science: Clone and customize an example app. Run the app both through the web API and the notebook |
| 15 min | -- | Break |
| 45 min | Rion | Hands on with the Agave R SDK: Primer on using the Agave R SDK to access data, run code, collaborate with other participants, and integrate into their notebook and gateway environments |
| 30 min | Rion | Building Apps, APIs, and Functions in R Primer on architectural approaches to support gateways in R. |
| 15 min | Sean | Hands on building Shiny apps: Intro to Shiny and pragmatic gateway design |
| 60 min | -- | Lunch |
| 20 min | Sean | Hands on building Shiny apps: Intro to Shiny and pragmatic gateway design |
| 10 min | Rion | Securing and publishing your Shiny app: Publish the example code as a standalone web application, add security, discuss integration strategies |
| 20 min | Sean | Hands on building a Plumber API: Intro to Plumber and best practices for REST API design |
| 20 min | Rion | Securing and publishing your Plumber API: Publish the example code as a REST API with Plumber, add API Management and security |
| 20 min | Sean | Hands on building an OpenCPU API: Intro to OpenCPU and best practices for REST API design |
| 20 min | Rion | Securing and publishing your OpenCPU API: Publish the example code as a REST API, add API Management and security |
| 5 min | Rion | Additional Resources: Connect participants with additional materials |
| 5 min | Rion & Sean | Conclusion/Wrap-Up: Next steps and ways to stay connected |
Rion Dooley |
Rion Dooley is principal investigator on the Agave Project a Science-as-a-Service API platform allowing researchers worldwide to manage data, run code, collaborate freely, and integrate their science anywhere. His previous projects span areas of identity management, distributed web security, full-stack application development, data management, cloud services, and high performance computing. He earned his Ph.D. in computer science from Louisiana State University. Rion actively puts his wife and two daughters at the top of his list of accomplishments. He hopes his work can someday edge out dancing teddy bears and smear-proof lipstick on their lists of favorite inventions. |
Sean Cleveland |
Sean Cleveland is a Cyberinfrastructure Research Scientist in the Department of Cyberinfrastructure within Information Technology Services at the University of Hawaii. He focuses on supporting the research mission by providing technology assistance and solutions for researchers across the University of Hawaii system. In addition, he is assisting in the rollout of new centralized computing resources and services for the University system. Sean earned his B.S. in Computer Science from Montana State University in 2002 and his doctorate in Microbiology, with an emphasis in Bioinformatics, from Montana State University in 2013.His research interests include evolutionary dynamics, disorder, compensatory mutations and intra-residue contact prediction, and data lifecycle management |
This material is based upon work supported by the National Science Foundation Division of Advanced CyberInfrastructure (1127210).
Special thanks go out to Mahdi Belcaid for his help on the Data Science curriculum.

