Skip to content

msu/csci-540-fall2019

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

105 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CSCI 540: Advanced Database Systems

NOTE: This is a live document and is subject to change throughout the semester.

Data is everywhere and often a database is a convenient way to store and process it. But is a relational database always the best way? In this class we will explore several advanced database models, computational paradigms for processing large data sets, and searching (indexing) techniques. Database models include spatial, key-value, columnar, document, and graph; Computational paradigms for large data sets include MapReduce and Streaming; Searching techniques include approx-NN, LSH, and inverted indices.

Meeting Times

Mon, Wed, Fri 09:00-09:50, 332 Reid Hall

Instructor

David L. Millman, Ph.D.

Email: david.millman@montana.edu

Office hours: Mon 15:00 - 15:50, Thurs 13:00-13:50, or by appointment

Office: Barnard Hall 359

Github: dlm

Bitbucket: david_millman

Learning Outcomes

After successfully completing this course, students will be able to:

  • Identify and Explain why a database or collections of databases is appropriate for a task
  • Build a system using polyglot persistence
  • Design and implement algorithms for searching and processing massive data sets

Textbook

No required text book but optional are highly recommended

Optional and highly recommended:

Others will be added as relevant.

Prerequisites

  • CSCI 440-- Database Systems: DBMS architecture; major database models; relational algebra fundamentals; SQL query language; index file structures, data modeling and management, entity relationship diagrams.

  • Comfort with a Unix based operating system.

  • Willingness to get your hands dirty installing and working with multiple

Class schedule

The lecture schedule is subject to change throughout the semester, but here is the current plan. Assignments and due dates will be updated as they're assigned in class.

Aug

Date Description Quiz Assigned Due Recommended Reading
08/26 Intro
08/28 Env setup
08/30 Relational Quiz 1 (Solution) Homework 0 7DB-Relational Day 1

Sept

Date Description Quiz Assigned Due Recommended Reading
09/02 NO CLASS (LABOR DAY)
09/04 Relational 7DB-Relational Day 2
09/06 Relational Homework 0 7DB-Relational Day 3
09/09 Relational Homework 1 7DB-Relational Day 2
09/11 Column 7DB-Hbase Day 1
09/13 Column Homework 2 Homework 1 7DB-Hbase Day 2
09/16 Document 7DB-Mongo Day 1
09/18 Document 7DB-Mongo Day 2
09/20 Document Quiz 2 Homework 3 Homework 2 7DB-Mongo Day 2
09/23 Graph 7DB-Neo4j Day 1
09/25 Graph Quiz 3 7DB-Neo4j Day 2
09/27 Graph Homework 4 Homework 3 7DB-Neo4j Day 2
09/30 Redis 7DB-Redis Day 1

Oct

Date Description Quiz Assigned Due Recommended Reading
10/02 Redis 7DB-Redis Day 2
10/04 Redis Homework 5 Homework 4 7DB-Redis Day 3
10/07 Hashing Quiz 4 PDS-Ch 1
10/09 Hashing PDS-Ch 2
10/11 Set Membership Homework 5 PDS-Ch 2
10/14 Cardinality PDS-Ch 3
10/16 NO CLASS (DAVE SICK)
10/18 NO CLASS (DAVE SICK)
10/21 Cardinality Presentation PDS-Ch 3
10/23 Cardinality Quiz 5 PDS-Ch 3
10/25 NO CLASS - Frequency - Video Quiz 6 Homework 6 Presentation PDS-Ch 4
10/28 Frequency PDS-Ch 4
10/30 Frequency PDS-Ch 4

Nov

Date Description Quiz Assigned Due Recommended Reading
11/01 MapReduce Quiz 7 Homework 7 Homework 6 MMD-Ch 2
11/04 MapReduce Quiz 8 Proj Proposal MMD-Ch 2
11/06 Similarity MMD-Ch 3 / PDS-Ch 6
11/08 Similarity Homework 7 MMD-Ch 3 / PDS-Ch 6
11/11 NO CLASS (VETERANS DAY)
11/13 Similarity MMD-CH 3 / PDS-CH 6
11/15 Realtime DBs (Saha, Rahman) Exam Proj Proposal Setup Overview Of RealtimeDBs
11/18 DB Security (Kelly, Turksonmez) Proj Discussion Setup
11/20 Rainbow Tables (Johnson) password hashing & salt
11/22 Blockchain DBs (Nelson) Proj Discussion Subspace BigchainDB Blockchains
11/25 Multi-Obj Query Plan (Harris, Zou) Proj
11/27 NO CLASS (THANKSGIVING BREAK) Multi-Obj PQO
11/29 NO CLASS (THANKSGIVING BREAK)

Dec

Date Description Assigned Due Recommended Reading
12/02 Community Detection (Gibbs, Hewitt) Graph Algos Ch 6
12/04 Streaming Clustering (Folkman, Whitman) clustering MMD--7.6
12/06 CouchDB versioning & Conflict Resolution (Hoy, Watson) about couch

| | | | | | | | 12/09 | (Finals week) 08:00-09:50 | | Proj Writeup & Presentation | |

Potential Upcoming Topics:

  • Journalling/Write ahead logging
  • Compression

Evaluation

Your grade for this class will be determined by:

Policies

Attendance

Attendance in class with not be taken but students are responsible for all material covered in class. If you are not in class, you cannot receive credit for quizzes. Attendance is strongly recommended.

Assignments

There will be regular homework assignments (about every week or every other week depending on the difficulty of the assignment) consisting of written problems and coding exercises. Homeworks will be posted in the schedule. If not specified, solutions should be submitted as a PDF on Brightspace. (The tool that I use for grading documents only works with PDFs, so any file format other than PDF will receive a 0.) Homework is due at 23:59 on the due date. Late homework will not be accepted.

You do NOT need to write up your solutions with LaTex, but I highly encourage you to do so. You can find some resources for getting started with latex (and for making figures, and keeping all those files safe with git) in the student resources repo.

I encourage collaboration, see collaboration section for details.

Discussion

Group discussions, questions, and announcements will take place on the Brightspace message board. is okay to send me a direct message or email if you have a question that you feel is not appropriate to share with the class. If, however, you send me an message with a question for which the response would be useful to the rest of the class, I will likely ask you to post publicly.

Collaboration

Collaboration IS encouraged, however, all submitted individual work must be your own and you must acknowledge your collaborators at the beginning of the submission.

On any group project, every team member is expected to make a substantial contribution. The distribution of the work, however, is up to the team.

A few specifics for the assignments. You may:

  • Work with anyone in the course.
  • Share ideas with others in the course
  • Help other teams debug their code or proofs.

You may NOT:

  • Submit a proof or code that you did not write.
  • Modify another's proof or code and claim it as your own.

Using resources in addition to the course materials is encouraged. But, be sure to properly cite additional resources. Remember, it is NEVER acceptable to pass others work off as your own.

Paraphrasing or quoting another's work without citing the source is a form of academic misconduct. Even inadvertent or unintentional misuse or appropriation of another's work (such as relying heavily on source material that is not acknowledged) is considered plagiarism. If you have any questions about using and citing sources, you are expected to ask for clarification. My rule of thumb is if I am in doubt, I cite.

By participating in this class, you agree to abide by the student code of conduct. Please review the policy.

Classroom Etiquette

Except for note taking and coding, please keep electronic devices off during class, they can be distractions to other students. Disruptions to the class will result in you being asked to leave the lecture and will negatively impact your grade.

Special needs information

If you have a documented disability for which you are or may be requesting an accommodation(s), you are encouraged to contact me and Disabled Student Services as soon as possible.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages