- Instructor: Andrew Crotty
- Day/Time: TuTh 2:00-3:20pm
- Location: Tech A110
This is a seminar course that will explore the design and implementation of key-value storage engines, which have become ubiquitous in a wide variety of modern data management applications. It is intended for graduate and advanced undergraduate students interested in systems research. Students will learn how to: (1) read and critically evaluate systems research papers; (2) craft presentations that distill and convey core research ideas; and (3) plan and execute a final project that answers an interesting systems research question.
Since this is a seminar course, it will not include exams or traditional homework assignments. Instead, students will spend most of their time reading, presenting, and discussing research papers. They will also complete a final project of their choosing. Grades will be determined based on the following criteria:
Each class, one student will present and lead the discussion for the assigned research paper. Presentations should be about 30 minutes long and cover the high-level ideas of the work, key technical details, experimental results, and any relevant background or related work. They should additionally include 3-5 explicit discussion questions for the class, either interspersed throughout the talk or at the end. Presenters are especially encouraged to point out any shortcomings of the paper and stimulate discussion about areas for improvement. Presentation slots are shown in the Schedule and will be determined on a first-come, first-served basis.
All students not presenting a given paper must prepare a short write-up that will be similar to a review from a systems conference. Write-ups are due prior to the start of each class and should include the following content:
- An overview of the main idea(s) proposed and summary of the contributions (1 solid paragraph)
- Three (or more) strong points about the paper (1-2 sentences each)
- Three (or more) weak points about the paper (1-2 sentences each)
Each class, one student will present a 5-10 minute overview of any topic related to data management (including presentations of their own research). Possible sources of inspiration include:
- Database of Databases: https://dbdb.io/
- CMU DB Group Seminar Series: https://www.youtube.com/c/cmudatabasegroup
- Northwest Database Society: https://www.youtube.com/channel/UCjTWKbxmf6uQ-l5Rp1g68BQ/videos
- Dutch Seminar on Data Systems Design: https://www.youtube.com/channel/UC-lFJg2eN9Qds1MBZzMM9mQ/videos
- Waterloo Data Systems Group: https://www.youtube.com/channel/UCEdu5yMr3Vry91tFjopWv6g/videos
Students may work individually or in groups of 2-3 to complete a final project of their choosing on an approved topic related to the course content. The project materials should include:
- Project proposal (5-10 minutes, modeled after a short conference talk)
- Project presentation (20-30 minutes, modeled after a full conference talk)
- Written report (at least 6 pages excluding references, modeled after a conference paper)
- Programming component (code, benchmarks, demo, etc.)
Date | Topic | Paper |
---|---|---|
9/20 | Course Introduction | link |
9/22 | LSM-tree | link |
9/27 | Berkeley DB | link |
9/29 | LevelDB | link |
10/4 | RocksDB | link |
10/6 | Leaper | link |
10/11 | Project Planning | - |
10/13 | LRU-K | link |
10/18 | Project Planning | - |
10/20 | Project Proposals | - |
10/25 | Voyager Neural Prefetcher | link |
10/27 | DeepBM Buffer Manager | link |
11/1 | Scout Prefetcher | link |
11/3 | Learned Access Patterns | link |
11/8 | Learned Buffer Replacement | link |
11/10 | Prefetching with ML | link |
11/15 | Competitive Caching with ML | link |
11/17 | Project Planning + Discussion | - |
11/22 | Thanksgiving Break (no class) | - |
11/24 | Thanksgiving Break (no class) | - |
11/29 | Work on Final Projects | - |
12/1 | Project Presentations | - |
12/6 | Projects Due (by 11:59pm) | - |
Please see below for additional course policies and information.
Plagiarism will not be tolerated. All writing (e.g., paper write-ups, final project report) must be your own. You may not copy (including minor rewording) directly from papers, outside sources, or other students. You are free to borrow from publicly available presentation materials (e.g., slides, images) and code with proper attribution, but what you produce should not simply be a facsimile of someone else's work. If at all in doubt about whether something is acceptable, please reach out to ask for clarification. See Northwestern's policies on Academic Integrity for additional information: https://www.northwestern.edu/provost/policies-procedures/academic-integrity/
This course has no formal provisions for late submission. Paper write-ups are due prior to the start of class on the day of the scheduled presentation. All other dates (e.g., presentations, final project deadlines) are firm. Extreme circumstances (e.g., medical emergencies) will be accommodated with accompanying written documentation (e.g., a doctor's note).
Northwestern is committed to fostering an inclusive and supportive environment for everyone. Please reach out if you have a disability or other condition that requires special accommodations. See the AccessibleNU website for more information: https://www.northwestern.edu/accessiblenu/
Being a student can be stressful. If you feel that you are under too much pressure or other issues are affecting your academic performance, please reach out to Northwestern's Counseling and Psychological Services (CAPS), which offers confidential counseling and can provide documentation for late policy accommodations. See the CAPS website for more information: https://www.northwestern.edu/counseling/