CSCI3170 Introduction to Database Systems - CUHK Course Review

Term Taken: 2020 Spring

Remote Teaching due to COVID19

Instructor: Prof. Wong Man Hon

Grading Scheme

  • Final exam (30%)
  • Programming project (30%)
  • Assignments (30%)
  • Short assignments (10%)

Midterm exam cancelled due to COVID-19

Textbook

Database Management Systems. (Ramakrishnan, Gehrke)

Topics Covered

  • ER-diagram.
  • Relational database schema design.
  • Relational algebra.
  • SQL.
  • Schema refinement and normal forms.
  • Database application development.
  • Database storage and indexing.
  • Relational operation cost analysis.
  • Concurrency control and recovery.

Review

This is an elective course for CS majors. Most people regard this course as the easiest 3000-level course offered by the CS department, however, I think some of the contents in the course still require a lot of time to master.


The first part of the course focused on database schema design, that is, SQL, relational algebra, and schema refinement. Though most of the CS majors were already familiar with the basic operations of SQL (i.e. SELECT, INSERT, UPDATE, DELETE), there were still much to learn. For example, in this course, division, join, triggers, views, and aggregate queries were introduced. They are all very useful for developing more complex CRUD applications. After teaching SQL, the course went on to relational algebra, which is the formal theory which SQL is based on. Then, schema refinement and normal forms were taught. Normal form is also a very important topic for designing database schema in real-world applications. Topics like BCNF, 3NF, functional dependency, and canonical covers were introduced. Only by mastering these topics can you design minimized and scalable relational schema.


The second part of the course was a broad topic about the inner workings of database systems. This included indexing, execution plan analysis, concurrency control, and recovery. Indexing is an essential topic if you want to boost up the speed of database queries, and execution plan analysis lets you understand how the pages are accessed in the database storage. These two topics should be fairly easy if you are good at data structures (B+ tree and hash map in particular). The last two topics were concurrency control and recovery, some terms such as View Serializability, Conflict Serializability, 2-Phase Locking Protocol, Recoverability, Avoid Cascading Abort, and Strict will be introduced. Though this is a vast and complex topic, the course didn’t get too deep in it, so it wasn’t too difficult.


The project of the course was to write a library management system with a team of 1-3 person(s). The system should be written in Java with SQL embedded in. It was fairly easy since using the basic SQL operations was enough to fulfill the specifications and some common issues such as the N+1 problem would not be penalized. A person familiar with Java could easily finish it within a few days.


I heard that this course used to be a major required course for CS students, but was removed after a syllabus update. Still, I think every CS students should take this course so that they know how to write good SQL and learn about the inner workings of a storage system.

Project Link

GitHub


More CUHK Course Reviews