CS 315: Principles of Database Systems

Units: 3-0-0-9
Pre-requisites: ESC101, CS210, CS330

Course Contents:

  1. Introduction: Database applications, purpose, accessing and modifying databases, need for transactions, architecture - users and administrators, data mining, information retrieval. iRelational Databases: relational model, database schema, keys, relational query languages, algebra, tuple and domain calculus example queries, (optional: equivalence of relational calculus and relational algebra).

  2. SQL: Data definition, basic SQL query structure, set operations, nested subqueries, aggregation, null values, database modification, join expressions, views.

  3. Database Design: E-R model, E-R diagram, reduction to relational schema, E-R design issues, database integrity, specifying integrity constraints in SQL: unique columns, foreign key, triggers.

  4. Relational Database Design: features of good design, Functional Dependency theory, decomposition using functional dependency and normal forms, algorithms for decomposition, normal forms, (optional: multi-valued dependency and 4th normal form).

  5. Storage and File structure: Overview of secondary storage, RAID and flash storage. Storing tables: row-wise, column database, database buffer. Indexing: concepts, clustered and non-clustered indices, B+-tree indices, multiple key access, hashed files, linear hash files, bitmap indices, Index definition in SQL, ++R-trees.

  6. Query Processing: Overview, measures of query cost, selection, sorting, join processing algorithms-nested loops, merge-sort, hash join, aggregation.

  7. Query Optimization: purpose, transformation of relational expressions, estimating cost and statistics of expression, choosing evaluation plans, linear and bushy plans, dynamic programming algorithms.

  8. Transactions: Concept and purpose, ACID properties and their necessity, transactions in SQL. Problems with full isolation and levels of isolation.

  9. Concurrency Control: lock-based protocols, 2-phase locking, deadlock handling, multiple granularity, timestamp based protocols, index locking, (optional: validation protocols, multi-version protocols, snap shot isolation, predicate locking, concurrency control for index structures).

  10. Recovery: Failures and their classification, recovery and atomicity, recovery algorithms, Undo-Redo with write ahead logging, no Undo no Redo and other combinations, buffer management, (optional: ARIES recovery).

Optional/Advanced topics below covered at the discretion of instructor

  1. Parallel Databases: Avenues for parallelism: I/O parallelism, interquery, inter-query and intra operation parallelism, databases for multi-core machines.

  2. Distributed Databases: Distributed data storage, distributed transactions, commit protocols, concurrency control in distributed databases, heterogeneous and cloud-based databases.

  3. Data Mining: Decision Support Systems, data warehousing, mining, classification, association rules, clustering

  4. Information Retrieval: relevance ranking using terms and hyperlinks,page rank, indexing of documents, measuring retrieval effectiveness.

  5. XML and semi-structured data: necessity, XML document schema, querying: XPath and XQuery languages, applications.

Books and References:

  1. H Garcia-Molina, JD Ullman and Widom, Database Systems: The Complete Book,2nd Ed., Prentice-Hall, 2008.
  2. A Silberschatz, H Korth and S Sudarshan, Database System Concepts, 6th Ed., McGraw-Hill, 2010.
  3. R Elmasri, S Navathe, Fundamentals of Database Systems, 6th edition, Addison-Wesley, 2010.
  4. R Ramakrishnan, J Gehrke, Database Management Systems, 3rd Ed., McGraw-Hill, 2002.