Introduction to Databases


Databases are inescapable today. Most people, whether they are aware of it or not, are using databases in their lives daily—some are using it every hour, every minute of the day. Modern computing relies heavily on databases and we live in an era of “Big Data” and “Cloud” computing more than ever. It’s clear that understanding databases is quite important to those of us in the digital industry.

With at all that in mind, I’m reviewing a Stanford/Coursera course, “Introduction to Databases,” by Jennifer Widom. I thought I’d summarize some notes as I go along.


1. Introduction

Database Management Systems (DBMS)

Key feature requirements of database design:

A fairly convincing feature list for strong database systems design.

However, not all data-intensive applications and programs will use…databases. Instead, you can file-based data storage, including things like Excel files. Hadoop, for example, is a processing framework for running operations to run on files.

Key concepts of DBMS designs

Key people involved

The focus of the course is primarily with database designers and application designers.


2. The Relational Model (RM)

The relational model is more than 35 years old and is the foundation of most DBMS, including commercial DBMS. It’s an incredibly simple model — which is one of its strengths.

A relational model also relies on high-level query languages that are clear, expressive and declarative. Their implementations tend to be efficient as well.

Basic constructs of the RM

Example of creating RM relations and tables in SQL:

Create Table "Student"(ID, name, GPA, photo)
Create Table "College"(name string, state char(2),enrollment integer)

The RM thus allows for a high-level declarative query language.

Querying a relational database

The following are some basic steps in creating and querying a relational database:

  1. Design a schema (attributes/columns) and create it using DDL (see above)
  2. “Bulk load” the initial data to fill in the tuples/rows
  3. Finally, execute queries on the DB and retrieve results. Or update the data itself.

Relational databases allow for high-level ad-hoc queris. In all relational query languages, when you ask a query over a set of relations, you get a whole new query object result. This feature of query languages is called a closure of the object. Another query can be called on this returned object — this is what is known as compositionality.

Two Basic Relational Query languages

These are two basic forms of how the sematics of a query language is defined.