Lesson 7: Databases

Filesystems Review Questions

Note

  • ext2 may not be consistent upon restart
  • ext3 and ext 4 are not
  • but consistency only guarantees metadata is intact
  • What might happen to a busy ext2 volume on power loss?
  • ext3?
  • ext4?

But what about our poor data?

Enter the Database

A database system's fundamental goal is to provide consistent views of structured data using the tools the operating system makes available.

Chief among them is fsync(2)

Note

fsync instructs the operating system to flush all writes to disk before returning

Structure

SQL databases are structured around Relational Algebra

Relational Algebra Visualized


_images/inner-outer-join-venn.jpg

Note

joins are the principle use of relations.

Installing MySQL

$ yum install mysql-server
$ /sbin/service mysqld start
$ /usr/bin/mysql_secure_installation

Managing MySQL

$ /sbin/service mysqld status
$ mysqladmin -p ping
$ mysqladmin -p create nobel

Configuration

Note

we're going to add: default_storage_engine         = InnoDB

Users & Permissions

$ sudo mysql -p
mysql> CREATE USER 'vagrant'@'localhost'
       IDENTIFIED BY 'password';

mysql> GRANT ALL PRIVILEGES ON nobel.*
       TO 'vagrant'@'localhost'
       WITH GRANT OPTION;

Importing Data

$ wget http://osl.io/nobel -O nobel.sql
$ mysql -p nobel < nobel.sql
$ mysql -p nobel
SHOW TABLES;
DESCRIBE nobel;

Basic Queries

4 basic operations on data:

SELECT

SELECT
   yr, subject, winner
FROM
   nobel
WHERE
   yr = 1960;

Practice

INSERT

INSERT VALUES
   ('2013','Literature','Herta Müller')
INTO
   nobel;

Note

this data stops at 2008, so lets insert some 2009 awards

Practice

In 2009:

UPDATE

UPDATE
   nobel
SET
   winner='Andrew Ryan'
WHERE
   subject='Peace' AND yr='1951';

Note

obviously Andrew Ryan deserves the peace price for his work in the Rapture planned community

Practice

  • Brigid Tenenbaum Medicine prize in 1952

DELETE

DELETE FROM
   nobel
WHERE
   yr = 1989, subject = peace;

Note

peace prizes can be controversial, and perhaps there's a political interest in censoring our database?

Further Reading, Resources, etc.

Hands-On: Make a Database

mysql> create database systemview

mysql> GRANT ALL PRIVILEGES ON systemview.*
       TO 'vagrant'@'localhost'
       WITH GRANT OPTION;

Note

challenge them to do this based on the material in the last hour, maybe also demo the mysql console. Make sure everyone remembers the username and password for the next step.

Databases in Applications

Applications love databases.

Note

All the various things an app might use a database for - note that the vast majority of web apps use them for something

Native SQL

Most languages allow you to speak directly to a database

Python:

#!/usr/bin/python
import MySQLdb

db = ("localhost","testuser","test123","nobel" )

cursor = db.cursor()

cursor.execute("SELECT subject, yr, winner FROM nobel WHERE yr = 1960)

data = cursor.fetchall()

for winner in data:
    print "%s winner in %s: %s " % (winner[0], winner[1], winner[2])

db.close()

Note

Note the plain SQL statement, recognizable from earlier. Point out the cumbersome nature of creating the connection, creating a cursor, sending the sql, getting data from the cursor (iterating over it if you want multiple results), etc. Similar interfaces exist for virtually all languages.

Introducing the ORM

Object Relational Mapper

Note

Make sure people know what you mean by "object", mention possible difference between Postgres, sqlite, MySql, etc. Objects may map to one table, but might also incorporate relationships. ORMs also often optimize queries and manage transactions to make database queries as efficient as possible (like all other magic, though, sometimes this can backfire).

Life With a Python ORM


Look, ma! No SQL!

for subject, yr, winner in session.query(Nobel).filter_by(yr=1960):
    print "%s winner in %s: %s " % (subject, yr, winner)

Much easier to read and understand, but requires some setting up first.

Note

Of course we actually have to do a lot of setup work - setting up the model, engine, session, etc - but you do that once and can interact with the database as much as you want, without worrying about the cursor or connection. Note that we have no SQL in this statement, it is pythonic and has pythonic methods. The database table is now an object.

Setting Up the Magic - SqlAlchemy

SqlAlchemy - a popular Python ORM, frequently used in Flask apps (like SystemView!).

To use it, we'll need to:

Note

Model:A object with all the properties, attributes, etc of our data, can also include code to manipulate that data in order to represent a specific view (i.e. automatically returning sorted results). It's just a python class, instances are just python objects.
Engine:This handles the authentication with the database, it's like the MySQLdb.connect above.
Session:An in-memory record of your changes to objects - all the orm objects you instantiate live int he session, and are only saved to the database when you say so.

Let's Databasify Systemview

Project:

What else? Ideas?

Note

Solicit ideas for another column or two, maybe number of times the term is used (easy incrementing example), or number of results from the least search.

Hands On

sudo yum install python-devel
sudo yum install mysql-devel
git clone git@github.com:DevOpsBootcamp/systemview

Hands On (Cont...)

git checkout -tb save-search origin/save-search
source <path to virtualenv>/bin/activate
pip install -r requirements.txt

Note

Talk about git branches again, explain tracking, git pull for people who already have it cloned, etc. Talk about the virtualenv, have people create a new one if they have lost the one they made last time. Talk about pip and what requirements.txt is all about - point out how easy it is to set up an app this way. Make sure requirements.txt contains sqlalchemy.

DANGER! - people will need mysql-dev package! name varies by distribution, for centos it is libmysqlclient-dev

Goals

http://docs.sqlalchemy.org/en/rel_0_9/orm/tutorial.html

Note

The code in the repo should have a simple model with one column, 'term', you can make a models.py, or just put it all in one file. If you separate them, talk about MVC. The code should start an sqlalchemy engine and session, save the search term normalized (lowercased, stripped), the column should be set to unique. Make sure the code handles the case of the term already existing in the database (when you add a count, increment the count when the term exists). You should probably initialize the db directly in the code, otherwise you'll have to open up a python console, import the app and run the db update.