SQL Aggregation, Grouping, and Having

Aggregating

In SQL, the standard aggregation operators are SUM, MIN, MAX, and COUNT and we can apply them to the attributes of relations. For example we have a schema employee(eid, name, dob, hired_date, salary), which holds records on each of a company’s employees. We can COUNT and see how many employees the company has with the following query:

SELECT COUNT(*)
FROM employee;

The * is an argument to the count operator and means to count all tuples in the relation. With the DISTINCT operator we eliminate duplicates for the query result, so if we wanted to see if any of the employees had the same name we could compare the previous query with the following:

SELECT COUNT(DISTINCT name)
FROM employee;

If the first count is larger than the second, then the company has at least two employees with the same name. COUNT also does not count tuple attributes with NULL value, and AVG and SUM don’t factor them in when calculating either. We can use AVG to find the average salary of our employees and SUM to calculate the total amount paid for employee salaries with these queries:

SELECT AVG(salary)
FROM (employee);

SELECT SUM(salary)
FROM (employee);

Grouping

A grouping query is executed on a relation with the GROUP BY clause. GROUP BY is followed by an attribute list and the resulting tuples are grouped successively according to the list. To see how grouping works, let’s say that the company from our earlier examples is a shipping company and owns trucks. The company keeps track of truck purchases and their usage with the following relations: truck(tid, make, model, year, purch_year, purch_quarter, cost). In a simple example we could query the database and ask which years the company purchased trucks:

SELECT purch_year
FROM truck
GROUP BY purch_year;

GROUP BY would come after a WHERE clause, but in this case we didn’t need one so it was omitted. The result of this query would be a single attribute relation (a single column table) with values reflecting the years in which the company purchased at least one truck. If the company purchased more that one truck in that year that purchase would be aggregated with all others in that year.

Only the attributes that are listed in the GROUP BY clause can appear unaggregated in the SELECT clause. The next two queries would tell the company which years and in which quarters they purchased trucks and the total amount spent in each quarter.

SELECT purch_year, quarter
FROM truck
GROUP BY year, quarter
ORDER BY quarter, year;

SELECT purch_year, SUM(cost) as annualcost
FROM truck
GROUP BY purch_year;

The first query would return a relation of 2-tuples with the first element being purch_year and the second quarter (the result is also first sorted by quarter, then by year). The second query would also return a 2-tuple but the elements would be purch_year and annualcost which is the aggregated (by simple addition in this case) cost of trucks for each year.  Because cost is not in the GROUP BY clause, it must be aggregated.

The HAVING Clause

In our last set of queries, we asked for the costs of trucks purchased on an annual basis. Suppose for some reason the company wanted to find the average cost spent in a year only in years where every truck purchased in that year cost over $15,000. To ask this question we can use the HAVING clause.

SELECT purch_year, AVG(cost) as avgcost
FROM truck
GROUP BY purch_year
HAVING MIN(cost) > 15000;

The HAVING clause applies to each group. In this case, MIN is calculated on the attribute cost for each purchase year (purch_year) then compared to 15,000. If the minimum value for all costs in a purchase year are greater that 15,000 then the purch_year shows up in the query result (a 2-tuple (purch_year, avgcost)).

Introduction to SCCS

The Original Version Control System

What is SCCS?

SCCS stands for Source Code Control System; it is a suite of programs that are used together to manage the development of source code which is eventually compiled into usable software. SCCS can be used to manage other classes of text files (not only source code), but the reason it was created and its predominate use is to mange changes made to source code files.

What is meant by development/changes?

Source code is generally recorded in plain (non-formatted) text files called source files and collectively the source code for a particular piece of software. The source code itself must then be transformed from text into the binary that a CPU understands. This transformation is done via compilation, as is the case with C and Java, or it is translated as is true of PHP and Python source code.

If a programmer wants to fix, add features, or otherwise improve the functionality of her program she would open her source code with a standard text editor, make the desired changes, and then compile (or run through a translator) the source code. Once every thing is working correctly she can then release a new version of the program with the improvement. This is essentially software development in a nut shell.

Okay, so where does SCCS come in?

As stated, SCCS is used to mange the process described in the last section. The most simple case of software development is a single programmer maintaining and improving a single program. Again the general procedure would be:

  1. Create/Edit a source file.
  2. Save edits to that file.
  3. Compile source into the executable program.

If all edits were done in a single sitting, and each edit make was completely correct, then development would happen linearly and management would be similarly straightforward. But making improvements to source code very often requires more than a single editing session. New bugs are regularly found. And making things more complicated, code added as an improvement sometimes introduces new bugs. Because of this the need arises to keep track of changes.

For example, a developer can be working on a program and he makes an couple of improvements. Then he discovers one of the changes made actually crashes the program under certain circumstances. With SCCS, or pretty much any other version control system, the programmer can view a history of the changes made to track down when the bug was introduced. Necessary changes to the source code can then be made to implement a fix, or if required, the entire set of changes that were made can be removed from the source code.

This use of SCCS, or more generally version control systems, is definitely an advantage to programmers working individually as well as in teams. Yet, as with just about anything else, things get more complicated the more people are involved.

First steps

Initially SCCS requires that programmers import their source code into the system. To bring these files in, the user must create a directory named SCCS. Each source file becomes an sfile in the SCCS directory (i.e. s.source.c). After that, working within the system one interacts with these sfiles – and not the source files themselves – when performing maintenance or development. From then on, the program and the programmer(s) use the system keep track of all changes.

To edit a file in the system it needs to be ‘checked out.’ This places a lock on the file so that no one else can work on the file at the same time. The programmer would then make edits and once they are satisfied with the changes (and the program runs correctly*) the would then delta, or merge, their changes back into the sfile. At each delta the programmer writes (or ought to write) a comment on the changes made. (Note that only changes are written to the sfile, a full copy is not made). The deltas are and normalized and serialized. With the deltas, and their accompanying programmer comments, the project is managed.

Quick Look

sfile

The sfile is like a database file. The original source code is imported with the admin program and transformed into the sfile format. These files are then stored in a directory named SCCS. Changes are recorder at each merge with programmer comments and other useful meta data.

editing

To edit a file the programmer uses the get program with the -e flag, or the edit command. A copy of the requested version of the file is then returned to the programmer. The sfile for the file being edited is then locked for any user other than the programmer who checked the file out. After the desired modifications are made (and the program compiles and runs correctly), the changes must be merged back into the sfile. If one wishes to discard their changes they can use the unedit command to back out of any changes made.

merging

To write the changes made to the source code, one must merge the changes with the delta program. This writes the changes to the sfile. Upon issue the sccs delta command, the programmer will be prompted to comment on the changes; this comment will be appended to meta data containing an SID (SCCS ID), or version number which is of the form r.l where r is the ‘release’ number and l is the ‘level’ – which I think of as major and minor version number.

compiling

According to Eric Allman of the University of California at Berkeley’s “An Introduction to the Source Code Control System” (available at: http://sccs.sourceforge.net/man/sccs.me.html):

“A good technique is to edit the files you need, make all necessary changes and tests, compiling and editing as often as necessary without making deltas. When you are satisfied that you have a working version, delta everything being edited, re-get them, and recompile everything.”

To get a file for compilation use the sccs get [file] command. This returns a read only copy of the current version of the file that can be used for compilation. To grab all the source files use the sccs get SCCS/ command in the directory containing your SCCS directory.

Examples

Example project: address book program.

$> ls -F addrBookUI*  bld/         inc/         makefile     README       src/

bld/ object files
inc/ header files
src/ source files

Initializing the SCCS

To initialize the SCCS using bash:

  1. Create an SCCS directory
  2. Import source files into sfiles
$> cd src
$> ls
actions.c     addrBookUI.c  helpers.c
$> mkdir SCCS save
$> for x in $(ls);do
> sccs admin -i$x $x
> mv $x save/
> done
No id keywords (cm7) 
No id keywords (cm7) 
No id keywords (cm7) 
$>

The output, No id keywords (cm7) is related to the consept of ID keywords, which I won’t cover, but this is just a warning (generated for each source file imported into the system). After issuing the above set of commands our source code has been transformed into sfiles (i.e. s.helpers.c) and can be found in the SCCS directory. The original .c files were moved to save for archiving or disposal.

$> ls save  SCCS
$> ls SCCS s.actions.c     s.addrBookUI.c  s.helpers.c

Editing

To work on a source file, it must be checked out of the system with the get program using the -e option, or using the edit command.

$> sccs get -e helpers.c
1.1
new delta 1.2
88 lines
$> sccs edit actions.c
1.1
new delta 1.2
396 lines
$> ls
actions.c  helpers.c  save       SCCS

As you can see the two source files have been made available for editing. To back out of editing use the sccs unedit [file].

Merging a file

After edits have been made, they need to be merged into the project (recorded in the sfile).

$> sccs delta actions.c
comments? Moved a comment
No id keywords (cm7)
1.2
2 inserted
1 deleted
395 unchanged

Compilation

Make sure all edits have been make and recorded in the sfile.

For individual files

$> sccs get helpers.c
1.2
88 lines
No id keywords (cm7)
$> gcc -c helpers.c
$> sccs clean

For the entire source code

$> sccs get SCCS
SCCS/s.actions.c:
1.2
397 lines
No id keywords (cm7)

SCCS/s.helpers.c:
1.2
88 lines
No id keywords (cm7)

SCCS/s.addrBookUI.c:
1.1
263 lines
No id keywords (cm7)
$> make addrBookUI clean
---output omitted--
$> sccs clean

File status

The following commands can be use to tell which files are being edited and by which users:

$> sccs info
Nothing being edited
$> sccs info -ujason
Nothing being edited by jason
$> sccs get -e helpers.c
$> sccs check
helpers.c: being edited:        1.2 1.3        jason 14/09/27 19:10:47

The prt program is use to get information on what changes have been made to a particular file.

$> sccs prt helpers.c 

SCCS/s.helpers.c: 

D 1.2   14/09/27 20:40:13 jason        2 1     00000/00000/00088 


D 1.1   14/09/27 19:50:38 jaosn        1 0     00088/00000/00000 
date and time created 14/09/27 19:50:38 by jason

get with the -m and -p options to get a detailed history of when changes were made.

$> sccs get -m -p helpers.c
1.2 
1.1     /********************************************************************** 
1.1     // 
1.1     // AUTHOR:      Jason Favrod 
1.1     // 
1.1     // DATE:        17 Sept 2014 
1.1     // 
1.1     // FILE:        helpers.c 
1.1     // 
1.1     // DESCRIPTION: A collection of helper functions. 
1.1     // 
1.2     // New Comment
1.1     ***********************************************************************/ 
1.1     #include "helpers.h" 
1.1     
1.1     
1.1     /********************************************************************* 
---output omitted---

Installing the LAMP stack on Debian Wheezy

Otherwise known as a LAMP server, the LAMP software stack consists of GNU\Linux as the operating system, Apache as the Web server, MySql for a database, and PHP (or possibly Pearl or Python) as the programming language used to host a Web application.

I’m going to assume a minimal install (I’ll be using debian-7.6.0-i386-CD-1.iso)
[http://cdimage.debian.org/debian-cd].

First we’ll need to make sure we have all the necessary repos in our
/etc/apt/sources.list file. If you installed from a complete
installation image you probably won’t need to mess with this, but
cat out the sources list and make sure you have the following or
something simialr.

root@debian32-base:# cat /etc/apt/sources.list
deb http://ftp.us.debian.org/debian stable main contrib non-free
deb http://ftp.debian.org/debian/ wheezy-updates main contrib non-free
deb http://security.debian.org/ wheezy/updates main contrib non-free

Okay, now update apt and install our software packages.
You will need to set a password for MySql’s root user.

root@debian32-base:# apt-get update
---output omitted---
root@debian32-base:~# apt-get install apache2 mysql-client mysql-server php5 libapache2-mod-php5
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following extra packages will be installed:
  apache2-mpm-prefork apache2-utils apache2.2-bin apache2.2-common libaio1 libapr1 libaprutil1 libaprutil1-dbd-sqlite3
  libaprutil1-ldap libdbd-mysql-perl libdbi-perl libhtml-template-perl libmysqlclient18 libonig2 libqdbm14
  mysql-client-5.5 mysql-common mysql-server-5.5 mysql-server-core-5.5 php5-cli php5-common ssl-cert
Suggested packages:
  apache2-doc apache2-suexec apache2-suexec-custom php-pear libipc-sharedcache-perl libterm-readkey-perl tinyca
  openssl-blacklist
The following NEW packages will be installed:
  apache2 apache2-mpm-prefork apache2-utils apache2.2-bin apache2.2-common libaio1 libapache2-mod-php5 libapr1
  libaprutil1 libaprutil1-dbd-sqlite3 libaprutil1-ldap libdbd-mysql-perl libdbi-perl libhtml-template-perl
  libmysqlclient18 libonig2 libqdbm14 mysql-client mysql-client-5.5 mysql-common mysql-server mysql-server-5.5
  mysql-server-core-5.5 php5 php5-cli php5-common ssl-cert
0 upgraded, 27 newly installed, 0 to remove and 2 not upgraded.
Need to get 16.1 MB of archives.
After this operation, 115 MB of additional disk space will be used.
Do you want to continue [Y/n]? 
Get:1 http://security.debian.org/ wheezy/updates/main mysql-common all 5.5.38-0+wheezy1 [78.6 kB]
---output omited---

Right about now, you get hit with the ncurses screen


And then the installation will finish uninterrupted

---output omitted---
Creating config file /etc/php5/apache2/php.ini with new version
[ ok ] Restarting web server: apache2 ... waiting .
Setting up libhtml-template-perl (2.91-1) ...
Setting up mysql-client (5.5.38-0+wheezy1) ...
Setting up mysql-server (5.5.38-0+wheezy1) ...
Setting up php5 (5.4.4-14+deb7u12) ...
Setting up php5-cli (5.4.4-14+deb7u12) ...

Creating config file /etc/php5/cli/php.ini with new version
update-alternatives: using /usr/bin/php5 to provide /usr/bin/php (php) in auto mode
Setting up ssl-cert (1.0.32) ...
root@debian32-base:~# 

Now you should be able to navigate to the server with a Web Browser. Just type
the computer’s IP address into the address bar.


What we don’t know for sure at this point is if PHP is working. Let’s rename
the default index.html file to a PHP file (index.php)
(FYI: to rename in the terminal, we use the move mv command).
Then we can open it with a text editor and add some PHP code and see if it’s
working.

root@debian32-base:~# mv /var/www/index.html /var/www/index.php
root@debian32-base:~# vim /var/www/index.php
<html><body><h1>It works!</h1>
<p>This is the default web page for this server.</p
<p>The web server software is running but no content has been added, yet.</p>
</body>
<?php echo('Hello World'); ?>
</html>

iMac Fan Control Work-Around

I was recently repairing a 24″ iMac (EMC 2211, Model A1225); to get the
computer up and running again I needed to replace the hard drive. This model
iMac has a sensor that attaches to the hard drive to report temperature metrics
for fan control. The replacement hard drive did not accommodate this sensor so
when the iMac booted up, the fan automatically went to full (and loud) speed.
This is how I revved down and quieted the fan.

The first thing I did was search the Web for a way to interface with
whatever OS X built-in software controls the fan speed. This was not
forthcoming, but in my search I found third-party applications that professed
to be able to help me achieve my ultimate goal of quieting down the fan. One
that came highly recommended was
smcFanControl,
but when I installed it the GUI only provided a way to set the minimum fan
speed (in RPMs). I needed a way to control the max fan speed.

Thankfully the author of this program, Hendrik Holtmann, opened sourced his
work. I visited the github
page
for the project and discovered that smcFanControl had a command line
utility that allowed me to set the max speed.

This utility is located inside the smcFanControl.app and is called
smc. The instructions for using the utility are on this
github
page. The first thing I had to do was switch the fan from auto to
forced mode so that I can make adjustments and apply them. First I tried
to set a target speed for the fan, but when I did that it sounded like the fan
went down to the target speed then right back up to the max. So I went ahead
and set the max speed (adjusting to find the highest RPMs at the lowest noise).
Now I could use the computer without the fan blasting, but these settings would
vanish on reset so I set up a cronjob to apply them at reboot.

TLDR

The Script

When I downloaded and extracted the app, I placed it into
/Applications/Utilities and I placed the startup script in my
$HOME directory and named it fancontrol.sh.

#!/bin/sh

# switch from auto to forced mode
/Applications/Utilities/smcFanControl.app/Contents/Resources/smc -k "FS! " -w 0004

# set the max fan speed (in rpms)
/Applications/Utilities/smcFanControl.app/Contents/Resources/smc -k F2Tg -w 2300

The crontab

Access the crontab for editing:

~$> sudo crontab -e

Add this to the crontab

@reboot /Users/jason/fancontrol.sh
~
~
~

Installing an OS in VirtualBox on a Headless FreeBSD Server

Using Virtualization one is able to run a full computer operating system inside of an existing operating system. I’ve been using VirtualBox for years to virtualize many kinds of operating systems for many different reasons on my desktop computers. I first started using VirtualBox to run a Windows installation so I could do things I needed to do in Windows that I had not yet figured out how to do with a free/open source operating system (at that time Ubuntu). As I came to familiarity with *nix platforms, I wanted to try out new techniques I learned and I used VirtualBox to accomplish this. With this program I could have a sandbox where I could easily create, save, snapshot and destroy virtual desktops, servers, and networks. On my new FreeBSD server I wanted to use virtualization to create test servers that I could use while I learn about new software and refine what I know about system administration and computer programming.

The thing this time around is my server has no dedicated monitor. I’ve always used the GUI to interact with the VirtualBox software, but now I needed a way to do all the things I do with VirtualBox with a Command Line Interface (CLI). And that’s how I became introduced to VBoxManage. If you have your system’s VirtualBox package installed you should have VBoxManage installed as well:

~$> VBoxManage
Oracle VM VirtualBox Command Line Management Interface Version 4.3.10_OSE
(C) 2005-2014 Oracle Corporation
All rights reserved.

Usage:

VBoxManage [] General Options: [-v|--version] print version number and exit [-q|--nologo] suppress the logo [--settingspw ] provide the settings password
[--settingspwfile ] provide a file containing the settings password


Commands:

list [--long|-l] vms|runningvms|ostypes|hostdvds|hostfloppies|
intnets|bridgedifs|hostonlyifs|natnets|dhcpservers|
hostinfo|hostcpuids|hddbackends|hdds|dvds|floppies|
usbhost|usbfilters|systemproperties|extpacks|
groups|webcams
---output omitted---

Typing in VBoxManage every time I wanted to issue a command to VirtualBox did not appeal to me at all so I quickly made a bash alias:

~$> echo "alias vbm='VBoxManage'" >> .bash_aliases && source .bash_aliases

Before we go any further, VirtualBox requires some kernel modules to be loaded.

~$ sudo kldload vboxdrv
~$ sudo kldload vboxnetadp

Okay, now we can get some work done. First create the VM and the virtual disk this machine will use.

~$ vbm createvm --name t0 --ostype FreeBSD --register
UUID: e18a1e47-69a0-4487-8114-2d4381141d32
Settings file: '/home/jason/VirtualBox VMs/t0/t0.vbox'
~$ vbm createhd --filename VirtualBox\ VMs/t0/t0.vdi --size 2000
0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Disk image created. UUID: 1a1ff4fa-db67-4b1b-b597-fd0453ed82ac

We can now list out the VMs to verify we created the device, then take a look at the profile of a virtual machine with the showvminfo command.

~$ vbm list vms
"t0" {e18a1e47-69a0-4487-8114-2d4381141d32}
~$ vbm showvminfo t0
---output omitted---
~$ ls VirtualBox\ VMs/t0/
t0.vbox t0.vbox-prev t0.vdi

Now that the machine exists (virtually), we need to install a storage bus, and then add devices – namely a hard drive and a dvd drive – to it.

~$ vbm storagectl t0 --name sata_bus --add sata --portcount 4 --bootable on
~$ vbm storageattach t0 --storagectl sata_bus --port 1 --type hdd --medium VirtualBox\ VMs/t0/t0.vdi
~$ vbm storageattach t0 --storagectl sata_bus --port 2 --type dvddrive --medium downloads/FreeBSD-10.0-RELEASE-i386-disc1.iso

Let’s verify:

~$ vbm showvminfo t0|grep sata_bus
Storage Controller Name (0): sata_bus
sata_bus (1, 0): /usr/home/jason/VirtualBox VMs/t0/t0.vdi (UUID: 1a1ff4fa-db67-4b1b-b597-fd0453ed82ac)
sata_bus (2, 0): /usr/home/jason/downloads/FreeBSD-10.0-RELEASE-i386-disc1.iso (UUID: c277d5b5-e7e6-4172-9ea7-253a5a23d38e)

So there exists a virtual machine and it has a blank virtual hard drive and a dvd drive installed. Inside the dvd drive is the FreeBSD installation disc. At this point we could start the VM and install the OS, but how do we connect to the virtual server? Well we need to install a virtual network interface card and setup bridged networking. This will allow us to easily connect to the VM and give it access to the Internet so we can download any packages necessary for installation.

~$ ifconfig
ath0: flags=8802 metric 0 mtu 2290
ether 00:25:56:22:d5:60
nd6 options=21 media: IEEE 802.11 Wireless Ethernet autoselect (autoselect) status: no carrier alc0: flags=8843 metric 0 mtu 1500
options=c3198 ether 00:23:5a:e0:0e:42 inet 192.168.200.8 netmask 0xfffffff0 broadcast 192.168.200.15 inet6 fe80::223:5aff:fee0:e42%alc0 prefixlen 64 scopeid 0x2 nd6 options=29 media: Ethernet autoselect (100baseTX )
status: active
lo0: flags=8049 metric 0 mtu 16384
options=600003 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 inet 127.0.0.1 netmask 0xff000000 nd6 options=21 ~$ vbm modifyvm t0 --nic1 bridged --bridgeadapter1 alc0 --nictype1 82540EM ~$ vbm showvminfo t0 | grep NIC NIC 1: MAC: 08002770C671, Attachment: Bridged Interface 'alc0', Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none NIC 2: disabled NIC 3: disabled NIC 4: disabled NIC 5: disabled NIC 6: disabled NIC 7: disabled NIC 8: disabled

Above, the first thing that I did was list out the network interfaces of the physical machine (with ifconfig). I did this to determine the which network interface I was going to bridge to (the one connect to the outside network). Each VM can have 8 NICs. With second command I set NIC 1 to be a bridged interface to alc0, my wired ethernet, and to emulate an Intel 82540EM Gigabit Ethernet Controller. Now when the VM powers up it will have network connectivity.

Now we can install the operating system. Because the FreeBSD install disc does not establish an IP address without user interaction, I needed to forward X11 through my SSH connection. At first I received an error when I tried to do the X11 forwarding, but I followed this tutorial and cleared that error.

jason@desktop ~$ ssh -X jason@hostserver
~$ # I connected to host server with ssh -X for ssh forwarding
~$ # now I start the vm on the host server and I can interact
~$ # with it with the GUI on my desktop
~$ vbm startvm t0

With our setup the VM will boot the FreeBSD installation media in our dvd drive

After the OS is installed, you’ll want to eject the dvd drive so that when you reboot, you won’t reboot back into the install cd.

~$ vbm storageattach t0 --storagectl sata_bus --port 2 --medium none

Now the OS is installed and (if you set it up that way) it will obtain an IP address on boot and start the SSH server that you can connect to from another machine

~$ vbm startvm t0 --type headless
Waiting for VM "t0" to power on...
VM "t0" has been successfully started.