Good Reasons to Use Unrequired Correlation Names
October 24, 2007 Ted Holt
Correlation names are alternate names given to tables in an SQL command. Sometimes correlation names are required. Other times they are optional, but helpful. I would go so far as to say that it is good practice to use correlation names as often as possible, even when they are not obligatory. Let me share two good reasons to do so. But first, let’s review the basics of correlation names. A correlation name follows a table name in the FROM clause. The table name and correlation name must be separated by white space. You may also include the filler word AS between the two. I always include AS because the query sounds better when I’m reading it. This query has no correlation name. SELECT classid, instructor, building, room FROM schedule
Here’s one that assigns correlation name SK to the schedule table. SELECT classid, instructor, building, room FROM schedule as sk
In this case, using the correlation name doesn’t buy me anything, so let’s look at a more complex query. SELECT classid, classes.name, instructor, faculty.name, building, buildings.name, room FROM schedule LEFT JOIN classes ON classid = classes.id LEFT JOIN buildings ON building = buildings.id LEFT JOIN faculty ON instructor = faculty.id
There are four tables–count ’em! Three of them–Classes, Buildings, and Faculty–have common column (field) names of ID and Name. I had to qualify all instances of the ID and Name fields with the table names in order to eradicate ambiguity. This is fine, but it can turn into an awful lot of typing. This leads us to one of the most common uses of correlation names. You can use correlation names as shorter names for tables (and views). Here’s the same query using b, c, and f as correlation names for the Buildings, Classes, and Faculty tables, respectively. SELECT classid, c.name, instructor, f.name, building, b.name, room FROM schedule LEFT JOIN classes as c ON classid = c.id LEFT JOIN buildings as b ON building = b.id LEFT JOIN faculty as f ON instructor = f.id
The only time a correlation name is required is when one table is used more than once in a query. For instance, to find the names of professors who share classrooms, we might run this query: SELECT s1.building, s1.room, f1.NAME, f2.NAME FROM schedule AS s1 JOIN schedule AS s2 ON s1.building = s2.building AND s1.room = s2.room AND s1.classid <> s2.classid JOIN faculty AS f1 ON s1.instructor = f1.id JOIN faculty AS f2 ON s2.instructor = f2.id ORDER BY 1, 2
In this query, we need to access the schedule table twice and the faculty table twice. Without correlation names, there would be no way to determine which instance of each table was being referenced. I hope that covers the basics. Now, as I was saying, there are good reasons to use correlation names when correlation names are not required. Here are two. First, using a correlation name ensures that the SQL interpreter catches a missing-comma error. For instance, look at this query and tell me how many columns of data it produces. SELECT CLASSID PERIOD, BUILDING, ROOM INSTRUCTOR FROM schedule Yes! That’s right! The query produces three columns of data!
The first column, ClassID, is renamed to Period. The Building column is second. The room number is listed in the third column, but it is renamed to instructor. Clearly, the author of this query intended to produce five columns of data, but omitted two commas. Let’s look at the same query, mistakes included, with correlation names. SELECT sked.CLASSID sked.PERIOD, sked.BUILDING, sked.ROOM sked.INSTRUCTOR FROM schedule as sked SQL produces error SQL0104: Token . was not valid. Valid tokens: , FROM INTO. Adding the missing commas produces the correct output. SELECT sked.CLASSID, sked.PERIOD, sked.BUILDING, sked.ROOM, sked.INSTRUCTOR FROM schedule as sked
I consider this an improvement, as I prefer syntax errors to logic errors any day of the week. A second good reason to use correlation names when they are not required is to enhance the readability of a query. Look at the following query and tell me in which table Room is stored. SELECT classid, classes.NAME, instructor, faculty.NAME AS facname, building, buildings.NAME AS bname, room FROM schedule LEFT JOIN classes ON classid = classes.id LEFT JOIN buildings ON building = buildings.id LEFT JOIN faculty ON instructor = faculty.id Now look at this query, which is functionally equivalent to the preceding one, and answer the same question. SELECT sk.classid, classes.NAME, sk.instructor, faculty.NAME AS facname, sk.building, buildings.NAME AS bname, sk.room FROM schedule AS sk LEFT JOIN classes ON sk.classid = classes.id LEFT JOIN buildings ON sk.building = buildings.id LEFT JOIN faculty ON sk.instructor = faculty.id Did you decide that the second query was easier to understand? The longer the query, the more I appreciate the inclusion of syntactically unnecessary correlation names. P. S. I know that the word unrequired is not in the dictionary. I used it anyway in order to shorten the title. P. P. S. For your reference, here is the data I used in developing these queries. CREATE TABLE BUILDING ( ID DEC (3,0), NAME CHAR(20)); ID NAME 23 Decthee Hall 25 Offuv Center 41 Narrow Hall 42 Vaux Hall CREATE TABLE CLASSES ( ID DEC (3,0), NAME CHAR(20)); ID NAME 101 Antarctic Literature 102 Pig Latin IV 103 Rodeo Appreciation 104 Antarctic History II 105 Cooking with Guppies CREATE TABLE FACULTY (ID CHAR (3), NAME CHAR (20)); ID NAME F01 Monella, Sal F02 Nootix, Herman F03 Andree, Polly F04 LeNoll, Ty F05 O'Var, Sam F06 Sillen, Penny F07 Doo, Billy F08 Fishul, Benny CREATE TABLE classes CLASSID DEC(3, 0), PERIOD CHAR (1), BUILDING DEC (3,0), ROOM DEC (4,0), INSTRUCTOR CHAR (3)) CLASSID PERIOD BUILDING ROOM INSTRUCTOR 101 A 41 320 F02 102 A 41 218 F03 103 B 41 212 F02 104 B 42 302 F05 105 C 41 165 F04 106 B 41 212 F06 107 B 42 302 F01 108 D 41 212 F07 109 D 41 320 F08
|