Much Ado about Nothing: Interesting Facts about Null
September 14, 2005 Ted Holt
In one of my son’s favorite M*A*S*H episodes, a certain Captain Tuttle is given accolades for his bravery and humanitarianism. What gives the show its humor is that Captain Tuttle does not exist. I would like to share some interesting facts about the relational database’s counterpart to Captain Tuttle–the null value. I have found, as did Hawkeye Pierce and the gang, that non-existent things can be useful.
1. Null indicates that a field has no value. A field with a null value is not equivalent to an alpha field with blanks or a numeric field with zeros. Such fields have values–blank and zero, respectively.
2. Null indicates that a field’s value is unknown, inapplicable, both, or we’re not sure which. For example, a null birth date means that a person’s date of birth is not known to whoever entered the data into the database. A null value in a field that indicates whether a hospital patient is pregnant or not might depend on the patient’s sex. For a female patient, null might mean “unknown,” whereas for a male patient, null would mean “inapplicable.” Then again, one might make the case that null could indicate both not pregnant and inapplicable for a female who’s had a hysterectomy, but I’m stretching it a bit here. Some database scholars think there should be two kinds of null values in order to differentiate between unknown and inapplicable.
3. Some scholars do not like null and advocate that null not be used. C. J. (Chris) Date, for example, recommends a default values implementation that, to my knowledge, no relational database management system supports.
4. Even scholars who support the concept of null values agree that it is good to avoid nulls as much as possible. SQL/400 has three functions that convert nulls to other values–COALESCE, IFNULL, and VALUE. I have written about them before. My favorite is COALESCE, because it allows more than two parameters and because it is supported by other relational database management systems.
5. SQL/400 also permits you to convert certain values to nulls. I have written about this as well.
6. Null does not equal null and null does not not equal null. (Yes, there is supposed to be a double not in that sentence.) Take a look at the following data from a table (physical file).
KeyField |
Data1 |
Data2 |
01 |
1 |
1 |
02 |
2 |
2 |
03 |
1 |
2 |
04 |
2 |
1 |
05 |
1 |
Null |
06 |
Null |
2 |
07 |
Null |
Null |
Suppose I run the following query. What rows will be returned?
select * from mydata where data1 = data2
Did you select rows with keys 01 and 02? If so, good for you. Row 07 was not returned because null does not equal null.
What rows will be returned from this query?
select * from mydata where data1 <> data2
I hope you selected rows with key values 03 and 04. Null does not compare to non-null values or to other null values.
7. If you need to compare nulls to each other or to non-null values, you can use the IS [NOT] DISTINCT FROM predicate, which was added to DB2 Universal Database for iSeries in V5R3. Two values are not distinct if they are both null, or if neither is null and the two equal each other. Two values are distinct if they are not not distinct. (Yes, there is a double not in that sentence, too.) For more information, see the V5R3 SQL Reference.
8. Since null does not compare to null, you cannot join with null values. Here’s an example.
Assume two tables, Employee and Department.
Employee |
Name |
Department |
1 |
Joe |
ACC |
2 |
Ben |
ACC |
3 |
Jim |
Null |
4 |
Ace |
IT |
Department |
Department |
Null |
Unassigned |
ACC |
Accounting |
IT |
Information |
Notice that Jim’s department number is null and that the department table includes a row with a null department ID. What do you think happens when we join the two tables on department ID?
select e.id, e.name, e.dept, d.name from qtemp/employee as e join qtemp/department as d on e.dept = d.id
Here is the output from the query.
Employee |
Name |
Department |
Department |
1 |
Joe |
ACC |
Accounting |
2 |
Ben |
ACC |
Accounting |
4 |
Ace |
IT |
Information |
Jim is not listed because null doesn’t match null in the join.
9. Primary keys are not supposed to be allowed to have null values. I don’t think DB2/400 strictly adheres to this convention, because I have created physical files (defined by DDS) that would allow me to add records with nulls in the key fields. However, SQL/400 does appear to enforce this restriction. When I tried to insert null key values into an SQL table, I got error message SQL054, which says that I violated a check constraint.
10. Null values are ignored in aggregation functions in SQL. Look at the following table.
Code |
Data |
A |
10 |
A |
30 |
B |
5 |
B |
Null |
A |
10 |
A |
Null |
B |
Null |
B |
20 |
B |
5 |
A |
50 |
Try to predict the results of the following queries.
select code, count(*) from mydata group by code
Did you get 5 As and 5 Bs? I hope so. Now, look at a slightly different query.
select code, count(data) from mydata group by code
I hope you determined that there are 4 As and 3 Bs. Rows with nulls in the DATA field were ignored. Try this one.
select code, avg(data) from mydata group by code
Did you get averages of 25 and 10 for A and B? The following query makes it more understandable.
select code, count(*), sum(data), count(data), avg(data) from mydata group by code
Here are the results.
Code |
Count(*) |
Sum(Data) |
Count(Data) |
Avg(Data) |
A |
5 |
100 |
4 |
25 |
B |
5 |
30 |
3 |
10 |
When determining the average, DB2/400 ignores rows with null values in the averaged field.
11. If a null value is used in a calculation, the result is null. Consider the following query with calculated fields.
select salary, dec(salary * 1.03,9,2), dec(salary * 1.05,9,2) from qtemp/mydata
If SALARY is null, the two calculated potential raises will also be null.
12. The sort order of null is not standard. Nulls may sort before or after other values. My experience has been that null values follow other values, but I would not bet on it.
select * from mydata order by data, code
Code |
Data |
B |
5 |
B |
5 |
A |
10 |
A |
10 |
B |
20 |
A |
30 |
A |
50 |
A |
Null |
B |
Null |
B |
Null |
13. SQL allows columns (fields) to have null values by default. To make an SQL column (field) “nullable”, do nothing. But if you want to prevent a column from having nulls, add either “NOT NULL” or “NOT NULL WITH DEFAULT” to the column definition. In the following table definition command, DOB is the only column allowed to have null values.
create table people (id integer not null, name varchar(25) not null, dept char(2) not null with default '00', dob date, primary key (id))
DDS, on the other hand, makes fields not nullable by default. To permit nulls, use the ALWNULL keyword at the field level. The following physical file can store null values in the customer name and sales rep fields.
A UNIQUE A R CUSTREC A RECCD 1A TEXT('Active record flag') A DFT('A') A COLHDG(' ' ' ' 'A/I') A COMPANY 3P 0B TEXT('Company') A DFT(1) A COLHDG(' ' ' ' 'Company') A EDTCDE(1) A CUSTNBR 5P 0B TEXT('Customer account number') A COLHDG('Customer' ' account' + A ' number') A EDTCDE(4) A CUSTNAME 20 B TEXT('Customer name') A COLHDG(' ' 'Customer' 'name') A ALWNULL A SLSREPNO 3P 0 A ALWNULL A TSTAMP Z TEXT('Maintenance timestamp') A COLHDG('Changed') A K COMPANY A K CUSTNBR
It’s hard to believe that this much could be said about things that don’t exist. But after seeing the M*A*S*H episode, maybe I shouldn’t be so surprised.