Testing for Valid Data Representation in SQL
October 8, 2008 Hey Mike
How can I tell if a character column contains a valid representation of numeric data? I have a column expression in a Select statement that needs to convert character Column A to numeric only when it contains a valid numeric representation. If Column A doesn’t contain a numeric representation, then I need to convert Column B to numeric. —FHG reader Testing for valid numeric representation is an important task. The problem is that casting character data to numeric will result in a run time error if the data is invalid and therefore it is appropriate to test the data before performing a cast. SQL Server’s T-SQL and Visual Basic for Applications (VBA) both contain an “IsNumeric” function to test if a string contains data that can be converted to numeric. Unfortunately, DB2 for i has no such built-in function. However, there are a variety of ways to perform this test. The first approach is to use the powerful translate function: Select Case When Trim(Translate(Data, ' ', '+-E.0123456789'))> ' ' Then 'N' Else 'Y' End As IsNumeric From DataTable In this example, the Translate function is applied to column “Data”. Any valid numeric characters are replaced by a space. If, after all the numeric characters are removed, there is still something other than a space, then the CASE statement will set this numeric test to “N” (i.e., not numeric). If only spaces are left, then it is assumed that the data is numeric. This technique is very useful but suffers from the limitation that there is no consideration given to the order of the characters or number of special characters, such as the plus (+) or minus (–) symbols, contained within the string. For instance, the value “32+5E23+.2-” would be considered a valid numeric value with this technique! Further, any embedded spaces or a value of all spaces would pass this test. Clearly much more work is needed to bulletproof this approach. An alternative approach would be to write a User-Defined Function (UDF) to do this task. And instead of reinventing the wheel with complicated TRANSLATE logic to find if the data is valid, this UDF would simply re-use what IBM has already given us–the ability to cast character data to numeric with error handling. Here is a function definition that emulates SQL Server’s IsNumeric function. Create Function xxxxx/IsNumeric (@TestData VarChar(64)) Returns Char(1) Deterministic Not Fenced Language SQL Set Option Commit=*None, UsrPrf=*Owner Begin Declare @Double Double Not Null Default 0; Declare Exit Handler For SQLException Return 'N'; If @TestData Is Null Then Return 'N'; End If; Set @Double=Cast(@TestData As Double); Return 'Y'; End The function receives the character data in variable “@TestData” and attempts to convert it to the DOUBLE numeric data type. DOUBLE was chosen because it can hold numbers ranging from -2.23×10308 to 2.23×10308. (Starting with version 6.1, developers can use the new DECFLOAT data type to get a much greater range of numbers!) If the conversion succeeds, the function returns a Y. If the conversion fails or is NULL, IsNumeric returns an N. This function can handle the following representations: Negatives: ‘-327.35’ Positives: ‘+267’ Floating point: ‘-1.7546E-10’ Also, the function is smart enough to know that problems like embedded spaces, or an “E” that is not used as an exponent, constitute an invalid representation. When used within a Select column expression, IsNumeric can be used as follows: Case When IsNumeric(Column1)= 'Y' Then Cast(Column2 As datatype) Else Cast(Column3 As datatype) End Of course, this assumes that the data types of both casts are compatible. Depending on the situation, it may be better to cast just once: Cast(Case When IsNumeric(Column1)= 'Y' Then Column2 Else Column3 End As datatype) It’s easy to use SQL to write additional functions such as “IsDate” or “IsTime” to test for a valid date or time representation. Function IsDate is shown here: Create Function xxxxx/IsDate (@Date VarChar(10)) Returns Char(1) Deterministic Not Fenced Language SQL Set Option Commit=*None, UsrPrf=*Owner Begin Declare @TestDate Date; Declare Exit Handler For SQLException Return 'N'; If @Date Is Null Then Return 'N'; End If; Set @TestDate=Cast(@Date As Date); Return 'Y'; End; Dates are a little bit more difficult to work with because DB2 makes an assumption about how the character date should be formatted based on the SQL session’s date format and date separator settings. First of all, you’re always safe if your date is in one of the four-digit year formats with the proper separator (*ISO,*USA,*EUR,*JIS) regardless of your session’s date format setting. However, if you’re expecting the character data to be in one of the following formats–*YMD,*MDY,*DMY,*JUL–then your SQL session’s date format setting must match. This also means that you can’t validate *YMD and *JUL in the same SQL UDF. If you need this complexity you will need to code additional logic in the UDF. I have found that taking the time to write these test functions pays off in the long run. For one example, it is common to copy a spool file into a physical file and then use SQL to analyze the spool file data to say, add a column of numbers or test a date. In these cases, it is handy to have functions like IsNumeric and IsDate to test data before performing a cast since spool files can often contain many formatting discrepancies. Likewise, in ERP applications it is common to have users type in numbers or dates with special meaning into a free form character reference or comment field. There is no guarantee that a user is going to format this data properly. Again, IsNumeric and IsDate can be used to verify that the user typed something in correctly before Casting the data and risking an error. Overall, using these validation functions can help protect your applications and queries by providing an additional bullet proofing defense. With a little practice, it only takes a few minutes to write these helper functions. Michael Sansoterra is a programmer/analyst for i3 Business Solutions, an IT services firm based in Grand Rapids, Michigan. Send your questions or comments for Michael to Ted Holt via the IT Jungle Contact page.
|
Hi,
I tired the ISDATE function out. It worked for the most part. The issue is that the user is able to enter a date of any format. For the most part they are using month/day/year(2 digits 🙁 ) . I have no control over whats entered or the application they use so I can’t change it to a date field or tell them to use a 4 digit year.
Anyways..
The function above returns N for all of these dates even though I can use the DATE function to cast it to a date field in the query. How can I modify the ISDATE function to accept the date format they are using and DATE says is okay?
Thanks!