Guru: Regular Expressions, Part 2
August 15, 2022 Mike Larsen
In the first part of this series, I showed how to replace characters in a string using SQL and regular expressions. This time, I’ll show other regular expressions that are available for us to use. The regular expression functions I’ll show are REGEXP_COUNT and REGEXP_LIKE and the examples come from production programs that I’ve recently implemented.
In the first example (Figure 1), I want to get a count of the number of occurrences of a pattern within a string. To make this more interesting, I’m searching for two different patterns within a sting. To accomplish this objective, I used REGEXP_COUNT to get a count of how many times the pattern appears in a string.
Before showing the result of this statement, let’s analyze the arguments being used. The first argument is the string to be searched. In a production program, this would likely be a variable or a field from a file or table. The second argument indicates what I’m looking for in the string. In this case, I’m looking for either [CR] or [GS]. The pipe symbol is used to indicate this is an ‘or’ statement. The third argument directs the function to start looking for these patterns in the first position of the string. Finally, the last argument I specified indicates that I want to ignore the case of the letters being searched. The result of the statement is shown in Figure 2.
In the next example, I’ll show the REGEXP_LIKE function (Figure 3). With this function, I’m checking for the existence of a pattern with a string. If the pattern is found, the function returns a ‘Y’. If the pattern isn’t found, the function returns a blank. Let’s look at this function using the same string I used in the first example.
The arguments provided are the same as I used in the REGEXP_COUNT, so they don’t need further explanation. When I run this statement (Figure 4), a ‘Y’ is returned as the pattern is found in the string.
So, what if the pattern isn’t found? I’ll show that scenario in Figure 5.
When I run this statement, a blank is returned (Figure 6).
The final example is from a query I wrote to identify customers who have the letters ‘DBA’ at the beginning of an address field. On the surface, this seems like a simple request, but there is a twist. During the data entry process, this field may have been populated as ‘DBA’, ‘D/B/A’, (DBA)’, ‘DBA:’, or other ways. To identify these customers, I wrote the statement shown in Figure 7.
For privacy purposes, I anonymized the library, table name, and field names in the query. I also anonymized company names in the results (Figure 8).
Hopefully the examples provided in this series are helpful in getting you started with regular expressions. I’ve found them to be very useful in my programming and will continue to use them where it makes sense. I’m always happy to have more tools accessible to me.