Using SQL WITH clause to create temporary static tables at query time

A few days ago, I came across the following problem: I currently work on a project where I am the responsible of an application which generates entries to a log table every time a job is executed. This table contains a lot of information on statuses of jobs, possible problems, exceptions, duration, aso. I was working on some analytics on this data and needed to enrich the data by the version of the software which generated the log entry (since we were not capturing this in the log table). From our configuration management tool, I was able to extract the dates when which versions of the software was deployed in production

Problem

My intention was to create a temporary table to join onto the  logged entries, but I didn´t want to create the tables on the Oracle server (mainly because they would have been just temporary tables and because the schema-user I was using didn´t have the rights to create tables).

Solution: Using the WITH statement to create temporary static data

Since I was just interested in getting my analytics done, I used an SQL WITH statement to create a temporary static table (for the query), which I then linked onto the job table. Creating static data with the WITH query is rather uncommon since it is usually used to create a temporary view by querying some data.

About the WITH clause

The WITH query_name clause lets you assign a name to a subquery block. You can then reference the subquery block multiple places in the query by specifying the query name. Oracle optimizes the query by treating the query name as either an inline view or as a temporary table.

You can specify this clause in any top-level SELECT statement and in most types of subqueries. The query name is visible to the main query and to all subsequent subqueries except the subquery that defines the query name itself. [1]

The initial static data

I extracted the following data from our configuration management tool. It provides information about when a certain version of the software we use was deployed in production. (The data has been anonymised for privacy reasons though)

versions

I then semi-manually transformed the column data from Excel into the following SQL statement.

The WITH statement to create the temporary static table

WITH software_versions as (
(SELECT 'V 8.1' as version, TO_DATE('20.12.2013 05:04:00', 'dd.mm.yyyy hh24:mi:ss')  as from_date, SYSDATE as to_date FROM dual ) UNION
(SELECT 'V 8.0' as version, TO_DATE('07.10.2013 10:06:00', 'dd.mm.yyyy hh24:mi:ss')  as from_date, TO_DATE('20.12.2013 05:04:00', 'dd.mm.yyyy hh24:mi:ss')  as to_date FROM dual ) UNION
(SELECT 'V 7.2' as version, TO_DATE('29.07.2013 07:54:00', 'dd.mm.yyyy hh24:mi:ss')  as from_date, TO_DATE('07.10.2013 10:06:00', 'dd.mm.yyyy hh24:mi:ss')  as to_date FROM dual ) UNION
(SELECT 'V 7.1' as version, TO_DATE('04.06.2013 10:02:00', 'dd.mm.yyyy hh24:mi:ss')  as from_date, TO_DATE('29.07.2013 07:54:00', 'dd.mm.yyyy hh24:mi:ss')  as to_date FROM dual ) UNION
(SELECT 'V 7.0' as version, TO_DATE('13.05.2013 08:32:00', 'dd.mm.yyyy hh24:mi:ss')  as from_date, TO_DATE('04.06.2013 10:02:00', 'dd.mm.yyyy hh24:mi:ss')  as to_date FROM dual ) UNION
(SELECT 'V 6.0' as version, TO_DATE('10.05.2013 08:34:00', 'dd.mm.yyyy hh24:mi:ss')  as from_date, TO_DATE('13.05.2013 08:32:00', 'dd.mm.yyyy hh24:mi:ss')  as to_date FROM dual ) UNION
(SELECT 'V 5.0' as version, TO_DATE('16.03.2013 08:09:00', 'dd.mm.yyyy hh24:mi:ss')  as from_date, TO_DATE('10.05.2013 08:34:00', 'dd.mm.yyyy hh24:mi:ss')  as to_date FROM dual )
)

Every static SELECT generates data and formats the from_date and to_date from a TEXT to a  DATE datatype. This is important because range queries will be done on this result. Subsequently the UNION of all this single rows is done.

The resultset of the query on the temporary static table created using the WITH statement

--WITH statement was omitted here for better readability
SELECT * from software_Versions;

software_versions

Now, we will combine this table with the logged events in the job table.

The job table

SELECT * from job;

job_table
This is just an extract, the job table contains many fields on the status of jobs

The job table enriched by the software version information

Here the idea is to join the software_versions table with the job table and to only select these elements, where the job_start_date falls in the range where a certain version of the software was installed:

--WITH statement was omitted here for better readability
SELECT s.version, j.job_id, j.job_start_date, j.job_end_date
FROM   job j,
       software_versions s
WHERE  j.job_start_date BETWEEN s.from_date AND s.to_date
ORDER BY job_id desc;

job_enriched

Counting the number of jobs executed by a certain software version

It might also be of interest so see, how many jobs were executed by each and every version of the software.

--WITH statement was omitted here for better readability
SELECT COUNT(job_id) as jobs_executed, s.version
FROM   job j,
       software_versions s
WHERE  j.job_start_date BETWEEN s.from_date AND s.to_date
GROUP BY s.version;

Result:

resultset

Obviously, now it is possible to manipulate the query to only count jobs having a certain status (succeeded, failed, …). Please go ahead and try out different filters and groupings. This is not part of this tutorial anymore,  but you can see what I mean! :-D



Summary

In this micro tutorial, we have seen a possibility to create temporary static tables using the WITH statement. (I used an Oracle database for this tutorial, but the WITH statement is also available in other database products.) This temporary table was linked onto a jobs table and the entries in this table were enriched by the version of the software which generated the entry. Please feel free to provide any comments, and let me know if I should make thinks more clear/explicit and  if  this  tutorial was useful for you.

Links

[1] http://psoug.org/reference/with.html

2 Responses to “Using SQL WITH clause to create temporary static tables at query time”


  1. 1 PoMo

    Just adding that what you’re referring to is called CTEs (common table expressions).

    I’ve only stumpled upon them about a year ago, and while I think they add nothing that couldn’t also be done with subqueries, they’re nonetheless a joy because they make reading a query that much easier.

    Also, even though almost not worth to be mentioned, afaik it’s better for performance to use “UNION ALL” instead of “UNION” when you know that what you union is already distinct (or if you don’t mind potential duplicates).

  2. 2 Marc Giombetti

    Hi Pomo,

    yes you are right, using UNION ALL would be faster in general – mainly if you are using queries to retrieve data into the CTEs. In this special case, I guess if won’t make a difference in performance (since there are not many entries in the static versions table)

    thanks a lot for your feedback!
    Marc

Leave a Reply