UNION and UNION ALL in SQL Server SELECT statement

In this article, we will learn about UNION and UNION all and comparison between both of them in the context of performance, behavior, and syntax.

What is UNION Operator

Union operator in SQL Server allows to combine the result sets generated by multiple select queries. It requires only similar column set or columns with similar datatypes in all the select statements. It is not required to have any logical relationship between any of select statements or any data. Specially it generates final result set with unique rows collection. In the end after generating the unique row set it also does sort operation.

Example

Here we have two tables, lets run separate select query on both the tables.

Select statement

We have Stephen Watson in common in both.

UNION Example
UNION Example

As we can see here that the result of both the select statements is combined and the duplicate records are removed.

What is UNION ALL Operator

This operator in SQL Server does the same job. It is used between multiple select statements having similar column structure. It combines all results generated by multiple select statements but it does not remove the duplicate rows between all the results. Therefor, result generated by this will have more rows than UNION.

Example

Lets execute the UNION ALL statement on the same two select statements

UNION ALL Example
UNION ALL Example

We can see that result of both the select statements is combined, but number of rows generated by this statement are more than UNION operator. As mentioned above, UNION ALL did not remove the duplicate rows between result sets.

Importance of Sequence of Data Type

It is very important to maintain the proper sequence of columns between all the select statements used with UNION operators.

Because, it does not see any logical relations between all the result sets but it just combines the result sets based on the similar column structure in select statements.

We have to take care of 3 things while using it

  1. Maintain the sequence of actual columns
  2. The same number of columns in all the statements
  3. and The sequence of datatypes between all the select statements

Performance comparison

UnionALL Performance in Estimated Execution Plan
Estimated Execution Plan

There is one difference between both these images, Merge Join operation added in execution.

Union Operator Performance
Estimated Execution Plan

As we discussed above that UNION operator does distinct operation to remove duplicate rows and also it does the sort operation over unique data. This Distinct Sort (Merge Join in above pic) operation costs the 46% of select statement which is very big load. Here you can see the execution plan for both. UNION has extra (Merge Join)operation in between, which is having more load percentage.

UNION With ORDER BY

UNION with Order By
Order by on Multiple Select Statement

We can also use ORDER BY along with select statement along with UNION operator. But, keep in mind that ORDER BY must be used in the end of last select statement.

As we know UNION already does the sorting by default but also you can do sorting on other columns as per your requirement. Here is the example of ORDER BY operator.

Difference between UNION and JOIN

UNION combines the result-set of two or more select queries into a single result-set which includes all the rows from all the queries in the union, where as JOINS, retrieve data from two or more tables based on logical relationships between the tables.

In short, it combines rows from 2 or more tables, where JOIN combines columns from 2 or more table with some logical relationship.


You can also read importance of ROW_NUMBER function to find duplicate rows

One thought on “UNION and UNION ALL in SQL Server SELECT statement

  • January 12, 2021 at 2:11 pm
    Permalink

    A big thank you for your blog. Thanks Again. Really Cool. Taryn Barnebas Swope

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *