SQL Count DISTINCT with partition

Posted by John Liu on Friday, July 7, 2023

There are times where you want to count distinct value over a partition window. However, COUNT(DISTINCT column) OVER(PARTITION BY …) is not supported by SQL. One alternative way to achieve the same results without using GROUP BY is to use the DENSE_RANK() function.

--if column does not allow null
SELECT 
    *
    ,CountDistinct = DENSE_RANK() OVER(PARTITION BY columnList ORDER BY column ASC) + DENSE_RANK() OVER(PARTITION BY columnList ORDER BY column DESC) - 1
FROM
    table

Please note, if your data might have null values, the above method might not work correctly as COUNT() ignore null value whereas DENSE_RANK() doesn’t. For that, using following method to subtract null value count.

--if column allows null
SELECT 
    *
    ,CountDistinct = DENSE_RANK() OVER(PARTITION BY columnList ORDER BY column ASC) + DENSE_RANK() OVER(PARTITION BY columnList ORDER BY column DESC) - 1
    - MAX(CASE WHEN column IS NULL THEN 1 ELSE 0 END) OVER (PARTITION BY columnList)
FROM
    table

If you need to ignore specific values, you can add additional logic similar with handling null value to exclude them.