Generating and splitting a delimited string column

1

Download demo query – 1.6 KB

Introduction

Recently I have answered one of the Code Project questions, in brief the question was, how to split the delimited string and return as row values. This is one of the common questions we all are facing in our development. Not only splitting the delimited string, but also we struggle when creating the delimited strings using SQL.

In this article, I will show you how easily this can be done by using the SQL XQuery.

Creating delimited strings

Suppose we have a table called DeveloperProject to store the individual developer project list.

DECLARE @DeveloperProject Table(
 DeveoperID INT,
 ProjectCode VARCHAR(20)
)
INSERT INTO @DeveloperProject(DeveoperID, ProjectCode) VALUES(1,'CODE_PROJECT')
INSERT INTO @DeveloperProject(DeveoperID, ProjectCode) VALUES(1,'MSFT')
INSERT INTO @DeveloperProject(DeveoperID, ProjectCode) VALUES(1,'REPT_MOD')
INSERT INTO @DeveloperProject(DeveoperID, ProjectCode) VALUES(2,'MSFT')
INSERT INTO @DeveloperProject(DeveoperID, ProjectCode) VALUES(2,'CODE_PROJECT')
INSERT INTO @DeveloperProject(DeveoperID, ProjectCode) VALUES(3,'REPT_MOD')

Now we need to convert this as follows

DeveoperID  ProjectCodes
----------- ------------------------
1           CODE_PROJECT,MSFT,REPT_MOD
2           MSFT,CODE_PROJECT
3           REPT_MOD

There are many ways to do this. But here is the easiest way.

SELECT DeveoperID,
    REPLACE((SELECT ProjectCode AS'data()'
        FROM @DeveloperProject dp2
        WHERE dp2.DeveoperID = dp1.DeveoperID
    FOR XML PATH('')),' ',',')AS ProjectCodes
FROM @DeveloperProject dp1
GROUP BY DeveoperID
Instead of REPLACE function you can also use the SUBSTRING function as below
SELECT DeveoperID,
    SUBSTRING((SELECT ','+ ProjectCode
        FROM @DeveloperProject dp2
        WHERE dp2.DeveoperID = dp1.DeveoperID
    FOR XML PATH('')), 2, 8000)AS ProjectCodes
FROM @DeveloperProject dp1
GROUP BY DeveoperID
Table with delimited column

FOR XML() clause

A SELECT query returns results as a rowset. You can optionally retrieve formal results of a SQL query as XML by specifying the FOR XML clause in the query.
http://msdn.microsoft.com/en-us/library/ms178107.aspx

data() function

If you want to get the value of an attribute, rather than the attribute node itself, we can use the data() method.

For ex: if you execute the following query, you will get the xml output with <ProjectCode> attribute nodes.

SELECT ProjectCode
FROM @DeveloperProject c2
FOR XML PATH('')

XML PATH with Nodes
Now run it as follows, and see the result set. It gives the output without <ProjectCode> nodes.

SELECT ProjectCode AS 'data()'
FROM @DeveloperProject c2
FOR XML PATH('')

XML PATH without Nodes

Splitting delimited string

Suppose we have a table called DeveloperProjectCSV to store the developer id and comma separated project list

DECLARE @DeveloperProjectCSV Table(
    DeveoperID INT,
    ProjectCodes  VARCHAR(MAX)
)

We can use previously created script to insert the data to this table.

INSERT INTO @DeveloperProjectCSV(DeveoperID,ProjectCodes)
SELECT DeveoperID,
    REPLACE((SELECT
        ProjectCode AS 'data()'
     FROM @DeveloperProject dp2
     WHERE dp2.DeveoperID = dp1.DeveoperID
     FOR XML PATH('')), ' ', ',') AS ProjectCodes
FROM @DeveloperProject dp1
GROUP BY DeveoperID

Now the table contains following data

DeveoperID  ProjectCodes
----------- ------------------------------
1           CODE_PROJECT,MSFT,REPT_MOD
2           MSFT,CODE_PROJECT
3           REPT_MOD

And we need to split the project codes for each developer.

;WITH cte AS (
    SELECT
        DeveoperID,
        CAST('' + REPLACE(ProjectCodes, ',', '') + '' AS XML) AS ProjectCodes
    FROM @DeveloperProjectCSV
)
SELECT
    DeveoperID,
    xTable.xColumn.value('.', 'VARCHAR(MAX)') AS ProjectCode
FROM cte
CROSS APPLY ProjectCodes.nodes('//r') AS xTable(xColumn)

Splited string

WITH (common_table_expression)

WITH is used to specify the temporary named result set. In above WITH block creates temporary results set with two columns. One is a DeveloperID and other one is an xml type column for ProjectCodes
http://msdn.microsoft.com/en-us/library/ms175972.aspx

nodes() Method

The nodes() method is useful when you want to share an xml data type instance into relational data. It allows you to identify nodes that will be mapped into a new row.

nodes (XQuery) as Table(Column)

http://msdn.microsoft.com/en-us/library/ms188282(v=sql.100).aspx

CROSS APPLY

CROSS APPLY is just like an inner join query clause which allows joining between two table expressions.
http://technet.microsoft.com/en-us/library/ms175156.aspx


Creating delimited strings with SQL 2000

Since SQL 2000 doesn’t support XML PATH, above query doesn’t work with SQL 2000. As an alternative option we can use COALESCE or ISNULL function and create User Defined Function to get the concatenation output.
Suppose we have a table called “DeveloperProject” with following data inserted.

DeveloperID ProjectCode
----------- --------------------
1           CODE_PROJECT
1           MSFT
1           REPT_MOD
2           MSFT
2           CODE_PROJECT
3           REPT_MOD

First we have to create the concatenation function

CREATE FUNCTION dbo.rowToCSV(@DeveoperID int )
RETURNS VARCHAR(MAX)
AS
BEGIN
    DECLARE @ProjectCodes VARCHAR(MAX)
    SELECT @ProjectCodes = COALESCE(@ProjectCodes + ', ', '') + ProjectCode
    FROM DeveloperProject
    WHERE DeveoperID = @DeveoperID
    RETURN @ProjectCodes
END

By using the above function, you will can get the same output.

SELECT
DeveoperID,
ProjectCodes = dbo.rowToCSV(DeveoperID )
FROM DeveloperProject
GROUP BY DeveoperID

Delimited String

COALESCE and ISNULL

COALESCE returns the first nonnull expression among its arguments. This is just like ISNULL function in SQL. Both ISNULL and COALESCE can be used to get the same results but there are some differences. http://msdn.microsoft.com/en-us/library/ms190349(v=sql.90).aspx

Splitting delimited string with SQL 2000
Again, SQL 2000 doesn’t support WITH (Common table expression) and the APPLY operators. As an alternative option we can use the spt_values master table.

SELECT  a.DeveloperID,
SUBSTRING(',' + a.ProjectCodes + ',', n.Number + 1, CHARINDEX(',', ',' + a.ProjectCodes + ',', n.Number + 1) - n.Number - 1) AS [Value]
FROM DeveloperProjects AS a
INNER JOIN master..spt_values AS n ON SUBSTRING(',' + a.ProjectCodes + ',', n.Number, 1) = ','
WHERE n.Type = 'p'
    AND n.Number > 0
    AND n.Number < LEN(',' + a.ProjectCodes + ',')

spt_values

spt_values is an undocumented system table, which contains various data items used by Microsoft written stored procedures. We can use the rows with type = ‘P’ to have a numbers table with the numbers from 1-2047.