SSIS: Concatenate/flatten multiple nodes/rows into strings

SSIS: Concatenate/flatten multiple nodes/rows into strings

Postby cccp » Mon Dec 15, 2008 2:30 pm

I am using SSIS. In my package I have an xml source that retrieves an XML structure simliar to (scaled down for readibility):

<root id="8391">
<event>
<title>Christmas Event 2008</title>
<description>Again we are looking forward to a busy christmas
period...</description>
<keyword>Christmas</keyword>
<keyword>Event</keyword>
<location id="344" level="city">
<name>London</name>
<address>541 Wigram Lane</address>
<parent-location id="222" level="Country">
<name>England</name>
<parent-location id="444" level="Subcontinent">
<name>United Kingdom</name>
</parent-location>
</parent-location>
</location>
</event>
</root>

I need to 'flatten' the above so I am passing the information to a stored
procedure as follows (note concatenations on description, keywords and
locations):

@Title = "Christmas Event 2008"
@Description = "Christmas Event 2008<br><br>
Again we are looking forward to a busy christmas period...<br><br>
Address: 541 Wigram Lane<br><br>
London"
@Keywords = Christmas, Event (i.e. need to concatenate the two keyword nodes)
@Locations = London, England, United Kingdom (i.e. need to concatenate the
multiple location & parent-location nodes)

Can anyone help me get started on the best way to approach the 'flattening' I'm not sure which SSIS toolbox items to use - new to SSIS. Especially for traversing the 'child' data (keywords, location).

Today I was playing around with using a script component and the processinput event, looping through all the records for keyword and location and concatenating, I was them attempting to pass to a package level variable, thinking I could add the variable values to an output on my core dataset. That was all good, except my variable doesn't get set. Apparently variables can only be set on the postexecute event - unfortunately this event is not firing.

I suppose my question(s) to everyone out there are:

1. Am I heading down the right track to use script component to concatenate
multiple rows into variables?

2. Or is there a better method i.e. would XSLT be more suited to this task
i.e. Use an xml task to apply the xslt, then pass to xml source, etc?

Thanks :)

User avatar
cccp
Newbie
Newbie
 
Posts: 35
Joined: Mon Apr 30, 2007 10:31 am

Re: SSIS: Concatenate/flatten multiple nodes/rows into strings

Postby Darwin » Mon Dec 15, 2008 2:32 pm

Why don't you use XML data type and its features instead of SSIS.

1. To import XML into a variable use OPENROWSET with BULK provider:

DECLARE @variable xml
SET @variable = (
SELECT *
FROM OPENROWSET(BULK N'C:\myfile.xml', SINGLE_NCLOB) -- or SINGLE_CLOB if not Unicode
)
SELECT @variable -- you can see if it works

2. To explore imported xml use the methods of XML data type: query(), nodes(), value(), exist(), modify(). Below some sample code that retrieves all the information needed from your XML document:

declare @x xml

set @x = '<root id="8391">
<event>
<title>Christmas Event 2008</title>
<description>Again we are looking forward to a busy christmas
period...</description>
<keyword>Christmas</keyword>
<keyword>Event</keyword>
<location id="344" level="city">
<name>London</name>
<address>541 Wigram Lane</address>
<parent-location id="222" level="Country">
<name>England</name>
<parent-location id="444" level="Subcontinent">
<name>United Kingdom</name>
</parent-location>
</parent-location>
</location>
</event>
</root>'

select
CAST(@x.query('/root/event/title/text()[1]') AS varchar(8000)) + '<br><br>' as Title

select
CAST(@x.query('/root/event/title/text()[1]') AS varchar(8000)) + '<br><br>' + char(13) + char(10) +
CAST(@x.query('/root/event/description/text()[1]') AS varchar(8000)) + '<br><br>' + char(13) + char(10) +
CAST(@x.query('/root/event/location/address/text()[1]') AS varchar(8000)) + '<br><br>' + char(13) + char(10) +
CAST(@x.query('/root/event/location/name/text()[1]') AS varchar(8000)) as Description

declare @keywords varchar(8000)
set @keywords = ''
select @keywords = @keywords + T.n.value('text()[1]','varchar(20)') + ', '
from @x.nodes('/root/event/keyword') T(n)
select LEFT(@keywords,LEN(@keywords)-1) AS Keywords

declare @locations varchar(8000)
set @locations = ''
select @locations = @locations + Q.location + ', '
from (
select T.n.value('text()[1]','varchar(20)') AS location
from @x.nodes('//location/name') T(n)
union all
select T.n.value('text()[1]','varchar(20)')
from @x.nodes('//parent-location/name') T(n)
) AS Q
select LEFT(@locations, LEN(@locations)-1) AS Locations

Instead of just selecting you can put those results into the variables and then just pass those variables to the stored procedure's parameters.

So just for summary - you can create a stored procedure that takes a path to XML file as a parameter. Then the procedure takes the document into xml variable (you will probably need to use dynamic sql here), queries xml against all information needed, puts information into variables and executes another procedure passing variables as the values of the parameters.

User avatar
Darwin
Full Member
Full Member
 
Posts: 111
Joined: Mon Oct 09, 2006 11:16 am
Location: Bradenton, FL


Return to Microsoft SQL Server

Who is online

Users browsing this forum: No registered users and 1 guest

cron