Apr 23, 2015

Term extraction transformation SSIS example

This transformation extracts the noun and noun phrases from text. Let's understand Term extraction transformation SSIS by simple example:

Step 1: Create a new table and populate with some data:

CREATE TABLE tblDocument (
     ntDocumentID BIGINT
     ,vcDoucumentName VARCHAR(100)
     ,vcData VARCHAR(MAX)
)

INSERT INTO tblDocument
VALUES
     (1,'Math','I am learning mathematics.')
     ,(2,'Che','I am learning chemistry.')
     ,(3,'Phy','I am learning physics.')

Step 2:
Drag any data source adapter in data flow task area. In this example in is ADO .NET source adapter. It is connected with Sql server:


Step 3:  Drag term extract transformation and connect with data source adapter. After double clicking it will pop up term exaction transformation editor. Select the any on text data type column from where we want to extract the nouns or noun phrases. 


Step 4: If we to exclude a set of nouns or noun phrases. In this example we want to exclude the noun "Chemistry".  We are creating a new table in database and populating exclude terms:

CREATE TABLE tblExclude (
     ntExcludeID BIGINT
     ,vcNoiseWord VARCHAR(100)
)

INSERT INTO tblExclude
VALUES(1,'chemistry')

Again come back to SSIS. Under exclusion tab, create a new database connect where we have just created a table named "tblExclude". Select the table and column name from drop down.

  
Step 5:
Under advance tab:
Term type:   Noun means a single word noun and Noun phrase means groups of words (Noun, adjective etc.) In this example we are choose "Noun"

Score Type:  Frequency means total occurrence of a word. TFIDF is a fractional type score. In this example we are choosing TFIDF.

Parameters:  Frequency threshold means minimum occurrence of word to exact from text column. In this example we are choosing only occurrence is sufficient.   


Step 6: Populate the output to any destination and run it. Sample output:


Apr 22, 2015

Import column transformation example SSIS

Using import column transformation data can be loaded from file system task (content of a file) to database or any other destination adapter.  For example:

Let's assume we have a table named document. We want to load content of file to vcData column according path specified in other column (vcDocumentName). 

CREATE TABLE tblDocument (
     ntDocumentID BIGINT
     ,vcDoucumentName VARCHAR(100)
     ,vcData NVARCHAR(MAX)
)

INSERT INTO tblDocument
VALUES
     (1,'Math',NULL)
     ,(2,'Che',NULL)
     ,(3,'Phy',NULL)

Step 1:
Drag any data source adapter in data flow task area. In this example it is an ADO .NET source adapter and connected with Sql server:

Import column source Adapter

Step 2: It is optional step. In this step we are going to concatenate actual physical file path with the column named vcDoucumentName by using derived column transformation:

Import column transformation Physical Path 

Step 3: Drag import column transformation and connect with derived column transformation. Double click on it which will pop up advance editor.  
SSIS Import column input file path
In Input columns tab select column which keeps file path.

Step 4: Now click on input and output properties tab. In the output Columns node add a new column. In this example it is vcFileData and selecting DT_NTEXT as a data type from drop down. Copy the LinenageID which is 297 in this example. 

Getting file content Import Column SSIS
Step 5: In input columns node, select vcDcoumntName (columns which keeps file path) and enter the lineage ID of the output column in FileDataColumnID as shown in below diagram:
Linking File Path to File Data SSIS Import Column 
Step 6: Drag OLEDB command and connect with import column transformation. Create a new OLEDB connection. Double click on OLEDB command transformation. Under the connection managers tab select new OLED connection from drop down.
Updating File data SSIS Import Column 
Step 7: Under Component Properties tab enter SQL Command:

UPDATE tblDocument
SET vcData = ?
WHERE ntDocumentID = ?

As shown in below diagram: 

Update query to Update with file data SSIS Import column
Step 8: Under column mapping tab, map the destination column with input column.
Column Mapping Import column transformation SSIS 
Step 9: Now execute the package!!
Run Import column Transformation SSIS
And check your table:

ntDocumentID
vcDoucumentName
vcData
1
Math
I am learning mathematics.
2
Che
I am learning chemistry.
3
Phy
I am learning physics.

Apr 20, 2015

How to save data of field in a file of given column path SSIS: Export column transform



Export columns transform is used to store text\image\binary content as a file at given location in different column of a table. For example:

Step 1: Create a new table and populate with some data:
CREATE TABLE tblDocument (
     ntDocumentID BIGINT
     ,vcDoucumentName VARCHAR(100)
     ,vcData VARCHAR(MAX)
)

INSERT INTO tblDocument
VALUES
     (1,'Math','I am learning mathematics.')
     ,(2,'Che','I am learning chemistry.')
     ,(3,'Phy','I am learning physics.')
Step 2:
Drag any data source adapter in data flow task area. In this example in is ADO .NET source adapter. It is connected with Sql server:

Export column source adapter
Step 2: It is option step. In this step we are going to concatenate physical file path in the column named vcDoucumentName by using derived column transformation:

SSIS Export column file path
Step 3: Now drag export column transformation and connect it with derived column transformation. Double click on it and will pop up export column transformation editor. In column tab:
 
Extract column: From drop down choose a  column which keeps content of file.

File Path column: from drop down choose a column   which keeps physical path of the file. 

Allow append: If it is selected and file is exists then it will appended the data in same file.

Force truncate: If it is selected and file is exists then existing will be deleted and a new file will be created. 

Export column SSIS configuration
Now execute the package. You will find for each row there is a file at specified location:

Run export column SSIS