Saturday, January 15, 2011

BODI ERROR in 2011

hi,
fdjkljflsdfa
fdfsa
fdsa
fdsa

fds
fs
afds
a

Monday, January 5, 2009

Loading the data from a table to the XML TARGET

The data looks like this in the source table :
ID NAME ADD1 STREET1 CITY1 COUNTRY1 ADD2 STREET2 CITY2 COUNTRY2
1 SUDHEER 216 SHIPRA GZB INDIA C-701 JAIPURIA GZB INDIA2 ABHISHEK 275 DWARKA DELHI INDIA

The output structure should be

ID NAME ADDRESS

In the ADDRESS you will find the ADD,STREET,CITY (AS MANY AS ARE THEY Eg : ADD1,ADD2 in the above eg : It can be ADD1,ADD2,ADD3,ADD4)

Sol :

FLOW :

SOURCE TABLE --- QUERY1--PIVOT--- QUERY2---QUERY3--- QUERY4---XMLTARGET

1)QUERY1--* Simply Extracting the data from the Source Table.

2)PIVOT---* Converting the Columns to rows

4 PIVOTAL SETS

1st Set---* ADD1, ADD2
2nd Set---* STREET1,STREET2
3rd Set---* CITY1,CITY2
4th Set---* COUNTRY1,COUNTRY2

Name the DATA FIELD COLUMNS as ADD,STREET,CITY,COUNTRY

3)QUERY3--* Make the XML Structure you are in need of

4)QUERY4--*For the Parent tree select DISTINCT ROWS option

In order to restrict DUPLICATE data.

FOR THE ADDRESS heirarchy place the following condition in the
where condition.

(((((Xml_Struct.ID = Xml_Struct.Pivot.ID) AND
(Xml_Struct.Pivot.ADD_DATA IS NOT NULL )) AND
(Xml_Struct.Pivot.STRT_DATA IS NOT NULL )) AND
(Xml_Struct.Pivot.CITY_DATA IS NOT NULL )) AND
(Xml_Struct.Pivot.CNTRY_DATA IS NOT NULL ))

In order to restrict NULL data

In the Group By tab place the following :
(((((Xml_Struct.ID = Xml_Struct.Pivot.ID) AND
(Xml_Struct.Pivot.ADD_DATA IS NOT NULL )) AND
(Xml_Struct.Pivot.STRT_DATA IS NOT NULL )) AND
(Xml_Struct.Pivot.CITY_DATA IS NOT NULL )) AND
(Xml_Struct.Pivot.CNTRY_DATA IS NOT NULL ))

In order to restrict DUPLICATE data.

5)QUERY5---*Construct the XML structure for NRDM column

Download ATL : http://www.divshare.com/download/6248312-2ed

*****

Monday, December 29, 2008

BODI FAQS

1)Is it necessary to put Input Primarykey column in Table Comparison Transform

It is mandatory to select primary key column

2)Parallel Processing

You can run transforms in parallel by entering a number in the Degree of Parallelism box on a data flow’s Properties window. The number is used to replicate transforms in the data flow which run as separate threads when the Job Server processes the data flow.

** Entering a number in the Degree of Parallelism box on a data flow’s Properties window.
** The number is used to replicate transforms in the data flow

3) Soure table Partitioning in Data Integrator , How will u load the data?


4) Performance tuning of a job

The following sections describe ways you can adjust Data Integrator
performance:

• Source-based performance options

• Using array fetch size
• Caching data
• Join ordering
• Minimizing extracted data

• Target-based performance options

• Loading method and rows per commit
• Staging tables to speed up auto-correct loads

• Job design performance options

• Improving throughput
• Maximizing the number of pushed-down operations
• Minimizing data type conversion
• Minimizing locale conversion
• Improving Informix repository performance


5) Parameters in Dataflow

LOCAL VARIABLES, GLOBAL VARIABLES, PARAMETERS.


Dataflow Level---Only Parameters.

Workflow Level---Parameters,Local Variables.

Job Level --- Local Variables, Global Variables.

You create local variables, parameters, and global variables using the Variables and Parameters window in the Designer.

You can set values for local or global variables in script objects. You can also
set global variable values using external job, execution, or schedule properties.


6) SCD Type 2 flow

1 way……….

In the DataFlow where the Intermediate Table and the Ultimate Target table are joined, we will be creating a port named DML_FLAG Which consists of {0/1/2} based on the following.

a) The SCD Columns that are retrieved from the Intermediate table & Ultimate Target table are compared over here.
b) If they are not matching we will assign with 2.
If they are matching we will assign with 1.
If the Surrogate Key of the Ultimate Target table is NULL then we will assign with 0.

So For DML_FLAG having value “2” we will have two dataflows one for insert and one for updating the target table.

Other way……….

SRC----TABLECOMPARISON----HISTORY PRESERVING----KEYGENERATION---TARGET.

Table Comparison-----Primary Key Columns,,,,,Compare columns(SCD Columns)
History Preserving----- Compare columns(SCD Columns)


7) Audit Strategy Implementation in the project



8) Different logs available in the Data Integrator

3 Data Integrator logs :

• Examining trace logs
• Examining statistics logs
• Examining error logs

**Use the trace logs to determine where an execution failed, whether the execution steps occur in the order you expect, and which parts of the execution are the most time consuming.

**The statistics log (also known as the monitor log) quantifies the activities of the components of the job. It lists the time spent in a given component of a job and the number of data rows that streamed through the component.

**Data Integrator produces an error log for every job execution. Use the error logs to determine how an execution failed. If the execution completed without error, the error log is blank.


9) Different Types of Embedded dataflows? Are they reusable.


An embedded data flow is a data flow that is called from inside another data flow.

There are four types of embedded data flows:

One input ---------Add an embedded data flow at the end of a data flow

One input and one output---------- Add an embedded data flow in the middle of a data flow.

One output --------Add an embedded data flow at the beginning of a data flow

No input or output---------- Replicate an existing data flow.

Yes. They are reusable.


10) Passing the parameters of one dataflow to other dataflows.

11) Bulk Loading

12)In how many ways Can we execute a job in BODI ?

3 ways :

a)Manually
b)Thru Scheduling in WEB ADMIN.
c) Export Execution Command which is used to export the job into a batch file which is again executable.


12)What are the different types of Caches in LOOKUP?

NO_CACHE — Does not cache any values.

• PRE_LOAD_CACHE — Preloads the result column and compare column
into memory (it loads the values before executing the lookup).
Use this option if the table can fit in memory.

• DEMAND_LOAD_CACHE — Loads the result column and compare
column into memory as the function executes.
Use this option when looking up highly repetitive values that are a small
subset of the data and when missing values are unlikely.


13)What is the significance of USE INPUT KEYS in Target table Options ?

By setting this Data Integrator will use the primary keys of the source table in order to update the data in the target table.

14)Tell about QUERY TRANSFORM

Retrieves a data set that satisfies conditions that you specify. A query transform is similar to a SQL SELECT statement.
Select-----Distinct Rows
Outer Join -----Outer Source----Inner Source
Where………..You can put any kind of codition like lookup()/any built-in function which returns a single value.
By using Propose Joins DI automatically joins the columns that are having the same name and data type.
Group By ……Drag the columns that r needed to group.
Order By……..Drag the columns you want.

Can Use SQL().
Can use LookUp,LookUpEXT(),LookupSeq() or Built-In functions.
Can use relational operators in the conditions.
Can use Global variables.
Can use Custom Functions.

15)Implementation of INCREMENTAL LOADING

Using the Last_Load_Timestamp of the target table.

If the source table's timestamp is > than the Maximum Last_Load_Timestamp

Using a Table Without Importing It

In SQL Transform :

1)Create Output Columns that are required by seeing the table structure in the backend.

2)Do not import the table in the respective DataStore(The Schema i.e Connected with in which the table is available).

3)Select the appropriate DataStore in which the actual table is residing in the Backend but not in the datastore.

4)Write the appropriate SQL Query as per the columns created.

5)If you want, you can use the global variable which represent the respective schema name.

6) Do not click on the Update Schema.

7)Implement the Transformation Logic i.e required in the job.

8)Execute the job

DOWNLOAD ATL : http://www.divshare.com/download/6200214-d42

Sunday, December 28, 2008

Column having muliple values separated with ATTHERATE symbol

(b)DATA RECEIVED VIA FLATFILE (STRUCTURE)
Header :
REGNO,MAKE,YEAR ,DATES LEASED FROM,DATES LEASED TO,CUSTID

Ist Row :
SV53AYC,FORD,2003,28/10/06@22/10/06@12/10/06,30/10/06@27/10/06@20/10/06,HK113@TU141@BK121

IInd Row Data :

SV54KGH,HONDA,2004, 24/08/06@19/8/06, 28/08/2006@22/08/2006, BB141@TM186

IIIrd Row Data :

SV07XKY, TOYOTA, 2007, , ,

Iv Row Data :

SV06VWY,BMW, 2006, 11/03/07, 23/03/07, MK129

Plz Remove Header :, Ist Row Data :, II Row Data :, III Row Data :, Iv Row Data

Solution :





Downloadable Atl Link :
------------------------

http://www.divshare.com/download/6199434-52c

Contains Source CSV file ...and .....Solution ATL file.

Wednesday, December 17, 2008

Renaming the Datastores of Soure / Target / Intermediate Tables

In order to Rename the Datastores of Source / Target / Intermediate Tables.

1) Take the atl of the job / jobs / Dataflows / Workflows / Datastores

2)Open in Textpad / Notepad.

3) Thru Find & Replace rename the Datastores of the desired one.

4) Save and Import the modified atl into the Repository.

Tuesday, December 16, 2008

Extract Only the File Name from the relevant/Absolete Path

$G_FILENAME = 'C:\RAMEHS\PRINCE\KDKD.TXT';

$G_VAR1 = word_ext($G_FILENAME,-1,'\\');

PRINT('Filename Is...... '||$G_VAR1);