Task Factory Advanced Lookup Cache
Advanced Lookup Cache Connection Manager
The Advanced Lookup Cache Connection Manager is used to setup the cache database and configure how data is retrieved from it.
Available Columns Tab
The available columns tab allows you to define columns that are part of the cache, whether or not the column is output when a lookup is performed, and what type of cache database to use.
Option | Description |
---|---|
Use In-Memory Cache | This tells the connection manager whether to use in-memory cache or on-disk cache when creating the cache database. |
Define Customer Cache Location | By default, the cache connection manager is created in a temporary file location defined by SSIS. You can choose to store the database in a custom location by checking this box and entering/selecting a path to the file. Note: The file should be defined with an .s3db extension. |
Output Columns | The Output column tells the connection manager to return this value when a lookup is performed. The output columns can be seen on the Advanced Lookup Transform UI after a cache connection manager is selected. |
Input Parameters Tab
Option | Description |
---|---|
Input Parameter | Used by the Advanced Lookup Transform to define the lookup values to find rows in the cache database. They are mapped from Input columns in the Advanced Lookup Transform. |
Input Parameter Condition | This is where you can easily create conditions for queries. By selecting the word And, you can select between an And and Or conditions. Selecting the green + adds more conditions and selecting the red X removes them. |
Select the green New Group Icon at the end of the condition to create a new And/Or group.
Properties Tab
Option | Description |
---|---|
Multiple Row Match Policy | This option lets you define what should happen if multiple rows are found during a lookup. |
Make All Comparisons Case Sensitive | Using this option makes all string comparisons performed case sensitive (e.g. SolarWinds does not equal solarwinds). |
Advanced Lookup Cache Transform
Transform Icon | Transform Description |
---|---|
The Advanced Lookup Cache Transform is used to store rows from a source into an in-memory or on-disk cache. See Advanced Lookup Cache Connection Manager to learn more about setting up the connection manager. |
Option | Description |
---|---|
Choose Cache Connection Manager | You can select an existing Advanced Lookup Cache Connection Manager or choose to create a new one. |
Map Cache Columns | After a Cache Connection Manager has been selected, the input columns and columns from the cache are automatically mapped if the names match. Otherwise you need to manually map the columns. |
Advanced Lookup Transform
Transform Icon | Transform Description |
---|---|
The Advanced Lookup Transform allows you to store a temporary cache of rows and perform lookups on that cache to pull that data into a data flow. There are two modes for the Advanced Lookup Transform :
See the Advanced Lookup Cache Connection Manager to learn more about setting up the connection manager. |
Option | Description |
---|---|
Select Cache Connection Manager | Select an Advanced Lookup Cache Connection Manager. If you have not created one yet, you will need to create one. Once you have one selected, it automatically populates the available columns. |
Map Input Columns To Cache Input Parameters | This is where you map input columns to the input parameters defined in the Advanced Lookup Cache Connection Manager. |
What Should Happen To Non-Matched Rows | This tells the component what should happen when a non-matched row is found in the lookup.
|
Output Columns (Defined in Cache Connection Manager) | This section lists all of the output columns defined in the Advanced Lookup Cache Connection Manager. You have a few choices on how to output the values of the output columns based on the column action.
|
Refresh Columns From Cache Button | This button refreshes the columns from the cache connection manager. |
Setting up an Advanced Lookup Transform
Complete the following steps to setup an Advanced Lookup Transform:
1. Create a data flow task, and then add an OLE DB Source. In this example we are using an OLE DB source connected to AdventureWorksDW. (download here).
2. For this example select only the following columns from the columns tab.
- ProductKey
- ProductAlternateKey
- Color
- EnglishProductName
- StandardCost
- StartDate
- EndDate
- LargePhoto
3. Now, create a lookup cache using the Advanced Lookup Cache Transform. Drag an instance of TF Advanced Lookup Cache Transform to the designer window. Attach the OLE DB Source Output to the TF Advanced Lookup Cache Transform. Open the transform by double clicking it, which opens the edit window.
4. Select the Choose Cache Connection Manager drop down list and then choose Create New Lookup Cache Connection...
5. Setup the Advanced Lookup Cache Connection Manager. Then deselect all output except for Product Key.
6. Next, select the Input Parameters tab, and then create two input parameters named ProductAlternateKey and OrderDate.
7. Then, create the following conditions: Note that Is Blank = Is Null.
8. Select OK and then your window should look like this:
9. Add a new data flow task and a new OLE DB source, for this example we are connecting to AdventureWorks and creating a custom SQL query.
10. Add a new TF Advanced Lookup Transform to the designer, drag the output from the OLE DB source to the TF Advanced Lookup Transform. Double click the transform to open it. Once the editor window is open, choose the previously created Cache Connection Manager.
11. In the Map Input Columns to Cache Input Parameters window, connect OrderDate from the Input column to the OrderDate from the Cache Column by dragging one to the other. Do the same for the remaining field, and then select OK.
12. Drag your Matched and Non-Matched outputs to your desired destination. For this example we used TF Terminator Destinations for both.
Disconnected Lookups
Disconnected Lookups allow you to use a cache to perform lookups without an Advanced Lookup Cache Transform in your data flow.
Functions
There are two functions that can be used to execute disconnected lookups:
Function | Description |
---|---|
LookupData(connectionManager As AdvancedLookupCacheConnectionManager, ParamArray args As Object) | Used to retrieve the FIRST column setup as an output column in the Advanced Lookup Cache Manager defined in the connectionManager parameter. Parameters:
|
LookupDataByColumn(connectionManager As AdvancedLookupCacheConnectionManager, outputColumnName As String, ParamArray args As Object | Used to retrieve the value of the column defined in the outputColumnName parameter from the Advanced Lookup Cache Connection manager defined in the connectionManager parameter. Parameters:
|