Over the years I have learned quite a bit about FDM – what I like, what I think needs improvement and what I hate. Unlike most of my posts where I sing the praises of different functionality, today I am going to spend some time talking about a piece of functionality that I absolutely abhor; import scripts.
Before I begin, let’s talk about what an import script is. FDM includes functionality that allows a piece of code to be executed during the import process. There are 2 types of import scripts:
|Data Pump||Used to manipulate a flat file extract that needs to be processed through FDM.|
|Integration||Used to query data directly from a relational source. No file is needed and data is imported directly to the FDM repository|
Data Pump import scripts execute on each record of the data file for the dimension to which they are assigned in the import format. For example, if there is a data pump import script assigned to the Account dimension and the file has 100 records, the import script will execute 100 times during the import. If the import format has a script assigned to both the entity and the account, the total script execution will be 200. Let me put a little asterisks here, we’ll come back to this point later.
Data pump import scripts are generally very performant. The mapping engine is optimized to execute scripts very efficiently. Import scripts will often perform better than cross dimensional maps. Additionally, these scripts are also generally very easy to write.
So why do I have such a disdain for import scripts? Well let me first say that my ire for import scripts is limited to import data pump scripts. Integration scripts are a critical piece of functionality. Before we delve into the reasons why I think import scripts are suboptimal, let me issue this disclaimer. There are times when an import script is necessary. For example, a source general ledger has an account code that looks something like the following:
This string represents the legal entity, the account code and the product code. For my mapping into the EPM system, I just need the middle section of the string to define my GL to EPM account mapping. A simple import script can be used to split this GL account string into its separate segments and use just the segment needed to define the EPM target. This is a good use case for an import script.
So let’s discuss the reasons why I think import scripts are not the best option. There are several reasons. First and foremost, scripts degrade the value proposition of FDM. FDM was created to empower the end-user and to decrease or eliminate the need for IT support of the data integration cycle. FDM is web-based and allows an end-user to control their mapping and determine when the data integration process is executed. Obviously there is security that can be wrapped around the application to limit the scope of an end-user’s empowerment but in general, FDM is geared toward an end-user. Import scripts fly in the face of this. Scripts almost exclusively are an administrator function. This is by design. FDM does not have script level security. If an end-user has access to one script, they have access to all scripts. Granting this level of access to a non-administrator is extremely risky. An end user could inadvertently change the wrong script or make edits to the script that result in the script failing. This action would stop the import and thereby the entire workflow process until the script is corrected.
In concert with the above, I also recommend avoiding import scripts because there is little visibility for the end-user as to action being performed by the script. If you review the import log, there is virtually no information other than that a script skipped the given like. An end-user has no visibility to the script actions because, again, scripts maintenance is an administrative function.
Finally as it relates to reducing the value proposition of FDM, when a change in logic to the script is needed, the administrator will need to make the change. This can impact the timeliness of the data integration cycle, particularly in a multinational corporation. If the administrator is in the United States and someone in Asia needs an update, you could potentially delay the submission of data for an entire business day due to time zone differences.
The next major issue I have with import scripts, which is an extension of the previous point, is that import scripts are often used to bypass the primary function of FDM – mapping data. So often I have seen an import script that looks something like the below:
In this example, the script is using the field Custom3 and Colour to define the result of the import script. First, this script could certainly be optimized to eliminate a number of unnecessary ElseIf statements but let’s save that discussion and just point to my post on scripting. My real issue with this script is that this logic is essentially mapping the data. This can be accomplished in the FDM mapping tables using explicit, Like or cross dimensional maps. This script is completely unnecessary. Moreover and to the first point, the logic of this script lacks any auditability. Any changes to this logic are not logged in the FDM database as they would be with changes to the mapping tables.
My last major gripe with import scripts is one that impacts the end-user as well as potentially data quality. When using import scripts, any time a change is made to the logic in the script, the data file must be re-import to affect the change to the data set. In contrast, changes to the mapping tables can be affected simply by re-validating the data. By using the mapping tables appropriately the end-user can avoid the task of having to find the file last imported and executing the import workflow step. This is a suboptimal process particularly for those organizations that process multiple data files through a single location. More concerning is the idea that data could be changed as a result of the re-import. If a user runs a new trial balance file or simply selects the wrong version, financial results in the target application could be changed as a result of the script “mapping” logic update. While you may think this is ok, it may not be. Some organizations do not want overall financial results to change after a specific date. While alignment of the data may be updated (as a result of updated maps), the total/net financial metrics should not change. The use of import scripts and the need to re-import the data introduces the risk that the source data may be different from what was last processed and loaded to the EPM application.
So the problem with import data pump scripts is multidimensional – they override/bypass core application functionality (mapping), lack auditability, require an administrator to update the logic and require a data file to be re-imported to affect the change. For these reasons, I endeavor to avoid using them when possible.
So your next thought might be, “great Tony, you’ve convinced me, I’m on board but how do I not use import scripts?” Well as I said at the start of this post, there are times when an import script will be needed. But in many cases using a combination of a well-defined import format and mapping will generally address many situations where an import script could be used. In the event that performance suffers due to complex multidimensional mapping, event scripting can be used to deliver performance while addressing the shortcomings of import scripts. The latter is a bit in-depth and likely a topic for another post.
I hope that this post has given you some food for thought. If you have comments from your experience with import scripts, please feel free to share.