Sunday, March 11, 2012

Active (non executed) objects will still drain performance!?

Even if objects(tasks, components) are not explicitly called (say they are under another CtrlFlow constraint which doesn;t meet expr_condition), they will still eat up performance.

We have a pkg that has 3 DFTs that will either transform FLFFtoXML, CSVtoXML, or PIPEtoXML. When we execute this pkg it takes roughly 15 secs to complete, but for every DFT we disable, it shaves 5 secs off. We did some debugging and found that an entry of:
DLLGetClassObject succeeded... and
DLLGetClassObject called for VSA7.dll/n...
gets written the more we add/activate objects. Even when the objects were not getting executed -- these entries will get written just because the objects are active.

Is this expected to happen? 15secs might not seem to bad in small numbers, but can be when dealing with a large number of files that run through a foreach container. Is there any tuning that can be done?

We know one tuning method could be to seperate the DFTs into seperate pkgs, however, we are trying to avoid this because it would triple the amount of pkgs (est 150x3)Is there a way to make the foreach enumerator, enumerate more than one at a time? Or, have it start(enumerate), call next pkg, and enumerate again without waiting for the child/sub processes to finish?|||The objects are not executed, but they are still loaded, initialized and validated. I suspect most of the time is spend validating objects. You may try to set DelayValidation on this data flow to True - then the task will only be validated right before and only if it is executed.|||

JAson_scoobyjw wrote:

Is there a way to make the foreach enumerator, enumerate more than one at a time? Or, have it start(enumerate), call next pkg, and enumerate again without waiting for the child/sub processes to finish?

You're talking there about parallel execution (sort of) - that's something that's been talked about for the next version. i.e. Carry out the steps for each enumeration in parallel with each other.

-Jamie|||

Genius Michael, pure genius! 100pts…

I thought that the delay validation was only for design time, but I tested it on a few packages and it worked!

Thanks

P.S. Very cool Jamie, we will still welcome parallel foreach executions... Can we look forward to this in the next version or SP? If next version, then (do I dare ask) any timelines yet?

|||It was in an early release, but was taken out :(|||That's too bad...

Turning off validation works marvelously, however, outside of manually updating every single pkg developed thus far to delay validation, is there any global ways?

We tried performing a search/replace (code) to change DelayValidation=0 to DelayValidation=-1, but discovered this wouldn't work because SSIS must validate the base pkg (meaning, the pkg in itself must have DelayValidation=False) in order to work. If not, an error like this will occur:

Error: Error 0xC0012050 while loading package file "dtsx". Package failed validation from the ExecutePackage task. The package cannot run. .

We can just go in and change this particular DelayValidation prop manually but was curious if there was another, faster way?

No comments:

Post a Comment