Note on my shorthand [+] means actionable insight, [i] refers to something of interest, [!] indicates a warning, [?] poses a question I'll need to ask
[i] Use the right tool for the job. SSIS is not a hammer to hit all nails with. T-SQL and other add-ons or complimentary tools have their place.
- Flat Files
- [+] The data will be parsed as it goes in If you can trust the veracity of the source data types turn on Fast Parse which gives an 8% performance boost.
- You'll find it only in the Advanced Editor - Input/Output Properties - By Columns - Custom Properties - fast parse = true
- It only works on reads
- Also works on Data Conversion task
- If the data is not as good quality as you thought the task will still fail but it'll get to the end first
- OLEDB Connections
- Default network package size is 4K. You can increase this to 32K for larger datasets.
- [+] You'll need the infrastructure guys to turn on Jumbo Packets across the network where the data is traveling. Network guys will need to run a traceroute to see where the data goes.
- Change setting in the Connection Properties - All Tab - Packet Size is 0 = default change to 32767
- Set the data set to be "pre-sorted" in the advanced properties so the engine will not have do this. As long as you have pre-sorted the data/
- T-SQL
- [i] Use T-SQL for the things it is good at and avoid the equivalent task in SSIS
- Asynchronous components
- e.g. a Sort. These are tasks that by definition have to wait for the full data set before they can act. They're bottlenecks avoid them if possible.
- Brian's blog has a page which outs all the asynchronous tasks
- [+] Tune the Single Customer View package to remove asynchronous tasks
- A classic example is the OLE DB Command which is row by row and shouldn't be used. Use T-SQL instead for these actions. It even opens and closes a connection for each row as well. This is a terrible component of SSIS.
- Default Buffer Max Row
- [+] In the Data Flow Task properties trial and error a buffer of between 8K and 12K. See what performs best.
- This could provide up to a 20% performance improvement.
- SQL Destination
- [!] Is deprecated now. Don't use it.
- Bulk Inserts
- Turn on Fast Load in the OLE connection under Data Access mode for bulk inserts.
- But you can't use it if you need some rows to use conditional logic.
- Manage CPU Load
- Instead of seeing a massive CPU spike as the package commits the data change the OLE DB connection manager Maximum Input Commit Rows to around 50K to get regular commits and smooth the CPU load.
- [+] Alter the DW load packages specifically the clickstream collection package to use this. Monitor the effect with RedGate SQL Monitor.
- Manage RAM
- [+] Look at the Buffers in use and Buffers spooled in RedGate SQL Monitor. If your Spools > 0 your running out of RAM. You can also find this in the SSIS Pipeline in PerfMon.
- Windows System Resource Manager
- [+] It's free and you already have it. Use it.
- Allows the creation of a soft throttle on resources so a .exe can only use n% of resources.
- It can be set to apply different usage restrictions at different times so you can allow the DW build full resource during the night but limit it during the day
- Work Pile Pattern
- [i] Check this out on his blog
- The basic theory is that you can load balance highly transactional SSIS processes by spawning multiple packages when demand spikes. Very cool.
- Fuzzy Tuning
- I asked about tuning the Fuzzy components. He pointed me to his blog. I couldn't find anything so far.
- But chapter 15 in Professional Microsoft SQL Server 2012 Integration Services appears to have lots of good advice
Overall - Excellent fast-paced session. Really good practical advice delivered by a pro. 5*'s
Hi,
ReplyDeleteSuppose my system has 8GB RAM and I have installed SQL Server 2012 on it along with the DataTools. Sql server engine uses 6GB memory. Will the SSIS uses remaning memory apart from 6GB or it uses 6GB?
Also is there a way to count the number of buffers used and the records accomodated in each buffer and the size of the buffer?
If am not wrong buffer refers to the RAM memory?