//
you're reading...
ETL

Python vs SSIS

I am a long time MS BI stack evangelist, however recently I have been incorporating Python into my ETL arsenal. In several cases I have opted for python over SSIS. Python in combination with Dask and Panda can accomplish many if not all ETL tasks… And here’s why:

Cost

Python is a free and completely open source language maintained by the Python Software Foundation. Most of its packages are available at no cost and you can contribute your own functionality and\or enhance existing functionality if you encounter a bug or need a new feature. SSIS obviously requires a license for every environment you are running on and you may wait quite awhile for bug fixes and\or new features.

Testability

Python has several unit testing frameworks such as PYtest and Unittest. You can design your ETL process in such a way as it will be easily testable to ensure that your process is behaving as expected. SSIS on the other hand does not really offer any built in testing frameworks other then those available from 3rd party vendors.

Scalability

Dask is specifically designed for large datasets and is easily scalable across n nodes. Whereas a single SSIS data flow\package is restricted by the memory available on the current node. With Python you can easily scale out your ETL process across many nodes. Of course you can decide to utilize the SSIS “scale out” feature which does allow for a master\workers configuration. This can get quite complex and costly to set up as you now need licenses for all worker nodes. Set up can be further complicated if you are running your etl in a cloud configuration such as AWS or Azure.

About ldgaller

Accomplished Data Warehouse Architect, DBA and Software Architect with over 15 years of professional experience and demonstrated success designing and implementing solutions that improve business functionality and productivity. Highly diverse technical background with proven ability to design, develop and implement technology on an enterprise level. I approach all projects with passion, diligence, integrity, and exceptional aptitude.

Discussion

No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: