An ETL Tool vs. a Home Grown Solution
[Huge archives of historical data, growing volumes
of transactional data, informative external data when harvested
into business Intelligence systems and data warehouses for
decision making activities provide a competitive edge to
business houses. Yet data warehousing is an expensive proposition
and often organizations hope to reduce costs by economizing
on specific stages of the process. There is always a raging
debate on the economics of creating in-house software versus
buying off-the-shelf expensive tools. Each critical process
is evaluated in this context and the use of ETL tools and
software does not escape from this eternal debate. However,
off-the-shelf products have their own advantages and though
the programming is generic they are flexible enough for
customizations. Now there is even a new breed of off-the-shelf
ETL tools priced for the small and mid size market such
as VisualETL.]
The off-the-shelf advocates insist that business
intelligence solutions have enough challenges without having
to worry about possible lacunae in the home grown software
that may corrupt target systems and make data inconsistent.
Software like VisualETL should be used as they have proven
track records of success and have taken care of most of
the data issues by a long drawn process of experimentation
and implementation.
The home grown solution advocates, on the other
hand, dismiss the off the shelf solutions on the ground
that they cannot really cater to the specific needs of a
business entity. They point out that high costs and generic
programming do not really address the specific needs of
the business entities. Home grown solutions have distinct
advantages. The low cost; the customization of the code;
optimization of the program to suit the needs; the pace
at which the solution can be built and the large knowledgebase
of the programmers all make the possibility of a home grown
solution attractive. On the other hand, it must be acknowledged
that the disadvantages out weigh the advantages. Home grown
solutions are difficult to manage and maintain. Any change
to the data warehouse would impact on the ETL solution.
There would be no centralized repository of code and the
metadata capabilities would be limited. The development
cycle is large and debugging is more difficult. Audit trials
are limited or non existent. Moreover, a small mistake in
the process of Extraction, transformation and load or a
lack of foresight in the creation of the program would result
in a crippling impact on the data analysis and interpretation.
Fractures in the integration of the various data sources
would corrupt the target system and cause incorrect representation
of facts.
Off-the-shelf ETL solutions like VisualETL
provide an attractive user interface and have a centralized
storage for programs. Version control of programs is possible
and customizations of transforms become fairly simple. Metadata
support is optimal. Transformations can be quickly deployed
and transform scheduling, auditing are possibilities. Debugging
is easy and user friendly.
Though the debate continues to rage, more and
more organizations are opting for off the shelf solutions
for ETL. They reason that ETL is an important stage in the
process of creating the data warehouse. It is the process
that determines the integrity and accuracy of data. It is
foolishness to risk data integrity for cutting costs or
for reasons of customization. Off the shelf products have
been tested on a variety of data sources and has enough
inbuilt flexibilities to accommodate customizations.
With the new breed of reasonably priced ETL
tools such as VisualETL, there is even more of a reason
to use an off-the-shelf extract, transform and load software
tool.