TST for Data validations

Mar 19, 2012 at 11:32 PM

Hi, I am working as a SDET in Database Testing. I am tryitng to find out a tool for Testing stored procedures in my SQL server. We have some jobs for ETL which runs these stored procedures. I need to test the Databse for data validations after stored procedures. Most of the tables don't have foreign keys. Any ideas about creating automated test cases in T-SQL language ? Is TST best for it ? I am not good in C# or .NET so Iam looking for T-SQL tools. Please let me know.


Mar 25, 2012 at 5:33 PM

Short answer:
TST is designed primarily for code validation and not for data validation. There are some downsides in using TST for data validation (see below).

Long answer:
The question is: What kind of ETL jobs are we talking about? Are those actual production mode ETL jobs with real data or test ETL jobs where you seed the data for the purpose of exercising test scenarios? From your description I assume we are talking about the first case: production mode ETL jobs.

The first case: If your goal is to make sure that production data is in good shape and you want to validate production data that results after ETL jobs are run. Using TST for this scenario has one important downside: One individual TST test procedure will stop at the first assert that fails. This means that if you have a test that has several asserts you will only be able to see the first failure and not be sure about the rest. Also I imagine that after you run the ETL you not only want to see if a certain relation is broken but you want to see in how many places and where are those places (as in how many rows and which rows). That may also be cumbersome to do with TST.

Keep in mind that in this case, what you do is to validate particular data and you will not necessarily be able to find certain bugs that you may have in your stored procedures. And this brings us to the second case:

The second case: If your goal is to make sure that you don’t have bugs in your stored procedures. Validating data after the production ETL is run, is not a reliable way to achieve this goal. There are several issues:
1. You find out about the bug only after the data is already impacted.
2. There are bugs that code validation will catch easily and data validation will not catch at all. This is probably the worst case. You end up running with bugs that may not be discovered or are discovered by chance long time after data corruption is introduced in production.
3. You may have bugs that will not be hit in the production for quite a while. Fixing a bug found 6 months after the system was released to production is most of the time way more difficult than fixing the bug 10 minutes after you introduced it.

If your goal is to do data validation (first case) then TST is not ideal. If your goal is to make sure that you don’t have bugs in your stored procedures (second case), TST is a good tool.