Home > Blog Posts > Dev > Data Import Program: Part One

Data Import Program: Part One

Tags:

dev
Published: Apr 21 2017

Estimated Read Time:

History

The problem: flat files with fixed character width that needs importing into an SQL database. The original solution was using a DTS package to pull the files with FTP and then import the files into the database.

The solution worked for many years.

The server was old...maybe a 2000 server with SQL Server 2005, I'm not even sure, and it was being sent to that great big rack in the sky. We were also moving the database to a newer platform as that server was going away soon too.

Now the new server is Windows 2012 with a SQL Server 2014 database. Naturally the DTS Package no longer worked. This was so old and there wasn't a lot of info about how it worked.

Darth Vader documentation meme

I had the original DTS code thankfully and eventually figured out what length each field was, which fields were converting to another type, what tables they write to, and which fields were transformed beforehand.

I started by trying to rewrite the package in SSIS. It never worked. I mean some of it worked but it didn't run to completion. I could call my batch file and FTP the files. I could clear the data before inserting. Importing the data is where it failed. One simple file worked great. The 3 other files, not so much. I am not an expert with SSIS (first time using it!) and I could not get the right conversions for some of the data. Eventually I tracked it down to characters being used in numerical fields to represent negative numbers. Given enough time I might have got it but I wasn't given forever.

The Solution

Keep the partially working SSIS package and at the next step call a custom console application that will import the other 3 files. I wrote about it a bit with the regular expression post. I had to figure out which character was which numeric in the string and then set it to negative. That was probably the most fun part of this all. The program was pretty simple. 

In pseudo-code: open a file and get all text, for each line run against RegEx to get fields, convert fields that need it, set fields to parameter and insert to database. Do this for each file.

The files had varying lengths of data. Nothing that I thought too much about. Roughly 180k, 100k, and 50k records respectively. 

Pretty simple and the process ran. It ran fine for almost 18 months. Until it didn't. 

I did learn a lot about SSIS (but not enough) and had a pretty good time reverse-engineering the process and how it all worked. 

Next post will be about when it stops working and the changes made to resolve it.