Tools of the Trade: Using EditPlus 2 to Generate Code

(Or insert your favorite Regular Expression enabled text editor here).
This is part one of a multi-part series of tools that we use to get the job done, whatever that job may be.
Oftentimes our work is not always “here’s a spec, create it”, but at other times that work is “here’s an export of our database for you to work with”. In an ideal world, we would like to receive this in something that is easily imported into a relational database., but more often than not we just get a straight “dump” (usually multiple CSV files) of the production data. The major challenge that we often face in this scenario is of how to maintain the original data integrity, and how we do it without blowing our budget. That’s where using the regex replace functionality that is part of EditPlus comes in very handy.

Here’s a sample of what the data dump might look like:

Where ID (the first column) might be the Primary key of the item on their system, and foreign keys always refer to different IDs than what’s on your local system. At first this was impossible to do while keeping the data accurate, so I had to ask for the data that the foreign keys referenced. So now we’re all set.  Read on for the solution

An alternative way to do this would be to open these files in Microsoft Excel and import those to a temporary table (in the sense that it will be deleted later, not a SQL Server “#” temporary table), run a bunch of select statements to find the corresponding ID of the item in the local database, or insert it if it doesn’t exist, etc. These have to be done for thousands of records and at least 5 tables. The way that I ultimately opt for each time is to use EditPlus2. This doesn’t have many benefits over importing to a table and using a cursor to loop over the items, however, either way has to be a one off solution since we’re not going to write a generic CSV importer that can handle foreign key lookups or insertions… yet.

The Steps to Using Regex Replace in EditPlus

The first step to solving this problem is to examine the data to determine a good regular expression that can be used to parse it thoroughly the first time. This is what I came up with:

^[0-9]*,([0-9]*),”([A-Z0-9]*)”,”([A-Za-z ,&0-9/()-]*)”,(-?[0-9]*),(-?[0-9]*),(-?[0-9]*),([0|1]),([0|1])$
Yes, it’s a mess. Yes, it works. Better yet, since EditPlus (or your text editor) recognizes your matched groups, it makes it simple to extract these values and plug them into SQL statements. I always keep track of what the groups stand for in comments somewhere to make easy reference when I’m writing the SQL statements. Let’s plug it into EditPlus:

And run the replace to see what we get with this test replace

Now we can write the SQL. Simply “Undo” the previous replace. EditPlus can undo an enormous amount.  The resulting file has close to 4000 lines, and it puts the entire replace into a single state, so pressing “CTRL-Z” once undoes the whole replace.  Fill in your SQL replace text.  It may take some experimentation and brute force, but one undo and you’re back to the original file and can try again.

Run the SQL replace and your SQL should be fully populated and syntactically correct.  If you miss a few things, you can just undo once and try again.  This SQL has a “set @jobCode” statement to lead off, and it’s not declared anywhere, so you need to declare any variables at the top of all the replaced text, not in the replace regular expression.

This produces 185,561 lines of SQL from the 1,913 lines of comma separated values that I received.

Another Practical Use: Generating Public Properties from your Private Member Fields

One other use I will go over, and use it whenever I have to create a class with an inordinate amount of private fields, is to automatically generate public properties. Visual Studio does this but it can only do one at a time, and if you have many projects in your solution, you will notice that it has to “prepare files for refactoring” for about 5 minutes. Eclipse is instant, but I won’t go into IDE differences. Here is a sample class that I will be “refactoring”.

Here’s a good regular expression to read the lines that contain the private members
         ^tprivate ([a-zA-Z]+) _([a-zA-Z0-9]+);$
It’s important to not include the underscore in the second group. Here is the replace text
      // the whole match contains the ending semicolon so there is no need to add it here
      tpublic 1 2 {
      ttget { return this._2; }
      ttset { this._2 = value; }
      t}

The result

If you’re like me and like to keep all your private/protected members in one spot and all properties in another, you can just copy the list of private/protected members, exclude the “” from the replace regular expression, and paste the resulting text where you want it in the class file. Get creative with some comments too!  Also, if you’re like me, you use lowercase private members, instead of the underscore as I have here.  So just don’t forget to replace your getters and setters with the correct variable name after you run the replace, lest you get infinite loops.

Conclusion

This type of problem may come up infrequently, but there may be other problems where writing a whole one-off program to read the file, parse it, maybe even write directly to the database instead of generating SQL, that may just be infeasible, and you may never use it again. It is a valuable skill to be able to choose the right tool for the job.
Stay tuned for the next part in this series.

« Prev Article
Next Article »