Dumping & Checking DBText Data

DBTextCheck will check exported borrower, catalog or loans files for blank records or missing required fields.  The name of the file to be checked is the only required parameter.   Blank record output defaults to DBTblanks.csv.  Records missing required fields will be written to DBTinvalid.csv.

DBTextCheck will produce a summary of each field showing the field name, blank count, maximum length and number of CR/LFs in the field.  DBTextCheck can be downloaded here

Checking Steps

  1. Modify DBTextCheck to select the fields to be checked.  For example : I have set “First Name”, “Last Name” and “Company” as the fields to be checked for blank records.  These field names have been modified or added to the original DBText for Libraries database.  The required fields for borrowers are “City”, “State” and “Zip”.  Capitalization is important.
  2. Export the Borrower.tba, catalog.tba, and Loans.tba files.  Set the Exports Options to:
    • Delimited ASCII Format
    • Record Separator — {3}
    • Field Separator–{4}
    • Entry Separator–| (the default value)
    • Quote Character–” (the default value)
    • Check “Store Field Names in First Row”
    • Records to Export–Export Entire Textbase
    • Fields to Export–Export All Fields
  3. Run DBTextCheck on each of the files and look at the blank and invalid ouput for each.
  4. Fix or delete the errors in DBText.
  5. Repeat 2-4 to remove as many errors as you can.

Program Details

DBTextCheck is written in Perl and uses Getopt::Long.

It starts by reading in the first record to save the field names in @headers.  A hash, %field_no, is set up to be able to check fields by name.  The first field is used to determine which file type is being processed.  This is used to specify which fields are being checked for blank or required fields.  You will want to look at these field names in the code as described in step one above.

As each record is read in, each field is checked for CR/LFs.  They are counted and converted to the pipe character, |, which is the default entry separator.  Leading and trailing double quotes and white space (blanks or tabs) are removed.  Blank entries are also removed.

If all of the fields check for blank records are missing, the record is counted as a blank record. If any of the required fields is missing, the record is counted as an invalid record.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.