The Library Data Transformation Tool system consists of the following components:
-
Main Transformation Tool (Library-Data-Transformation-Tool.ps1)
- The primary GUI application for transforming and validating data
- Handles all mapping, validation, and transformation functionality
-
Test Data Generation Scripts
- 0-Generate-Book-Data.PS1: Creates sample book data for testing
- 0-Generate-Borrower-Data.PS1: Creates sample borrower data for testing
-
Sample Mapping Files
- Create your own CSV files with mapping rules based on the examples provided
The toolkit includes two PowerShell scripts to generate test data for functional and performance testing:
This script generates realistic library book data with controlled error rates for testing validation rules.
# Configurable variables
$NumberOfEntries = 50 # Default number of entries
$ExportFileName = "2-Data-Books.csv" # Default export filename
$ErrorFields = @("Barcode") # Fields where errors will be introduced
$ErrorRate = 15 # Default error rate percentage (15%)
$ErrorPlusMinus = 5 # Plus/minus randomness for error rate (5%)Features:
- Generates book records with titles, authors, genres, barcodes, etc.
- Deliberately introduces errors in specified fields at a controlled rate
- Creates 12-digit barcodes that can be validated by regex
- Outputs data in CSV format ready for import into the transformation tool
- Reports statistics on generated data and error rates
Sample Output Fields:
- Title
- Author
- Genre
- Barcode (12-digit numeric)
- Publisher
- PublicationDate (dd/MM/yyyy format)
- Pages (numeric)
- Language
- Format
- Location
This script generates borrower data with controlled error rates for testing validation and transformation rules.
# Configurable variables
$NumberOfEntries = 50 # Default number of entries
$ExportFileName = "2-Data-Borrowers.csv" # Default export filename
$ErrorFields = @("BorrowerId") # Fields where errors will be introduced
$ErrorRate = 15 # Default error rate percentage (15%)
$ErrorPlusMinus = 5 # Plus/minus randomness for error rate (5%)Features:
- Generates borrower records with IDs, names, contact information, etc.
- Creates IDs in the format
ANNNNAA(letter, 4 digits, 2 letters) - Deliberately introduces errors in specified fields at a controlled rate
- Includes title/salutation (Mr, Mrs, etc.) for GenderTransform function testing
- Outputs data in CSV format ready for import into the transformation tool
Sample Output Fields:
- BorrowerId (format matching `^[A-Za-z][0-9]{4}[A-Za-z]{2})
- Name
- Title (Mr, Mrs, Ms, etc.)
- DateOfBirth (dd/MM/yyyy format)
- Phone
- Address
- MembershipType
- MembershipStartDate (dd/MM/yyyy format)
- OverdueCount
-
Adjust the configurable variables at the top of the scripts:
$NumberOfEntries: Number of records to generate$ExportFileName: Output CSV filename$ErrorFields: Fields where errors should be introduced$ErrorRate: Targeted percentage of records with errors$ErrorPlusMinus: Randomness range for error rate
-
Run the script in PowerShell:
.\0-Generate-Book-Data.PS1 .\0-Generate-Borrower-Data.PS1
-
Use the generated CSV files for testing the transformation tool.
These scripts are valuable for:
- Testing validation rules with different error rates
- Performance testing with large data volumes
- Verifying transformation functions like GenderTransform and BorrowerTypeTransform
- Training and demonstrations# Library Data Transformation Tool
A PowerShell-based utility for transforming and validating CSV data according to mapping rules, with support for multiple field mappings and console minimization. Includes tools for test data generation.
This tool provides a graphical interface for:
- Loading mapping rules from a CSV file
- Loading source data from a CSV file
- Performing data validation against specified rules
- Transforming data fields based on custom functions
- Highlighting validation errors in a grid view
- Exporting processed data to CSV
- Data Mapping: Define source and target field names, mandatory fields, data types, and validation rules
- Multiple Field Mappings: Support for mapping a single source field to multiple target fields
- Data Validation: Validate data against regex patterns with visual error highlighting
- Data Transformation: Apply custom transformation functions to data fields
- Visual Interface: View data in a grid with alternating row colors, row numbers, and error highlighting
- Console Minimization: Automatically minimizes the PowerShell console window for a cleaner UI experience
- Filtering:
- Show only rows with validation errors
- Show only mapped fields
- Column Ordering:
- Sort by original data file column order
- Sort by mapping file column order
- Export: Export processed data and error logs to CSV files
- Detailed Logging: Comprehensive logging of field mappings, transformations, and errors
- Test Data Generation: Scripts for generating sample book and borrower data with controlled error rates
- Windows PowerShell 5.1 or later
- .NET Framework 4.5 or later
- Run the script to open the graphical interface
- Click "Load Files" to select your mapping and data files
- View processed data in the grid (validation errors highlighted in pink)
- Use "Show Errors Only" to filter to rows with validation issues
- Use "Show Mapped Only" to display only fields specified in mapping
- Use "Show All Data" to revert to showing all rows and fields
- Use "Sort by Data" to arrange columns in original CSV order
- Use "Sort by Mapping" to arrange columns in mapping file order
- Export processed data with the "Export Data" button
- Export error logs with the "Export Errors" button
The mapping file should be a CSV with the following columns:
| Column | Description |
|---|---|
| SourceField | Name of the field in the source data |
| NewField | Name to use in the output (leave empty to keep original name) |
| DataType | Data type for conversion (string, int, decimal, datetime) |
| Mandatory | Y/N - whether the field is required |
| Validation | Y/N - whether to validate the field |
| ValidationRule | Regex pattern for validation |
| Transformation | Y/N - whether to transform the field |
| TransformFunction | Name of the transformation function to apply |
The tool processes data through several steps:
- Maps source fields to new field names if specified in the
NewFieldcolumn - If
NewFieldis empty, the originalSourceFieldname is retained - Example: "Title" field becomes "BookTitle" if specified in mapping
- Supports mapping a single source field to multiple target fields
- Example: "Name" field could be mapped to both "FirstName" and "LastName" with different transformations
- The tool preserves a map of field name changes for consistent column ordering
- Tracks and logs fields with multiple mappings for better transparency
- Converts data values to the specified type in the
DataTypecolumn - Supported data types include:
- string: Default type, no conversion performed
- int: Converts text values to integers
- Example: "123" → 123
- Fails conversion if value contains non-numeric characters
- decimal: Converts text values to decimal numbers
- Example: "45.67" → 45.67
- Supports decimal points according to current culture
- Fails if value contains invalid characters
- datetime: Converts text values to date/time objects
- Example: "2023-10-15" → DateTime object
- Uses current culture's date format by default
- Fails if the date format is invalid
- Failed conversions are reported as validation errors
- When conversion fails, the original value is preserved
- Checks if required fields (marked with
Mandatory= "Y") are present - Reports an error if a mandatory field is empty or null
- Example error: "Row 2 Field 'Title': Mandatory field missing (original value)"
- Also checks if values are empty after transformation with improved error messages
- Detects and reports completely missing mandatory fields in source data
- Validates data against regex patterns in the
ValidationRulecolumn - Only performed when
Validation= "Y" - Validation succeeds if the value matches the pattern
- Example: Validating email addresses against pattern
^[\w\.-]+@[\w\.-]+\.\w+$ - Validation errors are highlighted in pink in the grid
- Applies custom transformation functions to fields
- Only performed when
Transformation= "Y" - Uses the function named in the
TransformFunctioncolumn - Example transformation: Converting "Mr" to "Male" with
GenderTransformfunction - Transformations can combine or modify values, change formats, etc.
- Logs the number of fields and records affected by transformations
- Each record from the source data is processed individually
- For each field in the record:
- Apply mapping to determine output field name
- Perform type conversion if specified
- Check mandatory requirement
- Apply validation if enabled
- Apply transformation if enabled
- Record validation errors for display and logging
- Output processed data to the grid
- Display error cells in pink
- Log transformations applied to the data
The tool now supports two different column ordering options:
-
Data File Order (default):
- Maintains the original column order from the source CSV
- Translates field names according to mapping
- Preserves user's expected order of fields
-
Mapping File Order:
- Orders columns according to the order they appear in the mapping file
- Useful when a specific output order is required
- Fields not in mapping appear at the end
- Error cells are highlighted in pink in the grid
- Row numbers clearly show which rows contain errors
- Detailed error messages are displayed in the log
- Export all errors to CSV for documentation
Input CSV:
Title,Author,Pages,PublishDate
The Great Gatsby,F. Scott Fitzgerald,180,04/10/1925
Mapping CSV:
SourceField,NewField,DataType,Mandatory,Validation,ValidationRule,Transformation,TransformFunction
Title,BookTitle,string,Y,Y,^.{1,100}$,N,
Author,AuthorName,string,Y,N,,N,
Pages,PageCount,int,N,Y,^\d+$,N,
PublishDate,PublicationDate,datetime,Y,Y,,,N,
Processing Result:
- "Title" → "BookTitle" (field mapping)
- "Pages" → 180 (converted from string to integer)
- "PublishDate" → DateTime object (converted to datetime type)
- Validates "Title" length is between 1-100 characters
- Validates "Pages" contains only digits
- All mandatory fields are checked for values
- A single field can be mapped to multiple target fields with different transformations
- The tool tracks and reports fields that have multiple mappings
The tool provides a complete audit trail of the processing through the log panel, showing which fields were transformed, which fields had multiple mappings, and any validation errors encountered. It also reports on generated fields created through transformations.
The ValidationRule column in the mapping file accepts regular expression patterns. Here are common validation patterns for UK formats:
| Data Type | Regex Pattern | Example |
|---|---|---|
^[\w\.-]+@[\w\.-]+\.\w+$ |
john.doe@example.co.uk |
|
| UK Phone | `^(?:(?:+44\s? | 0)(?:1\d{8,9} |
| Date (DD/MM/YYYY) | `^(0[1-9] | [12][0-9] |
| UK National Insurance | ^[A-CEGHJ-PR-TW-Z]{1}[A-CEGHJ-NPR-TW-Z]{1}[0-9]{6}[A-D]{1}$ |
AB123456C |
| UK Company Number | ^(SC|NI|AC|FC|IC|IP|LP|NA|NL|NO|NP|NR|NZ|OC|R|RC|SA|SF|SI|SL|SO|SP)\d{6}$|^\d{8}$ |
SC123456 or 12345678 |
| UK Bank Sort Code | ^\d{2}-\d{2}-\d{2}$ |
12-34-56 |
| UK Bank Account | ^\d{8}$ |
12345678 |
| Numeric only | ^\d+$ |
12345 |
| Price (£) | ^£\d+\.\d{2}$ |
£99.95 |
| ISBN | ^(?:ISBN(?:-13)?:?\s)?(?=[0-9X]{10}$|(?=(?:[0-9]+[-\s]){3})[-\s0-9X]{13}$) |
ISBN-13: 978-0-123456-47-2 |
| URL | ^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$ |
https://example.com/path |
| Name | ^[a-zA-Z\s'-]+$ |
Mary-Jane O'Brien |
| UK Driving License | ^[A-Z]{5}\d{6}[A-Z]{2}\d[A-Z]{2}$ |
SMITH123456AB7CD |
| Validation Purpose | Regex Pattern | Description |
|---|---|---|
| Alpha-numeric | ^[a-zA-Z0-9]+$ |
Letters and numbers only |
| Fixed length | ^.{10}$ |
Exactly 10 characters |
| Min-max length | ^.{8,16}$ |
Between 8-16 characters |
| Integer range | `^([1-9] | [1-9][0-9] |
| Time (24-hour) | `^([01]?[0-9] | 2[0-3]):[0-5][0-9]$` |
| Hex color | `^#([A-Fa-f0-9]{6} | [A-Fa-f0-9]{3})$` |
| IP Address | ^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$ |
Format: 192.168.1.1 |
| UK Passport Number | ^[0-9]{9}$ |
UK passport number (9 digits) |
| UK NHS Number | ^\d{3}[ -]?\d{3}[ -]?\d{4}$ |
Format: 123 456 7890 |
You can add custom transformation functions by editing the script. Each function should:
- Accept a single input value
- Return the transformed value
- Be registered in the
$global:TransformFunctionshashtable
Example transformation function:
# Transformation functions
function GenderTransform($value) {
if ($value -match "Mr") { return "Male" }
elseif ($value -match "Mrs" -or $value -match "Miss") { return "Female" }
else { return "" }
}
function BorrowerTypeTransform($value) {
if ($value -match "^[A-Za-z][0-9]{4}[A-Za-z]{2}$") {
return "Student"
} else {
return "Teacher"
}
}
$global:TransformFunctions = @{}
$global:TransformFunctions["GenderTransform"] = ${function:GenderTransform}
$global:TransformFunctions["BorrowerTypeTransform"] = ${function:BorrowerTypeTransform}You can add custom transformation functions by editing the script. Each function should:
- Accept a single input value
- Return the transformed value
- Be registered in the
$global:TransformFunctionshashtable
The recommended way to register transformation functions:
function CustomTransform($value) {
# Your transformation logic here
return $transformedValue
}
# Register the function
$global:TransformFunctions["CustomTransform"] = ${function:CustomTransform}For more complex validation that regex alone can't handle, you can create custom validation functions similar to transformation functions:
function ValidateUKPostcode($value) {
# First standardize by removing spaces and converting to uppercase
$value = $value.ToUpper().Replace(" ", "")
# Basic format check using regex
if (-not ($value -match '^[A-Z]{1,2}[0-9R][0-9A-Z]?[0-9][ABD-HJLNP-UW-Z]{2}
## Validation Best Practices
1. **Start simple**: Begin with basic validations and add complexity as needed
2. **Test thoroughly**: Create test data that intentionally violates validation rules
3. **Layer validation**: Use regex for format, custom functions for complex logic
4. **Balance strictness**: Overly strict validation may reject valid data
5. **Use appropriate error handling**: Choose between warning, error, and log based on data importance
6. **Document patterns**: Keep a record of regex patterns and their purposes
7. **Consider data cleansing**: Sometimes it's better to transform/cleanse than reject
## Test Data Generation
The toolkit includes two PowerShell scripts to generate test data for functional and performance testing:
### 1. 0-Generate-Book-Data.PS1
This script generates realistic library book data with controlled error rates for testing validation rules.
```powershell
# Configurable variables
$NumberOfEntries = 50 # Default number of entries
$ExportFileName = "2-Data-Books.csv" # Default export filename
$ErrorFields = @("Barcode") # Fields where errors will be introduced
$ErrorRate = 15 # Default error rate percentage (15%)
$ErrorPlusMinus = 5 # Plus/minus randomness for error rate (5%)Features:
- Generates book records with titles, authors, genres, barcodes, etc.
- Deliberately introduces errors in specified fields at a controlled rate
- Creates 12-digit barcodes that can be validated by regex
- Outputs data in CSV format ready for import into the transformation tool
- Reports statistics on generated data and error rates
Sample Output Fields:
- Title
- Author
- Genre
- Barcode (12-digit numeric)
- Publisher
- PublicationDate (dd/MM/yyyy format)
- Pages (numeric)
- Language
- Format
- Location
This script generates borrower data with controlled error rates for testing validation and transformation rules.
# Configurable variables
$NumberOfEntries = 50 # Default number of entries
$ExportFileName = "2-Data-Borrowers.csv" # Default export filename
$ErrorFields = @("BorrowerId") # Fields where errors will be introduced
$ErrorRate = 15 # Default error rate percentage (15%)
$ErrorPlusMinus = 5 # Plus/minus randomness for error rate (5%)Features:
- Generates borrower records with IDs, names, contact information, etc.
- Creates IDs in the format
ANNNNAA(letter, 4 digits, 2 letters) - Deliberately introduces errors in specified fields at a controlled rate
- Includes title/salutation (Mr, Mrs, etc.) for GenderTransform function testing
- Outputs data in CSV format ready for import into the transformation tool
Sample Output Fields:
- BorrowerId (format matching `^[A-Za-z][0-9]{4}[A-Za-z]{2}# Library Data Transformation Tool
A PowerShell-based utility for transforming and validating CSV data according to mapping rules, with support for multiple field mappings and console minimization. Includes tools for test data generation.
This tool provides a graphical interface for:
- Loading mapping rules from a CSV file
- Loading source data from a CSV file
- Performing data validation against specified rules
- Transforming data fields based on custom functions
- Highlighting validation errors in a grid view
- Exporting processed data to CSV
- Data Mapping: Define source and target field names, mandatory fields, data types, and validation rules
- Multiple Field Mappings: Support for mapping a single source field to multiple target fields
- Data Validation: Validate data against regex patterns with visual error highlighting
- Data Transformation: Apply custom transformation functions to data fields
- Visual Interface: View data in a grid with alternating row colors, row numbers, and error highlighting
- Console Minimization: Automatically minimizes the PowerShell console window for a cleaner UI experience
- Filtering:
- Show only rows with validation errors
- Show only mapped fields
- Column Ordering:
- Sort by original data file column order
- Sort by mapping file column order
- Export: Export processed data and error logs to CSV files
- Detailed Logging: Comprehensive logging of field mappings, transformations, and errors
- Test Data Generation: Scripts for generating sample book and borrower data with controlled error rates
- Windows PowerShell 5.1 or later
- .NET Framework 4.5 or later
- Run the script to open the graphical interface
- Click "Load Files" to select your mapping and data files
- View processed data in the grid (validation errors highlighted in pink)
- Use "Show Errors Only" to filter to rows with validation issues
- Use "Show Mapped Only" to display only fields specified in mapping
- Use "Show All Data" to revert to showing all rows and fields
- Use "Sort by Data" to arrange columns in original CSV order
- Use "Sort by Mapping" to arrange columns in mapping file order
- Export processed data with the "Export Data" button
- Export error logs with the "Export Errors" button
The mapping file should be a CSV with the following columns:
| Column | Description |
|---|---|
| SourceField | Name of the field in the source data |
| NewField | Name to use in the output (leave empty to keep original name) |
| DataType | Data type for conversion (string, int, decimal, datetime) |
| Mandatory | Y/N - whether the field is required |
| Validation | Y/N - whether to validate the field |
| ValidationRule | Regex pattern for validation |
| Transformation | Y/N - whether to transform the field |
| TransformFunction | Name of the transformation function to apply |
The tool processes data through several steps:
- Maps source fields to new field names if specified in the
NewFieldcolumn - If
NewFieldis empty, the originalSourceFieldname is retained - Example: "Title" field becomes "BookTitle" if specified in mapping
- Supports mapping a single source field to multiple target fields
- Example: "Name" field could be mapped to both "FirstName" and "LastName" with different transformations
- The tool preserves a map of field name changes for consistent column ordering
- Tracks and logs fields with multiple mappings for better transparency
- Converts data values to the specified type in the
DataTypecolumn - Supported data types include:
- string: Default type, no conversion performed
- int: Converts text values to integers
- Example: "123" → 123
- Fails conversion if value contains non-numeric characters
- decimal: Converts text values to decimal numbers
- Example: "45.67" → 45.67
- Supports decimal points according to current culture
- Fails if value contains invalid characters
- datetime: Converts text values to date/time objects
- Example: "2023-10-15" → DateTime object
- Uses current culture's date format by default
- Fails if the date format is invalid
- Failed conversions are reported as validation errors
- When conversion fails, the original value is preserved
- Checks if required fields (marked with
Mandatory= "Y") are present - Reports an error if a mandatory field is empty or null
- Example error: "Row 2 Field 'Title': Mandatory field missing (original value)"
- Also checks if values are empty after transformation with improved error messages
- Detects and reports completely missing mandatory fields in source data
- Validates data against regex patterns in the
ValidationRulecolumn - Only performed when
Validation= "Y" - Validation succeeds if the value matches the pattern
- Example: Validating email addresses against pattern
^[\w\.-]+@[\w\.-]+\.\w+$ - Validation errors are highlighted in pink in the grid
- Applies custom transformation functions to fields
- Only performed when
Transformation= "Y" - Uses the function named in the
TransformFunctioncolumn - Example transformation: Converting "Mr" to "Male" with
GenderTransformfunction - Transformations can combine or modify values, change formats, etc.
- Logs the number of fields and records affected by transformations
- Each record from the source data is processed individually
- For each field in the record:
- Apply mapping to determine output field name
- Perform type conversion if specified
- Check mandatory requirement
- Apply validation if enabled
- Apply transformation if enabled
- Record validation errors for display and logging
- Output processed data to the grid
- Display error cells in pink
- Log transformations applied to the data
The tool now supports two different column ordering options:
-
Data File Order (default):
- Maintains the original column order from the source CSV
- Translates field names according to mapping
- Preserves user's expected order of fields
-
Mapping File Order:
- Orders columns according to the order they appear in the mapping file
- Useful when a specific output order is required
- Fields not in mapping appear at the end
- Error cells are highlighted in pink in the grid
- Row numbers clearly show which rows contain errors
- Detailed error messages are displayed in the log
- Export all errors to CSV for documentation
Input CSV:
Title,Author,Pages,PublishDate
The Great Gatsby,F. Scott Fitzgerald,180,04/10/1925
Mapping CSV:
SourceField,NewField,DataType,Mandatory,Validation,ValidationRule,Transformation,TransformFunction
Title,BookTitle,string,Y,Y,^.{1,100}$,N,
Author,AuthorName,string,Y,N,,N,
Pages,PageCount,int,N,Y,^\d+$,N,
PublishDate,PublicationDate,datetime,Y,Y,,,N,
Processing Result:
- "Title" → "BookTitle" (field mapping)
- "Pages" → 180 (converted from string to integer)
- "PublishDate" → DateTime object (converted to datetime type)
- Validates "Title" length is between 1-100 characters
- Validates "Pages" contains only digits
- All mandatory fields are checked for values
- A single field can be mapped to multiple target fields with different transformations
- The tool tracks and reports fields that have multiple mappings
The tool provides a complete audit trail of the processing through the log panel, showing which fields were transformed, which fields had multiple mappings, and any validation errors encountered. It also reports on generated fields created through transformations.
The ValidationRule column in the mapping file accepts regular expression patterns. Here are common validation patterns for UK formats:
| Data Type | Regex Pattern | Example |
|---|---|---|
^[\w\.-]+@[\w\.-]+\.\w+$ |
john.doe@example.co.uk |
|
| UK Phone | `^(?:(?:+44\s? | 0)(?:1\d{8,9} |
| Date (DD/MM/YYYY) | `^(0[1-9] | [12][0-9] |
| UK National Insurance | ^[A-CEGHJ-PR-TW-Z]{1}[A-CEGHJ-NPR-TW-Z]{1}[0-9]{6}[A-D]{1}$ |
AB123456C |
| UK Company Number | ^(SC|NI|AC|FC|IC|IP|LP|NA|NL|NO|NP|NR|NZ|OC|R|RC|SA|SF|SI|SL|SO|SP)\d{6}$|^\d{8}$ |
SC123456 or 12345678 |
| UK Bank Sort Code | ^\d{2}-\d{2}-\d{2}$ |
12-34-56 |
| UK Bank Account | ^\d{8}$ |
12345678 |
| Numeric only | ^\d+$ |
12345 |
| Price (£) | ^£\d+\.\d{2}$ |
£99.95 |
| ISBN | ^(?:ISBN(?:-13)?:?\s)?(?=[0-9X]{10}$|(?=(?:[0-9]+[-\s]){3})[-\s0-9X]{13}$) |
ISBN-13: 978-0-123456-47-2 |
| URL | ^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$ |
https://example.com/path |
| Name | ^[a-zA-Z\s'-]+$ |
Mary-Jane O'Brien |
| UK Driving License | ^[A-Z]{5}\d{6}[A-Z]{2}\d[A-Z]{2}$ |
SMITH123456AB7CD |
| Validation Purpose | Regex Pattern | Description |
|---|---|---|
| Alpha-numeric | ^[a-zA-Z0-9]+$ |
Letters and numbers only |
| Fixed length | ^.{10}$ |
Exactly 10 characters |
| Min-max length | ^.{8,16}$ |
Between 8-16 characters |
| Integer range | `^([1-9] | [1-9][0-9] |
| Time (24-hour) | `^([01]?[0-9] | 2[0-3]):[0-5][0-9]$` |
| Hex color | `^#([A-Fa-f0-9]{6} | [A-Fa-f0-9]{3})$` |
| IP Address | ^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$ |
Format: 192.168.1.1 |
| UK Passport Number | ^[0-9]{9}$ |
UK passport number (9 digits) |
| UK NHS Number | ^\d{3}[ -]?\d{3}[ -]?\d{4}$ |
Format: 123 456 7890 |
You can add custom transformation functions by editing the script. Each function should:
- Accept a single input value
- Return the transformed value
- Be registered in the
$global:TransformFunctionshashtable
Example transformation function:
# Transformation functions
function GenderTransform($value) {
if ($value -match "Mr") { return "Male" }
elseif ($value -match "Mrs" -or $value -match "Miss") { return "Female" }
else { return "" }
}
function BorrowerTypeTransform($value) {
if ($value -match "^[A-Za-z][0-9]{4}[A-Za-z]{2}$") {
return "Student"
} else {
return "Teacher"
}
}
$global:TransformFunctions = @{}
$global:TransformFunctions["GenderTransform"] = ${function:GenderTransform}
$global:TransformFunctions["BorrowerTypeTransform"] = ${function:BorrowerTypeTransform}You can add custom transformation functions by editing the script. Each function should:
- Accept a single input value
- Return the transformed value
- Be registered in the
$global:TransformFunctionshashtable
The recommended way to register transformation functions:
function CustomTransform($value) {
# Your transformation logic here
return $transformedValue
}
# Register the function
$global:TransformFunctions["CustomTransform"] = ${function:CustomTransform}For more complex validation that regex alone can't handle, you can create custom validation functions similar to transformation functions:
function ValidateUKPostcode($value) {
# First standardize by removing spaces and converting to uppercase
$value = $value.ToUpper().Replace(" ", "")
# Basic format check using regex
if (-not ($value -match '^[A-Z]{1,2}[0-9R][0-9A-Z]?[0-9][ABD-HJLNP-UW-Z]{2}
## Validation Best Practices
1. **Start simple**: Begin with basic validations and add complexity as needed
2. **Test thoroughly**: Create test data that intentionally violates validation rules
3. **Layer validation**: Use regex for format, custom functions for complex logic
4. **Balance strictness**: Overly strict validation may reject valid data
5. **Use appropriate error handling**: Choose between warning, error, and log based on data importance
6. **Document patterns**: Keep a record of regex patterns and their purposes
7. **Consider data cleansing**: Sometimes it's better to transform/cleanse than reject
)
- Name
- Title (Mr, Mrs, Ms, etc.)
- DateOfBirth (dd/MM/yyyy format)
- Email
- Phone
- Address
- MembershipType
- MembershipStartDate (dd/MM/yyyy format)
- OverdueCount
### Usage for Test Data Generation
1. Adjust the configurable variables at the top of the scripts:
- `$NumberOfEntries`: Number of records to generate
- `$ExportFileName`: Output CSV filename
- `$ErrorFields`: Fields where errors should be introduced
- `$ErrorRate`: Targeted percentage of records with errors
- `$ErrorPlusMinus`: Randomness range for error rate
2. Run the script in PowerShell:
```powershell
.\0-Generate-Book-Data.PS1
.\0-Generate-Borrower-Data.PS1- Use the generated CSV files for testing the transformation tool.
These scripts are valuable for:
- Testing validation rules with different error rates
- Performance testing with large data volumes
- Verifying transformation functions like GenderTransform and BorrowerTypeTransform
- Training and demonstrations
The tool is built using:
- PowerShell scripting language
- Windows Forms for the GUI
- .NET Framework classes for data manipulation
- Regular expressions for validation
- P/Invoke for console window management
The main components are:
- Form layout with resizable panels
- DataGridView for data display with row numbers
- Support for row and column selection
- Alternating row colors and error cell highlighting
- Comprehensive logging system
- CSV parsing with support for quoted fields
SourceField,NewField,DataType,Mandatory,Validation,ValidationRule,Transformation,TransformFunction
Title,BookTitle,string,Y,Y,^.{1,100}$,N,
Author,AuthorName,string,Y,N,,N,
ISBN,,string,Y,Y,^(?:ISBN(?:-13)?:?\s)?(?=[0-9X]{10}$|(?=(?:[0-9]+[-\s]){3})[-\s0-9X]{13}$),N,
PublishedYear,,int,Y,Y,^(19|20)\d{2}$,N,
PageCount,,int,N,Y,^\d{1,4}$,N,
Publisher,,string,Y,N,,N,SourceField,NewField,DataType,Mandatory,Validation,ValidationRule,Transformation,TransformFunction
BorrowerID,,string,Y,Y,^[A-Z][0-9]{4}[A-Z]{2}$|^T[0-9]{5}$,Y,BorrowerTypeTransform
BorrowerID,BorrowerType,string,N,N,,Y,BorrowerTypeTransform
Name,FirstName,string,Y,Y,^[A-Za-z\-']{2,30}$,Y,ExtractFirstName
Name,LastName,string,Y,Y,^[A-Za-z\-']{2,30}$,Y,ExtractLastName
Title,,string,N,N,,Y,GenderTransform
Title,Gender,string,N,N,,Y,GenderTransform
EmailAddress,,string,Y,Y,^[\w\.-]+@[\w\.-]+\.\w+$,N,Note how in the Borrower Records example:
-
BorrowerIDis mapped twice (once as is, once to extract a type) -
Nameis mapped to bothFirstNameandLastNamewith different transformations -
Titleis both kept as-is and transformed toGender)) { return $false }Additional validation logic
For example, certain letter combinations are not used in the first position
$invalidFirstLetters = @("QV", "X") $firstPart = if (
$value[0..1] -join "" -match "^[A-Z]{2}$ ") { $value[0..1] -join "" } else { $value[0] }if ($invalidFirstLetters -contains $firstPart) { return $false }
return $true }
$global:ValidationFunctions = @{}
Then in your mapping file:
```csv
SourceField,NewField,DataType,Mandatory,Validation,ValidationRule,ErrorHandling,Transformation,TransformFunction,ValidationFunction
Postcode,,string,Y,Y,,Error,N,,ValidateUKPostcode
The latest version includes several UI enhancements:
- Clear layout: Improved organization with proper panel placement
- Row numbers: Easy identification of data rows in the grid
- Alternating row colors: Better visual separation of rows
- Error highlighting: Clear visual indication of validation issues
- Column headers: Improved styling and click behavior for selection
- Resizable panels: Adjust data and log panel sizes as needed
- Better error handling: Graceful handling of display errors
The tool provides detailed logging of:
- Loaded mapping and data files
- Field name translations
- Original column orders from both data and mapping
- Validation errors with specific details
- Transformation summaries including affected fields and records
- Row and column counts
- Error details and locations
The latest version includes several UI enhancements:
- Clear layout: Improved organization with proper panel placement
- Row numbers: Easy identification of data rows in the grid
- Alternating row colors: Better visual separation of rows
- Error highlighting: Clear visual indication of validation issues
- Column headers: Improved styling and click behavior for selection
- Resizable panels: Adjust data and log panel sizes as needed
- Better error handling: Graceful handling of display errors
The tool provides detailed logging of:
- Loaded mapping and data files
- Field name translations
- Original column orders from both data and mapping
- Validation errors with specific details
- Transformation summaries including affected fields and records
- Row and column counts
- Error details and locations
The tool is built using:
- PowerShell scripting language
- Windows Forms for the GUI
- .NET Framework classes for data manipulation
- Regular expressions for validation
- P/Invoke for console window management
The main components are:
- Form layout with resizable panels
- DataGridView for data display with row numbers
- Support for row and column selection
- Alternating row colors and error cell highlighting
- Comprehensive logging system
- CSV parsing with support for quoted fields
The tool supports mapping a single source field to multiple target fields with different processing rules:
SourceField,NewField,DataType,Mandatory,Validation,ValidationRule,Transformation,TransformFunction
CustomerName,FirstName,string,Y,Y,^[A-Z][a-z]+$,Y,ExtractFirstName
CustomerName,LastName,string,Y,Y,^[A-Z][a-z]+$,Y,ExtractLastNameThis allows:
- Splitting a field into multiple components
- Creating different views of the same data
- Applying different validation rules to each target field
- Transforming the same source data in different ways
The tool now tracks and reports on:
- Fields that have been transformed
- New fields that were generated through transformation
- The number of records affected by transformations
- Source fields with multiple mappings
The tool automatically minimizes the PowerShell console window when launched to provide a cleaner user interface experience. This is accomplished through P/Invoke calls to the Windows API.


