Powershell: compare and filter SQL-Reply/Array with file content
Introduction
In this article, we will explore how to compare a PowerShell array with the contents of a file. The array in question is likely to be the result set from an SQL query, while the file contains document IDs on each line. We will go through the process step by step and provide code examples.
Prerequisites
To follow this article, you should have the following:
- PowerShell installed on your system
- A basic understanding of PowerShell scripting
- The
System.Datanamespace for working with SQL query results
Understanding the Problem
Let’s break down the problem at hand. You have a SQL query that returns a result set containing document IDs and other attributes. This result set is stored in an array. You also have a file that contains document IDs on each line.
Your goal is to compare this array with the contents of the file, filter out any rows that do not match the document IDs present in the file, and then use the remaining rows for further processing.
Step 1: Load the SQL Query Results into an Array
To start, you need to load the result set from your SQL query into a PowerShell array. This can be done using the ConvertFrom-Csv cmdlet.
# Load the SQL query results into an array
$sql_reply = ConvertFrom-Csv @'
Name,Article,Size
David,TShirt,M
Eduard,Trouwsers,S
Marc,Trouwsers,L
Reto,Trouwsers,XS
'@
# Print the first few rows of the result set
$sql_reply | Select-Object -First 5
Step 2: Read the File Contents
Next, you need to read the contents of the file that contains document IDs on each line.
# Define the path to the file containing document IDs
$file_path = 'C:\Path\To\File.txt'
# Use Get-Content to read the file contents
$file_content = Get-Content -Path $file_path
# Print the first few lines of the file contents
$file_content | Select-Object -First 5
Step 3: Filter Out Unmatched Rows
Now that you have both arrays, you can filter out any rows in your result set ($sql_reply) whose documentId attribute does not match a line in your file ($file_content).
# Compare the $file_content array with the documentIds in $sql_reply
$missing_in_db = [string[]](Compare-Object $file_content $sql_reply.documentId | Where {$_.sideindicator -eq "<="} | % {$_.inputobject})
# Get rows from $sql Reply where documentID matches a line in file content
$exist_in_db = [string[]](Compare-Object $file_content $sql_reply.documentId -IncludeEqual -ExcludeDifferent | Where {$_.sideindicator -eq "=="} | % {$_.inputobject})
However, the above code does not correctly filter rows out that don’t match. Instead we need to select only those rows where documentID matches a line in $file_content.
# Filter $sql_reply by matching documentIds with lines in file_content
$filteredResultSet = $sql_reply | Where{$sql_reply.documentID -in $exist_in_db}
Step 4: Process the Remaining Rows
Once you have filtered your result set, you can process the remaining rows for further analysis or output.
# Print a message indicating how many matching document IDs were found
Write-Host "Found $($filteredResultSet.Count) matching document IDs."
# Export the matched rows to a new CSV file
$matchedRowsCsv = New-Object System.IO.StringWriter
$matchedRowsCsvWriter = $matchedRowsCsv.getWriter
$writer = [System.Text.StringBuilder]::new()
$writer.Append("[")
$writer.AppendLine("DocumentID,")
foreach($row in $filteredResultSet) {
if($writer.Length -gt 0) {$writer.Append(",")};
$writer.Write($row.documentId);
$writer.Append(",`" + $row.itemname);
$writer.Append(",`" + $row.datestored)
}
$writer.AppendLine("]")
Write-Host "Exported matching rows to CSV file."
Example Use Case
Let’s say you have a SQL query that returns the following result set:
| documentID | itemname | dateStored |
|---|---|---|
| A | Item1 | 2022-01-01 |
| B | Item2 | 2022-02-01 |
| C | Item3 | 2022-03-01 |
And you have a file containing the following lines:
A B
In this example, we want to filter out any rows in our result set whose documentID attribute does not match a line in our file.
The code above would correctly return only the first two rows of the result set (documentID A and documentID B) because they are present as lines in our file. The third row (documentID C) is not returned because it is not present as a line in our file.
Conclusion
In this article, we went through the steps to compare an array with the contents of a file. We started by loading our SQL query results into an array and reading the contents of our file using Get-Content. Then, we filtered out any rows that did not match based on their documentId attribute. Finally, we processed the remaining rows for further analysis or output.
Last modified on 2023-10-12