0

I have following PS script to get a count. Is there a way to count (minus the header) without importing the entire csv? Sometimes the csv file is very large and sometime it has no records.

Get-ChildItem 'C:\Temp\*.csv' | ForEach {
    $check = Import-Csv $_
    If ($check) { Write-Host "$($_.FullName) contains data" }
    Else { Write-Host "$($_.FullName) does not contain data" }
}

2 Answers 2

1

To count the rows without worrying about the header use this:

$c = (Import-Csv $_.FullName).count

However this has to read the entire file into memory. A faster way to count the file would be to use the Get-Content with the readcount flag like so:

$c = 0
Get-Content $_.FullName -ReadCount 1000 | % {$c += $_.Length}
$c -= 1

To remove the header row from the count you just subtract 1. If your files with no rows don't have an header you can avoid them counting as minus 1 like so:

$c = 0
Get-Content $_.FullName -ReadCount 1000 | % {$c += $_.Length}
$c -= @{$true = 0; $false = - 1}[$c -eq 0]
Sign up to request clarification or add additional context in comments.

Comments

0

Here is the function that will check is CSV file empty (returns True if empty, False otherwise) with following features:

  • Can skip headers
  • Works in PS 2.0 (PS 2.0 hasn't -ReadCount switch for Get-Content cmdlet)
  • Doesn't load entire file in memory
  • Aware of CSV file structure (wouldn't count empty/nonvalid lines).

It accepts following arguments:

  • FileName: Path to CSV file.
  • MaxLine: Maximum lines to read from file.
  • NoHeader: If this switch is not specified, function will skip first line of the file

Usage example:

Test-IsCsvEmpty -FileName 'c:\foo.csv' -MaxLines 2 -NoHeader

function Test-IsCsvEmpty
{
    Param
    (
        [Parameter(Mandatory = $true, ValueFromPipeline = $true, ValueFromPipelineByPropertyName = $true)]
        [string]$FileName,

        [Parameter(ValueFromPipelineByPropertyName = $true)]
        [ValidateRange(1, [int]::MaxValue)]
        [int]$MaxLines = 2,

        [Parameter(ValueFromPipelineByPropertyName = $true)]
        [switch]$NoHeader
    )

    Begin
    {
        # Setup regex for CSV parsing
        $DQuotes = '"'
        $Separator = ','
        # http://stackoverflow.com/questions/15927291/how-to-split-a-string-by-comma-ignoring-comma-in-double-quotes
        $SplitRegex = "$Separator(?=(?:[^$DQuotes]|$DQuotes[^$DQuotes]*$DQuotes)*$)"
    }

    Process
    {
        # Open file in StreamReader
        $InFile = New-Object -TypeName System.IO.StreamReader -ArgumentList $FileName -ErrorAction Stop

        # Set inital values for Raw\Data lines count
        $CsvRawLinesCount = 0
        $CsvDataLinesCount = 0

        # Loop over lines in file
        while(($line = $InFile.ReadLine()) -ne $null)
        {
            # Increase Raw line counter
            $CsvRawLinesCount++

            # Skip header, if requested
            if(!$NoHeader -and ($CsvRawLinesCount -eq 1))
            {
                continue
            }

            # Stop processing if MaxLines limit is reached
            if($CsvRawLinesCount -gt $MaxLines)
            {
                break
            }

            # Try to parse line as CSV
            if($line -match $SplitRegex)
            {
                # If success, increase CSV Data line counter 
                $CsvDataLinesCount++
            }
        }
    }

    End
    {
        # Close file, dispose StreamReader
        $InFile.Close()
        $InFile.Dispose()

        # Write result to the pipeline
        if($CsvDataLinesCount -gt 0)
        {
            $false
        }
        else
        {
            $true
        }
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.