0

I try to write a script that does the following:

  • determine folders that contain PDF files from a source dir
  • create the full original directory structure down to these folders to a destination dir
  • copy the full content of the folder that contains a PDF - regardless of the types of the other files in that folder
  • do not copy any of the files in the parent folders

screenshot with description of what I try to achieve

I hope I described it well enough for you who don't live in my head to understand what i mean. :D

I got this script now.

# Define the source and destination paths
$source = "H:\MC4"
$destination = "H:\Mirror"

# Get all subdirectories that contain at least one PDF file
$dirsWithPDFs = Get-ChildItem -Path $source -Recurse -Directory | Where-Object {
    Get-ChildItem -Path $_.FullName -Filter *.pdf -File -Recurse | Where-Object { $_.Extension -eq ".pdf" }
}

# Copy each directory with PDF files using Robocopy
foreach ($dir in $dirsWithPDFs) {
    $sourceDir = $dir.FullName
    $destDir = $sourceDir.Replace($source, $destination)
    robocopy $sourceDir $destDir /MIR
}

It nearly does the job. Only problem: It won't leave the parent folders of my "PDF folder" empty of files.

Can you tell me how to do that? Sorry, I've found many similar questions and answers but not that exact situation.

Thanks so much!

2
  • Can target folders with pdfs have more folders which might have pdfs on their own? If so, should their be multiple result folders? Commented Jul 30, 2024 at 17:34
  • That's a good question indeed ... I never saw an example in which folders with pdfs contained folders with more pdfs. But might be. If so, they should both be filled with the full original content. Commented Jul 31, 2024 at 5:45

1 Answer 1

0

The following code does not use robocopy, but the Copy-Item could be replaced by it. It first creates a list of directories which contain one or more .pdf files. It then constructs the target path directory based on the directory path without the initial source directory information.

[CmdletBinding()]
param ()
$Source = 'C:\src'
$Destination = 'C:\mirror'

$SourceDir = $Source
$DestinationDir = $Destination

if ($SourceDir[-1] -ne [IO.Path]::DirectorySeparatorChar) { $SourceDir += [IO.Path]::DirectorySeparatorChar }
if ($DestinationDir[-1] -ne [IO.Path]::DirectorySeparatorChar) { $DestinationDir += [IO.Path]::DirectorySeparatorChar }
Write-Verbose "SourceDir is $SourceDir"
Write-Verbose "DestinationDir is $DestinationDir"

$SourceDirectories = Get-ChildItem -Recurse -Path $SourceDir -Filter '*.pdf' |
    Select-Object -ExpandProperty DirectoryName |
    Sort-Object -Unique

$SourceRegex = [regex]::Escape($SourceDir)
Write-Verbose "SourceRegex = $SourceRegex"

foreach ($SourceDirectory in $SourceDirectories) {
    $TargetPath = $DestinationDir + ($SourceDirectory -replace $SourceRegex,'')
    Write-Verbose "TargetPath = $TargetPath"

    if (-not (Test-Path -Path $TargetPath)) { New-Item -ItemType Directory -Path $TargetPath }
    Copy-Item -Path (Join-Path -Path $SourceDirectory -ChildPath '*') -Destination $TargetPath
}

Update:

This code will allow for a list of excluded directories.

[CmdletBinding()]
param ()
$Source = 'C:\src\'
$Destination = 'C:\mirror'
$Exclusions = @('t\t2')
$ExclusionDirs = foreach ($Exclusion in $Exclusions) { Join-Path -Path $Source -ChildPath $Exclusion }

$SourceDir = $Source
$DestinationDir = $Destination

if ($SourceDir[-1] -ne [IO.Path]::DirectorySeparatorChar) { $SourceDir += [IO.Path]::DirectorySeparatorChar }
if ($DestinationDir[-1] -ne [IO.Path]::DirectorySeparatorChar) { $DestinationDir += [IO.Path]::DirectorySeparatorChar }
Write-Verbose "SourceDir is $SourceDir"
Write-Verbose "DestinationDir is $DestinationDir"

$SourceDirectories = Get-ChildItem -Recurse -Path $SourceDir -Filter '*.pdf' |
    Where-Object { $_.DirectoryName -notin $ExclusionDirs } |
    Select-Object -ExpandProperty DirectoryName |
    Sort-Object -Unique

$SourceRegex = [regex]::Escape($SourceDir)
Write-Verbose "SourceRegex = $SourceRegex"

foreach ($SourceDirectory in $SourceDirectories) {
    $TargetPath = $DestinationDir + ($SourceDirectory -replace $SourceRegex,'')
    Write-Verbose "TargetPath = $TargetPath"

    if (-not (Test-Path -Path $TargetPath)) { New-Item -ItemType Directory -Path $TargetPath }
    Copy-Item -Path (Join-Path -Path $SourceDirectory -ChildPath '*') -Destination $TargetPath
}
Sign up to request clarification or add additional context in comments.

4 Comments

Works great, thank you! An additional question that came up: Can I exclude one specific path? I tried adding an exception for this path like that: Copy-Item -Path (Join-Path -Path $SourceDirectory -ChildPath '*') -Exclude @("$SourceDir\MC196\195303\847779") -Destination $TargetPath but it won't work ... I'm not sure where to put it. Can you help please?
Tried several things now. Best guess would be to add $exclusion = "MC196\195303\847779\" before the first 'if' lines and then $SourceDirectories = Get-ChildItem -Recurse -Path $SourceDir -Filter '*.pdf' | Where-Object {$_.FullName -notlike $exclusion} | Select-Object -ExpandProperty DirectoryName | Sort-Object -Unique in the middle. But still won't work.
Added code to allow excluded directories to be omitted.
I know it says "avoid comments like thanks" but ... thanks! This helped a lot!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.