1

Please consider this python code:

from pathlib import Path

def main():
    filenames = sys.argv[1:]

    for filename in filenames:
        path = Path(filename)
        with path.open() as file:
            text = file.read()
            a = json.loads(text)

if __name__ == "__main__":
    main()

This script works fine on Linux called:

python script_name.py logs/*.txt

But on Windows, Anaconda Powershell returns an error:

python script_name.py logs/*.txt
OSError: [Errno 22] Invalid argument: 'logs\\*.txt'

So how call on Windows a script with argument which is a filename joker (*.txt)?

5
  • from pathlib import Path Commented Sep 15, 2021 at 17:31
  • 1
    Don't update your post via comments. Just click "edit". Commented Sep 15, 2021 at 17:33
  • logs/*.txt doesn't match 'logs\\*.txt'. Are you sure that's correct? Commented Sep 15, 2021 at 17:42
  • 1
    When you say "joker", you mean "wildcard", right? Commented Sep 15, 2021 at 17:44
  • my crystal ball tells me that in bash the wildcard is replaced with the list of files while in PowerShell you get the string directly Commented Sep 15, 2021 at 17:57

1 Answer 1

2

Issue: *.ext is not a valid path

*.ext is not a valid path, but a glob, a kind of pattern to match or find files.

Preferred Solution

Pass the specified files (either as glob-ed expression, or as concrete file-path) directly to a suitable method, that can either expand the path-pattern (glob) or resolve the concrete path.

Pathlib with glob method

Since you already imported and use Pathlib you could use its glob method like this:

from pathlib import Path

paths = list(Path('.').glob('*.txt'))
# [PosixPath('test.txt'), PosixPath('production.txt')]
for path in paths:
    with path.open() as file:
        text = file.read()

The resulting output from comment-line assumes, there are two .txt files in your current directory denoted by ..

Note: You could also pass relative path-expressions to glob like logs/*.txt or even **/*.txt which will math the files in all sub-directories recursively (denoted by **).

What if a user passes a concrete file-path?

Consider, that user might directly pass concrete file-names as arguments. You should test if glob function can deal with it.

If not, you would have to validate for it and select a different path-finder for these cases.

Alternative: Pure globs (jokers, wildcards) in python

Underneath most of these globbing modules (like pathlib) might use Python's pure glob module. This is how it could work here, too:

import glob

filenames = glob.glob('logs/*.txt')
# ['logs/test.txt', 'logs/production.txt']

See also: Using File Extension Wildcards in os.listdir(path)

But as Charlie G adviced introducing another module is not necessary here when Pathlib could do the trick (globbing).

Handle file-name patterns in command-arguments

When passing a file-name pattern like logs/*.txt via the command-line, you should treat each argument separately.

For example a program call from console/shell like:

python script_name.py logs/*.txt 

would work like this:

from pathlib import Path

if __name__ == "__main__":
    # the first element (with index 0) is the program called
    path_patterns = sys.argv[1:]  # get all arguments as list by slicing
    print('got arguments:', path_expressions)

    for pattern in path_patterns:
        paths = Path.cwd().glob(pattern)
        print('file-pattern: ', pattern, 'globbed to paths: ', paths)

Note: it is important that glob method requires a single pattern as string ( type str), not a list. If you pass a list to the method like glob(path_patterns) you will get an error like:

TypeError: expected str, bytes or os.PathLike object, not list

Your sys.argv[1:] uses slicing to get all arguments passed on the command-line. So the resulting list could contain 0, 1 or multiple elements.

Validate command line arguments

If you only require 1 single argument (the "globbed" file-path) then use path_pattern = sys.argv[1].

Furthermore it would be good style and defensive programming to check for the number of arguments before (to avoid an out-of-bounds exception).

This could be done like this:

# guard-statement testing for required number of arguments (program + 1 = 2)
if len(sys.argv) < 2:
    print('Requires at least a single argument, the file-path!')
    print('Usage: python script_name.py <file-path>')
    print('Example: python script_name.py logs/*.txt')
    sys.exit()

# continue because here you are sure at least 1 argument exists
print('got at least 1 required argument: ', sys.argv[1:])

See also:

Sign up to request clarification or add additional context in comments.

5 Comments

Using pathlib.Path.glob should be the accepted answer rather than bringing in a separate library as it will handle OS switches natively. OP might have to do some validation of the strings brought in to determine whether wildcards are present in each item of sys.argv[1:], but I think minimizing imports should be preferred.
Actually, you can pass each element of sys.argv[1:] right into pathlib.Path.glob since a direct match of the pattern without a wildcard should return either one or no paths.
@CharlieG Thanks for bewaring. I first came up with raw use of glob module, then I found the glob-feature already exists in pathlib. Also agree with the direct-match for concrete-file arguments (without wildcards in them).
So with this instructions: filenames = glob.glob(sys.argv[1:]) I got error: filenames = glob.glob(sys.argv[1:]) File "C:\ProgramData\Anaconda3\lib\glob.py", line 21, in glob return list(iglob(pathname, recursive=recursive)) File "C:\ProgramData\Anaconda3\lib\glob.py", line 42, in _iglob dirname, basename = os.path.split(pathname) File "C:\ProgramData\Anaconda3\lib\ntpath.py", line 185, in split p = os.fspath(p) TypeError: expected str, bytes or os.PathLike object, not list
@Theo75 The error-message suggests the fix: Use a sting instead of a list 😉 glob.glob(sys.argv[1]) passes a single string, whereas sys.argv[1:] is Python's list slicing returning a list. See my update to handle command-line args.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.