I am trying to know the duration of a distant video file (say mp4).
I know already how to get the duration of a local video file:
import xml.etree.ElementTree as eltt, subprocess as spr
def size_from_fn(file_name):
size = eltt.fromstring(
spr.run(["ffprobe",
"-i", file_name,
"-show_format", "-output_format", "xml"
], stdout = spr.PIPE, stderr = spr.STDOUT).stdout.decode()
).find("format").get("duration")
return size
def size_from_fd(file_descriptor):
size = eltt.fromstring(
spr.run(["ffprobe",
"-i", "pipe:0",
"-show_format", "-output_format", "xml"
], stdin = file_descriptor, stdout = spr.PIPE, stderr = spr.STDOUT).stdout.decode()
).find("format").get("duration")
return size
def size_from_data(file_name):
size = eltt.fromstring(
spr.run(["ffprobe",
"-i", "pipe:0",
"-show_format", "-output_format", "xml"
], input = data, stdout = spr.PIPE, stderr = spr.STDOUT).stdout.decode()
).find("format").get("duration")
return size
All work perfectly
Also I know how to get an HTTP request as a file descriptor:
import requests as rq
def url_to_fd(url):
req = rq.get(url, stream = True)
return req.raw
It also works
However the combination of the two fails with the message from ffprobe : Invalid data found when processing input
I have no idea why, I just know the returned file descriptor from URL has the difference of not being seekable (one-way reading) but by replacing this method of a normal file descriptor:
with open("test.mp4", "rb") as f:
f.seek = None
size_of_fd(f)
this works and thus shows that ffprobe doesn't use any seeking
Also doing this works so I don't know what is up:
def get_duration(url):
complete_data = url_to_fd(url).read()
return size_of_data(complete_data)
My problem is that video files may be arbitrarily large so I can't afford to download the whole video.
req.raw- if you see some HTML then it is wrong URL.-v quiet -show_entries format=duration -output_format default=noprint_wrappers=1:nokey=1, ffmpeg - How to get video duration in seconds? - Super User