Information is power
To process my Subscriber Crown videos for YouTube, I needed a little bit of information. I tend to split up longer videos for ‘ease’ of watching- psychologically, I am to an extent put off by seeing a video that runs to multiple hours. I also think there’s an advantage to coming back, insofar as it is easier to remember ‘I was on video 5’ rather than ‘I was two and a half hours in’. All of this is based on to my own consumption preferences, as opposed to any data-driven engagement metrics, which would be another discussion and a philosophical one at that.
For the Sub Crown videos, I wanted to know both what the average length is, and what the longest is. That’s because if most are around an hour, I would put those up ‘as-is’ (as-are?) and split any outlier long videos. On the other hand, if many are over an hour-and-a-half I can treat the video splitting as part of the video preparation step of processing.

I debated doing this as a bash function, but quickly realised that it’s more complicated than it’s worth (related U&L answer from yesterday), and so I wrote a small python script instead.
The script’s output (pictured above) told me:
- the average is indeed over the cutoff for automating the splitting up of videos
- the longest one went on for quite a while (the most recent one, ‘Quadrennic Games’)
- I haven’t moved all of the videos to the right pending directory yet (there should be over 12)
So it was a helpful exercise, and will be useful for other video series.
The process of writing the script also made me realise a few things:
- the issue I had with ‘found 4 tabs, expected 1’ reared its head again; this time I changed my user-wide
.pylintrcfile - I have now reused a couple of functions:
to_hmsfor some nice time formatting; andget_media_lengthto get a video’s duration in seconds using ffprobe.
The latter has now been used in a few places, and I think it’s time to pull out some common functions to my own ‘library’, which I have done in other languages before but not in python, at least not recently
The Directory Summary Script
(I will probably replace this with a link to the script in a git repo after doing some organisation)
#!/bin/python3
"""directorysummary.py -- give info on videos in a given directory"""
from pathlib import Path
import subprocess
from datetime import timedelta
import time
def get_media_length(filename):
"""Return length of _filename_ in seconds using ffprobe"""
ffprobe_output = subprocess.Popen(
["ffprobe", "-v", "error",
"-show_entries", "format=duration",
"-of", "default=noprint_wrappers=1:nokey=1",
filename],
stdout=subprocess.PIPE, stderr=subprocess.STDOUT
)
return ffprobe_output.stdout.readlines()[0].strip().decode("utf-8")
def to_hms(seconds):
"""Convert seconds to hours, minutes, seconds
See:
- https://stackoverflow.com/a/8907269
- https://stackoverflow.com/a/1384565
"""
def strfdelta(tdelta, fmt):
d = {"days": tdelta.days}
d["hours"], rem = divmod(tdelta.seconds, 3600)
d["minutes"], d["seconds"] = divmod(rem, 60)
return fmt.format(**d)
if seconds > 86399:
td = timedelta(seconds=float(seconds))
return strfdelta(td, "({days} days) {hours}:{minutes}:{seconds}")
return time.strftime("%H:%M:%S", time.gmtime(seconds))
def directorysummary(directory: Path):
"""Provide a summary of supplied directory- ie gather info and report"""
videos = list(directory.glob("*.mkv"))
print("Videos:\t\t{}".format(len(videos)))
lengths = []
for video in directory.glob("*.mkv"):
# name = video.name
length = get_media_length(video)
# print("{}: {}".format(name, length))
lengths.append(float(length))
print("Longest:\t{}\nShortest:\t{}\nAverage:\t{}".format(
to_hms(max(lengths)),
to_hms(min(lengths)),
to_hms(sum(lengths)/len(lengths)))
)
def sanity_check(directory: str):
"""Check supplied directory"""
return bool(Path(directory).exists and Path(directory).is_dir())
if __name__ == "__main__":
import sys
if len(sys.argv) < 2:
print("Please supply a directory to summarise")
sys.exit(1)
video_directory = sys.argv[1]
if not sanity_check(video_directory):
print("{} does not exist or is not a directory"
.format(video_directory))
sys.exit(1)
directorysummary(Path(video_directory))
Pingback: Getting Aragon Videos Out – Rob's Blog