Information is power
To process my Subscriber Crown videos for YouTube, I needed a little bit of information. I tend to split up longer videos for ‘ease’ of watching- psychologically, I am to an extent put off by seeing a video that runs to multiple hours. I also think there’s an advantage to coming back, insofar as it is easier to remember ‘I was on video 5’ rather than ‘I was two and a half hours in’. All of this is based on to my own consumption preferences, as opposed to any data-driven engagement metrics, which would be another discussion and a philosophical one at that.
For the Sub Crown videos, I wanted to know both what the average length is, and what the longest is. That’s because if most are around an hour, I would put those up ‘as-is’ (as-are?) and split any outlier long videos. On the other hand, if many are over an hour-and-a-half I can treat the video splitting as part of the video preparation step of processing.
I debated doing this as a bash function, but quickly realised that it’s more complicated than it’s worth (related U&L answer from yesterday), and so I wrote a small python script instead.
The script’s output (pictured above) told me:
- the average is indeed over the cutoff for automating the splitting up of videos
- the longest one went on for quite a while (the most recent one, ‘Quadrennic Games’)
- I haven’t moved all of the videos to the right pending directory yet (there should be over 12)
So it was a helpful exercise, and will be useful for other video series.
The process of writing the script also made me realise a few things:
- the issue I had with ‘found 4 tabs, expected 1’ reared its head again; this time I changed my user-wide
.pylintrc
file - I have now reused a couple of functions:
to_hms
for some nice time formatting; andget_media_length
to get a video’s duration in seconds using ffprobe.
The latter has now been used in a few places, and I think it’s time to pull out some common functions to my own ‘library’, which I have done in other languages before but not in python, at least not recently
The Directory Summary Script
(I will probably replace this with a link to the script in a git repo after doing some organisation)
#!/bin/python3 """directorysummary.py -- give info on videos in a given directory""" from pathlib import Path import subprocess from datetime import timedelta import time def get_media_length(filename): """Return length of _filename_ in seconds using ffprobe""" ffprobe_output = subprocess.Popen( ["ffprobe", "-v", "error", "-show_entries", "format=duration", "-of", "default=noprint_wrappers=1:nokey=1", filename], stdout=subprocess.PIPE, stderr=subprocess.STDOUT ) return ffprobe_output.stdout.readlines()[0].strip().decode("utf-8") def to_hms(seconds): """Convert seconds to hours, minutes, seconds See: - https://stackoverflow.com/a/8907269 - https://stackoverflow.com/a/1384565 """ def strfdelta(tdelta, fmt): d = {"days": tdelta.days} d["hours"], rem = divmod(tdelta.seconds, 3600) d["minutes"], d["seconds"] = divmod(rem, 60) return fmt.format(**d) if seconds > 86399: td = timedelta(seconds=float(seconds)) return strfdelta(td, "({days} days) {hours}:{minutes}:{seconds}") return time.strftime("%H:%M:%S", time.gmtime(seconds)) def directorysummary(directory: Path): """Provide a summary of supplied directory- ie gather info and report""" videos = list(directory.glob("*.mkv")) print("Videos:\t\t{}".format(len(videos))) lengths = [] for video in directory.glob("*.mkv"): # name = video.name length = get_media_length(video) # print("{}: {}".format(name, length)) lengths.append(float(length)) print("Longest:\t{}\nShortest:\t{}\nAverage:\t{}".format( to_hms(max(lengths)), to_hms(min(lengths)), to_hms(sum(lengths)/len(lengths))) ) def sanity_check(directory: str): """Check supplied directory""" return bool(Path(directory).exists and Path(directory).is_dir()) if __name__ == "__main__": import sys if len(sys.argv) < 2: print("Please supply a directory to summarise") sys.exit(1) video_directory = sys.argv[1] if not sanity_check(video_directory): print("{} does not exist or is not a directory" .format(video_directory)) sys.exit(1) directorysummary(Path(video_directory))
Pingback: Getting Aragon Videos Out – Rob's Blog