Quick Scripts: Summary of Video Directory

Information is power

To process my Subscriber Crown videos for YouTube, I needed a little bit of information. I tend to split up longer videos for ‘ease’ of watching- psychologically, I am to an extent put off by seeing a video that runs to multiple hours. I also think there’s an advantage to coming back, insofar as it is easier to remember ‘I was on video 5’ rather than ‘I was two and a half hours in’. All of this is based on to my own consumption preferences, as opposed to any data-driven engagement metrics, which would be another discussion and a philosophical one at that.

For the Sub Crown videos, I wanted to know both what the average length is, and what the longest is. That’s because if most are around an hour, I would put those up ‘as-is’ (as-are?) and split any outlier long videos. On the other hand, if many are over an hour-and-a-half I can treat the video splitting as part of the video preparation step of processing.

I debated doing this as a bash function, but quickly realised that it’s more complicated than it’s worth (related U&L answer from yesterday), and so I wrote a small python script instead.

The script’s output (pictured above) told me:

  • the average is indeed over the cutoff for automating the splitting up of videos
  • the longest one went on for quite a while (the most recent one, ‘Quadrennic Games’)
  • I haven’t moved all of the videos to the right pending directory yet (there should be over 12)

So it was a helpful exercise, and will be useful for other video series.

The process of writing the script also made me realise a few things:

  • the issue I had with ‘found 4 tabs, expected 1’ reared its head again; this time I changed my user-wide .pylintrc file
  • I have now reused a couple of functions: to_hms for some nice time formatting; and get_media_length to get a video’s duration in seconds using ffprobe.

    The latter has now been used in a few places, and I think it’s time to pull out some common functions to my own ‘library’, which I have done in other languages before but not in python, at least not recently

The Directory Summary Script

(I will probably replace this with a link to the script in a git repo after doing some organisation)

#!/bin/python3
"""directorysummary.py -- give info on videos in a given directory"""
from pathlib import Path
import subprocess
from datetime import timedelta
import time


def get_media_length(filename):
    """Return length of _filename_ in seconds using ffprobe"""
    ffprobe_output = subprocess.Popen(
        ["ffprobe", "-v", "error",
            "-show_entries", "format=duration",
            "-of", "default=noprint_wrappers=1:nokey=1",
            filename],
        stdout=subprocess.PIPE, stderr=subprocess.STDOUT
    )

    return ffprobe_output.stdout.readlines()[0].strip().decode("utf-8")


def to_hms(seconds):
    """Convert seconds to hours, minutes, seconds

    See:
        - https://stackoverflow.com/a/8907269
        - https://stackoverflow.com/a/1384565
    """
    def strfdelta(tdelta, fmt):
        d = {"days": tdelta.days}
        d["hours"], rem = divmod(tdelta.seconds, 3600)
        d["minutes"], d["seconds"] = divmod(rem, 60)
        return fmt.format(**d)

    if seconds > 86399:
        td = timedelta(seconds=float(seconds))
        return strfdelta(td, "({days} days) {hours}:{minutes}:{seconds}")

    return time.strftime("%H:%M:%S", time.gmtime(seconds))


def directorysummary(directory: Path):
    """Provide a summary of supplied directory- ie gather info and report"""
    videos = list(directory.glob("*.mkv"))
    print("Videos:\t\t{}".format(len(videos)))

    lengths = []

    for video in directory.glob("*.mkv"):
        # name = video.name
        length = get_media_length(video)
        # print("{}: {}".format(name, length))
        lengths.append(float(length))

    print("Longest:\t{}\nShortest:\t{}\nAverage:\t{}".format(
        to_hms(max(lengths)),
        to_hms(min(lengths)),
        to_hms(sum(lengths)/len(lengths)))
    )


def sanity_check(directory: str):
    """Check supplied directory"""

    return bool(Path(directory).exists and Path(directory).is_dir())


if __name__ == "__main__":
    import sys

    if len(sys.argv) < 2:
        print("Please supply a directory to summarise")
        sys.exit(1)

    video_directory = sys.argv[1]

    if not sanity_check(video_directory):
        print("{} does not exist or is not a directory"
              .format(video_directory))
        sys.exit(1)

    directorysummary(Path(video_directory))

One thought on “Quick Scripts: Summary of Video Directory”

Tell us what's on your mind