Categories
automation ocr python

Automating YouTube Uploads With OCR Part 9: Bringing it All Together

I love it when a plan comes together

We’ve been finding a way to automate YouTube uploads using tesseract to OCR frames from Deep Rock Galactic videos to extract metadata used in the video on YouTube.

We got to the stage where we have good, useful JSON output that our automated upload tool can work on. Job done? Well, yes- I could point the tool at it and let it work on that, but it would take quite a while. You see, to give a broad test base and plenty of ‘live-fire’ ammunition, I let a blacklog of a month’s videos build up.

Automating Metadata Updates

Why is that an issue for an automated tool? The YouTube API by default permits 10 000 units per day of access, and uploading a video costs 1600 units. That limits us to six videos per day max, or five once the costs of other API calls are factored in. So I’d rather upload the videos in the background using the web API, and let our automated tool set the metadata.

For that we need the videoIds reported by the API. My tool of choice to obtain those was shoogle. I wrapped it in a python script to get the playlistId of the uploads playlist, then grabbed the videoIds of the 100 latest videos, got the fileDetails of those to get the uploaded fileName… and matched that list to the filename of JSON entries.

So far so good.

Faster Thumbnails

But one of the personal touches that I like to do, and that will likely not be automated away is to pick a frame from the video for the thumbnail. So I need a way to quickly go through the videos, find a frame that would make a good thumbnail, and add that as a field to thumb for the correct video entry. I’ve used xdotool in the past to speed up some of the more repetitive parts of data entry (if you’ve used AutoHotKey for Windows, it’s similar to that in some ways).

I threw together a quick script to switch to the terminal with vim, go to the filename of current video in VLC (VLC can expose a JSON interface with current video metadata- the ones I’m interested in are the filename and the current seek position), create a thumb ? time entry with the current time and then switch back to VLC. That script can be assigned a key combo in Openbox, so the process is: find frame, hit hotkey, find frame in next video, hotkey, repeat.

Though the process is streamlined, finding a good frame in 47 videos isn’t the quickest! But the final result is worth it:

We have videos with full metadata, thumbnail and scheduled date/time set.

Glorious.

I included a video that failed OCR due to a missing loading screen (I hit record too late). There’s a handful of those- I found five while doing the thumbnails. I could do a bit of further work and get partial output from the loading/ending screen alone; or I could bit the bullet and do those ones manually, using it as a reminder to hit the record button at the right time!