I played some GTA V: Online the other night — my three word review: ‘fun but clunky’ — and uploaded the footage of it as I usually do, leaving it as a draft to be later updated with my automation tools.
Later on I saw I had a notification on YouTube and thought “Ah! Someone’s subscribed, or commented, or similar”. Actually, I had a copyright claim from Take 2 Interactive for ‘WZLJHRS’. What?
The just under two minute segment in question was a GTA teevee programme (‘Jack Howitzer’, a documentary/mockumentary about a washed up action movie actor) I watched while waiting for my friend to arrive at my office. It had some funny moments.
I am mindful of YouTube’s content ID system, and I mute game music pre-emptively having been bitten in the past by that. I didn’t suspect for a second that a fake TV show in a game would result in an entire video being blocked.
We’ve been finding a way to automate YouTube uploads using tesseract to OCR frames from Deep Rock Galactic videos to extract metadata used in the video on YouTube.
We got to the stage where we have good, useful JSON output that our automated upload tool can work on. Job done? Well, yes- I could point the tool at it and let it work on that, but it would take quite a while. You see, to give a broad test base and plenty of ‘live-fire’ ammunition, I let a blacklog of a month’s videos build up.
Automating Metadata Updates
Why is that an issue for an automated tool? The YouTube API by default permits 10 000 units per day of access, and uploading a video costs 1600 units. That limits us to six videos per day max, or five once the costs of other API calls are factored in. So I’d rather upload the videos in the background using the web API, and let our automated tool set the metadata.
For that we need the videoIds reported by the API. My tool of choice to obtain those was shoogle. I wrapped it in a python script to get the playlistId of the uploads playlist, then grabbed the videoIds of the 100 latest videos, got the fileDetails of those to get the uploaded fileName… and matched that list to the filename of JSON entries.
So far so good.
But one of the personal touches that I like to do, and that will likely not be automated away is to pick a frame from the video for the thumbnail. So I need a way to quickly go through the videos, find a frame that would make a good thumbnail, and add that as a field to thumb for the correct video entry. I’ve used xdotool in the past to speed up some of the more repetitive parts of data entry (if you’ve used AutoHotKey for Windows, it’s similar to that in some ways).
I threw together a quick script to switch to the terminal with vim, go to the filename of current video in VLC (VLC can expose a JSON interface with current video metadata- the ones I’m interested in are the filename and the current seek position), create a thumb ? time entry with the current time and then switch back to VLC. That script can be assigned a key combo in Openbox, so the process is: find frame, hit hotkey, find frame in next video, hotkey, repeat.
Though the process is streamlined, finding a good frame in 47 videos isn’t the quickest! But the final result is worth it:
We have videos with full metadata, thumbnail and scheduled date/time set.
I included a video that failed OCR due to a missing loading screen (I hit record too late). There’s a handful of those- I found five while doing the thumbnails. I could do a bit of further work and get partial output from the loading/ending screen alone; or I could bit the bullet and do those ones manually, using it as a reminder to hit the record button at the right time!
However, each session of a game is one video, so I end up with many videos. In fact, Jefe, I’d say I have a plethora of videos. Since they are different round of the same games, many of the videos have similar structure to their descriptions.
Lots of things being similar sounds like fertile ground for automation!
I have a system, described elsewhere, which uploads and publishes videos to YouTube based on metadata I write, which is vastly more convenient than doing it manually through the web interface, which is a bit clunky to work with when doing videos in any quantity.
But if the metadata is similar, what if we could automatically generate that?
Deep Rock Galactic
If you’re not familiar with Deep Rock Galactic, it’s a coop FPS game for up to four players that sees you going on missions in procedurally-generated caves on a fictional world to extract materials and kill aliens. It’s great fun, but don’t take my word for it, go watch some videos!
DRG has a loading screen that very helpfully includes all the information on it that is needed to generate the metadata for the YouTube video:
Let’s break down the elements:
Here we have the names of the brave dwarven miners. This lets me say who is in the video.
It also has the classes. I don’t use that information currently, but since it’s there I could.
This has the mission type (Point Extraction), and the generated name (Clouded Joy).
Lots going on here.
Biome (location) of mission
Hazard level (difficulty)
* these are in pictograph format, but we can still work with that.
And the rest of the metadata mentioned above is included in tags, but it could be put into the description just as easily.
All the elements are there, all we need to do is do a bit of image recognition on them. Fortunately python has bindings to such things, so as we’ve figured out where everything is, all that’s left to do is write the code- that’s the easy bit, right?