This idea has been done before, but I don’t think it’s been done so small! I shat this out in a week or so in the month leading up to DEF CON 32. It was so late we had to get them shipped directly to the hotel and had to assemble them there. DHL has screwed me before, not going to let a late package ruin everything again!
The Hardware
I love making SAOs cause I don’t need to fuck around with batteries or power or anything. I suck at that and hate doing it. This SAO is an RP2040 (wow how original), a SPI based OLED screen, a sallen-key pre-amp circuit and and an LM386 for clean audio, and a speaker from a Nintendo DSi.
Playing an MP3 and a video is a pretty strenous task for the RP2040. The way I have it written it’s just barely able to do it. I chose a large QSPI flash chip so I could fit all the audio and video data. I went with a SPI screen instead of an I2C one because I can clock those way faster and I knew I’d be operating at the margins of the RP2040. Any time saved on screen transactions meant smoother video. The sallen-key preamp and LM386 was cannibalized from another design. The RP2040 does not output an actual audio waveform, the sound that comes out is PWM. That doesn’t amplify very well, so the sallen-key filter turns the PWM signal into a proper audio waveform. That’s then piped through the LM386 to make it louder. The DSi speaker was chosen for cost reasons. You can get these for dirt cheap in large quantities on Aliexpress. Protip, when looking for parts, see if Aliexpress caters to it through some other specific niche. Searching for “small speaker” or something similar might not yield many high quality results, but “DSi speaker” will return tons of products.
Downscaling the Video
The music video I ripped from YouTube was in HD resolution and had full color. The screen I had was capable of displaying 128x64 pixels in two colors. Not black and white, black OR white. Just the two colors. No shades of gray. One bit color. I also don’t think there’s any way the RP2040 is going to be capable of decompressing and playing an actual compressed video file. This video would have to be streamed uncompressed. Luckily, 128x64 pixels at one bit color makes for a pretty small video stream. The full video is about 5300 frames long. If each frame is 128x64 bits, that comes out to exactly 1K. 5300 frames is 53K. That will totally scrunch into the SPI flash I have on the SAO.
Turning the video into a series of bitmaps was easier than you think. I used VLC to split the video into individual frames, then a couple of imagemagick one liners to downscale it to 128X64, set the color to black or white, and dither it. I was left with every frame of the video as a 128x64 dithered black or white bitmap.
The Software
Since I was shitting this out, I used circuitpython. This ended up kneecapping me considerably, since circuitpython only makes use of one of the two cores on the chip. The other core sat idle, and I wanted to do other things with that core later on.
Circuitpython has some libraries to decode and play MP3 audio that I made use of. It also had libaries for driving the SPI screen. I was able to get a pretty quick prototype working with these, but it needed lots of improvement. The video played at about 3 or 4 frames per second. It sucked and barely looked like video
Optimizing!
There are a couple freebies I can do to get the frame rate up. First off, we can increase the clock speed of the screen up to 10MHz. That nets us a couple extra frames. We can also increase the clock speed of the rp2040! It actually does pretty well for a little micro, I was able to get it to about 200MHz. Again, that nets us a couple more frames. Why is the video still so slow?
The main bottleneck here is IO. We can only read the data off of the flash so fast. Still, each frame should only be 1K right? Well, looking at the BMPs I had exported I noticed that each one was actually 24K! In order to play each frame, we were reading 24 times as much data from the flash than I thought. That brough the video stream to a crawl. BMPs apparently have configurable color depth, the default being 24 bit color or one byte per RGB channel. I only got black or white! One bit color depth! After a bit of twiddling with the imagemagick incantations, I got it to spit out 1K bitmaps. That sped the video up by a lot. It actually looked like a video feed now!
The video played fine at the start, but I noticed it started to slow down a lot the longer that it played. Worse yet, the audio also started to skip and blast random static out of the
There are a couple freebies I can do to get the frame rate up. First off, we can increase the clock speed of the screen up to 10MHz. That nets us a couple extra frames. We can also increase the clock speed of the rp2040! It actually does pretty well for a little micro, I was able to get it to about 200MHz. Again, that nets us a couple more frames. Why is the video still so slow?
The main bottleneck here is IO. We can only read the data off of the flash so fast. Still, each frame should only be 1K right? Well, looking at the BMPs I had exported I noticed that each one was 24K. They’re supposed to be 1K in size! In order to play each frame, I was reading 24 times the data off of flash that I needed to. Why was this happening? Apparently bitmaps have a configurable color depth. The default is 24 bit color, since most people in the modern era have lots of hard drive space and want nice looking images. I wanted neither of those things, so I fiddled with the imagemagick and got it to spit out 1K bitmaps with 1 bit color depth. Now it actually looks like real video!
Still, I noticed after the video played for a while it was starting to slow down a lot. Worse yet, the audio was starting to cut out too. A fun new problem to deal with. Immediately this seems like some kind of memory leak in circuitpython. If that was the case I was screwed. I’m not fixing circuitpython memory leaks for my shitty add on to work properly. Still, I remembered the IO bottleneck I just dealt with and thought a bit. I was storing each one of these images on Circuitpython’s FAT filesystem. It seemed like the longer the video played, the worse the performance got. It started out nice and smooth and eventually got really choppy with a bunch of audio artifacts. What else could be getting worse over time, if not a memory leak?
The problem was with the filsystem itself. Each new image was stored “deeper” in the filesystem, and required the RP2040 to search through its slow flash storage looking for the actual image. That’s why it got worse and worse with time. My constraint here was the actual number of files in the filesystem. This wasn’t something I could code around, I had to just start cutting frames. The music video originally played at 24 frames per second, but 12 frames per second still looked like a video. It’s a bit shitty, but that’s kind of the point isn’t it? I ended up cutting every other frame out of the video and cut my file count in half. This way, it just barely got through all the frames of video before having any real problems
Assembly
We really left this one till the last minute. I had all the boards shipped to our hotel room in Vegas, to be assembled two days before DEF CON 32.
They came mostly assembled, but we had to solder on the screens, headers, and speakers ourselved. Those were the only parts that were through hole and that we couldn’t get done in China
It took like six straight hours of work to finish these. We soldered all the parts on ourselves, then manually programmed them all.
Programming these sucked! I used a TagConnect cable to touch the programming points on the PCB, which means we had had to hold the cable in place for two minutes, without moving, while they were programmed. The TagConnect cable even has clips so it can hold itself in place but I didn’t add cutouts of these on the board. So dumb of me.
Improvements?
I am not improving this lmao. It works and I don’t care. If I WAS going to improve this though, I would do a couple things.
First, I would add a FUCKING programming header that clips in place. Holding the TagConnect plug in place for two minutes straight sucks. I’d also program the SPI flash directly instead of going throught the RP2040. I could probably speed things up that way if I shortened my cables a bit
Second, I would try streaming the video data out of QSPI flash without using a filesystem. Just turn all the frames into a single binary blob and read chunks of it back. I could probably get a great framerate this way, but it would require more software overhead that I don’t want to deal with.