VBox – AI-powered radio for musical exploration

Created by Danning Liang and Artem Laptiev at the MIT’s School of Architecture and Planning, VBox is an AI-powered radio for musical exploration and group vibrations. Through the Large Language Model, VBox captures the various abstracted understandings of the song played in the form of texts and lets you travel down one of these rabbit holes to find more of what is hard to describe.

In the world of endless musical complexity, using generic terms to describe the sound is a crime. Is the track just happy? Is it filthy? Purple? Carpe diem? ^5#@*?

The user starts their musical journey by choosing a cartridge containing the desired musical genre. As they insert it into the radio, Vbox picks a song based on the cartridge. While the first song is playing, Vbox begins by displaying a series of rolling texts across the screen, “Serenity”, “Freedom”, “Whisper”, “Enigma”, These words all relate to the song being played in some ways, whether it is the emotions, or meanings, or cultural backgrounds. As the user watches these texts scroll by and finally notices the word that echoes strongly with their vibe, through a simple press of a button, VBox processes the word and picks the next exciting direction for the musical journey.

VBox derives its intelligence from OpenAI’s language models, which are prompted to analyze the playing track and extract its various abstract musical, cultural, and contextual properties. The same models are also used to find the tracks that share similar properties. The identified new tracks are streamed through Spotify and played through the VBox’s speakers.

The interface of VBox consists of a dot matrix display, a crank, and two buttons dedicated to Queue and play. The crank allows the users to scroll backwards or forward through the texts generated and choose the desired property to queue or play. The hardware of VBox is made from perforated sheet metal assembled through mechanical fasteners as well 3D printed pieces for electronic mounting. VBox’s internal electronics are run by an Arduino microcontroller which communicates bidirectionally with ChatGPT and Spotify. VBox internal is composed of a set of speakers, dot matrix display, NRF modules, and a rotary encoder. Overall, the aesthetic of VBox is intended to communicate a sense of timelessness and modernity. It begs the question of what it means to marry AI with analog devices and how large language models may be integrated into our daily interactions with objects and culture.

Created for the Interaction Intelligence on Large Language Objects (LLO) class lead by Marcelo Coelho, part of MIT’s Art & Design Major (School of Architecture and Planning).

Danning Liang (@danning.liang) | Artem Laptiev (@arlaptiev) | MIT Art & Design