Giter VIP home page Giter VIP logo

timbre-tools's Introduction

Timbre Toys

Timbre Tools Hackathon 2024

This repository contains source code and assets of the project made by SPIS Girls for Timbre Tools 2024 Hackathon.

Link to the repository.

Link to the video presentation.

What is the project about?

Inspiration

The project comes in the form of a Max Patch. It is a timbre manipulation tool that allows the users to interact with a timbre the same way they would interact with a physical object. The main inspiration behind the project was a sonic interaction and a live performance (#sensingtimbre, #timbrecheck, #timbreforall).

Technical Overview

The way the audio is modified depends on a camera input. More specifically, the tool extracts visual properties of the live video feed from the webcam (color space, brightness, presence and a full spectogram). The intensity of the extracted features controls timbre-manipulating audio effects. Additionally, audio input is convolved with the camera input, adding unpredictable factor to the whole system.

Video feature extraction and FFT calculations are computed in a real-time JUCE plugin. The plugin is plugged into the MaxMSP Patch. In the patch, certain effects and their parameters are controlled by the values generated by the JUCE plugin.

Project Schema ๐Ÿต๏ธ - Predictable Timbre Componenets
โญ - Magic Frequency Computation

In-Depth Description

Feature Extraction

The JUCE plugin is responsible for every image-dependent feature extraction. It periodically caputres a still image from the first available webcam and process it. The images is rescale and copied over more manageable and writable data structures, since JUCE Pixel object don't allow manually setting the individual color values. The extracted features are

  • individual color presence (extracted by bleeding the fundamental colors)
  • presence (which in the code is called slinkiness)
  • crunchiness (sum of the brightness absolute differences of neighboring pixels)
  • brightness

The extracted features are purposefully simple, given the computation constraints and the difficulty of importing more complex vision frameworks in JUCE. The greyscale image is then resized and fed to another object, which will run a FFT on the image itself.

Video Capture

Each incoming frame is first resized to size NxN (where N is a power of 2) and then its 2D FFT is calculated.

FFT Calculations

A simple prototype of the algorithm can be found in python_test directory.

FFT Audio Processing

In the audio processing loop, samples are grouped into blocks of size N. For each block its 1D FFT is calculated. Image FFT and audio FFT are then combined by running an element-wise multiplication. The resulting 2D signal is extracted by running 2D IFFT. To move from 2D to 1D, the sum of each column is calculated. The resulting 1D signal is then scaled to a range [-1;1] and multiplied by a Hann window of size N to minimize artifacts.

JUCE -> MaxMSP Communication

After extracting features from the video stream, JUCE passes audio and MIDI messages into Max using the Max object ~vst. Each of the extracted features are mapped to an audio effect parameter from Max's built-in audio effects.

MaxMSP

The MIDI messages that are passed to Max are from the extracted video features: red content, green content, blue content, presence, crunchiness, and brightness. Due to time contraints in the hackathon, only red, blue, green, and presence were mapped to following audio effects.

Red: Wet/dry mix of a rain drop synthesizer. This effect takes a given audio and makes it sound like falling drops. A wetter mix makes the audio sound more raindrop-like. Green: Offset of "Comber," a comb filter effect. Blue: Wet/dry mix of a chamber reverb effect. Presence: Transposition of a pitch shifter effect. More succinctly, the presence of the slinky bends the pitch.

All of the audio effects controlled by the incoming MIDI messages are multiplied with the original audio signal, creating a beautiful cocophony of music controlled by the webcam watching the slinky.

Meet The Team

We are a team of 5 from Sound and Music Computing, Aalborg University:

Cumhur Giacomo Kate Levin Maria

timbre-tools's People

Contributors

mp-smc23 avatar giacomo-ascari avatar moewe-audio avatar kbosen23 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.