FeetBox - Me => Cam => dac

joel | Uncategorized | Monday, March 16th, 2009

My final project for 220B is a system for synthesizing interesting music sounds from video.  A video input system (cameraOne) takes webcam input and sends OSC messages to a chuck program called FeetBocks.

 

CameraOne

The cameraOne code is about 700 lines of processing code.  Input frames at 320×240 resolution are grabbed at about 12fps and processed for the following types of messages:

  • \drgb messages: the video window is split into frames and the total RGB value in the window as well as the frame difference (d-channel) is sent to chuck.  In general we map hue to pitch, brightness to intensity, and movement to event triggers.  The drgb values are indicated on the display with a small line graph in each frame.
  • \mrgb messages: the position of average brightness in the frame is mapped and the corresponding xy coordinate is sent to chuck.  This value is mapped to tone controls.  Each frame has a small grey square inside indicating the current estimate of this value.
  • \blob messages: the program tracks the position of the best match blob to a given target color.  Matches are computed using a windowed mean shift relative to the current position with coefficients for RGB and HSV color distance metrics.  In addition to the xy position, a “sharpness” score is provided which indicates how much the blob resembles a point source.  Blob messages could control theremin-like melodic voices.
    Blobs are shown on the screen as small colored triangles.

The video input window for the controller is shown above.

The video input system is fairly straightforward and mostly complete.

 

FeetBocks - A Chuck System for Song Synthesis and Playback

The interactive music synthesis system has several modes of operation. 

The first mode is a manual mode.  Instruments (some from the homeworks and some from Chuck) and the \drgb and \blob messages are mapped to controls on those instruments.  For the demo I will have kick/snare and two drones mapped to the drgb messages and a single melodic instrument mapped to a blob.

The basic modules are as follows:

  • Instruments:
    • breathy.ck - filtered noise
    • tone.ck - beating oscillators
    • fmfm.ck - cascaded FM
    • voice.ck - instrument abstraction layer - common noteOn, noteOff, toneControl
  • Libraries:
    • hsv.ck - color transformations
    • oscsender.ck - send some oscMessages to processing
  • Song Structure
    • note.ck - everything we need to know about a single note
    • tonality.ck - note relationships independent of time
    • feel.ck - metrical structure/arrangement/notelist synthesis
      • support for different metrical/submetrical structures
      • parameters like density, syncopation and independence of voices
    • measure.ck - wrapper for feel
    • progression.ck - a collection of measures, build with a generative grammar
    • verse.ck - a collection of progressions
    • song.ck - a sequence of verses
    • score.ck - a sequence of songs
  • Control
    • main.ck - keyboard IO, toplevel timing
    • conductor.ck - control of time, handling OSC messages
    • band.ck - collect instruments and play measures

For this quarter I focused most on conductor module (809 lines).  This module allows one to play music in several modes:

  • Manual mode: no time - video events (movement) are mapped directly to notes
  • Measure mode: A measure is synthesized and played - when things get boring a new measure is synthesized and the feel interpolates between these measures.  Time advances in measures either by an internal clock, by external triggers, or a mixture of the two.
  • Song mode: A song is synthesized and played

Beat Detection

The biggest challenge so far was the generation of a beat detector.  The basic idea is to have the conductor module lock playback to a video motion like a head bob or dancing.  In electrical engineering we might use a technique like PLLs, but those methods probably dont capture what real people do to track beats and are slow to adapt.  Instead I use a different method as follows:

  1. FeetBox has an internal estimate of the tempo and schedules the next beat for a certain time t after the most recent beat.
  2. If the movement in the scene triggers a beat before t1, that beat is played, unless the beat occurs within a certain holdoff time (~300msec) of the last beat played.  A new estimate of the bpm and the last beat time is made and we return to step 1. 
  3. If t1 arrives before the next beat, the computer triggers the beat and returns to step 1.

 This method produces fairly intuitive results, but is not foolproof.

Future Work

As of today the system is 3500 lines, and is probably only about 30% complete.   It will be a struggle to get natural sounding beats and chord progressions for the show on Tuesday, but I will keep working on it.  The performance will focus primarily on the use of beat detection and manual melody.

I dont want to post the code on the web anymore.  Email me if you want to take a look at it.

joel

 

 

 

Leave a comment

»

RSS feed for comments on this post. TrackBack URI