Owen Campbell, MIR Visualizer
The goal of my project was to extend an existing real time music visualization system I had written using OpenFrameworks. The original visualizer was controlled entirely by the RMS of the input signal, so I wanted to incorporate a number of the other low-level audio features we covered in class in order to add more depth to the visuals. I used the OpenFrameworks addon for Aubio written by Paul Reimer https://github.com/paulreimer/ofxAudioFeatures, which despite its lack of documentation still helped me integrate Aubio with my existing code. Aubio offers a suite of higher level feature detection functions as well, including onset and pitch. However, because of the way my existing code was structured, I chose to stick to the low level features for now.
My strategy for expanding the number of different control parameters was to break down the behavior of the visualizer and reassign what had previously been controlled with different thresholds and short- and long-term histories of the RMS. So for each feature I stored all the same information I had been collecting for RMS for overall spectral energy, high frequency content, and the flux, spread, skewness, kurtosis, and slope of the spectrum. These features update on every FFT frame, synced to the audio callback in OpenFrameworks. A large part of the still ongoing process of using these features is determining useful ranges for their output and encapsulating that information in a way that makes it easy to swap features between different characteristics in the visuals. These visual features I'm mapping the audio features to so far are the lifetime and size of the particles, a spawning control, a trigger to scatter the particles, and color, opacity and 'texture' features which I haven't fully implemented. The texture feature determines the number of sides of each particle, shifting them from triangles to squares to pentagons all the way up to full circles. However, the OpenFrameworks addon I used to achieve this effect https://github.com/openframeworks/openFrameworks/tree/master/addons/ofxVectorGraphics does not seem to support alpha blending, which is a key improvement on the asethetic of the original visualizer.
Anis Haron, Lick detection in an improvised passage
- a 'lick' can be understood as a known sequence of notes used in solos and melodic lines in popular genres such as Jazz and Rock
- a lick is a form of imitation, or a quote – with the goal of developing the phrase and building personal style and vocabulary
- usually quoted from the greats within the genre
- project will be approached similar to the methods of speech recognition – using HMM
- main issue -- database, and variation of licks
- licks could be played in any key, at any speed and variations
Aaron Demby Jones, MIR tool for algorithmic composition
Music information retrieval (MIR) typically starts with a particular audio file or collection of audio files from which the user would like to extract data (e.g. tempo, mood, genre, etc.). Typically, a number of interesting mathematical procedures are employed in this task. However, these same procedures have not generally been considered from the perspective of generating audio content. In this project, I propose a novel approach to algorithmic composition, derived from techniques of MIR.
Robert Miller, Shifty Looping-X: Expressive, meter-aware, non-repeating rhythmic looping
Shifty Looping-x, a creative application of MIR, uses multi-feature audio segmentation and beat tracking to extend Matthew Wright’s original implementation and development of the shifty looping technique. Traditional looping is advantageous because a few seconds of music can germinate into an arbitrarily larger sample; yet, looping quickly becomes dull because of its inherent repetition. As a result, Wright developed a new technique that maintains the advantageous of looping while addressing the repetitiveness. Using a Max/MSP patch and onset detector object, the program finds and suggests possible points within a given sound sample that could serve as start and stop points for a loop. The duration of these suggested loops are an integer number of bars, thus maintaining the metrical integrity of the sample. Once the program suggests loops, the user then auditions the candidates and discards any that may potentially dismember musical coherence. Because the program generally suggests an excessive number of loops, auditioning these loops quickly becomes tedious and obtrusive to the compositional process. Therefore, I propose to incorporate segmentation and texture window analysis to the implementation to improve the programs ability to find good loops and create a more autonomous system.
Instead of in a Max/MSP patch, the program will run with C++ using Allocore and Essentia, the latter powering the program’s MIR functionality. I will use Tzanetakis and Cook’s scheme for segmentation, but I will replace zero-crossing rate with running cross-correlation (or frame-by-frame autocorrelation). The other features are spectral centroid, spectral rolloff, spectral flux, and root-mean square energy. A significant and simultaneous changes within these features will be used in the decision making process of setting a loop point. The beat locations will also be factored in the decision of where to place a loop point. The distance between start and stop positions will be an integer number of bars away.
The main challenges with this project include creating an automated decision maker and controlling for the different types of music that may be passed to it. I will deliver a program (source) that runs via the command line, or time permitting a simple GUI application built using GLV. The program will be accompanied by a README and a few audio samples demonstrating the program.
Hafiz Muhammad, NICM: Raag Analysis
- Time on Pitch & HMM http://nbviewer.ipython.org/gist/muhammadhafiz/6e716182653e2f2e35c2
- Raag Analysis http://nbviewer.ipython.org/gist/muhammadhafiz/062a21c9caa3cae5cb2a
- Spectral difference-sitar onset VS tabla onset http://nbviewer.ipython.org/gist/muhammadhafiz/60719fc11964655b8690
Sean Cyrus Phillips, Classification of Animal Vocalizations
This project attemps to classify sounds as animal or non-animal based on low level audio features. http://nbviewer.ipython.org/urls/raw.githubusercontent.com/soundbysean/240E_iPython/master/240E_Final_Total_Set.ipynb?create=1
due: Friday 28th February
Submit a written proposal for your final project detailing the goals and techniques used. Detail the deliverables you will submit as your project.
due: Friday 28th February
Tzanetakis, G., & Cook, P. (1999). Multifeature audio segmentation for browsing and annotation. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 1–4. Retrieved from http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=810860
Foote, J. (2000). Automatic audio segmentation using a measure of audio novelty. Multimedia and Expo, 2000. ICME 2000. 2000 IEEE …, 1, 452–455. Retrieved from http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=869637
due: Friday 21th February
Explore summarization of features extracted in HW5 (if it makes sense) either through "Texture windows", Self-similarity matrices, or other techinques.
due: Friday 21th February
Tzanetakis, G., & Cook, P. (2002). Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing, 10(5), 293–302. Retrieved from http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1021072
Jain, A., Duin, R., & Mao, J. (2000). Statistical pattern recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1), 4–37. Retrieved from http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=824819
Lerch, A. (n.d.). Chapter 8: MUSICAL GENRE, SIMILARITY, AND MOOD. In An Introduction to Audio Content Analysis.
Ogihara, M., & Kim, Y. (2012). Chapter 5:Mood and Emotional Classification. In Music Data Mining.
Trohidis, K., Tsoumakas, G., Kalliris, G., & Vlahavas, I. (2008). Multi-Label Classification of Music into Emotions. ISMIR, 325–330. Retrieved from http://books.google.com/books?hl=en&lr=&id=OHp3sRnZD-oC&oi=fnd&pg=PA325&dq=MULTI-LABEL+CLASSIFICATION+OF+MUSIC+INTO+EMOTIONS&ots=oDSLrDjwc2&sig=i5KPRtHqhp74L6BIVpHecgYp6bw
due: Friday 14th February
Use pitch or tempo extractors from an existing library (Essentia, Marsyas, librosa, etc.) to extract tempo and beat information from your collection. Verify the results, to try to identify wrong estimations and discuss the reasons for this.
due: Friday 14th February
Klapuri, A., & Davy, M. (2006). Signal Processing Methods for Music Transcription. Part II- Chapter 4: Beat Tracking and Musical Metre Analysis. Pages 101-127.
Lerch, A. (n.d.). Chapter 6: Temporal analysis. In An Introduction to Audio Content Analysis (pp. 119–137).
- Aaron: Aaron_hw1
- Anis: http://nbviewer.ipython.org/gist/anisharon/8443429
- Hafiz: Hafiz_hw1.zip
- Joseph: http://nbviewer.ipython.org/gist/anonymous/8470255
- Owen: http://nbviewer.ipython.org/gist/owengc/8486332 , http://nbviewer.ipython.org/gist/owengc/8486246
- Rob: https://gist.github.com/milrob/8483898
- Sean: http://nbviewer.ipython.org/gist/soundbysean/8487190
- Aaron: Aaron_hw2
- Anis: http://nbviewer.ipython.org/gist/anisharon/8593377
- Hafiz: http://nbviewer.ipython.org/gist/muhammadhafiz/2342ea25562384e57037
- Owen: http://nbviewer.ipython.org/gist/owengc/75068e105523d0773e33
- Rob: https://gist.github.com/milrob/16fa49709529722f5e6e
- Sean: http://nbviewer.ipython.org/github/soundbysean/240E_iPython/blob/master/240E_HW2_SCP.ipynb?create=1
due: Friday February 7th
For at least 5 pieces in your collection (try to choose some that are very different, but include some similar ones too), extract 6 temporal or spectral features. Analyze and discuss what the features could tell us about the recordings. Explore how different window sizes and smoothing might affect the results.
due: Friday February 7th
Lerch, A. (n.d.). Chapter 5: Tonal analysis. In An Introduction to Audio Content Analysis (pp. 79–103). Up to the Polyphonic Input signal chapter.
Rabiner, L., Cheng, M., Rosenberg, a., & McGonegal, C. (1976). A comparative performance study of several pitch detection algorithms. IEEE Transactions on Acoustics, Speech, and Signal Processing, 24(5), 399–418. doi:10.1109/TASSP.1976.1162846
Klapuri, A., & Davy, M. (2006). Signal Processing Methods for Music Transcription. Chapter 4. Beat tracking and music metre analysis.
due: Friday January 31th
On your music collection perform the measurement of RMS and plot the result as a histogram. Separate your collection by artist, album or genre, and show any differences or similarities in the values of RMS.
Additionally, select 5 tracks from your collection and extract windowed RMS. Explore the effect of different sized windows on the result.
due: Friday January 31th
G. Tzanetakis. Chapter 2: Audio Feature Extraction. Sections 2.1 and 2.2. Pages 43-57. From Music Data Mining.
Peeters, G. (2004). A large set of audio features for sound description (similarity and classification) in the CUIDADO project (pp. 1–25). Retrieved from http://www.citeulike.org/group/1854/article/1562527
due: Friday January 24th
Perform an analysis of a set of MIDI files of your choice. Set out a hypo thesis or goal for analysis, then perform it and discuss the results.
Submit an ipython notebook or a similar document from another environment showing the work performed, including a description of the source data set and the discussion of the results.
Cuthbert, M., Ariza, C., & Friedland, L. (2011). Feature Extraction and Machine Learning on Symbolic Music using the music21 Toolkit. In ISMIR. Retrieved from http://web.mit.edu/music21/papers/Cuthbert_Ariza_Friedland_Feature-Extraction_ISMIR_2011.pdf
McKay, C., & Fujinaga, I. (2006). jSymbolic: A feature extractor for MIDI files. In Proceedings of the International Computer Music Conference 2006. Retrieved from https://www.music.mcgill.ca/~cmckay/papers/musictech/McKay_ICMC_06_jSymbolic.pdf
Mckay, C. (2004). Automatic Genre Classification of MIDI Recordings. PhD Thesis.
Knopke, I. (2012). Chapter 11 : Symbolic Data Mining in Musicology. In Music Data Mining (pp. 327–345).
Due Friday January 17th
Install the required tools in your system, and produce a histogram showing the length of audio files in your audio collection.
For Monday January 13th
Li, T., & Li, L. (2012). Chapter 1: Music Data Mining : An Introduction. In Music Data Mining (pp. 3–42).
(This text is available online from the UCSB library when connected inside campus or from the campus VPN)