Skip to the content.
Homepage Course content

drawing

Website for the graduate-level Deep Learning for Music Information Retrieval I & II workshops covering the deep learning theory, literature, and practice applied to digital audio. More specifically, the course covers generative models (including autoencoders).

Taught at the Center for Computer Research in Music and Acoustics (CCRMA), Stanford University by Iran R. Roman and Camille Noufi as part of the CCRMA Summer 2022 Workshop Series.

Course content

  1. Introduction and digital audio review
  2. Audio features and musical objects
  3. Datasets and dimensionality reduction
  4. Cross-validation and linear regression
  5. Logistic regression, binary cross-entropy, and evaluation metrics
  6. Softmax classification
  7. Feedforward neural networks
  8. CNNs and optimization techniques
  9. Autoencoders
  10. VAEs and GANs
  11. Transformers and RNNs

Prerequisites

This is a graduate-level workshop that assumes knowledge of digital audio signal processing, object-oriented programming (we will work with Python3), differential calculus (chain rule), linear algebra, and basic probability/statistics. To ensure that everybody is on the same page, we will review these concepts as they become relevant to course content. However, if you have never been exposed to these concepts, this course will likely be more challenging that what it has to be.

If you need to review these concepts, checkout the following:

Or make an online search for other materials covering these concepts.

Course logistics

The course runs from August 8th to 19th (2022) and meets daily from 10AM to 5PM (Pacific Daylight Time) in person at CCRMA and over Zoom.

Please register for the course if you would like to attend either in-person or online. The course will welcome all students who can meet the course prerequisites. All course materials are in English thus strong knowledge of English reading, writting, and speaking is assumed.

Getting help

You may also find a past version of the course’s sub-reddit deeplearningaudio helpful for particular questions relating to this workshop.


© Iran R. Roman & Camille Noufi 2022