audio

New project: auphonic

Submitted by grh on 6. July 2011 - 17:19

Currently I am working on the auphonic project, which involves machine learning, audio signal processing, web development, open-source technologies and much more.

So don't expect many updates on my mur.at page, I will write about new things on the auphonic blog. You can subscribe to the auphonic feed or follow @auphonic on twitter.

grh's blog

How to become a Pure Data / GEM Professional

Submitted by grh on 20. June 2010 - 19:31

These are materials from a lecture about realtime audio and video programming (using GEM and Pure Data) at the University of Applied Science, FH Joanneum, for the master programme Media and Interaction Design.

Reservoir Computing: a powerful Black-Box Framework for Nonlinear Audio Processing

Submitted by grh on 25. June 2009 - 15:23

Year:

2009

Authors:

Georg Holzmann

Type:

Conference paper

Publisher:

Proc. of the 12th Int. Conference on Digital Audio Effects (DAFx-09)

Abstract:

This paper proposes reservoir computing as a general framework for nonlinear audio processing.
Reservoir computing is a novel approach to recurrent neural network training with the advantage of a very simple and linear learning algorithm. It can in theory approximate arbitrary nonlinear dynamical systems with arbitrary precision, has an inherent temporal processing capability and is therefore well suited for many nonlinear audio processing problems. Always when nonlinear relationships are present in the data and time information is crucial, reservoir computing can be applied.

Examples from three application areas are presented: nonlinear system identification of a tube amplifier emulator algorithm, nonlinear audio prediction, as necessary in a wireless transmission of audio where dropouts may occur, and automatic melody transcription out of a polyphonic audio stream, as one example from the big field of music information retrieval.
Reservoir computing was able to outperform state-of-the-art alternative models in all studied tasks.

Publication:

Reservoir Computing DAFx-09 paper

Media:

Audio Examples for DAFx-09 paper (5.3 MB)

Master Thesis on Echo State Networks

Submitted by grh on 24. June 2009 - 20:01

Year:

2008

Authors:

Georg Holzmann

Type:

Master Thesis

Publisher:

Institute for Theoretical Computer Science, TU Graz, Austria

Abstract:

Echo State Networks with Filter Neurons and a Delay&Sum Readout with Applications in Audio Signal Processing

Echo state networks (ESNs) are a novel approach to recurrent neural network training with the advantage of a very simple and linear learning algorithm. They can in theory approximate arbitrary nonlinear dynamical system with arbitrary precision (universal approximation property), have an inherent temporal processing capability, and are therefore a very powerful enhancement of linear blackbox modeling techniques in nonlinear domain. It was demonstrated on a number of benchmark tasks, that echo state networks outperform other methods for nonlinear dynamical modeling.

This thesis suggests two enhancements of the original network model. First, the previously proposed idea of filters in neurons is extended to arbitrary infinite impulse response (IIR) filter neurons and the ability of such networks to learn multiple attractors is demonstrated. Second, a delay&sum readout is introduced, which adds trainable delays in the synaptic connections of output neurons and therefore vastly improves the memory capacity of echo state networks. It is shown in benchmark tasks that this new structure is able to outperform standard ESNs and other models, moreover no other comparable method for sparse nonlinear system identification with long-term dependencies could be found in literature.

Finally real-world applications in the context of audio signal processing are presented and compared to state-of-the-art alternative methods. The first example is a nonlinear system identification task of a tube amplifier and afterwards ESNs are trained for nonlinear audio prediction, as necessary in audio restoration or in the wireless transmission of audio where dropouts may occur. Furthermore an efficient and open source C++ library for echo state networks was developed and is briefly presented.

The audio examples can be downloaded below.

Publication:

Master Thesis

Media:

Thesis Audio Examples (6 MB)

Echo State Networks in Audio Processing

Submitted by grh on 24. June 2009 - 19:34

Year:

2007

Authors:

Georg Holzmann

Type:

Technical report

Publisher:

Internet Publication

Abstract:

In this article echo state networks, a special form of recurrent neural networks, are discussed in the area of nonlinear audio signal processing. Echo state networks are a novel approach in recurrent neural networks with a very easy (linear) training algorithm.
Signal processing examples in nonlinear system identification (valve distortion, clipping), inverse modeling (quality enhancement) and audio prediction are briefly presented and discussed.

Publication:

ESNs in Audio Processing

audioeditor

Submitted by grh on 24. June 2009 - 16:04

audio

Started in:

2006

Authors:

Georg Holzmann

License:

GNU General Public License (GPL)

Programming language:

C++

Overview:

This was an attempt to build a C++ multitrack audioeditor using the Qt4 GUI toolkit.
The audio system is already working, but the GUI is not yet really usable.

Audioeditor uses PortAudio to access the hardware in a cross-platform way and libsndfile for file IO.

The audio IO is callback based: a callback will be called by the PortAudio engine whenever it needs more audio data for output/input. The callback function operates under an interrupt or background thread. This leaves the foreground application free to do other things while the audio just runs in the background.

Each track, effect, etc. has to register a REALTIME-SAFE process method the AudioCtl class and this method will be called each audio block.
This class manages also all the play/pause/stop/seek and other control methods and synchronizes all different tracks and effects.

A more detailed (doxygen) documentation can be found in the source tarball below.

Release Tarball:

audioeditor source and documentation

echo noise

Submitted by grh on 23. June 2009 - 10:36

audio
video

Year:

2008

Authors:

Georg Holzmann

Project Description:

Echo noise is an audio and visual performance for the Frechheit Freiheit festival at Minoriten Graz.

A longer recording is also available in higher quality in the download section below. Hear all records with good headphones or loudspeakers, they contain important low frequencies!

Text from the festival curator (in german):
Schon gleich zu Beginn fiel der junge Steirer Georg Holzmann durch die Radikalität seiner Klangsprache und die Kompromisslosigkeit seiner Konzepte auf, eine ganz charakteristische im Geräuschhaften angesiedelte Klanglichkeit und konsequent umgesetzte räumliche Vorstellungen zeichneten seine frühen Stücke aus.
Das vielzitierte – im übertragenen Sinne gemeinte – Lachenmann-Diktum, man müsse sich als Komponist jeweils sein Instrument „selbst bauen“, nimmt Holzmann wörtlich und setzt es konsequent in die Tat um: Sei es nun im Rahmen seiner Forschungen im Bereich der Programmierung und seit neuestem der „Computational Intelligence“ oder auch im realen physischen Bau eines ganz eigenen Instrumentatriums. Holzmann reiht sich damit in eine lange Tradition avantgardistischer Experimentatoren ein und ist Komponist, Forscher, Bastler, Tüftler und nicht zuletzt auch (z.B. auf dem von ihm selbst konzipierten und umgesetzten „elektronischen Einhorn“) sein eigener Interpret.
Kurator: Florian Gessler

Public Presentations and Performances:

echo noise @ Minoriten Graz

Recording:

echo noise recording (mp4, 70 MB)

grh.mur.at

audio

New project: auphonic

How to become a Pure Data / GEM Professional

Reservoir Computing: a powerful Black-Box Framework for Nonlinear Audio Processing

Master Thesis on Echo State Networks

Echo State Networks in Audio Processing

audioeditor

echo noise

RSS feeds

Recent blog posts

Tag cloud