| Key Concepts
- The event (a sound, a picture, etc.) that produces a
brain response from the subject.
- A disk of metal (usually silver plus a layer of
silver chloride) used in EEG to collect the tiny voltage changes coming
from the brain. It is connected to the scalp with a conductive gel or
cream after abrading slightly the skin the subject. The signal
collected by the electrode is then transferred to analog-to-digital
amplifiers for its amplification.
- The portion of the EEG or MEG that is locked to the
- The portion of the EEG or MEG that is locked to the
stimulus around a specific time window. Sweep and Epoch are actually
- The marker on the EEG or MEG that indicates when a
stimulus is being presented to the subject. It is called trigger
because it triggers a brain response.
- It refers to the event when the stimulus is presented
to the subject.
- Event-related potentials recorded from the scalp
combine signal (the ERP) and noise (other electrical activity). Since
the signal is often smaller than the noise, it would be difficult or
impossible to distinguish the ERP in a single sweep. In order to
increase the signal-to-noise ratio averaging of the sweep locked to a
single repeated stimulus is performed.
- The electrical or magnetic response that is not
related to the event we want to study. For example, the eye movements
cause artefacts in the EEG or MEG.
- This is another procedure used to increase the
signal-to-noise ratio. It consists in digitally filtering the brain
averaged ERP or ERF and it is based on the assumption that
physiological brain responses occur with a specific frequency. For
example, it is known that movement artefacts occur with a very high
frequency (above 50 Hz) while the most studied ERPs and ERFs occur with
a frequency between 1 and 20 Hz.
- The timing of an ERP or ERF brain response.
- Sound Transfer:
- It is the process of capturing (by playback) the
recorded sound in a given medium, e.g., an old 78-RPM disk, and
transferring the information to a new medium, e.g., a digitalized
signal in a compact disk.
- Refers to any sort of signal processing that aims at
reducing or removing spurious noises from a noisy signal.
- Clicks, hiss, thumps, pops, wow, flutter:
- These are names associated with different types of
noises or degradations that commonly occur in old records. For
instance, click is an onomatopoeia (a formation of words that imitates
a sound) for a short spark. Similarly, hiss refers to the
characteristic noise of tape recordings. As for thumps and pops, they
refer to the sound produced when playing back a disk record whose
surface has been scratched. Wow and flutter relate, respectively, to
slow and fast frequency fluctuations that may occur in the sound if the
speed of the playback apparatus is not kept constant over time.
- Musical noise:
- Although formed by rather contradictory terms,
musical noise refers to a side effect of hiss reduction algorithms.
Perhaps a better term to characterize such effect would be random tonal
noise, since musical noise is perceived as progression of short tone
bursts whose frequencies change randomly over time.
- As the name suggests, filter is a device that is
capable of segregating or selecting components from a mixture. For
instance, the holes of a coffee filter are smaller than the average
diameter of grained coffee beans. Thus the coffee powder is segregated
from the beverage. In signal processing the elements to be segregated
are usually frequency components. Thus, a low-pass filter preserves the
low frequencies and attenuates (or removes) the high frequencies. The
same goes for high-pass, band-pass, and stop-band filters.
- It is the process by which a signal is modified when
passing through a filter.
- Wiener filtering:
- It is a special filter devised for de-noising
purposes. Given a clean and a noisy version of a certain signal, the
Wiener filter is the one that, if used to filter the noisy signal,
outputs a signal with minimum signal-to-noise ratio (SNR). One may
wonder why bother about using the Wiener filtering at all, since in
order to get rid of the noise in an optimal way, you need to know
beforehand the clean version of the signal. There are situations in
telecommunications systems in which known reference signals are sent
from the transmitter to receiver in order to calibrate the system.
During this stage the receiver gets a noisy signal but knows already
how the clean version of it is like.
- Signal-to-Noise Ratio (SNR):
- SNR is an objective measure that consists in the
ratio between the power of the signal and the power of the corrupting
noise. The higher the SNR the freer of noise the signal is.
- Model and Modeling:
- Model is something that represents an idealization of
the reality or of a phenomenon. This something can be an object, a
mathematical equation, or a more complex system. For instance, a model
for the shape of the human head is a sphere. The position of an object
in free fall can be modeled by a second-order equation of the position.
If the model represents the phenomenon in an appropriate way, it is
possible to predict future happenings in the phenomenon from the model.
In signal processing, the most commonly used models are rational
functions. Modeling (or model estimation) refers to the process of
finding a suitable model for a given phenomenon or signal. Very often,
models offer a compact representation of a phenomenon or signal. For
instance, one does not need to record at a large number of instants the
positions traveled by an object in free fall to describe the
phenomenon. Though a model, it suffices to know the initial conditions
(position and velocity) and the gravity force to describe it.
- Autoregressive (AR) model:
- An AR model is a special case of rational function
models in which the numerator is reduced to a single gain. The basic
idea behind an AR model is that an event in given time can be guessed
(or predicted) from a linear combination of a limited set of past
events. For instance, the days of the month within the 1st and 27th
days obey a first-order AR model: it suffices to know the number
associated with the current day to guess that of the next day, which is
given by the number of the current day plus one. In this example, as in
many others in real-life situations, the sequence of events does not
evolve forever or has always existed. In other words, a sequence of
events has a beginning and an end. Due to that, the model fails on the
sequence borders: if today is 28th of February, tomorrow will not
necessarily be the 29th. Neither if today is 1st day of a month,
yesterday was the 0th day of the same month. AR models can approximate
the evolution of the amplitude of audio and speech signals within short
- Inverse filtering:
- It consists of filtering a given signal through the
inverse of its own model. What is the purpose of this? Well, if the
model perfectly represents the signal, the output of the inverse
filtering is null. In reality, there is always some modeling error
involved. Thus, the output of an inverse filter is usually called
modeling residual, modeling error, or excitation. Minimizing the
modeling error is a key point in modeling schemes. Other applications
for modeling error are monitoring and detection of spurious events in
time series: a sudden occurrence of a large modeling error indicates
the presence of outliers in the signal.
- Additive White Gaussian Noise (AWGN):
- Gaussian noise refers to a sequence whose amplitude
values follow a Gaussian distribution. The term white is borrowed from
optics: it means that the signal contains all frequencies (within a
given spectral range) and all frequency components have equal energy.
In most de-noising problems and formulations the corrupting noise is
supposedly added to a clean version of signal. The term additive refers
to this assumption.
- pitch-class (pc):
- All pitches that are one or more octaves apart are
considered equivalent. All c's form one pitch-class, one c sharps (or d
flats) form another, etc. Thus there are twelve pitch-classes. As
traditional notation always fixes some specified pitch, numeric
notation is used. Thus 0 denotes the pitch-class c, 1 denotes the
pitch-class C sharp, etc. Pitch-classes form a cycle or a clock face:
the neighbours of 0 are 11 and 1.
- interval class:
- As pitch-classes are an equivalence class of pitches
there is no sense in thinking one pc is "higher" than another. If it
takes n steps clockwise to get from a to b, it takes 12 - n steps
counterclockwise to get from a to b. For example it takes 11 step
clockwise to get from 6 to 5 but only 1 counterclockwise. Interval
class is the smaller one of the steps in these two directions.
- An ordering of a set. For example 5 4 0 9 7 2 8 1 3 6
10 11 is an ordering of the twelve smallest non-negative integers.
- twelve-tone row (row):
- Some ordering of the twelve pitch-classes.
Conceptually similar to permutation of twelve integers. Row is
abstract, if a row is played pitch-classes must be changed to actual
pitches, which cannot be done without adding much information.
- row operation:
- A transformation that row to another row. For example
transposition, inversion and retrogression.
- row class:
- A set of rows that are considered equivalent.
Typically row class consists of 48 rows that are related to each other
by transposition, inversion, retrogression and their combinations.
- usually, when an row operation is applied to a row,
the row is transformed to some other row. However, some rows have the
property that applying certain row operation keeps the row unchanged.
- The core of personality, a process-like subject which
experiences, and constructs itself.
- A conception, image or experience of the self (What
am I like?)
- A relatively stable conception of self and the
matters related to self. It includes for example self-identity,
gender-identity, professional identity, and cultural identity. (Who am
- Self-regulation means all the psychological processes
that aim for the maintenance of psychological balance and coherent
experience of the self. This psychological balance can be disrupted by
internal factors like inner conflicts or hunger, or external factors
like troubled relationships or environmental disturbances. The psyche
acts to eliminate this discomfort by trying to make the situation, the
experience, and the emotions understandable and controllable for the
self. Thus, the psyche is a homeostatic process which acts to satisfy
its own goals by means of self regulation.
- Coping and Emotion regulation:
- Coping is a central part of self-regulation. It
includes different cognitive and behavioural attempts to manage a
troubled person-environment relationship. Coping can be divided into
emotion-focused coping (which aims to regulate the experience) and
problem-focused coping (which aims to alter the situation). Emotion
regulation has been regarded as a synonym for the coping in general or
only for the emotion-focused coping. It means not only preventing
negative feelings but also promoting positive ones.
- Object relations:
- A self-object is an extension of the self. Though
physically separate from the self, it is experienced as part of the
self, something that belongs to me and is related to me. Self-objects
can be for example persons, things, or ideologies. The self is partly
constructed on these self-objects, like close relationships with
significant others. The most important self-objects of a child are the
parents. In adolescence, the individual has to separate from parents
and readjust his/her object relations.
- Psychosocial development:
- The developmental psychology studies different
sectors of psychological development: physical-motorical, cognitive,
and psychosocial. Psychosocial development concerns sociality,
emotionality, personality, morality, and psychosexuality. Central
challenges of psychosocial development in adolescence are linked with
the awakening sexuality, reconstruction of identity and the changes in
- Dyslexia is recognised as being a specific learning
disability of neurological origin that does not imply low intelligence
or poor educational potential, and which is independent of race and
social background. Dyslexia has a genetic cause, but in some cases
birth difficulties may play an important role. The following cognitive
characteristics of dyslexia can be mentioned here: a marked
inefficiency in the working or short-term memory system, inadequate
phonological processing abilities, difficulties with motor skills or
co-ordination, a range of problems connected with visual processing.
Some educational effects of dyslexia are
: - early difficulties in acquiring phonic skills - a high proportion
of errors in oral reading
- difficulty in extracting the sense from written material without
- slow reading speed
- inaccurate reading, omission of words
- The mismatch negativity (MMN) is a component of the
auditory event related potential (ERP) which is elicited
task-independently by an infrequent change in a repetitive sound. MMN
is evoked by an infrequently presented stimulus ("deviant"), differing
from the frequently-occurring stimuli ("standards") in one or several
physical parameters like duration, intensity, or frequency
(Näätänen, 1992). In addition, it is generated by a
change in spectrally complex stimuli like phonemes, in synthesised
instrumental tones, or in the spectral component of tone timbre. The
MMN data imply the existence of a sensory-memory trace in which the
features of the frequently occurring standard stimuli are represented. mmn
- Dichotic listening tests:
- DL tests reveal how the left vs. right hemisphere
auditory cortices contribute to behavioral speech /music sound
discrimination. Dichotic listening tasks are tasks which affect the two
ears differently, as when one stimuli is conveyed to the left ear at
the same time as a different stimuli is being transmitted to the right.
The subject is free to report the stimuli heard which, with
CV-syllables, is in the majority of the cases the sound presented to
the right ear. This data pattern is taken as a behavioral measure of
left temporal lobe processing superiority for phonological stimuli.
- In this study musicality is defined by means of the
tests developed by Karma and Seashore. The test developed by Seashore
(1967) considers musicality as an entity emerging from relatively
independent subskills organized along the different sound parameters
and cognitive demands (e.g., pitch-discrimination accuracy/temporal
accuracy, vs. memory for pitch/rhythm). In contrast, the test developed
by Karma (1993) considers musicality as a more general ability to
structure sound information cognitively into meaningful chunks. Both
views of musicality have been shown to have their neural counterparts.
- WAIS / WISC:
- Generally used psychological tests for intelligence:
WISC-III (children) and WAIS-R (adults).
- body response:
- loosely speaking, the part of musical instrument
sound production after the initial excitation (e.g. strings), the
instrument body as a resonator and radiator of sound
- room response:
- the system that describes how the sound generated in
one location in a space in perceived in another location.
- transfer function:
- a mathematical description of the mechanism of the
system, how input signals are mapped to output signals and different
frequency components are modified by the system
- impulse response:
- the response of the system to an impulse excitation
(it didn't get much clearer)
- time-frequency representations and visualizations:
- the representation of the response as a discrete-time
signal, the energy distribution of different frequency components as a
magnitude spectrum, and various ways to display how the spectrum
evolute in time.
- 1. A movement of the body or any part of it in a way
that conveys some intention/expresses or emphasizes an idea or emotion.
- 2. Motion of the body that contains information.
- To assign a set and/or element(s) in a
correspondence. Eg. Map an input variable onto an output variable.
- Virtual widget:
- Virtual model and representation of a functional and
- 1. Being able to effectively and accurately control
(predictable) elements of a process to achieve the intented
outcome/match the outcome with one's inner vision.
- 2. Effectively conveying meaning or emotion.
- User interface:
- 1. Means and/or devices to control/interact with a
system. 2. Means to input information to a system and receive feedback.
- Electronic musical instruments:
- Musical instruments, which sound production is based
on electricity. Nowadays mostly digital and accompanied with digital
- Virtual reality:
- An artificial environment created with computer
handware and software and presented to the user in such a way that it
appears and feels like a real environment.
- System delay:
- 1. Time that it takes for a system to process input
into a feedback. 2. In virtual reality systems the time to get the
information from the trackers and other input devices to the main
application to be processed and finally alter the state of the system.
3. System reaction time to an input.
- Prediction algorithms:
- Mathematical means to give an estimate of the state
of a system at time step T+delta t already at time T.
- Haptic feedback:
- Applying touch sensation and control to giving
feedback information of an interaction. Eg. one senses the resisting
force of a piano key when pressed. Also the key physically limits the
vertical movement of the finger.
- Involuntary attention:
- attention switching to the irrelevant stimuli. It is
normally caused by natural stimuli, that are deviant from the target
stimulation. Involuntary attention is part of the orienting reflex. The
P3a component of ERPs is the elecrophysiological correlate of
involuntary attention shift.
- Working memory:
- memory for intermediate results that must be held
during thinking. In Baddley and Hitch model, working memory consists of
three parts: central executive, phonological loop and visuospatial
sketch pad. The concept of tripartite working memory replaced the
earlier model of a single unitary short-term memory.
- Exogenous and endogenous ERP components:
- exogenous components are those that could be modified
by the change in stimulation (e.g., loudness, frequency), while the
endogenous components are modified by the internal condition of the
subject (e.g., attention). Since each ERP peak may be composed of many
components (generators), it could be partly exogenous and partly
endogenous. The earlier in time ERP peaks (P1, N1) are believed to be
more exogenous than the later ones (P3, N400).
- fMRI acoustic noise:
- loud unpleasant sound, produced by switching of the
coils of the magnet, which are responsible for creating gradients in
the magnetic field. Since the gradients are essential in the
localization of the activation, it is impossible to avoid acoustic
noise during MR imaging.
- Odd-ball paradigm:
- the presentation of a rare deviant auditory stimulus
within a continuous sequence of repeating stimuli (standards).
- Roving-standard paradigm:
- the modification of the odd-ball paradigm, the
presentation of alternatingly changing short sets of stimuli. The first
stimulus after the change is a deviant, the stimuli before the change
- Matching-to-sample paradigm:
- a test of the working memory, where the subject has
to compare the cue and a probe, presented with a delay and indicate
whether they coincided. The term is used mostly for the non-human
primate studies. In humans, the same-different procedure could be
considered as a variety of the matching to sample paradigm.
- What and where pathways:
- the segregation of spatial and non-spatial
information in the brain. In vision, information about location
proceeds in a dorsal direction from visual cortex, while the
information about the features of the object (e.g., color) spreads in
ventral direction. There are data supporting the division of spatial
and non-spatial information in audition, but the fact is not
- Minimum current estimate (MCE):
- a method of the magnetoencephalographic (MEG) data
analysis, which provides automatic user-independent source
- Sound synthesis:
- In this context, sound synthesis means that a sound
signal is produced with a computer. In contrast to traditional
instruments. Sound synthesis is a general term that does not specify
how the sound signal is produced. Practically any method or algorithm
that can produce a sound can be is a sound synthesis method. It does
not have to be physical at all, but for practical purposes it is useful
that the method is consistent with the outcome. There exists many
methods/algorithms that can be used to synthesize sound, such as,
sampling, FM-synthesis, physical modeling. FM-synthesis is a method
with basically two oscillators that control each other. It enables to
produce musical tones very efficiently and was introduced in
synthesizers in the 1980's, the most popular being the YAMAHA DX-7.
- Physical modeling of musical instruments:
- Physical modeling is a field of research of a musical
instrument that aims to mimic the behavior of the instrument, i.e.,
tries to imitate the way the instrument produces its sound. As the
behavior of the instrument is tried to be understood, the instrument is
typically divided into functional parts, on an abstract level, i.e.,
the instrument itself is not put apart. For example in the case of
violin the string, the hollow body, and the bow are first treated as
separate units. Then as they are understood separately the interactions
and connections between are investigated. Basically a mathematical
model of the behavior is produced and this mathematical equation is
implemented and approximated with a computer program. This way it can
be understood that the concept physical modeling is a method included
in sound synthesis (see I).
- Commuted synthesis:
- Commuted synthesis is a physical modeling method that
evolved around synthesis of plucked string instruments in 1993 (Julius
Smith and Matti Karjalainen). After the initial idea it has been
expanded so that it can be used for, e.g., synthesizing the violin. The
term commuted refers to the fact that the order of functional parts in
the model is changed. In the case of the acoustic guitar this means
that the string is basically plucked with the body, including the
string-finger interaction. In more technical terms, the string model is
excited with a sample of the instrument body that includes the
string-finger interaction. This might sound odd, but it works and is
theoretically also justifiable.
- Digital waveguide:
- Digital waveguides are a digital signal processing
method to model traveling waves inside a defined structure. Moreover,
digital waveguides provide a model to objects like strings, horns, and
plates. The parameters of a digital waveguide specify what kind of an
object is modeled.
- Acoustic measurements:
- In acoustic measurements the purpose is to collect
informative data of the sound source under investigation, e.g., the
guitar. One can record different variables such as, changes in air
pressure, which humans listen to, or velocity, or acceleration of some
part of the object. For obtaining these different variables multitude
of different sensors are used. For example changes in the air pressure
are typically recorded with a microphone and acceleration with an
accelerometer. Measurements can be conducted in several different
places: in anechoic conditions, i.e., in very damped conditions, echoic
chamber, which the exact opposite to the previous, in a normal room, in
a studio, out on the field etc. The measurement place depends on the
purpose of the measurement.
- Model parameter estimation:
- As a model for an instrument has been created, e.g.,
a general physical model of a string, values for how the model should
behave have to be specified. In other words, values for the parameters
that dictate how the model behaves have to be given. It can be
understood that two strings with different lengths sound different.
Hence, the model of the string has different parameter values for
different strings. Moreover, as we change a new string to an instrument
it sounds brighter. In the same manner the parameters of the string
model should be changed. With incorrect parameter values a model
produces a tone that might sound odd. On the other hand, this could be
interesting in some musical experiments.
- Analysis of measurement data:
- First data is gathered by acoustical measurements.
Then properties of interest are examined from the signal. For example
the fundamental frequency of a tone is approximated.
- Inharmonic sound:
- Sounds produced by musical instruments do not contain
one single frequency, but many frequency components or vibrational
modes. In typical melodic instruments such as, string instruments the
vibrational modes are multiples of the fundamental frequency f_0, i.e.,
2*f_0, 3*f_0& Because of their ratio these vibrational modes f_0,
2*f_0& are called harmonics. In practice because of string tension
the harmonics are not exactly harmonic, i.e., the second harmonic might
not be exactly 2*f_0, but 1.9*f_0. Sounds that have a harmonic
structure that behaves in this manner are said to be inharmonic.
- Frequency dependency:
- It is understandable that things change over time and
when things do the process is said to be time dependent. In addition,
practically all processes in musical acoustics are frequency dependent.
This means that as the frequency changes the behavior changes as well,
at least slightly. For example when a singer sings a low note it sounds
different than when he or she sings a high note. Another example could
be a plucked string where the high-frequency content dies out much
faster than low frequencies.
- Body response:
- Most string instruments have a resonant body where
the strings are attached to. The body of the instrument amplifies the
string vibrations and also colors the response. Coloring here means
that different frequencies are amplified or attenuated different
amounts, i.e., a body response is frequency dependent. A body response
can be measured in an anechoic chamber, e.g., by hitting the body of a
guitar with an impulse hammer and recording the radiation. The strings
should be damped or removed for this.