
HOME |
Lineups in the discographical
relational database
by
Jørgen
Mathiasen
October
2006
Questions
about lineups in relational databases are in some
cases discussed in the literature on relational
databases. Actually it should be discussed more
often because the number of music databases on
computers is probably quite high. Instead the
address database is very often used as a case and
the problem discussed in this article is actually
solved with the help of a technique used in
address relational databases, but so far I have
not seen this in connection with discographical
relational databases
The problem is:
How can information on lineups be stored in a
relational database in such a way that the rules
of normalization with regard to consistence,
reduction of redundancy as well as the traditions
of the jazz discography are considered at the
same time?
This article specifies various questions
connected to this problem and a couple of
different solutions in a relational database. The
article takes some experience in relational
database and jazz discographies for granted. For
instance basic questions on relations, normal
forms and what kind of information we can expect
to find in a discography will not be discussed.
The solution of the central problem is aimed at
jazz discographies, but it should sufficiently
based on rules, so that it can be applied on
other music.
Model case
The article will gradually go deeper into the
problems, but for a beginning we need a model
case. I have chosen a classical jazz session with
Louis Armstrong:
Armstrong,
Louis
Louis Armstrong
And His Hot Five: Louis
Armstrong (tp, vo-1) Fred Robinson (tb)
Jimmy Strong (cl) Earl Hines (p) Mancy
Carr (bj) Zutty Singleton (d)
Chicago 1928/06/28
West End Blues (1)
Muggles
Session ID: 999
|
This article concentrates on the
lineup field, in other words the following
information:
| Louis Armstrong (tp, vo-1) Fred
Robinson (tb) Jimmy Strong (cl) Earl
Hines (p) Mancy Carr (bj) Zutty Singleton
(d) |
Since the trumpet and the trombone players were
among the most prominent musicians in early jazz
they were already in the early discographies
mentioned in the beginning of the lineup field.
Apart from that the information is sorted in
instrument families and it became a principle
when the standards of the jazz discography were
established.
For a period three groups were sufficient:
The brass, the reed and the rhythm group. To this
was applied a subdivision of the two first, but
gradually the range of instruments in jazz grew
strongly and in a jazz discography today we can
easily find more than a hundred different
instruments. Jazz discographies have for a long
time maintained the approximate instrument family
structure of the lineup field and so ignored the
fact that some members of the lineup field are
more prominent than others. It is actually so in
the model case, which has participation not only
by Armstrong but also by Earl Hines.
The first analysis of a solution
A text field with the lineup information of
the model case in a session table relinquishes
the use of any of the techniques designed for
relational databases as a means to efficient use
of space, input of consistent data, and
limitation of redundancy, in short the things we
expect from a normalized relational database. A
text field is also the most difficult way to make
the input, and therefore we have a reason to make
an analysis of the content of the lineup field
and try to transform the results into a better
database structure. Lets look once again on
the information in the model case:
| Louis Armstrong |
(tp, vo |
-1) |
It is now divided into the properties name,
sound and annotation. (It is the
tradition of the paper discography to put
parentheses around the two latter, but we will
ignore them here.) After the division of the
original alphanumeric string in properties a new
table begins to emerge. As a part of the
normalization the text field with the lineup
information of a session table should be
transferred into a new table, the lineup table,
but first the records of the session table should
have an ID number (if they dont have it
already), which serves as relational data. The
structure of the two tables and their relation
can temporarily be described in this model:
| Session.table |
1:M
|
Lineup table |
| ID |
|
ID (session) |
| . |
|
Name |
| |
|
Instrument / sound |
| |
|
Annotations |
The model does not describe what kind of
fields a session table needs, but stresses just
what is necessary (the ID serial number) in order
for the relation to work. Solo recordings (one
instrument) only constitute a minor group of all
recordings and therefore the relation between the
two tables is of course one to many.
Following the principle of instrument
families in discographies the field of instrument
will have to be a part of the key in order to
establish the desired structure. A very pragmatic
way is in a separate table to give all
instruments a number. But not only the instrument
field borrows its data from another table. So
does the name field and therefore we need a table
for musicians or names too. These tables should
be established before we make the lineup table.
The instrument table
According to the tradition of the jazz
discography and the requirements of the database
we need at least three fields in this table:
Instrument table
|
| Instrument name |
| Instrument abbreviation |
| ID |
The ID number of the instrument
determines the order of instruments and names of
the lineup records connected to a session.
Abbreviations of instruments
Instruments are abbreviated but there is some
heterogeneity in the existing discographies. Vocal
is for instance often abbreviated vcl and
therefore this abbreviation cannot be used for
the (violin) cello. Vo can be
recommend for vocal instead. Names like ensemble
or choir are also a part of this table and
as mentioned before we can easily find more than
a hundred different instruments names in a jazz
discography. No matter what the abbreviation is
the relational database should be used to rule
out repetitions. In general the established
abbreviations can be recommended as they give the
shortest abbreviations to the most used
instrument. For some instruments it is necessary
to find new meaningful abbreviations and for
others like oboe or harp the name is the
abbreviation too.
It is practical to have a field with the
full name of the instrument in order to avoid
misunderstandings. A jazz discography is usually
written in English, and if necessary another
field can be added to the table in which the name
of the instrument is listed in the native
language.
Originally the abbreviations were made
because the typing became easier but in the
(normalized) relational database the name is just
in one record in one table no matter the
abbreviation is used (seen) in many records. If
the discography is going to be used as a database
exclusively that is without printouts -
the instrument abbreviations become more or less
superfluous. But printouts are useful also for
proofreading and in any case the use of them can
be limited to the printouts.
The name table
The name table is essential to a
discographical database. The time it takes to
make it is justified because its contents can be
used in a number of tables including the lineup
table. As with the session table we will here
just mention the fields relevant to the central
problem.
Bill Evans is the name of a pianist
who has played with Miles Davis. It is also the
name of a saxophonist, who also has played with
Davis. Since confusion is possible and the
relational database cannot distinguish on the
basis of the family name the latter is not
sufficient. To begin with we need a reference
name and with this we are back in the address
database. Names can be divided into different
fields and for our purpose the name must be in
the order: first name - family name. For a jazz
discography however we must take into
consideration when constructing the name table
that many musicians are known only by
other names than the one on the certificate of
baptism: The pianist William John Evans is only
known as Bill Evans.
To distinguish when inputting we also need
to know for instance about the instrument and for
the database we need an ID number of the record.
With this we have identified the fields in the
name table that we shall later need for the
lineup table:
Name table
|
| Reference name |
| ID |
The lineup table
After this we are ready to design the lineup
table. The essence of this table is the
technique used in the address database with
regard to post area names and numbers.
The Post Office has in many places divided
the national post delivery area into smaller
units, which are supplied with a number and a
name, for instance 12619 Berlin. In an
address database this information is divided into
two fields: the post area number and the
post area name. The latter is a function of
the former and by input you just have to write
the number, then the database computerises
the number and fills out the post area name
field. (By the input however, you are guided by
names and must not remember the numbers. In the
discographical relational database it is the name
of the participant and the name of the
instrument.) The post area name is stored in one
record in one table regardless it can be seen in
many records. Redundancy is avoided by this
construction as well as incorrect spellings of
the post area name. The post area number is of
course validated so that at least invalid post
area numbers are avoided. The functional
depending field of post area name is in some
database applications known as a computed field,
in others as a validated field.
A way to solve the central problem
The computed field can be used in the lineup
table in connection with the name field as well
as the instrument field. If Armstrong for
instance has serial number 75 it is the content
of the name ID field, while the database fills
out computerises - the name field and
answers with Louis Armstrong or
Louis Satchmo Armstrong that is with
the reference name of the record in the name
table. In this structure one correction in the
name table will correct all relevant fields of
the lineup table. The same principles are applied
for the instrument.
Coming back to the model case; if trumpet
has instrument ID number 3 the relevant record in
the lineup table will have the content:
| Lineup table |
| |
Field.input |
Shows |
Field type |
Part.of.key |
| ID Session |
|
999 |
Numerical |
Yes |
| ID Instrument |
3 |
3 |
Numerical |
Yes |
| |
|
Tp |
Computed |
|
| ID Musician |
75 |
75 |
Numerical |
Yes |
| Name |
|
Louis Armstrong |
Computed |
|
Notice about the input: The
session ID is transferred from the session table
and it is only necessary to input two numbers to
make the new record.
Records should be created for each
musician of the session (if it is a jazz
discography). By printouts or other outputs the
key, which contains the session ID and the
instrument ID, takes care of the order of the
session participants. They will appear in the
order determined in the instrument table.
The sound function in discographies
For the many jazz recordings with trios,
quartets and quintets the lineup table above will
do, but we can go deeper into the discographical
account of a session, and jazz discographies
often do that. In the model case Armstrong
appeared as a trumpeter and as a singer on West
End Blues. The annotation is still not in the
lineup table and it takes a couple of new
features before we can input this information.
These features will be discussed below.
While Armstrong in the model case appears
in two (sound) functions the reed players of a
big band frequently appear with three or four
functions by a recording. Since we are not aiming
at a paper discography, creating more records in
the session table can solve some of these things,
and to the benefit of the reader through
this the kind of annotation structure of the
paper discography, which can make it a nightmare
to read, can be avoided. However, this is not
relevant for the model case. We can create yet
another field for the sound function or three
more taking the reeds of a big band into
consideration but thereby we arrive at the
problem records with empty fields since we
only need these extra fields in some cases. This
problem is principally solved by the creation of
yet another table, which allows an infinite
number (n) of instruments or sound
functions and still only results in records where
they are needed. The database structure must be
extended with a table for the sound function.
| Session |
1:M
|
Lineup |
1:M
|
Sound.Function |
The key to the sound function table consists of
session ID, musician ID and with the instrument
we have these fields:
Sound function
table
|
| |
Field.input |
Shows |
Field
type |
Part.of.key |
| ID
Session |
|
999 |
Numerical |
Yes |
| ID
Instrument |
3 |
3 |
Numerical |
Yes |
| ID
Musician |
75 |
75 |
Numerical |
Yes |
For each new sound there is a new
record. Meanwhile the fields of the lineup table
are reduced to:
Lineup table
|
| |
Field.input |
Shows |
Field
type |
Part.of.key |
| ID
Session |
|
999 |
Numerical |
Yes |
| ID
Musician |
75 |
75 |
Numerical |
Yes |
| Name |
|
Louis
Armstrong |
Computed |
|
Criticism of the solution
The solution is in accordance with the rules
of a database but it implies a couple of
important disadvantages. Input and editing is now
a heavy thing and also the original key from the
session table to the table, which had an
instrument ID responsible for the order of
instruments, is lost. The order can be
established only through the random musician
serial number, which again leads to new orders of
instruments for each session, and no jazz
discographer would accept that. The order will
have to be re-established through a numerical
field.
A second solution could be to maintain one
instrument field in the table and put all
succeeding sound functions [2
n] into the
sound function table, but it will not make input
or editing easier.
A third solution is to accept records with
empty fields and compromise on the rules of the
relational database.
Accepting a table with empty fields
First we will have to determine how many
fields we will accept. It would be pragmatic to
use big bands as a norm, that is four fields.
That would cover most jazz recordings. It would
also result in quite a number of empty fields.
For musicians like for instance Raahsan Roland
Kirk with a great variety of sounds we will still
need the note field at the bottom of the session.
Extended row of fields for sound function and
its annotations
By extending the number of fields of the
sound function and adding new fields for order
and annotation to the table we will be able to
put the information of the model case into the
right fields in the table. In the table below the
number of sound functions have been reduced to
two, but it is just for the purpose. The table
now has this structure:
Lineup table
|
| |
Field
input |
Shows |
Field
type |
Part.of.key |
| ID
Session |
|
999 |
Numerical |
Yes |
| ID
Musician |
75 |
75 |
Numerical |
Yes |
| Name |
|
Louis
Armstrong |
Computed |
|
| ID
Instrument |
3 |
3 |
Numerical |
Yes |
| |
|
Tp |
Computed |
|
| Order
|
|
|
Numerical |
|
| Annotation |
|
|
Alphanumerical |
|
| ID
Instrument |
94 |
94 |
Numerical |
|
| |
|
Vo |
Computed |
|
| Order |
|
|
Numerical |
|
| Annotation |
-1 |
-1 |
Alphanumerical |
|
For our model case we dont
need to input anything in the order field but in
connection with big bands this field will make it
possible to arrange the order of for instance the
members of the brass and the reed section.
As mentioned above we will get some empty fields
because many records only need the first
instrument field but there will also be empty
fields in the completely normalized database
albeit less.
Sessions with hundreds of participants
At last it should be mentioned that some
recordings of music including such music that is
categorized as jazz occasionally have been
recorded by up to thousand musicians (through the
use of many and big ensembles). In such cases the
traditional jazz discography and its
author - gets into troubles. A solution
which may sound as something of a paradox
is to keep the old text field and in such extreme
cases only to write the relevant into the
text field, if necessary in combination with
selected names stored in the table.
|
|