Discography
 

HOME

Lineups in the discographical relational database

by

Jørgen Mathiasen

October 2006

Questions about lineups in relational databases are in some cases discussed in the literature on relational databases. Actually it should be discussed more often because the number of music databases on computers is probably quite high. Instead the address database is very often used as a case and the problem discussed in this article is actually solved with the help of a technique used in address relational databases, but so far I have not seen this in connection with discographical relational databases

The problem is:

How can information on lineups be stored in a relational database in such a way that the rules of normalization with regard to consistence, reduction of redundancy as well as the traditions of the jazz discography are considered at the same time?

This article specifies various questions connected to this problem and a couple of different solutions in a relational database. The article takes some experience in relational database and jazz discographies for granted. For instance basic questions on relations, normal forms and what kind of information we can expect to find in a discography will not be discussed. The solution of the central problem is aimed at jazz discographies, but it should sufficiently based on rules, so that it can be applied on other music.


Model case
The article will gradually go deeper into the problems, but for a beginning we need a model case. I have chosen a classical jazz session with Louis Armstrong:


Armstrong, Louis

Louis Armstrong And His Hot Five: Louis Armstrong (tp, vo-1) Fred Robinson (tb) Jimmy Strong (cl) Earl Hines (p) Mancy Carr (bj) Zutty Singleton (d)

Chicago 1928/06/28

            West End Blues (1)
            Muggles

Session ID: 999



This article concentrates on the lineup field, in other words the following information:

Louis Armstrong (tp, vo-1) Fred Robinson (tb) Jimmy Strong (cl) Earl Hines (p) Mancy Carr (bj) Zutty Singleton (d)



Since the trumpet and the trombone players were among the most prominent musicians in early jazz they were already in the early discographies mentioned in the beginning of the lineup field. Apart from that the information is sorted in instrument families and it became a principle when the standards of the jazz discography were established.
  For a period three groups were sufficient: The brass, the reed and the rhythm group. To this was applied a subdivision of the two first, but gradually the range of instruments in jazz grew strongly and in a jazz discography today we can easily find more than a hundred different instruments. Jazz discographies have for a long time maintained the approximate instrument family structure of the lineup field and so ignored the fact that some members of the lineup field are more prominent than others. It is actually so in the model case, which has participation not only by Armstrong but also by Earl Hines.


The first analysis of a solution
A text field with the lineup information of the model case in a session table relinquishes the use of any of the techniques designed for relational databases as a means to efficient use of space, input of consistent data, and limitation of redundancy, in short the things we expect from a normalized relational database. A text field is also the most difficult way to make the input, and therefore we have a reason to make an analysis of the content of the lineup field and try to transform the results into a better database structure. Let’s look once again on the information in the model case:

Louis Armstrong (tp, vo -1)



It is now divided into the properties name, sound and annotation. (It is the tradition of the paper discography to put parentheses around the two latter, but we will ignore them here.) After the division of the original alphanumeric string in properties a new table begins to emerge. As a part of the normalization the text field with the lineup information of a session table should be transferred into a new table, the lineup table, but first the records of the session table should have an ID number (if they don’t have it already), which serves as relational data. The structure of the two tables and their relation can temporarily be described in this model:

Session.table

1:M

Lineup table
ID   ID (session)
.   Name
    Instrument / sound
    Annotations



The model does not describe what kind of fields a session table needs, but stresses just what is necessary (the ID serial number) in order for the relation to work. Solo recordings (one instrument) only constitute a minor group of all recordings and therefore the relation between the two tables is of course one to many.
  Following the principle of instrument families in discographies the field of instrument will have to be a part of the key in order to establish the desired structure. A very pragmatic way is in a separate table to give all instruments a number. But not only the instrument field borrows its data from another table. So does the name field and therefore we need a table for musicians or names too. These tables should be established before we make the lineup table.


The instrument table
According to the tradition of the jazz discography and the requirements of the database we need at least three fields in this table:

Instrument table

Instrument name
Instrument abbreviation
ID



The ID number of the instrument determines the order of instruments and names of the lineup records connected to a session.


Abbreviations of instruments
Instruments are abbreviated but there is some heterogeneity in the existing discographies. Vocal is for instance often abbreviated vcl and therefore this abbreviation cannot be used for the (violin) cello. Vo can be recommend for vocal instead. Names like ensemble or choir are also a part of this table and as mentioned before we can easily find more than a hundred different instruments names in a jazz discography. No matter what the abbreviation is the relational database should be used to rule out repetitions. In general the established abbreviations can be recommended as they give the shortest abbreviations to the most used instrument. For some instruments it is necessary to find new meaningful abbreviations and for others like oboe or harp the name is the abbreviation too.
  It is practical to have a field with the full name of the instrument in order to avoid misunderstandings. A jazz discography is usually written in English, and if necessary another field can be added to the table in which the name of the instrument is listed in the native language.
  Originally the abbreviations were made because the typing became easier but in the (normalized) relational database the name is just in one record in one table no matter the abbreviation is used (seen) in many records. If the discography is going to be used as a database exclusively – that is without printouts - the instrument abbreviations become more or less superfluous. But printouts are useful also for proofreading and in any case the use of them can be limited to the printouts.


The name table
The name table is essential to a discographical database. The time it takes to make it is justified because its contents can be used in a number of tables including the lineup table. As with the session table we will here just mention the fields relevant to the central problem.
  Bill Evans is the name of a pianist who has played with Miles Davis. It is also the name of a saxophonist, who also has played with Davis. Since confusion is possible and the relational database cannot distinguish on the basis of the family name the latter is not sufficient. To begin with we need a reference name and with this we are back in the address database. Names can be divided into different fields and for our purpose the name must be in the order: first name - family name. For a jazz discography however we must take into consideration when constructing the name table that many musicians are known only by other names than the one on the certificate of baptism: The pianist William John Evans is only known as Bill Evans.
  To distinguish when inputting we also need to know for instance about the instrument and for the database we need an ID number of the record. With this we have identified the fields in the name table that we shall later need for the lineup table:

Name table

Reference name
ID




The lineup table
After this we are ready to design the lineup table. The essence of this table is the technique used in the address database with regard to post area names and numbers.
  The Post Office has in many places divided the national post delivery area into smaller units, which are supplied with a number and a name, for instance 12619 Berlin. In an address database this information is divided into two fields: the post area number and the post area name. The latter is a function of the former and by input you just have to write the number, then the database computerises the number and fills out the post area name field. (By the input however, you are guided by names and must not remember the numbers. In the discographical relational database it is the name of the participant and the name of the instrument.) The post area name is stored in one record in one table regardless it can be seen in many records. Redundancy is avoided by this construction as well as incorrect spellings of the post area name. The post area number is of course validated so that at least invalid post area numbers are avoided. The functional depending field of post area name is in some database applications known as a computed field, in others as a validated field.


A way to solve the central problem
The computed field can be used in the lineup table in connection with the name field as well as the instrument field. If Armstrong for instance has serial number 75 it is the content of the name ID field, while the database fills out – computerises - the name field and answers with “Louis Armstrong” or “Louis Satchmo Armstrong” that is with the reference name of the record in the name table. In this structure one correction in the name table will correct all relevant fields of the lineup table. The same principles are applied for the instrument.
  Coming back to the model case; if trumpet has instrument ID number 3 the relevant record in the lineup table will have the content:

Lineup table
  Field.input Shows Field type Part.of.key
ID Session   999 Numerical Yes
ID Instrument 3 3 Numerical Yes
    Tp Computed  
ID Musician 75 75 Numerical Yes
Name   Louis Armstrong Computed  



Notice about the input: The session ID is transferred from the session table and it is only necessary to input two numbers to make the new record.
  Records should be created for each musician of the session (if it is a jazz discography). By printouts or other outputs the key, which contains the session ID and the instrument ID, takes care of the order of the session participants. They will appear in the order determined in the instrument table.


The sound function in discographies
For the many jazz recordings with trios, quartets and quintets the lineup table above will do, but we can go deeper into the discographical account of a session, and jazz discographies often do that. In the model case Armstrong appeared as a trumpeter and as a singer on West End Blues. The annotation is still not in the lineup table and it takes a couple of new features before we can input this information. These features will be discussed below.
  While Armstrong in the model case appears in two (sound) functions the reed players of a big band frequently appear with three or four functions by a recording. Since we are not aiming at a paper discography, creating more records in the session table can solve some of these things, and to the benefit of the reader through this the kind of annotation structure of the paper discography, which can make it a nightmare to read, can be avoided. However, this is not relevant for the model case. We can create yet another field for the sound function or three more taking the reeds of a big band into consideration but thereby we arrive at the problem records with empty fields since we only need these extra fields in some cases. This problem is principally solved by the creation of yet another table, which allows an infinite number (n) of instruments or sound functions and still only results in records where they are needed. The database structure must be extended with a table for the sound function.

Session

1:M

Lineup

1:M

Sound.Function



The key to the sound function table consists of session ID, musician ID and with the instrument we have these fields
:

Sound function table

  Field.input Shows Field type Part.of.key
ID Session   999 Numerical Yes
ID Instrument 3 3 Numerical Yes
ID Musician 75 75 Numerical Yes



For each new sound there is a new record. Meanwhile the fields of the lineup table are reduced to:

Lineup table

  Field.input Shows Field type Part.of.key
ID Session   999 Numerical Yes
ID Musician 75 75 Numerical Yes
Name   Louis Armstrong Computed  




Criticism of the solution
The solution is in accordance with the rules of a database but it implies a couple of important disadvantages. Input and editing is now a heavy thing and also the original key from the session table to the table, which had an instrument ID responsible for the order of instruments, is lost. The order can be established only through the random musician serial number, which again leads to new orders of instruments for each session, and no jazz discographer would accept that. The order will have to be re-established through a numerical field.
  A second solution could be to maintain one instrument field in the table and put all succeeding sound functions [2…n] into the sound function table, but it will not make input or editing easier.
  A third solution is to accept records with empty fields and compromise on the rules of the relational database.


Accepting a table with empty fields
First we will have to determine how many fields we will accept. It would be pragmatic to use big bands as a norm, that is four fields. That would cover most jazz recordings. It would also result in quite a number of empty fields. For musicians like for instance Raahsan Roland Kirk with a great variety of sounds we will still need the note field at the bottom of the session.


Extended row of fields for sound function and its annotations
By extending the number of fields of the sound function and adding new fields for order and annotation to the table we will be able to put the information of the model case into the right fields in the table. In the table below the number of sound functions have been reduced to two, but it is just for the purpose. The table now has this structure:

Lineup table

  Field input Shows Field type Part.of.key
ID Session   999 Numerical Yes
ID Musician 75 75 Numerical Yes
Name   Louis Armstrong Computed  
ID Instrument 3 3 Numerical Yes
    Tp Computed  
Order     Numerical  
Annotation     Alphanumerical  
ID Instrument 94 94 Numerical  
    Vo Computed  
Order     Numerical  
Annotation -1 -1 Alphanumerical  



For our model case we don’t need to input anything in the order field but in connection with big bands this field will make it possible to arrange the order of for instance the members of the brass and the reed section.
As mentioned above we will get some empty fields because many records only need the first instrument field but there will also be empty fields in the completely normalized database albeit less.


Sessions with hundreds of participants
At last it should be mentioned that some recordings of music including such music that is categorized as jazz occasionally have been recorded by up to thousand musicians (through the use of many and big ensembles). In such cases the traditional jazz discography – and its author - gets into troubles. A solution – which may sound as something of a paradox – is to keep the old text field and in such extreme cases only to write the relevant into the text field, if necessary in combination with selected names stored in the table.



 
   



Site Meter