7 Aralık 2012 Cuma

Algorithmic Music

To contact us Click HERE
I recently came across a fascinatingarticle that shows how to produce music using a one-line C program. Granted, the musical result is not in the same league as a Bach fugue, but the fact that a 57 character C program can produce sound patterns with interesting dynamic textural variation is really neat. Particularly so because the formula only involves arithmetic and bitwise operations on whole numbers between 0 and 255: no trig functions, lookup tables, Fourier transforms, or any of the many other modern tricks for sound synthesis.

The original inventor of this technique is a Finnish artist/programmer called viznut. He posted several articles and video/sound-recordings about the idea in 2011, which then spawned a whole community of people doing experiments. You can find the original articles and listen to sound-clips at Algorithmic symphonies from one line of code -- how and why?.

Here is a complete example.

main(t) { for(;;t++) putchar((t*5&t>>7)|(t*3&t>>10)); }

Here is a 10 second sound clip you can listen to.Use this link onLinuxto get access to a downloadable OGG-VORBIS file. Use this link onWindows/Macto get access to a downloadable ACC file.

The program outputs 8-bit unsigned values (i.e. whole numbers between0 and 255), which are intended to be played through an old-fashioned8-bit sound chip such as was common on old game platforms andcomputers in the '80s and '90s. The resulting "music" certainlyreminds me of the kind of sound produced by games in that era.What is fascinating is how complex a sound can be produced using sucha simple program.

Copy the text into afile called example.c and compile it. If you do not alreadyhave a C compiler, I recommend using the free, open source, highquality, widely used GNU Ccompiler. Once installed, type gcc example.c to compilethe example into an executable program called a.out.

On Linux you can run the executable and send ("pipe") the output directly tothe sound card with the command ./a.out > /dev/dsp. Onother platforms, you may need to save the output to a file and tell amusic-player program that it uses 8-bit unsigned data that is intendedto be played at 8000 Hz. For instance, you canuse Sox, another good opensource tool, via commands like

./a.out > a.rawsox -V -r 8k -e un -b 8 -c 1 a.raw \  -b 16 -e si -L a.wav rate -h 44.1k gain -1 trim 0 30

The first line runs the program and saves the resulting sound samplesto a "raw" sound file. Press Control-C to interrupt it after a secondor so, since the raw file will already be huge. The second line usessox to convert the first 30 seconds of the sound into a standard WAV filesuitable for playback on any machine. Don't be intimidated by all thearguments, just copy and paste the command. The arguments just specifythe nature of the raw file (8000 samples per second, one unsigned byteeach) and of the desired new format (WAV file with 16 bit signedinteger values at 44100 samples per second). You can change the final"30" to some other number of seconds, if you want.

As yet another alternative, you can use Audacity, anopen-source sound editor with a graphical user interface, if youprefer clicking menus to typing commands.

Look now at the actual program.The variable t representstime; it counts from zero upward. The putchar functionwrites an 8-bit value in "raw" form as a single "character". This isdifferent from printing the value in decimal notation.

All the real action happens inside the argumentto putchar, which is
(t*5&t>>7)|(t*3&t>>10)
in this example. To make it easier to experiment with alternativeformulas, here's an example of a slightly longer program that lets you selectfrom several different formulas by passing a command lineargument. Adding a second argument specified a duration in seconds,and prints the numbers so we can plot them in a graph. Rather than squeeze this into oneline, I've written it using a more traditional format; I've also usedC++ instead of C here, so compile this with g++ instead of gcc.
/**     ./a.out kind > /dev/dsp    ./a.out kind duration > data.txt**/#include <stdio.h>#include <stdlib.h>int main(int argc, char **argv) {  int kind = 0, dur = 0, s = 0;  if(argc >= 2) kind = atoi(argv[1]);  if(argc >= 3) dur = atoi(argv[2]);  for(int t=0; dur==0 or t<dur*8000; t++) {    switch(kind) {    case 1:       s = (t*5&t>>7)|(t*3&t>>10);       break;    case 2:       s = ((t*(t>>8|t>>9)&46&t>>8))^(t&t>>13|t>>6);       break;    }    if(dur == 0) putchar(s);    else printf("%d\n", (unsigned char)s);  }  return 0;}

The Case 1 formula with which we began the article isfrom viznut 2011-10-10.Viznut attributes the Case 2 formula to xpansive 2011-09-29 Lost in Space.

Viznut has listed a large number of additional exampleshere, many ofwhich were contributed by his readers.

By changing the formula, one can get all sorts of differentsounds. The Lost in Space formula is interesting because itchanges, gradually, over time: you hear a pattern of beeps (almost a"melody") that seems to repeat forever, but every now and then thepattern changes on you unexpectedly.

You could modify the program to output 16 bit values and playthem at 44.1kHz, like modern music CD's. This would make it easier toadd multiple voices together to synthesize even more complexpieces. However, this would also distract us from the charm of thereally simple program. The mathematically interesting question is: howdoes it work?

In the simplest case, we could just say s = t; whichwould generate a "sawtooth wave". When t exceeds 255, thehigh order bits are dropped in converting the integer to a byte, so256 becomes 0, 257 becomes 1, and so on. Thus the sound will repeatevery 256 samples. Since they are being played at 8000 samples persecond, the frequency is 8000/256 = 31.25, a very deep tone but withmany higher frequency harmonics, due to the discontinuity in theshape, giving a characteristic buzzing sound.

The interesting parts are the bit manipulations. These remindme of the kinds of tricks sometimes used in random numbergenerators and hash functions.

For readers who do not program in C, the notation
t >> 7
means shift the value of t seven bits to the right, whicheffectively divides t by 2 to the seventh power, or 128. Similarly, t >> 10 divides by 1024.

Next, the ampersand operator (&) performs abitwise AND of two values represented in binary(see Counting and Number Systems). For example, in binary the number 19 is10011, and the number 25 is 11001. If we bitwise AND these together,we get 10001, because AND is true (1) only if both its inputs aretrue. Thus, in C notation, 19&25 gives 17.

The vertical bar means bitwise OR. For instance, 19|25 gives11011 in binary, or 27 in decimal, because OR is true if at least one of its inputs is true.

The caret (^) meansbitwise exclusive-or, which is true if exactly one of its twoinputs is true. For example, 19^25 gives 01010 in binary, or 10 indecimal. The caret notation is a little unfortunate, since in most othercontexts other than C-based languages, the caret means exponentiation,as in 2^8 = 256.

Finally, the star (*) means multiply, as in most languages.

We also need to know the order of operations. As in ordinaryarithmetic, in C multiplication has higher priority than addition,regardless of the order in which they are written; forinstance, 4+5*6 is 34, not 54, because it reallymeans 4+(5*6), not (4+5)*6.

The priority order for bitwise operations works like this:After multiply and add, next comes the shift operator (>>), then AND,then XOR, then OR. If we want something different, we can useparentheses in the usual manner.

Now we can start to analyze the program. Let's start with theCase 1 formula, since it is relativelyshort: (t*5&t>>7)|(t*3&t>>10).

This is an OR of two terms. The terms are pretty similar: theAND of (t times a constant) with (t divided by a constant).

Try adding lines to the program like the following:
    case 3: s = t*5&t>>7; break;    case 4: s = t*3&t>>10; break;

This will let you hear the results of each term by itself. One thingyou can hear immediately is that Case 3varies on a faster time scale than Case 4.

Now let's make a graph of the numeric valuesusing R, the free open-sourcehigh-quality statistical programming language. First, run the programwith the following command line arguments:

./a.out 3 5 > data.txt

This will create a file of 40,000 numbers representing the first 5 seconds of sound from the Case 3 formula. We can plot them with the R commands

a <- read.delim("data.txt")$X0x <- 1:length(a)plot(x, a[x], type='l')

This produces a plot like

We will look at the big triangle more closely in a moment, but thefirst thing that jumps to mind is that the sound may start to repeatafter 32k samples (or about 4 seconds) - and indeed it does, goingfrom soft to loud and then jumping back to soft. In fact, if we plugin t+pow(2,15) in place of t in the Case 3 formula,we can see why it repeats: since the output is 8 bits, any multiplesof 256 are ignored. Right shifting 32k by 7 bits takes it down toexactly 256.

If we only had s = t*5, we would have a sawtooth wavethat repeats every 256/5 samples. At 8000 samples per second, thiswould be a drone sound at 156 Hz: a five times higher pitch thanthe s = t wave. Instead, we AND this with t/128. TheAND operation is a lot like multiplication (and the OR operation is alot like addition). Multiplying by t/128 would put a linear envelopeon the wave matching the diagonal boundary of the big blacktriangle. This is what makes the sound grow progressively louder overthe course of the four seconds.

However, the sound does not simply grow louder. It contains at least 4distinct "notes", each of which contains a myriad of smallervariations. Let's zoom the graph in to get a closer look. Change thedefinition of x in the R code to be

x <- 8000:8400

Each quarter of the 32k period uses 8k = 8192 samples, just over onesecond long. This new definition for x will zoom us in to see the endof the first quarter and the start of the next. It looks like this:

Notice the sharp change in character at the 8192 point. The wavechanges from a sawtooth to a square wave, with a corresponding changein timbre. What's going on?

When t is 8192, shifting right by 7 (or dividing by 128) yields 64,which is a power of two. The 8-bit unsigned numbers we are workingwith have 128 as their most significant bit, and 64 as their secondmost. As we cross the 8192 boundary, the lower order bits (32, 16, 8,4, 2, and 1) all reset to zero, while the 64 bit turns on. Thiseffectively suppresses all the small wiggles in the wave. All that isleft is the oscillation between 0 and 64. Then as t continues toclimb, t/128 gradually picks up some low order bits, so we start tosee small wiggles inside the square wave. As t gets even bigger, thesewill grow.

The net result is that using AND instead of multiplication generates amyriad of ever shifting fluctuations that add color to the sound. Wehear four relatively distinct notes as the top two bits (128 and 64)go from 00 to 01 to 10 to 11 (counting 0, 1, 2, 3 in binary). But wealso hear lots of lower-amplitude, smaller-volume variations in thebackground, as the bits for 32, 16, and so forth turn on and off atdifferent rates.

To clarify this point, let's plot the wave a little further into thissecond quarter. Using x <- 12000:12400 we get

The small wiggles in the square wave have grown into largewiggles. Then just before 12,300 there is another transition. If wemultiply 1024 by 12 we get 12,288. In binary, 12,287 is10111111111111, whereas 12,288 is 11000000000000. Once again, all thelow order bits reset. This is perhaps easier to read in hexadecimal:we go from 0x2fff to0x3000. Numbermonkprovides a convenient calculator for doing these conversions. When wedivide by 128, we get 96 = 64+32 instead of 95 = 64+31. Recall thatsince 32 is a power of 2, 31 has all the lower bits turned on, and 32turns them off. So once again, we have a significant transition in thesound, although less noticeable because it is half as large andpartially hidden by bit 64, which remains on the whole time.

The other term in the Case 1 formula is t*3&t>>10.The sound evolves more slowly, and now we see why: it takes 8 timeslonger to repeat, since we shift right by 10 bits instead of 7. Nowthe question becomes, what happens when we OR the two terms together?OR is somewhat like addition, so we will hear a blend of the twosounds, but once again, there are subtle changes. Let's redo theprevious graphs for the full Case 1 formula.

We see immediately that the sound no longer repeats after 4 seconds(32k samples): instead, while it does go back to a softer volume, theshape of the curve is more complicated than during the first 4 seconds.

Zooming in we see that 8192 still marks a transition, but the waveform is considerably more complex on both sides, since we OR it withanother, independently evolving wave.

Similarly, there is still a transition at 12,288, but once again,there are additional variations happening from the OR with the secondwave. Ultimately, though, the entire sound will indeed repeat afterabout 32 seconds, since both of the individual terms do as well.

You can have fun trying further experiments yourself. The Case2 formula produces a completely different sound (though still withinthe 1980's "video game" genre), though it uses the same basic tools ofsimple bit manipulations. Try graphing or listening to portions of theCase 2 formula to better understand the individual pieces and how theycombine. Then try modifying either formula by changing the constantsor the operators or adding new terms. It is amazing how much varietylurks within these simple forms.

You can also try operators we have not discusses, such as addition andsubtraction. C also uses the twiddle or tilde symbol (~) tomean NOT, which flips each bit in the number. The percent sign (%)means MOD, which is the remainder after integer division.

You can see why I feel this is such a fascinatingtopic. Slight changes to the formula can produce dramatic changes inthe resulting sound. While the output is certainly low in musicalquality compared to a Bach fugue, it is remarkable how much fun youcan have playing around with changes to the formulas and trying toimprove the results. If anyone out there knows of somethingsimilar - i.e. a compact mathematical method of producing a widevariety of interesting variations - whether in music, art, or someother programming-related area, please do let me know about it.

If you enjoyed this article, you might also like Algorithmic Art or What is the difference between Just Intonation and Equal Temperament?

You may also want to use the 'Topic', 'Search' or 'Archive'widgets in the side-bar to find other articles of related interest.Or check outthe Contentspage for a complete list of past topics in historical order.

I hope you enjoyed this discussion. You can use the littlebuttons near the comment box below to share it. Click the little Mto email this post to a friend, or the T toTweet it, or the F to share it on Facebook.

Please post questions, comments and other suggestions using the box below, or email me directly at the address given by the clues at the end of the Welcome post. Remember that you can sign up for email alerts about new posts by entering your address in the widget on the sidebar. If you prefer, you can follow @ingThruMath on Twitter, where I will tweet about each new post to this blog. See you next time!

Hiç yorum yok:

Yorum Gönder