Musical Language
To be clear, this is a document thinking about ways to turn musical notation into an actual programming language, inspired by similar esoteric language projects like “velato” which uses musical intervals to define commands.
MN: Musical Notation.
I use it a lot here, musical notation refers to standard western musical notation, written with a clef, often a key, and five lines where each line represents a certain note. And, of course, every other convention that musicians have adopted when writing and reading music.
Also, this might be my masters project. I’ll just call it “MN” because “Musical Notation” or “Sheet Music” are never used in a computer science context so they’re basically free. “.mn” files can be source, or a proprietary reduction of MusicXML that removes everything I haven’t accounted for, and makes it easier to parse.
Introduction
I recently watched an esolang video that showed off Velato, so I looked at the introduction page for the language and it’s not quite what I had hoped. Mind you, the video also didn’t put it in a good light.
The main issue I have with Velato is that it’s basically just a normal language but commands are represented by note intervals and note intervals alone. Writing in Velato seems like it would feel like just writing a normal program in a strange language with midi grammar. This is probably due to the constraint of midi files as a format, but I reckon we can do better.
I’m abandoning midi as a valid file format. Needing this to look like valid sheet music just means midi doesn’t contain enough data. MusicXML is probably exactly what we need.
And I’m also too used to C-style grammar, so I’ll probably end up riffing off of that unless I get suggested a more suitable language format. C-style grammar is great because reading music is kinda like reading words.
I don’t want to make an esoteric language. I want to make a perfectly practical language that uses musical notation instead of written notations. Like, modern programming depends on western writing norms. An alphabet that is then used to construct words that then form a sentence through grammar. But not every writing system is like this, for example I understand the Chinese writing system as essentially a bunch of independent symbols (which include some form of symbol composition) that each represent a word.
So imagine what coding would be like if it wasn’t developed by the western world. That is what I want to do with a proper musical language. A western-developed musical notation, for sure, but it is an entirely different writing system.
We also shouldn’t really use written text on our sheet music as that seems weird. The goal here is to write music that also compiles into a program, that also looks like music and when you code it feels like some hybrid between writing music and writing code rather than just writing code in the format of a midi file or something like that.
What needs to remain free
A lot of MN needs to remain free to keep the music playable and expressive.
- Dynamics. These probably shouldn’t be used by the compiler because due to other limitations, the pace and the feel of the music isn’t really under the composer’s control. Leaving dynamics free lets a player of the piece decide what gets to be intense, or calm, or just quiet or loud.
- Octaves. Forcing composers to use certain octaves for certain functions of their program doesn’t make sense. It means that a piece might be unplayable for a given instrument.
- Key signature and root note. On a grand context, what key a piece of music is in doesn’t matter. What actual frequency each note represents doesn’t matter. This is why we can play any piece in any key (same mode), and it’s still the same piece.
- Clef. We’re trying to compose readable music, constraining clefs to a function of the programming language doesn’t make sense.
- Instrumentation. Again, we’re making musical pieces which are also programs. Constraining what instruments you can use would just really suck, and actually hamper creativity as the language can only be written with the most popular instruments which the compiler accounts for.
- Breaths and pauses. It goes without saying that we can’t mandate when players can breathe, but even fermatas are part of player expression that shouldn’t be tied to computation.
- Graphical choices. Specifically, noteheads, stem, and flag appearances. Not only are they used for expression (like a guitar’s harmonics) which shouldn’t be enforced, but they could even be used by a composer / programmer to “comment” or help distinguish commands from data or other stuff.
The Turing-completion of Musical Notation
Turing machines are conceptually simple, but if you tried to write code for a Turing machine that did anything reasonable, it would be more like an esoteric language than an actual language. Because of that, making MN Turing complete has less to do with making the notes correspond to commands on a Turing machine and more to do with making the notation itself useful for computation.
From my understanding, to be Turing complete, you need at minimum three things: an infinite loop, a conditional jump, and memory.
Memory doesn’t really need to be a part of MN, apart from writing to memory. So long as the notes can be interpreted as doing something with memory, then we are good on that front. Most languages don’t have memory written into them anyway, it’s always run-time and not write-time or even compile-time.
MN has loops and jumps, but they aren’t infinite or conditional. But, we can make them. Theoretically, we are allowed to write whatever we want on a piece of paper and if a musician doesn’t know what it is because it’s outside of convention, we can just tell them “oh this means that”. More properly, you could just write what to do in plain language But this is against the spirit, so I’d like to use normal MN writing conventions as provided to me by the palette of symbols used by the Musescore editor.
Loops
In music, there are two ways to repeat, and no ways to properly loop.
- Repeat last N bars, N∈{1, 2, 4}. The convention uses a symbol that looks like percentage, with the number of dashes in-between the dots representing how many bars you repeat. This isn’t a loop, it’s only a repeat, but it might come in handy if we let N∈R.
- Bar repeat signs. The bar line can be replaced with a thicker line, another like, and then two dots. Have one of these point forwards and another backwards, and you have to repeat that section of music twice to play it.
This is probably the worst restriction in the computability of music, because it’s only ever repeat twice and never repeat N times. Regardless, loops are just jumping to the start of the loop forever, so maybe jumping presents another opportunity for looping?
Jumps
In music, there’s one main way to jump.
Capo, Segno, Coda, Fine. Sheet music is Italian, and the words “da” meaning “from”, and “al” meaning “to” are standard musical notation.
The coda is also special because it acts as a sort of teleport. When you go from some place (capo or segno) to the coda, you then skip to after the part where you jumped back, often marked with another coda, making the coda a kind of portal.
It’s worth mentioning the coda and segno also have a “doppia” variation, a doubled version. This is just another segno or coda so that musicians can do the segno-coda trick twice in their music. Similarly, “Da Doppia Segno” is abbreviated to “D.D.S.”.
The main limitation in this system is that it’s one-time. If you read D.C. al fine, you ignore it the second time. This is a shame, because if musicians were computationally minded it would be somewhat easy to set up readable notation for a “while i != N; i++” loop with voltas.
Even though there’s doppia versions, it seems fairly easy to be able to write several segnos and codas in a piece. The concrete rule would probably, “Da Segno al Coda” goes from the closest segno before the mark to whatever coda it encounters next.
Ripeti Mentre
We can write an effective “do … while” loop by using repeat bars, and writing at the end “Ripeti mentre x” where x is some check. I’m not a fan of this because it’s a little unclear, it would help if the compiler also accepted the phrase in different languages. I also have to reconcile with checking variables, which I talk about in the Questa Volta section.
Giocare x Infinitum
Alongside Ripeti volte, “Giocare x Infinitum” is the easiest and clearest way to write a loop. Write some structural repeats (|: … :|). At the end of those, write “x∞”, and the player will repeat that section for infinity. Pair Giocare x Infinitum with the Questa Volta, and we have effective flow control.
Questa Volta
For all intents and purposes, these compromises should be the only out-of-place thing on the sheet music. They’re not completely alien for a musician, and I’m aiming for nothing on the sheet to be weird or odd for a playing musician. The worst offender here is really the Questa Volta.
We can very easily implement flow control with questa volta. Using voltas, we can effectively show that a certain part of the piece should only be played if a condition is met, traditionally if this is the ith time we’ve played this part.
We can extend this framework with text variable names, but then we run into a big issue between the player-programmer balance. Playing the piece, you can’t easily track how variables are being changed by the musical operations. If we reach a questa volta that says “a=5.” or “a eq 5.” as a musician reading the music we haven’t really been able to keep track of the variable a after it’s declaration. So what do we do?
Example

Here’s a good looping example. We can use end bar lines to effectively notate different pieces within a single file. In it, there are three voltas. the first one is just to show some valid syntax for “iterations is less than”. Musescore won’t render the < symbol, so “lt” should at least be valid syntax. “lt 5.” assumes “iterations less than 5” and is equivalent to writing “<5.” or “i<5.”. The “1.” open volta means that if we jump to before the loop, not portrayed in this example, we ignore it the next time over. This implies the syntax keeps careful track of how many times it “plays” each bar.
I’ve used the Giocare x infinitum notation, because it’s a lot more readable compared to all of the other options. Inside the repeat bar lines, we have a volta with “10.” and a “To Coda” inside of it, with the coda placed just after the loop. When we reach the tenth iteration, we read that bar and the “to coda” inside of it and finally exit the infinite loop.
Memory Management
The other part of Turing completion is being able to write and store to a memory tape.
If we really wanted to be C-style, we’d need some functions that we can call to edit to memory.
Rhythmic Commands
A decent idea is that we could use rhythms to define commands. Once the compiler picks up notes playing in a certain rhythm, it then interprets the following data as parameters. Conveniently, common time can be divided into 8 sections of “note is playing” and “note is not playing”, so we could potentially re-interpret this as a binary number. This would limit creativity to only playing a note when we need to set a bit and needing a rest for any empty bit, but this might be okay and a decent restriction on the music.
The rhythm of the commands doesn’t have to be a full common time bar either, and it really shouldn’t be. Ideally, my MN compiler works with any time signature, so maybe bounding commands to 2/4 (possibly the shortest viable time measure) lets them be played in any time signature.
There should also be an opportunity to “replace” the commands of the rhythm with project-specific rhythms that let a composer use whatever rhythm they want in their piece to do a certain command, while also freeing up the pre-defined rhythm to be used elsewhere.
Melodic Operations
Not sure on the specifics but making certain operations outside of commands (like assignment, boolean operations, bitwise operations) accessible through some melodic data (intervals) would work nicely. But, because I want to reserve sharp and flat notations for something else, we only have the eight intervals to work with, and any combinations of those. When calling a melodic operation, we would probably also need to define a root note, which we can probably try and use some compiler magic to work with.
I’m not going to list the intervals, because major or minor doesn’t matter (so we leave sharps free) it’s just the ordinality of the interval. Conveniently, eight is also a power of two. Not sure what to do with this.
Binary Data
Okay, so I’ve had a few ideas on how to interpret data. We could just interpret rests and non-rests as a binary stream in whatever section makes sense for that, with some terminating pattern or interval. Alternatively, we could scratch melodic operations and use how we conveniently have eight intervals to set bits in a byte. I’m not sure.
In general, when I’m not sure between two interesting and seemingly viable options, I should make them both available. That way, we increase the options available to a composer.
Staffs
Sheet music supports several staffs. Either several instruments, or one really big instrument like a piano with a treble and bass clef. The language should work with one staff, the control staff, while also having other staffs potentially have function, or to not have function.
We can also use that little pre-segment before the real music starts (I forgot the proper terminology) as a motif - a unique identifier for the piece of music so that it can be referenced for function calls or other stuff.
Staves are very easily just parallel processes. They should also act on the same memory, with a caveat. Musical motifs are shared, so a motif in one stave is the same in the other, but text names aren’t shared if they’re written on the stave. I’m pretty sure MusicXML supports system text, which is how you’d define variable names across staves.
Otherwise, there must be able to exist only one staff upon which the whole program may lie.
Variables
To do flow control inside of questa volta, we need to use staff text to define what motifs refer to what variables so that we can use them in flow control.
So, if you write some staff text, that staff text is now tied to the staff it’s for. If a variable is declared inside of that staff, the text is now that variable’s name used in questa voltas for flow control. Variables are also declared with a musical motif written after whatever syntax is used for “declare variable”, which can be used in the music part to refer to it for operations and stuff.
Then, inside of a questa volta, we can just write “var compare var”. The only issue with this is it’s incomprehensible to a musician, and I’ll have to find some way to reconcile this.
Scoping is also a thing. It only really makes sense if each piece is in its own scope.
If we let staff text actually matter, this makes function calls potentially easier. We can just write “Play (piece)” in the staff text, and then “play” the piece written using title metadata. We can use the music part to dictate what variables are to be passed as parameters by explicitly loading variables into a memory location that will be copied or otherwise referenced by the newly played piece, effectively giving it some starting memory. We can use this system while also allowing motif-hash linking.
Reconciling hard-to-read questa volta
I think it’s just fine?
Theoretically, if a musician playing the piece knew what commands they were playing, and remembered what motifs refer to what text variables, they could follow the value of a text variable and then effectively read the questa volta containing the variable. This is essentially an admission that it doesn’t matter if it’s not reasonably readable by a human, so long as it’s theoretically readable it’s fine.
And since these hard-to-read checks are often exit checks, we can just tell a musician to play the loop some minimum but greater than two times, or repeat it a certain number of times to demonstrate its different pathing, and play the exit volta at the end. After all, the flow control being part of the sheet music is a way to tie sheet music and the flow of the program. Since commands and data are abstracted into word calls, the performer never “plays” the changing value of a variable (but they can with my bit set method!) so they are inherently abstracted from the running of the program.
Distinguishing commands
By default, everything should be interpreted as a motif, used or not. Ideally, this means composers can write whatever they want and have some parts of the piece used for computation. This is essentially just like writing normally in Python, you’re declaring variables but not using them.
I’m not sure on the details on how commands are recognised from amongst these, but it can’t be something easy like staff text. To re-iterate, I’m not going to let programmer-style syntax into the staff text! Variable declaration is done in the music, and we use some staff text to help us refer to it in questa voltas. Otherwise, staff text remains free.
Function Calls
Functions. The coding system should adopt a sort of one-function-one-composition enforcement in most part because writing several loops in a single piece is really hard, and any code that needs to loop very naturally fits into its own function.
So, each function should be in a separate composition. The issue with this is function calls and how we do them. As a composition, functions might not make a whole lot of sense. You just write some command or operation with the reference to the function and data passed into your music, and it’s somewhat not understandable by a player, but the compiler will just understand that it needs to link the other composition here.
Possible Implementation
MusicXML provides a key definer, which also lets you define mode. This is what defines the root note of the piece, and allows the root note to be any western music key signature mode.
Tokenising
Tokenising a stream of music is a real challenge, so I think we need a broad approach.
We can define a set of token classes to start with. Here are some “tokens” that have nothing to do with words but help define certain relations. VOICE - simultaneous notes in different voices can help distinguish motifs or words. STAFF - staff changes are ignored. More staffs are essentially simultaneous computation. CLEF - used to define where notes are on the staff KEY, MODE - used to define working root note TIME - only defines how many notes we can fit in a measure
Here’s the more useful ones. SECMARK - an explicit section separator that we can’t define a word across. BARLINE - a standard delimiter and measure separator. Can be ignored with slurs. MEASURE - Words are contained within measures, but a measure may have several words. MNOBJ - Music Notation Stream Object, a thing on the continuous music stream. Only ever either a note, chord, or rest. NOTE - a single note event. A interval relation to the root with a duration and several voicing elements. CHORD - simultaneous grouped notes REST - a rest event with a duration
MOTIF - a “word”, which is either a defined command or a new motif which can be assigned a value. COMMAND - a command token FUNC - a function call. A motif which isn’t a command token could be a function call. VAR - a variable that might have a value. A motif which isn’t a command or a function call is a variable. RAW - A motif may be a raw value and not a variable, to be passed as an argument in a function or to be assigned to a variable.
And then there are some extra-textual tokens that aren’t part of the note stream but outside of it. These have rooted positions in the note stream. Many have start points and end points, like voltas. STAFF_TEXT - can give a motif a text identifier, to be used in questa voltas QUESTA_VOLTA - Might have an end point and has some text. The text is a condition that must be met to play everything in its range. JUMP - “al Coda”, and other common jumps but al coda is probably the most common to be used inside of a questa volta MARK - a coda or a segno, that jumps jump to. Capo and Fine are also marks, automatically just the start or end of the piece. SLUR - Could have a beginning and an end. When they group notes, they can help identify motifs (like a violin stroke) ACCENT - Accent and Marcato markers, while theoretically part of dynamics, I am going to conveniently use to explicitly mark a new word and word division. However, they shouldn’t be mandatory.
Okay, here’s some concrete definition.
MOTIFs are groups of MNOBJ that can be grouped using the following logic SLUR marks as phrases always group notes into a motif, even when other separators are present BARLINE separates motifs ACCENT as marcato, violin bow directions, define the start of a new motif
Importantly, when parsing, we need to throw an error if slur phrase markings go over explicit section markers. It wouldn’t look right either.
Data memory and management
The program’s commands let us bind motifs to memory values, assign them, and perform some basic memory operations. There are also commands to set the default reference, a sort of “register”, where we don’t have to constantly play a motif and instead the commands will default to the register value.
Possible commands Unless specified otherwise, dest is by default the register value, and if there’s only a var parameter then it is also by default the register value. Memory
- ALLOC var, like Let, pre-defines a variable’s existence and defines what motif refers to it. It doesn’t really need a command, as any written motif essentially is a null pointer that is dynamically allocated if it is ever used.
- FREE var, which just resets a motif, not sure how useful this will be given the compiler needs to manage memory carefully.
- MOV dest, src. Moves a value from dest to source, essentially equals. With one argument, essentially puts a new reference into register.
- SET var, other. Like the equals command, some command pattern followed by a motif (or nothing to act on the register), and then a raw value (represented by binary notation done either with intervals or binary pattern).
Arithmetic
- ADD dest, src.
- SUB dest, src.
- MUL dest, src.
- DIV dest, src.
- MOD dest, src.
- INC var.
- DEC var.
- NEG var.
- ABS var.
Bitwise
- AND dest, src.
- OR dest, src.
- XOR dest, src.
- NOT var.
- SHL var, n. shifts left
- SHR var, n. Shifts right
IO For IO, I’ll probably give the program some default global streams that can have values passed into them as function calls. You play the motif of the stream, then its member function, then the motifs of the variables you’re passing in.
This is also just more program structure in general.
We can use voices to distinguish between ops and motifs, conveniently. For example, lets say “set” is just the fifth. We can play the fifth an octave lower in the second voice, and write the motif above it playing at least simultaneously and maybe after it. This won’t be voice-bound because “voices” aren’t always explicitly readable, but say we have something playing over something else, we can assume the higher voice is a melody motif and the lower voice is supposed to be a command.
Otherwise, we can interpret anything as a command. We should also keep in mind that chords could both be a command or a motif.
Additional markings and potential uses
Staccato
Could be used to represent raw binary data when used at the start of a word?
Tremolo
If it’s part of a command motif, runs that command as many times as the tremolo would sound.
Trill
Like tremolo, needs some solid logic behind how many times it “sounds”
Short trill and mordent
Need something good here
Turns
and also need something good here
Gliss
Could be short hand for some more complex command with ranges. On commands that need ranges, glissing between motifs could mean “from this value to that value”? I would prefer something better for gliss.
Fermata
system.sleep() for some time defined by the motif it’s attached to maybe, or if it’s attached to a null motif or an operator it could sleep depending on the tempo. There’s also long and short fermatas. Could always just stack to increase duration. Could be optional, since fermatas are usually a style thing?