Well hi! Thanks for, er, making an example of me, Jeff. Glad you like it.
- “Is it a coincidence that I wrote a chatbot with MegaHal(another 4th order markov chain) for a game lobby this morning or are their spy cameras in my cereal!?”
I can’t speak for building security, Tom, but MegaHal is unquestionably what first turned me onto Markov theory back in the day.
- “I’d like to see this applied to music and see what happens if you feed it all the works of Mozart…”
It’s been done. Heck, I did a very small version of it myself for a project in college – analyzing only melodies, and looking at melodic interval and time intervals as two separate tables rather than at melodic+time as a single block. What I produced were melodies that sounded like Mozart teeteringly drunk and standing on one foot before the keyboard; but that’s audibly different from truly randomly generated notes, so the project was considered a success.
One of the challenges with using Markov theory to synthesize creative output is that you have to decide how far-seeing your criteria is, and that leads to a tradeoff between novelty and coherence.
A 1-order markov model says, here’s word A, which set of words {B} can follow it? Then you pick a single word B from {B}, and see what {C} contains, and so on. It turns out that 1-order chains lead to miserable nonsense. Garkov (and MegaHal, and I think probably most Markov language toys) is a 2-order model: given words A and B in sequence, look at possible followups {C}. The three-words-at-a-time process gets us a lot closer to coherence, but it also means that the sentences are less novel – there are fewer places where [A, B] has multiple options in {C}, and there are fewer options in {C} when it is plural.
A high order markov model produces very coherent output but rarely produces novel output. But there’s also a balance of the order of your model against the size of your corpus – the collection of data you’re feeding into it – and as corpus size goes up, coherence creeps down. So choosing the order of your model depends on how much data you’re working with as well.
Which is all to say, as it works for words, it also can work for music, but what the results are like depends on (among other things) how you parse the music on input, how much music you’re processing, and the order of model you’re using to analyze and synthesize new stuff.
- “That’s my IRC bot which uses a Markovian style algorithm.”
Markov IRC bots are one of the best ideas since sliced bread, yeah.
- “That’s more a feature than a bug By using a specific text as an input you can produce a text sounding like the input. E.g. texts in old English or try poems etc.”
Exactly, David. Try it with arithmetic for some really bizarre stuff; try it with source code as well. Markov is naive, but very willing.
- “How about Markov/Dilbert project - oh wait, it already works that way…”
Heh. In principle, the Garkov code is actually Garfield agnostic – it’s just a bidirectional markov structure and some generic display code, paired up with a garfield display font, some garfield background strips, and a bunch of garfield transcripts. I’d very much like to do some other comics with it in the future – the big trick is those transcripts, so if anybody wants to spend a saturday plowing through Dilbert (or, better, Mary Worth), let me know.
- “Or maybe even analyse adjacent pixels in graphics/photos and see what kind of weird amorphous blob it produces.”
Ha! That could be fun. Difficult, but fun. Putting together a big corpus – and preprocessing it somehow to get the image complexity to a nice middle area in terms of total number of colors – would probably be the biggest challenge.
- “Is the same technique used for the order of the words?”
I know you a-righted yourself already, but Markov models done at the letter rather than the word level have been done before as well, and are rather fun. If you want to generate some new plausible non-words in the language of your choice, a 2-order model does a great job. There is a nice brief writeup that touches on that line of thinking (with a bilingual twist) : http://www.teamten.com/lawrence/projects/markov/ . I know I have seen others, but I can’t google to save my life this morning.
- “Well, this may finaly explain the comic Zippy. It is a Markov chain.”
Bill Griffith is a genius.
I’m going on at irresponsible length. This is fun stuff, and again, I’m glad you like the Garkov.
-j