Markov and You

BugFree · June 11, 2008, 12:00am

Jeff,

you can hear similar nonsense without Markov chains if you interview people for programming jobs on a daily basis.

“Never underestimate the power of human stupidity.” (Robert Heinlein)

ElvisM · June 11, 2008, 12:00am

Any good sources for Markov chain generators which take binary input? It could be fun to see what kind of executables it would produce…

That, my friend, would be wicked cool! Although I would run such a program inside a vault just as a precautionary act, in case our Markovian baby decides to write anything on the hard disk!

GrahamS · June 11, 2008, 12:00am

@Victor:

It might be easier to randomly generate Assembly Language

JoshM · June 11, 2008, 12:00am

Marius, owing specifically to it’s strong gramaticallity, the Svada thing looks to be a different beast from a Markov model (at least at the word-by-word level).

Markov’s greatest strength is it’s simplicity – it is a tiny whisp of a principle from which all kinds of exciting and surprising things can emerge, including e.g. novel, semi-coherent text. But you can never depend on anything more than semi-coherence on the whole.

The model’s greatest weakness is it’s lack of context. It knows nothing of sentence structure, or of the logical relationships between words in its tables beyond sheer proximity. Part of what makes Markov language output amusing in many cases is the tendency toward sudden surreal shift across an unexpected point of transition; that that transition is sometimes grammatical or even semantically plausible is just a chance of luck, not a feature of the model.

A pure Markov model is in that sense literally more context-free than even a Context-Free Grammar.

It’s possible that the Svada code depends on a CFG for generating its sentences; or it may rely some more structurally complex model, or on templates. It’s very neat work, regardless, but I don’t believe at a glance that it is Markovian at all at the level of sentence-construction.

JohnM · June 11, 2008, 12:00am

Very interesting, could also come in handy for unit testing scenarios

JamesG · June 11, 2008, 12:00am

Regarding Markov chains and music, didn’t the old Sid Meier 3DO game “CPU Bach” use them? I loved that game/toy, wish Sid would put it out on XBLA or something.

Jeff_Davis · June 11, 2008, 12:00am

MegaHal is quite impressive. I really enjoy what it produces.

PenguinP · June 11, 2008, 12:00am

Uh, Jeffy? The rest of us heard about Markovs in like 1980-something. Really those “buzzword bingo” cards work on the same principle. Are you telling us this is your first introduction to them?

For more fun, try the Emacs dissociated text plugin (M-x dissociated-press). Run it with a buffer full of text, get delightful gibberish that almost makes sense back. Try it on a buffer with code pasted from two different languages, watch it invent a new programming language before your eyes!

Carter · June 11, 2008, 12:00am

This is really interesting. I work at a internet harvester/analyzer and we use Bayesian spam filtering. This makes me want to look more into Markovian.

wwwc · June 11, 2008, 12:00am

Jeff, the thing I love about your site is I can spend two weeks in the Australian outback learning how to predict the weather by studying the entrails of hand-wrestled alligators and come back to civilization to find your blog still here, filled with new posts, each one of which is guaranteed to be interesting. You are one of the most consistent bloggers out there and considering how long you’ve been blogging that’s saying a lot.

Now, any ideas on using Markov chains to simulate chat text at an online poker table?

Harold · June 11, 2008, 12:00am

What’s your conclusion on the original comment? Do you now believe it was created by a bot using Markov chains?

D__Lambert4 · June 11, 2008, 12:00am

Pretty interesting. Like a couple others, I see spam comments on my site from time to time that sure look like they came out of something like this. Pretty tricky, and probably damned near impossible to filter out.

JoshM · June 11, 2008, 12:00am

I’m actually curious whether Markov-generated text would make for an effective attack on a Bayesian filter. I don’t know much about the latter, but my general impression is that the strength of a Bayes spam filter is it’s ability to evaluate the raw probability of a given set of words occuring in known-spam email vs. all email, independent of the order of those words in a given email.

In other words, a Bayes filter as I understand it wouldn’t care about the sentence structure or order of clauses – it’s not set off by specific fixed phrases so much as by the occurance or not, period, of unlikely-to-be-legit individual words in a candidate email.

The reason I’m doubtful about Markov as a means to defeat Bayesian filtering in specific is that a Markov regurgitation of source text doesn’t change the probability of any given word in that corpus appearing in the output. So if you feed “clean” text into Markov, you’re getting clean text out; feed “dirty” text (containing spam-flavored tokens) and you’ll get dirty text out. If Bayes just counts it all up as unsequenced tokens, you haven’t gained anything.

That’s not to say Markov spam can’t fool some filters. A filter that depends on matching a known-bad fixed string of words to reject spam may well be fooled by a Markov model that churns out unique syntheses of Jane Austen instead of just quoting the same verbatim passages again and again. So it makes sense as a gambit (spambit?), but as I understand it Bayes is pretty much the strongest general anti-spam technique going right now and Markov shouldn’t be a panacea in that sense.

Andy · June 11, 2008, 12:00am

Arbuckle is more than just strips that have been redrawn. They’re strips that are redrawn from John Arbuckle’s point of view!
They’re a reminder that Garfield CANNOT TALK. When you remember that Arbuckle cannot know what Garfield is thinking, the strip gains a new level of ridiculousness.
You don’t need to redraw them for this, of course. Most Garfield strips are improved greatly by simply removing Garfield’s thought-balloons.

Langel1 · June 11, 2008, 12:00am

I just wanted to share some randomly written text I found on one of my websites –

like the malfunctioning, spastic, wet-nightmare of an artificial omni-intelligent acid fried cyborg, battle of the bits is this mother larva queen spewing out spawned offspring in the form of bit crunchy pockets of noise paintings. elegant pulp data paradigms and musical notions assigned to snippets of life going in and out of relevance. algorithmic musical vessels of meaning for that which we assign… and then sometimes robocop assigns it for us! because we all know in our hearts that if robocop made electronic music, he’d kick all our asses so badly. THINK ABOUT IT. he’s pretty much an electronic man. if we let robocop learn how to use a drum machine, he’s going to destroy what we love the most. help me fight against this rising threat. show your support by calling 1-800-328-4475. if we work together we can preserve our silly human notions of what we think music should be. love, blood mr.

My friends and I will often sit around and ramble nonsense at each other. Markov can replace beatnik ideals but it’s not as much fun. I’m not going to say that engineered creativity doesn’t have it’s place. It’s just sad, for me, how often people would rather believe that such a presence was not the work of a fellow human. ;D/

Kevin · June 11, 2008, 12:00am

@Graham

Or Python/Java/etc. bytecode…

PaulT · June 11, 2008, 12:00am

Markov chains we’re talking about? One example of Garkov. The best description of material to Turn a Bayesian spam filtering. They’re even better! The most notable example of the states that page back to work on the links between pages. As a hard time we produce output so many times every letter follows an A, how unbelievably simple they work almost uncannily well, a Definition of the text corpus for Markovian models – are the original 1998 PageRank paper, titled The PageRank sites circa 1996 in this paper have come to a very free bottlemarkable, By

demallien · June 11, 2008, 12:00am

I wonder if you could use Markovian chains to write a decent chess AI?!?! I can imagine it now, Grand Master kooky

LawrenceK · June 11, 2008, 12:00am

I once wrote a Markov chain program that could interpolate between two graphs:

https://www.teamten.com/lawrence/projects/markov/

JoshM · June 11, 2008, 12:00am

Ha! Lawrence, I had forgotten midway through my first comment above that the comment box stripped html and actually linked to your project (as “over here”). Glad you showed up.

I wonder if you could use Markovian chains to write a decent chess AI?!?!

Well, here you have a problem of putting together a corpus, and one of defining a game state. Finding a corpus shouldn’t be too hard – there are many chess sites, and loads of historical games out there as well, so let’s call that a solved problem for the sake of argument.

(For the sake of counterargument, there is a corpus-size problem: do you want to feed it only excellent play, or are you willing to feed it a great deal more mediocre play? Markov does better with bigger corpuses, but is bigger and less-skilled worth the tradeoff?)

But, yes, so: you have your corpus of, say, many games of chess each with some dozen number of moves on either side. Call it a million total moves that you use to train the model.

Since a Markov table is essentially a series of state-move pairs, we need to define what a state is and what a move is in order to build the table form our corpus of moves.

The obvious answers, I think, are that a state is a given configuration of the board – the location of (up to) 32 pieces, some black and some white, on the 8x8 grid of the chessboard; and a move is a name of a piece and a pair of coordinates, say black pawn from king 2 to king 3 (forgive my lay nomenclature, I’m not up on my chess lingo).

Which, okay, great. But here’s a problem; among our million recorded moves, how much duplication of the board layout do we end up with? If (past the first seven or eight moves) we see very few duplications of a given state, there aren’t any options available to the AI at most junctures after the opening. In which case you have an AI that can play a few moves and then has to give up and fall back to a different heuristic.

In other words, a more complicated state means less novelty, and the explicit organization of a chessboard is a very complex state indeed in this sense. (Doing this with Go would be worse yet.) To get novel behavior out of your Markov chess AI, you need to find some way to reduce the complexity of the state to guarantee interesting choices even well into a game. Probably not good choices, but interesting, surprising, novel.

One approach: treat state as, not a record of the explicit configuration of the board, but as, say, the type of piece that was last moved by an opponent. Then analyze your corpus of moves like this, perhaps, stocking a 2-order table that picks the next move based on the previous two:

State: (I moved King, he moved Queen)
Moves:
(I move Pawn: probability = 0.27)
(I move Bishop: p = 0.13)
(I move Rook: p = 0.10)
…

That may be too simple for an interesting AI; perhaps you would want to incorporate move-to-capture vs move-that-doesn’t, or direction of movement, or other generalized aspects of a move. Finding a sweet spot that results in novel but not utterly random play would be the challenge and the fun, I think.

That’s just one approach, and off the top of my head as well so there may be some truck-sized holes in the idea. My gut says that applying Markov to chess at a tactical level is probably a doomed notion, but I’m neither an AI researcher nor a chess wonk, and so I wouldn’t be surprised to find that such models do have a role somewhere in existing AIs.