Threading, Concurrency, and the Most Powerful Psychokinetic Explosive in the Universe

Back when I was writing for Tech Report, I had an epiphany: the future of CPU development had to be multiple cores on the same die. Even in 2001, a simple extrapolation of transistor counts versus time bore this out: what the heck are they going to do with those millions of transistors they can add to chips every year? Increase level two cache to twenty megabytes? Add to that the well known heat and scaling problems of Intel's "more Ghz, less work per Mhz" Pentium 4 architecture and you've got a recipe for both lower clocks and lots of transistors. When you can't go forward, go sideways: more CPUs on the same die.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2004/12/threading-concurrency-and-the-most-powerful-psychokinetic-explosive-in-the-univ.html

Interesting side note; BeOS had a completely asynchronous API, which is cited as one of the reasons for its failure:


http://www.osnews.com/story.php?news_id=3064page=5

BeOS has a very elegant API, really a pleasure to work with, but it is not as powerful than any of its competitors. Additionally, there are no good development tools for BeOS, no good visual GUI designers, no full-featured debuggers, no profilers… Also, under BeOS you constantly need to take care of multithreading issues and write your code around the fact that everything is so multithreaded on BeOS that could create deadlocks where you would least expect it. Writing small apps for the BeOS is a joy, writing anything more complex or serious though is a real pain.


http://www.jelovic.com/weblog/e102.htm

I’ve been hearing for the last few years how BeOS has an amazingly responsive GUI because of the way it is written. You almost never wait for the GUI to respond like you do under Windows or OS X.

That seemed too good to be true; something was obviously missing from the picture. Finally, after reading this article on OS News I understood the BeOS thing: it’s not fast, it simply has a largely asynchronous API. This provides a good GUI experience but is a pain in the ass to program.

At least with traditional programming languages. But say you used a language based on the join calculus like the Polyphonic C#… Hm…

A lot could be gained by familiarizing VB devs with concurrency patterns and how they apply to different kinds of problems.

Thatrsquo;s a tall order; education always is. A healthy start in this direction is Chris Sellsrsquo;s articles in the MSDN Wonders of Windows Forms column:

Okay, so maybe education is just a noble but unrealistic goal. I imagine therersquo;s a need for a concurrency class librarymdash;with an API design for maximum usability from the perspective of VB devsmdash;that makes bulletproof multithreading as easy as falling off a log most of the time.

Come on: concurrency is a solved problem. So the next step is to make it easy to do right without a comp-sci degree.

I don’t agree that it’s solved; even if it was, how do you deal with the order-of-magnitude increase in difficulty? Clearly we need a new paradigm.

The good news is, it’ll take a solid 3 years before multi-cpu machines* are common.

* beyond hyperthreading, which is only good for 15 percent performance increase in the most absolutely optimal of programming conditions. Hyperthreading is a “mini-me” CPU that isn’t capable of full computing tasks. Interesting commentary from Raymond Chen on this here:

http://weblogs.asp.net/oldnewthing/archive/2004/09/13/228780.aspx

Okay, I take back most of what I said. I thought it would be easier to abstract away some of the complexity of concurrency that the BCL leaves uncovered, but my attempts in the past week to build a novice-friendly class library for multithreading proved fruitless. I’d forgotten how difficult it is to solve even application-specific threading problems, such as implementing the asynchronous methods BeginDoMyWork() CancelDoMyWork() and EndDoMyWork() alongside the synchronous method DoMyWork().

I haven’t fiddled around with VS 2005 yet, but http://msdn2.microsoft.com/library/system.componentmodel.backgroundworker.aspx looks like it’s a step in the right direction. In the meantime, Rocky Lhotka suggested something similar for VB devs using VS 2002: http://msdn.microsoft.com/library/en-us/dnadvnet/html/vbnet09272002.asp.

I think that’s what Chris was getting at: we don’t have the right toolset to solve the concurrency problem. Someone like Anders Hejlsberg has to attack it and change the framework… possibly even come up with a new language entirely.

All we can offer is band-aids like BackgroundWorker. Which is certainly better than nothing, but it’s nowhere near the paradigm change we’re gonna need three years from now if these predictions pan out!

Scott Swigart has a great blog entry on which highlights some of the strategies Microsoft is pursuing to make concurrent programming easier:

http://swigartconsulting.blogs.com/tech_blender/2005/12/multithreaded_p.html

It’s odd to see a discussion like this and not see anyone mention Erlang. http://www.erlang.org

My personal opinion is that Erlang is too “functional” to pick up the necessary programmer support… although I think the “the general public never really masters good multi-threaded programming and it ever remains a thing for the masters” outcome probability is non-zero and in that case something semi-obscure like Erlang could still win out.

But I suspect that if nothing else, we’re going to end up with the basic Erlang model: Extremely cheap messaging, extremely cheap threads (want 5000? go for it), and absolutely no modifying the data in one thread from another, at least logically. (For performance reasons behind the scenes there may be “shared data”, but it’d be copy-on-write.) Might as well chuck in the network-transparency on the messaging while you’re at it.

Technically, there’s no reason that these three characteristics can’t exist in a “conventional”, non-functional language, which is why I don’t think Erlang is going to be the Big Winner since that new language is likely to get a lot more support from the general programming public. But the relationship will be clear.

But I don’t think you’re going to be able to patch your way to it from current languages anytime soon. Dynamically scheduling your heavy threads onto multiple processors under varying loads and trying to share data space the entire time is just fundamentally going to be too much to deal with. Much better to break the task up into lots of small parts and let the computer manage scheduling. Erlang proves its possible, at least if you build it into the language from day one, which was really the hard part of believing this.

I think one of the biggest hurdles will be to get the general business developer (i.e. average VB, .NET developer) to think about their application in terms of concurrency. They need to start identifying how their application can perform multiple business tasks in parallel. If they decide that identifying these concurrent processes is too difficult or that it takes too much time, they will then just rely on designing the application as they always have: serially.

Multi-threaded, concurrent programming is the hardest programming I have ever done, and I took the course (literally - CSC375) from a master - Doug Lea, who wrote ALL of the java.util.concurrent classes. But once I “got” it, it changed my entire outlook on developing software.
Also, for what it is worth, the java.util.concurrent classes can be translated into .net(for the most part) as the source is available. As a bonus, the javadocs that Doug wrote are excellent.

I don’t have any expectation that anytime soon, a massive breakthrough will occur that will make parallel programming much easier. It’s been an active research project for many years. Better tools will help and somewhat better programming methodologies will help. One of the big problems with modern game development with C/C++ languages is that your junior programmer who’s supposed to be over there working on how the pistol works can’t have one tiny little race condition that interacts with the background thread doing something. I do sweat about the fragility of what we do with the large-scale software stuff with multiple programmers developing on things, and adding multi-core development makes it much scarier and much worse in that regard.

That’s from John Carmack, hardest of the hardcore C programmers, and the author of Doom, Quake and other games.

http://www.gameinformer.com/News/Story/200701/N07.0109.1737.15034.htm?Page=3

Concurrent programming is difficult because the thread model sucks. The thread model is popular because it maps well to the underlying hardware.

The solution is to change to another paradigm - by switching to a language that integrates it. One option is Erlang. Another is occam, based on the CSP model (http://www.wotug.org). It is extremely easy writing concurrent software with occam - and it’s not even functional (although I regard this as a drawback rather than desirable).

A concurrent programming language implements the hard parts of concurreny once and for all. All the programmer has to do is to adobt the habit of writing easily understandable concurrent programs instead of sequential ones. This shouldn’t be too hard - most problems aren’t strictly sequential by nature. And the “hard part” of concurrency isn’t actually so hard to implement if you have a good underlying model.

Of course, limiting yourself to one model of concurrency prevents you from doing certain things that might be crucial to performance in certain situations. But this is little different from when you have to switch to assembler from within C. Anyway, unless you have a very good understanding of what you’re doing, limits are actually a good thing.

How about a web survey hosted on stackoverflow.com that tracks what proportion of developers are doing web/desktop/embedded development?

Sure, its only likely to be roughly accurate, but it’d be interesting statistics just the same …

It occurs to me that one of the advantages of functional- or functional-supporting languages is that functional programming is very much about defined interfaces. In the case of concurrent processing, the protocols for interchanging data and how they operate can be analysed easily in terms of functional concepts. On that basis, I think that good concurrent primitives and support are most likely to be handled best by those who have a good backing in functional programming, in the same way that functional programming concepts help very much in object oriented or procedural development.

Isn’t this what we are all doing with Ajax? If you are coding Ajax, even trivial Ajax, you are generally using multiple execution paths (I don’t know if JavaScript truly runs multiple threads though).

Since nearly all new development is occurring on the web, doesn’t this sort of self-solve the problem (as long as Javascript continues to improve)?

I think concurrency is simply taught incorrectly. Event based code execution is not that complicated, it’s trying to make multiple threads somehow finish at the same time that is complicated (in fact, nearly impossible).

No, Javascript does not use multiple threads, and so programming ‘concurrency’ in Javascript is greatly simplified — your ‘other thread’ is guaranteed to run only when no other code is running, so you don’t have to worry about it executing after a check, or halfway through a statement, or during an assignment, or part-way through a procedure.

I think the general consensus is simply that imperative programming is fundamentally the wrong paradigm for dealing with concurrent applications. Functional (and semi-functional — Erlang) languages are making leaps and bounds in this area (such as Erlang’s actor model, or Haskell’s STM), while ‘traditional’ imperative languages seem to have reached a local maximum. Concurrency in these models does still require some thought, but overall I would say that there are a lot less opportunities for horrible bugs to slip in.

So I’m curious @codinghorror, How do you feel about the multithreading in .NET 4+ ?

Haven’t looked at it! Current project is Ruby and JavaScript…