Exploring Wide Finder

oliver · June 10, 2008, 12:00am

Btw. on the issue of threading there’s some interesting stuff today at http://www.gnome.org/~michael/blog/2008-06-10.html , and linked from there: http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf - “The Problem with Threads”.

(Though, Meeks’ conclusion which mentions helgrind as a helper is somewhat opposite to the PDF, which rather wants to “discard threads as a programming model”… And while I think helgrind is a nice tool for fixing those damn multithreaded apps, I prefer to completely steer clear of threading if possible)

charles10 · June 10, 2008, 12:00am

Regarding language support for multi-core and silver bullets, I wonder how good Apple’s solution with Grand Central in Snow Leopard will be. Unfortunately, it is NDA right now, but when it is out there in public (and open source?), that could be an interesting option.

“Grand Central, a new set of technologies built into Snow Leopard, brings unrivaled support for multicore systems to Mac OS X. More cores, not faster clock speeds, drive performance increases in today’s processors. Grand Central takes full advantage by making all of Mac OS X multicore aware and optimizing it for allocating tasks across multiple cores and processors. Grand Central also makes it much easier for developers to create programs that squeeze every last drop of power from multicore systems.”

So basically, support in the language and some dynamic decisions about allocation of threads.

see http://www.apple.com/pr/library/2008/06/09snowleopard.html and http://www.apple.com/macosx/snowleopard/

astigmatik · June 10, 2008, 12:00am

Regarding Tim Bray’s claim that Ruby is the most readable, I have to disagree as well. C’mon… ‘puts’? ‘chomp’? Its ‘unreadability’ is one of the reasons I never went far with it.

I’m tempted to say that Python is the most readable of all.

Codewiz51 · June 10, 2008, 12:00am

This seems to be much ado about nothing.

Implementing with even the most simplistic OpenMP directives has a large impact on performance. Without a significant increase in source code.

Why in God’s name would you attempt performance critical code in an interpreted language anyway?

ICR21 · June 10, 2008, 12:00am

I don’t know Ruby, so I found it very difficult to read the first time round. I’m guessing though that if I did know Ruby it wouldn’t be so hard - indeed after reading the comments and explanations I can follow it okay.

Personally one thing about Ruby I’m not too keen on is making things try to read as close to English as possible, because more often than not it seems to mask a lot of the details of what’s going on. What may be quicker to understand at a conceptual level at a glance becomes very difficult to debug when it’s ever so slightly wrong.

There are lot’s of things I like about Ruby, but the Ruby style and the Ruby culture doesn’t float my boat, which is a shame.

ICR22 · June 10, 2008, 12:00am

“Their uptime is fairly pathetic for a serious outfit and they’d probably have been quicker to rewrite Twitter in a “serious” language by now rather than desperately trying to work out how to hack RoR into a serious production environment.”

I think Twitters problems come more from a very poor architecture rather than the language. http://dev.twitter.com/2008/05/twittering-about-architecture.html

MarcelP · June 10, 2008, 12:00am

“While you’re there, I also suggest reading Tim’s analysis of the results…”

Tim’s analysis is a wikipedia page? I think you got the wrong URL there…

Lee_Grissom · June 10, 2008, 12:00am

A couple of other people mentioned Microsoft Parallel Extensions. Check out Allen Bauer’s blog, he digs into it.
http://blogs.codegear.com/abauer/2008/02/22/38857

RichardH · June 10, 2008, 12:00am

This from a recovering Delphi fan:

WHERE ARE THE COMMENTS?

line =~ %r{GET /ongoing/When/\d\d\dx/(\d\d\d\d/\d\d/\d\d/[^ .]+) }

One simple comment would avoid the need to load my mental RE parser. Once I have figured out what it does, then I need to guess why it’s there.

Clever code is more fun to write than it is to maintain.

Cheers

MarkA · June 10, 2008, 12:00am

@Jon Raynor

Less lines of code doesn’t necssarily equal beautiful code.
Additionally, machines (compilier’s) don’t care about beauty, they just run it.

This is where Ruby (at least using Matz Ruby Interpreter) breaks all the rules. Because it is a primitive AST-walking interpreter, the smaller a program is syntactically, the faster it is likely to run (at least compared to other semantically similar programs.)

I suspect this fact has a lot to do with Rubyists’ obsession with with syntactic compression…

Cedric · June 10, 2008, 12:00am

Sorry Mark, but this is patently false. The quality and speed of code produce is completely unrelated to the size of the source code, as anyone who’s ever written a “Hello world” in any language can show.

As it turns out, Ruby is one of the slowest dynamically typed languages on the map today.

Shmork · June 10, 2008, 12:00am

This part sums up the unreadability of it to me:

keys_by_count[0 … 9].each do |key|
puts "#{counts[key]}: #{key}"
end

Obviously this is some sort of foreach loop. But does it increment key? or what? 0 … 9? Two dots? I mean, I’m sure it DOES make sense, but it doesn’t look like any improvement over a simple foreach construction, as even the much maligned PHP can do.

But anyway, I think the overall point is totally spot on. The people who rant about elegance, beauty, the way it “should” be done are often far, far removed from the trenches. A good programmer knows not to get obsessed with elegance and beauty, to keep a strong sense of realism at all times.

Jaster · June 10, 2008, 12:00am

The syntax is very readable except for the overuse of non-obvious chars |#~%
these don’t say anything to a non-Ruby programmer… i.e. have to be learned…

This is why APL died…and why many people hate regex’s

You have an editor with auto-completion, the compiler does not care about verbose syntax, the program will be exactly the same size and run the same speed … so why are you using lots of silly chars …?

Tony · June 10, 2008, 12:00am

“I think Twitters problems come more from a very poor architecture rather than the language.”

A big part of the problem is the framework they chose: Rails. And, yes, I CAN place the blame there, even if the architecture is ultimately the problem. I can do that because the Rails framework encourages a certain DB architecture - one that is inherently unscalable - and discourages more innovative and scalable architecutres (sharding, for example).

I don’t understand the mindset of the Ruby / Rails crowd (and yes, I group them together, because it was the advent of Rails that has caused the recent explosion in the popularity of Ruby). It’s really something of a fanboy cult. And if you have any criticism to offer? Well: http://www.robbyonrails.com/articles/2006/04/13/canada-on-rails-day-1-part-1

After spending three months working full-time in Ruby, I can say I don’t get it. I did not find it at all intuitive or easy to use. In fact, quite the opposite. I realize that you spend a lot of time in the manual when learning a new language, but it was ridiculous the amount of time I spent looking up how to do things - things that I would have expected to be obvious.

Ruby MIGHT be taken as a serious language, but only if the current attitudes that are associated with it are set aside. It is one tool among many, and may or may not be the best choice. But so far, I haven’t seen the case where Ruby is clearly the better choice.

engtech · June 10, 2008, 12:00am

“what does this do… well i have to guess massively and say that it sorts the keys and stores them in a new array… but the bit inside the {} is completely ambiguous. it does something to do with pairs of values and something which /looks like/ it might be a swapping operation.”

You’re stumbling over the syntax for Ruby blocks. Yes, other languages don’t do them and it’s a shame because they’re awesome. It takes about 10 minutes to get the hang of them with a good explanation.

This might not be a good explanation.

First of all, {} is the same as begin/end when it comes to blocks. People use {} for one-liners, begin/end for multi-liners.

Here’s an example of:

for(int i=0; i=10; i++) {
do_something(i);
}

is the same as

(0 … 10).each begin |i|
do_something(i)
end

is the same as

(0 … 10).each { |i| do_something(i) }

So what are we doing? We’re using a ‘range’ to build an array from 0 to 10. Then we’re using ‘each’ to execute an arbitrary block of code on an array.

Because it’s an array, our block will get one argument. If it was a hash our block will get two args.

|i| names the argument to the block i

But that’s not so useful, right? It’s just a loop with different syntax.

Here’s why it’s cool: you can use blocks with any code.

def my_function
print "Hello "
yield
print " World"
end

and calling it with:

my_function() do
print " Goodbye "
end

would give you

“Hello Goodbye World”

engtech · June 10, 2008, 12:00am

oh, and Ruby is a lot easier to read if you’re coming from a perl background.

Stuff like the regular expression syntax and the sorting syntax isn’t that strange if you come from a perlish background.

Wayne · June 11, 2008, 12:00am

If I was writing that program I’d avoid the sort routine and just keep hold of the top ten most frequent strings.

FrancisF · June 11, 2008, 12:00am

The speed Ruby gives is nothing to do with its execution time. That said, Ruby 1.9 is coming (not for Rails yet though, big change) and JRuby is pretty quick too - it’s being worked on.

I can do stuff in a quarter of the time (or less) it used to take me in Java, with a tenth of the code. It is easier to maintain because of the use of blocks, which allow templating:

set up database connection
execute query, or do something do the database
tear down the connection
and handle any exceptions gracefully

With blocks you just pass in 2, and are not at home to the cut and paste monster eating your tear down code, you can do just about anything in the block, and the exception handling can be halfway sane. This applies to all of these situations, reading and writing to files etc. etc. You can do this in Java but it is very hard, because the language doesn’t like you, the programmer, it doesn’t trust you to know what you are doing (why is String final? Because it doesn’t trust you - I could go on).

I think the example is poor, because it reads like something you’d type into the interactive console when you were trying things out. If it needs comments it’s been written wrong - c.f. Fowler’s “Refactoring” book where there is a pattern of replacing complicated comment with a well-named, and factored, method.

There’s the whole metaprogramming and code generation thing, which is really easy in Ruby - see http://s3.amazonaws.com/giles/scissors_041108/scissors.pdf (pdf!!)

Have a poke around in the Scotland on Rails website for more interesting stuff…

http://scotlandonrails.com/talks

Somewhere in the blogosphere there is also an interesting article where some code is actually refactored to have more lines, to make it maintainable. Less is more, but more is better when you can understand it. This is on the Rails Envy podcast site somewhere.

ICR23 · June 11, 2008, 12:00am

As a side note, myself and others seem to perform this task for often I’m surprised I don’t see more implementations of FrequencyDictionary ADTs.

Ron · June 11, 2008, 12:00am

Just on the odd characters Ruby uses: a friend sent me a Java code fragment in which he looped through printing “Thank You!” a million times (it was a response to a professor who had extended the deadline on a paper). I responded with a single line of Ruby to do the same, and a single line of Lisp.

He wrote back:
underscores, pipes, octothorpes, curly braces – sheesh…
I’ll take a mild dose of verbosity if means I don’t have
to code something that looks like it’s been zipped already