What's Wrong With Turkey?

Heh I did not know about the dotted i problem.

At least one of the problems above seems to be, though, that the code was executing in the user’s locale, i.e. a string the developer put in his code was being interpreted in the turkish locale. This seems to me an ASP.Net oddity, where as a convenience you set the entire thread to a specific locale and thus suddenly the entire semantics of your original program state (without any input or output) can change completely.

For me, the proper way to deal with this would be to keep the culture of the thread set to a “sane default” (invariant culture, perhaps?) and keep an eye on converting/parsing as necessary per the user culture wherever input/output takes place. I’m not sure why ASP.Net was chosen to behave the way it does, since it seems an easy source of bugs.

You just opened a can of worms :slight_smile:

“3) The Turkey Test gets you 90% of the way to the goal of internationalizing most apps.* We know French and Spanish are going to work. Why not test with the most difficult (but realistically difficult) locale first?”

I fail to see how Turkish gets you 90% of the way, unless you can easily say that the character-set limitations and formatting rules for dates, etc. encompass 90% of the possibilities for internationalisation. I think it’s naive (no pun intended) to assume that just because your program is compatible with Turkish formatting, it’s going to be compatible in 90% of circumstances.

The approach to take is to not try to shortcut to internationalisation, but to actually do internationalisation properly. How would you say that Turkey will help you internationalise Norwegian, for example, where you can have circles above letters, etc… or Greek, where there are plenty of other symbols available. Sure, there are common characters - but that’s besides the point… that’s more like specialisation rather than internationalisation.

I know you’re talking about Latin-based character sets, but how on Earth is that true internationalisation?

As a Hungarian developer, I also have fun with internationalization issues in software. Some aspects every developer could add to their checklist:

  1. Some Hungarian accents (#337;#369;) do not fit into the ISO-8859-1 codepage used by most Americans, so in most cases you need Unicode to support Hungarian - at least we fit into the Basic Multilingual Plane opposed to some unfortunate cultures.

  2. Academic collation order cannot be implemented without using some heuristics about the semantics. Opposed to the lucky Czechs, we historically decided to represent some of our phonemes with multiple graphemes (for example “sz” and “zs” is one sound). If you write “egszsg”, you need to know that it is a “sz” followed by an “s”, and not an “s” followed by a “zs” to put it in the right collation order. The solution was to introduce a so-called technical collation order, which is still more complicated than an ASCII-ordering, but at least it does not need the semantic analysis.

  3. The thing that gives us the most fun is that we write family and given names in reverse order compared to sane countries. My family name is “Vgvlgyi”, the given (first!) name is “Attila”, so my full name is “Vgvlgyi Attila”. Although I am used to reverse my names in foreign cultures, for a software used in Hungary, you need to translate the format string you use to glue the parts of the name together. Even gmail fails to do this. By the way, instead putting “Mr”, “Mrs” and “Ms” before the full name, we put “r”, “asszony” and “kisasszony” after it, but only in a very formal salutation.

  4. We glue prepositions to the words they refer to, and sometimes we assimilate the word or the preposition. So if you thought you could translate “with” “hand” separately, you will not deliver your software to Hungary. “kzzel” is the right form, which is made of “kz” (hand) and “-vel” (with) assimilated. And yes, we also make plurals in a bit strange way, but leave that for some other time.

My point is, that proper internationalization is a linguistic and cultural issue, and you need to get rid of a lot of assumptions if you would like to develop cross-culturally. If you bought in to the lie that simply changing an environment variable will help you making your software “speak” a given language, you will be surprised in real life.

The Reason is the upper/lower case problem…The worst bug in the computer science world

Only somewhat related, but a university I went to got a new library website, and it was AWFUL. It was impossible to find anything. You’d find the record you’d want and the button to display it would be exactly opposite of where you expected it to be.

I later met someone who was a developer, and found that they had outsourced the entire thing to Israel. Which wouldn’t be a problem, other than Israeli’s read from right-to-left, not left-to-right. Which made the design weirdness look a little less random – the left-right orientation was pretty much exactly opposite of what I would have intuitively expected it to be.

Culture is a bitch.

Oh Man. Are you ever gonna get hammered for mentioning Turkey and Midnight Express in the same column.

That’s kind of like mentioning America and…

well there’s nothing bad enough to be an equivalent.

Oh I figured an analogy out.

It’s like mentioning America and Abu Ghraib and saying Abu Ghraib is representative of American morality.

Actually, no its worse than that because in the case of America, Abu Ghraib is real, but Midnight Express is just a movie.

Hey Now Jeff,
I’ve seen the Turkey test since two good friends are Turkish Tulih Volkan, who love soccer pc’s. I really liked this post.
Coding Horror Fan,
Catto

“Or there will be… trouble.”

You had to sneak in a Robocop reference, didn’t you?

“When he visited Turkey in 2004, screenwriter Oliver Stone, who won an Academy Award for his adaptation, apologized for the film, expressing regret that ‘many hearts were broken in Turkey’ due to the film.”

Midnight Express is ‘more violent, as a national hate-film than anything I can remember’, ‘a cultural form that narrows horizons, confirming the audience’s meanest fears and prejudices and resentments’". John Wakeman(ed) (1988). World Film Directors. New York: T.H. W. Wilson Co.

  1. I Agree with the first point…the characteristics identified as Turkish issues apply to a lot of European countries (I lived in Sweden and expereinced the date and period and character issues discussed, yet some of the best software and OS work comes out of Scandinavia.

  2. Stating that we know things work in France and Spain is erroneous. I did work for a large US multinational and when we were deploying world wide process control systems amongst many countries: Australia, US, Canada, Mexico, France, UK, Brazil we had real issues with the French installs, primarily due to the French version of the OS (in this case Windows on the PC clients).

The US engineers had absolutely no concept that there would be localized versions of the OS (this was in the mid 90s.

  1. There was mention about using ISO standards to enter information - most of the world does, it is the US that does not: Other examples, mph instead of kph, paper size (letter versus A4, the international standard)…heck, even look to NASA and it is all imperial units, yet this is (now) part of an international consortium!!

Localization makes the software presentable, however, the underlying data storage and maipulation should be in ISO format/standards and you merely have the localized ‘presentation’ for user input and output (this obvious includes text).

Has anyone ever heard a good reason for using the mm/dd/yyyy date format? Just curious.

Midnight Express was based on a true story, but we knew Turkish prisons were bad before the movie.

Haaa. Jeff, you really like to stir the pot and getting your fingers burned! I do take issue with your characterization of Midnight Express as a credible source for your opinion–and many commenters here pointed out why (it’s basically wrong as a historical document) – but I laugh about it because I sense that you are being facetious there. No one in a right state of mind understand anything about Turkey from that one movie. You are right about using Turkish as a test case for internationalization, and it’s good to see my country’s flag here.

The reason to use YYYYMMDD is not because it helps with a “standard numerical sort”, but because it puts the Most Significant portion first, just like in common number printing (there may be differences between using comma and period as a decimal separator, but the places extend to the left in increasing significance. No one prints the hundreds place to the right of the ones place, do they?

@PaulG, you are unbelievably ignorant.

http://en.wikipedia.org/wiki/Midnight_Express_(film)#Billy_Hayes_interviewed

Note when in non Turkish locales you want to ignore the differences
between #304; I when changing case. These characters are normalised to
ASCII on linux at least by doing upper() then lower().
When in the turkish locale the characters are not merged as expected.
I’ve used the following function on linux to transform text before comparison:

void transform(wchar_t* wcs)
{
    if (ignorecase) {
        /* Note this handles Turkic Case folding */
        (void) wcsupper(wcs);
        (void) wcslower(wcs);
    }
    /* Other possible transformations one could do here are:
           StripDiacritics:   - A
           ConvertEnclosed:  #9398; - A
           ConvertStylistic: #65313;- A
           TurkicFoldCase:   #304; - i
       Note the above are transformations done in msort.

       Note python can normalise some things also:
       unicodedata.digit(u'\u2462')  # #9314;  - 3
    */
}

Nothing wrong with Turkey…Turkish language is different than English, that’s all…
Question is; What’s wrong with you?

What does Turkish prisons or Turkey as a country have to do with this problem? Who is breaking your fine software, Turkish people or the fact that your software is not localized that fine?

“Has anyone ever heard a good reason for using the mm/dd/yyyy date format? Just curious.”

So that Americans can understand the date? :wink: It is just tradition at this point, I think.