Tag Archives: Number

Google Translating Numbers

I am came across a post on reddit recently (in the wonderful subreddit “mildlyinteresting“, I’d recommend it if you like your internet entertainment mild) in which someone had put some numbers into the Google Translate engine, and got some interesting results. It turns out, that when you write a list of numbers, each with a full stop after them, and tell google you’re translating from Spanish to English, that some odd things happen.

1. translates as 1., but then 2. translates and Two. (the same happens for 3./Three.) but then, and here’s where things get really mildly interesting, 4. translates as April, as do the next four months.

Image

Go home Google Translate, you’re drunk etc. The interesting thing about this is the settings that are required to make this happen. There have to be dots after every number, without them google is boring and translates the numbers to numbers. This tells us something about why google is treating “5.” as May. Google has learnt (through user feedback and various algorithms) that when there’s a dot after a number, it’s likely to be a date. BUT JUST IN SPANISH. When you set the source language to things other than Spanish, google now decides that 4. is 4. Why on earth should this be the case for just Spanish text? I do not know the answer, do you?

Other languages do produce interesting results, translating from Armenian causes semicolons and parentheses to appear all over the place (link here). Then there are those that mix and match when words or digits are used. When translated in Catalan, the first five numbers translate as “First. Two. Three. Four. 5.”. A translation in Romanian will have the first 9 numbers translated as months apart from 3.

A Belarusian translation makes all the numbers ordinal, and rather cryptically translates 3. as “The Third”. (WHAT IS SO SPECIAL ABOUT 3?!) Is this a result of Kings and Queens in Belarusian history? A quick look at the wikipedia article shows there are indeed some people who are “the third” but there are equally some “the seconds” and “the fourths” out there. Mysterious.

Image

This translation quirk highlights the fact that a single digit can represent a whole range of things, depending on context. It is only because these numbers have been taken out of context that it seems odd to us. In the date today 20.8.13, it’s obvious that 8 stands for August, and that 20 should be read as twentieth (if you’re reading the date in the UK). When we see a digit, we see a whole range of different things. And this, dear readers, is one of the reasons that studying number entry is cool and interesting and a reason to be friends with me.

Tagged ,

Double blogging

To save me talking about my most recent published work for a second time, here’s a link to my Digit Distribution work on the CHI+MED project’s blog.

Tagged ,

Punctuation and number entry

I was recently told a story by JW which made me reconsider by focus on purely digits and the decimal points. I now consider the comma to be an interesting character in the world of number entry.

J explained how he was trying to set up a monthly payment online for a particular bill.  He wanted $170.00 to be automatically taken out of his account, and sent the billing company.  On the website, he got to the text box in the form that asked for the monthly amount he wished to pay, and he entered in 170.00:

Only this isn’t what he typed. He typed this out on a laptop keyboard and so used the top row of numbers, and the full stop as a decimal place. When he typed the decimal place however, his finger slipped and what he actually entered was this:

A few weeks later, J gets a call from his wife, frantically asking him why their bank balance is so incredibly low all of a sudden. After inspection, it turns out that for two months, $17,000 has been leaving their account each month. Oops.

When J submitted the form, he didn’t notice his mistake, there was a difference of two pixels between the decimal point and comma character in the font on the web form. He saw there was something there between the first to zeros and understandably assumed it was the decimal point he’d gone to press.

The first error was typing a comma instead of a decimal point and not noticing it. But really, if this wasn’t picked up by the human, it so easily could or should have been picked up by the system. That comma is in a weird place – if it were representing a thousand marker, there should be an extra zero, or is should be between the seven and the zero. This was a malformed number entry. And the system did not pick up on this. Instead, it chose to ignore that comma, and strip it out of the input, leaving only 17000. And resulting in an incredibly angry customer.

Conclusion: Think about the erroneous characters people enter, even in number entry. Don’t ignore them without thinking about why they might be there.

Tagged ,