Tag Archives: language

Google Translating Numbers

I am came across a post on reddit recently (in the wonderful subreddit “mildlyinteresting“, I’d recommend it if you like your internet entertainment mild) in which someone had put some numbers into the Google Translate engine, and got some interesting results. It turns out, that when you write a list of numbers, each with a full stop after them, and tell google you’re translating from Spanish to English, that some odd things happen.

1. translates as 1., but then 2. translates and Two. (the same happens for 3./Three.) but then, and here’s where things get really mildly interesting, 4. translates as April, as do the next four months.

Image

Go home Google Translate, you’re drunk etc. The interesting thing about this is the settings that are required to make this happen. There have to be dots after every number, without them google is boring and translates the numbers to numbers. This tells us something about why google is treating “5.” as May. Google has learnt (through user feedback and various algorithms) that when there’s a dot after a number, it’s likely to be a date. BUT JUST IN SPANISH. When you set the source language to things other than Spanish, google now decides that 4. is 4. Why on earth should this be the case for just Spanish text? I do not know the answer, do you?

Other languages do produce interesting results, translating from Armenian causes semicolons and parentheses to appear all over the place (link here). Then there are those that mix and match when words or digits are used. When translated in Catalan, the first five numbers translate as “First. Two. Three. Four. 5.”. A translation in Romanian will have the first 9 numbers translated as months apart from 3.

A Belarusian translation makes all the numbers ordinal, and rather cryptically translates 3. as “The Third”. (WHAT IS SO SPECIAL ABOUT 3?!) Is this a result of Kings and Queens in Belarusian history? A quick look at the wikipedia article shows there are indeed some people who are “the third” but there are equally some “the seconds” and “the fourths” out there. Mysterious.

Image

This translation quirk highlights the fact that a single digit can represent a whole range of things, depending on context. It is only because these numbers have been taken out of context that it seems odd to us. In the date today 20.8.13, it’s obvious that 8 stands for August, and that 20 should be read as twentieth (if you’re reading the date in the UK). When we see a digit, we see a whole range of different things. And this, dear readers, is one of the reasons that studying number entry is cool and interesting and a reason to be friends with me.

Tagged ,