{
  "path": "/3m2gallm52c23",
  "site": "at://did:plc:tc67mguo6vvlgqpw54mez6jv/site.standard.publication/3m2fzquwkhs2a",
  "$type": "site.standard.document",
  "title": "Translategate: The ghost in the Machine",
  "content": {
    "$type": "pub.leaflet.content",
    "pages": [
      {
        "$type": "pub.leaflet.pages.linearDocument",
        "blocks": [
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [],
              "plaintext": "Around early 2023, rumours started flying in the comments of different Instagram posts. There seems to be something not quite right with the translate feature. Commenting \"Oouga Bouga\" triggers the translate button to appear and seems to translate nonsensical variations. Some are racist translations, some are sexual innuendos, some seem to be ramblings of sentient beings."
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [],
              "plaintext": "This is a screen recorded from that time - "
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "src": "https://www.instagram.com/reel/CpALN8FLC9B/",
              "$type": "pub.leaflet.blocks.website",
              "title": "Your mom on Instagram",
              "description": "6 likes, 24 comments - makeyourmomgreatagain87 on February 23, 2023",
              "previewImage": {
                "$type": "blob",
                "ref": {
                  "$link": "bafkreifgjpuxqbpoqmypsgvivmgysn4ibqdaodhrrj352ntl2rykfgjpfq"
                },
                "mimeType": "image/png",
                "size": 5612
              }
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [],
              "plaintext": "None of the user comments show the behaviour anymore. But everyone expected this to be quickly patched because this had the potential to be a PR disaster for Instagram. However, it's December of 2023 and still didn't seem to be patched. "
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "src": "https://www.instagram.com/reel/C07h1PquFIO/",
              "$type": "pub.leaflet.blocks.website",
              "title": "supercarscreams (@supercarscreams) • Instagram reel",
              "description": "53K likes, 39K comments - supercarscreams on December 16, 2023: \"insta found out & patched. can yall help me reach 150k before new years? 🙏😻 - 🏎️ @wsdtony  📸 @v10driver  - #funny #drift #drifter #cargram #carinstagram #cars #car #audi #audis5 #gt500 #hellcat #gt350 #mopar #v8 #v12 #v10 #carswithoutlimits #carfails #carfail #jdm #gtr #bmw #bmee36 #bmwm #bmwm3 #porsche #ferrari #koenigsegg #cayenneturbogt #speed\".",
              "previewImage": {
                "$type": "blob",
                "ref": {
                  "$link": "bafkreidfsx7maj7vk7qmhcj6eubggdwsppqgtduvvzawyglcv2tx3snyui"
                },
                "mimeType": "image/png",
                "size": 5930
              }
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [],
              "plaintext": "The comments of this post have some examples that still work in different ways. Coming to March 2024, Matt Rose posts a compilation of all the different variations that were discovered by users."
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [
                {
                  "index": {
                    "byteEnd": 31,
                    "byteStart": 5
                  },
                  "features": [
                    {
                      "uri": "https://www.reddit.com/r/OutOfTheLoop/comments/1b8fcck/comment/ktrxxk8/",
                      "$type": "pub.leaflet.richtext.facet#link"
                    }
                  ]
                }
              ],
              "plaintext": "Also a comment by a reddit-user points out the connection to google translate, where similar behaviour has been observed. You can see the behaviour still exists."
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "src": "https://translate.google.com/?hl=en&sl=so&tl=en&text=ooga%20booga&op=translate",
              "$type": "pub.leaflet.blocks.website",
              "title": "Google Translate",
              "description": "Google's service, offered free of charge, instantly translates words, phrases, and web pages between English and over 100 other languages.",
              "previewImage": {
                "$type": "blob",
                "ref": {
                  "$link": "bafkreie45xqn5xbsqtcb2bjuoip6wvnndtm5i6vabvnph4msn3mg5w2way"
                },
                "mimeType": "image/png",
                "size": 5310
              }
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [],
              "plaintext": "At this point, when searching for other bugs between Somali and English, I came across this post on a Google support forum"
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "src": "https://web.archive.org/web/20250315051518/https://support.google.com/translate/thread/3687976/somali-to-english-bug?hl=en",
              "$type": "pub.leaflet.blocks.website",
              "title": "somali to english bug .  - Google Translate Community",
              "description": "",
              "previewImage": {
                "$type": "blob",
                "ref": {
                  "$link": "bafkreihq2grr4lr7ir4ouln6evpa4ezmcsyhpqlnapeidvvjlpzpewxwau"
                },
                "mimeType": "image/png",
                "size": 20981
              }
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [],
              "plaintext": "This issue now has been traced atleast back to 2019 mainly on the same Somali to English translation pipeline."
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [
                {
                  "index": {
                    "byteEnd": 28,
                    "byteStart": 13
                  },
                  "features": [
                    {
                      "uri": "https://www.reddit.com/r/TranslateGate/top/?t=all",
                      "$type": "pub.leaflet.richtext.facet#link"
                    }
                  ]
                }
              ],
              "plaintext": "Then I found r/translategate which has examples from August of 2018. If you go through some of the top posts, you will see that the examples exist in multiple languages. These are languages that might not have been extensively trained on. However, some of them are still surprising - Here is a 2025 screenshot of a Bulgarian to English translation that still works."
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.image",
              "image": {
                "$type": "blob",
                "ref": {
                  "$link": "bafkreicyoi6wyemzlbal4eg2br47zfwxiaotiszvef3h7b6ccbu2zycc6i"
                },
                "mimeType": "image/png",
                "size": 39634
              },
              "aspectRatio": {
                "width": 1403,
                "height": 502
              }
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [],
              "plaintext": "I was quite puzzled at this point at how this was happening post the 2023 cambrian explosion of LLMs. That is when I came across this post by a reddit user"
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "src": "https://imgur.com/gallery/reporting-petscop-bugs-google-translate-IIAoo",
              "$type": "pub.leaflet.blocks.website",
              "title": "Reporting Petscop bugs in Google Translate",
              "description": "Discover the magic of the internet at Imgur, a community powered entertainment destination. Lift your spirits with funny jokes, trending memes, entertaining gifs, inspiring stories, viral videos, and so much more from users like Nathan1123.",
              "previewImage": {
                "$type": "blob",
                "ref": {
                  "$link": "bafkreigi4paqqf5m6stxkhdd6mztalztkw64ewmdclkbl3rsox2ssuryku"
                },
                "mimeType": "image/png",
                "size": 7758
              }
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [],
              "plaintext": "This user's theory was that user contributions had poisoned the well. However, if you try to trace it from 2018 to 2025, this would mean that the poisoning was quite effective. So I started to look at other sources to see how Google had built the translation pipelines."
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [
                {
                  "index": {
                    "byteEnd": 39,
                    "byteStart": 30
                  },
                  "features": [
                    {
                      "uri": "https://en.wikipedia.org/wiki/AI_winter",
                      "$type": "pub.leaflet.richtext.facet#link"
                    }
                  ]
                }
              ],
              "plaintext": "Around 2012 fresh out of the \"AI winter\" that started in the 1987, neural nets were starting to seem like they could be better approaches to Machine Learning. This was in conjunction with an explosion of GPU-based parallel processing capabilities that enabled neural networks to be more powerful. AlexNet had won a hard fought 10% lead on the runner up in the ImageNet challenge. This was followed by breakthroughs with DeepMind, specifically in AlphaZero and AlphaGo which solved for Chess and Go gameplays."
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [],
              "plaintext": "In 2014, 3 Google engineers Ilya Sutskever, Oriol Vinyals, Quoc V. Le wrote a paper that demonstrated that sequences of words could be semantically linked by reversing them."
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.image",
              "image": {
                "$type": "blob",
                "ref": {
                  "$link": "bafkreifly5oqq24pdklf6gff5ri4a6fexkqj2tk2blma4jaqeeilylurtm"
                },
                "mimeType": "image/png",
                "size": 92441
              },
              "aspectRatio": {
                "width": 1187,
                "height": 598
              }
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [],
              "plaintext": "This was a far leaner method of semantic linkage than what was then the bleeding edge of language translation - Statistical Machine Translation. This method broke down sentences into words or phrases and statistically chose translations for those words from a large vocabulary dictionary"
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [],
              "plaintext": "What they proposed as an alternative was a \"sequence-to-sequence\" approach instead of a word-to-word approach. To do this they created a new type of neural net that they called the Long Short-Term Memory (LSTM). To train this model, they used 12 million English-French sentence pairs from the WMT'14 dataset consisting of 348M French words and 304M English words. The English sentences were reversed during the training processes. This caused the model to retain a lot more of the context between the words in the sequences. This context however, was not very effective for very long sequences and was prone to hallucinations."
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [],
              "plaintext": "Google, in the years that followed started to train a corpus of"
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.unorderedList",
              "children": [
                {
                  "$type": "pub.leaflet.blocks.unorderedList#listItem",
                  "content": {
                    "$type": "pub.leaflet.blocks.text",
                    "facets": [
                      {
                        "index": {
                          "byteEnd": 123,
                          "byteStart": 33
                        },
                        "features": [
                          {
                            "uri": "https://research.google/blog/a-neural-network-for-machine-translation-at-production-scale/",
                            "$type": "pub.leaflet.richtext.facet#link"
                          }
                        ]
                      }
                    ],
                    "plaintext": "NMT is born and christened GNMT (https://research.google/blog/a-neural-network-for-machine-translation-at-production-scale/)"
                  },
                  "children": []
                },
                {
                  "$type": "pub.leaflet.blocks.unorderedList#listItem",
                  "content": {
                    "$type": "pub.leaflet.blocks.text",
                    "facets": [],
                    "plaintext": "Uses seq2seq LSTMs"
                  },
                  "children": []
                }
              ]
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.text",
              "facets": [],
              "plaintext": ""
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.unorderedList",
              "children": [
                {
                  "$type": "pub.leaflet.blocks.unorderedList#listItem",
                  "content": {
                    "src": "https://www.reddit.com/r/Instagram/comments/119eyw9/amusing_instagram_translation/",
                    "$type": "pub.leaflet.blocks.website",
                    "title": "Amusing Instagram translation",
                    "description": "96 votes, 60 comments. As of December 2023 this has been patched Possibly old news but I figured this out yesterday. If you comment \"Ooga booga\" and…",
                    "previewImage": {
                      "$type": "blob",
                      "ref": {
                        "$link": "bafkreiab37aq57h75pueqiw4hhogrrk7ibthnszkekavmbsp77kfllf2eq"
                      },
                      "mimeType": "image/png",
                      "size": 8100
                    }
                  },
                  "children": []
                },
                {
                  "$type": "pub.leaflet.blocks.unorderedList#listItem",
                  "content": {
                    "src": "https://www.reddit.com/r/OutOfTheLoop/comments/1b8fcck/what_is_going_on_with_the_crazy_translations_of/",
                    "$type": "pub.leaflet.blocks.website",
                    "title": "What is going on with the crazy translations of \"hooga booga\" or similar comments on Instagram?",
                    "description": "https://www.instagram.com/reel/C4JflrYPNNZ/?igsh=MXJ4OTh4Mzg1eDFtYw== The video actually explains what it is, so I'm mainly asking is it an actual…",
                    "previewImage": {
                      "$type": "blob",
                      "ref": {
                        "$link": "bafkreiab37aq57h75pueqiw4hhogrrk7ibthnszkekavmbsp77kfllf2eq"
                      },
                      "mimeType": "image/png",
                      "size": 8100
                    }
                  },
                  "children": []
                },
                {
                  "$type": "pub.leaflet.blocks.unorderedList#listItem",
                  "content": {
                    "src": "https://www.reddit.com/r/eastereggs/comments/9ocmko/google_translator_creepy_easteregg/",
                    "$type": "pub.leaflet.blocks.website",
                    "title": "Google translator creepy easteregg",
                    "description": "I was watching a video of google easter eggs and it said that if you write on google translator from Somali to English \"ar ey ou ca pa bl eo fk il li…",
                    "previewImage": {
                      "$type": "blob",
                      "ref": {
                        "$link": "bafkreiab37aq57h75pueqiw4hhogrrk7ibthnszkekavmbsp77kfllf2eq"
                      },
                      "mimeType": "image/png",
                      "size": 8100
                    }
                  },
                  "children": []
                },
                {
                  "$type": "pub.leaflet.blocks.unorderedList#listItem",
                  "content": {
                    "src": "https://www.teachyoubackwards.com/qualitative-analysis/#ooga-booga",
                    "$type": "pub.leaflet.blocks.website",
                    "title": "Qualitative Analysis of Google Translate across 108 Languages - Teach You Backwards",
                    "description": "Several myths lead people to believe Google has achieved the dream of universal translation. We examine those myths, and what Google Translate really does.",
                    "previewImage": {
                      "$type": "blob",
                      "ref": {
                        "$link": "bafkreib2m64vppsmh3zvsewgk3u5riztjws4ubkk3ydzzdvjplmck5gb5a"
                      },
                      "mimeType": "image/png",
                      "size": 47511
                    }
                  },
                  "children": []
                },
                {
                  "$type": "pub.leaflet.blocks.unorderedList#listItem",
                  "content": {
                    "src": "http://kamu.si/tyb-musa",
                    "$type": "pub.leaflet.blocks.website",
                    "title": "Introduction: Into the Black Box of Google Translate - Teach You Backwards",
                    "description": "Does Google Translate translate? You might put your faith in it. You might laugh at it. I studied it, for all its languages. This is the first comprehensive examination ever conducted of the world’s most-used translation tool.",
                    "previewImage": {
                      "$type": "blob",
                      "ref": {
                        "$link": "bafkreidzb3h3hz6njd4kriqt53xccqhnbgmltbweobyfear7m2qnhd74ay"
                      },
                      "mimeType": "image/png",
                      "size": 847
                    }
                  },
                  "children": []
                },
                {
                  "$type": "pub.leaflet.blocks.unorderedList#listItem",
                  "content": {
                    "src": "https://web.archive.org/web/20200610215424/https://people.eecs.berkeley.edu/~clarafy/neurips_irasl_2018.pdf",
                    "$type": "pub.leaflet.blocks.website",
                    "title": "",
                    "description": "",
                    "previewImage": {
                      "$type": "blob",
                      "ref": {
                        "$link": "bafkreih5d4rndyxc3waszwuxfjw3zjipyyknb2ydpeuodx6gofvb2kaol4"
                      },
                      "mimeType": "image/png",
                      "size": 4147
                    }
                  },
                  "children": []
                },
                {
                  "$type": "pub.leaflet.blocks.unorderedList#listItem",
                  "content": {
                    "$type": "pub.leaflet.blocks.text",
                    "facets": [],
                    "plaintext": "Hallucinations in Neural Machine Translation"
                  },
                  "children": []
                },
                {
                  "$type": "pub.leaflet.blocks.unorderedList#listItem",
                  "content": {
                    "src": "https://arxiv.org/pdf/1609.08144",
                    "$type": "pub.leaflet.blocks.website",
                    "title": "",
                    "description": "",
                    "previewImage": {
                      "$type": "blob",
                      "ref": {
                        "$link": "bafkreif4uwcwufva2dipn22rl7ztlugjrgu4c6yzlkufkgnazsybr57s2y"
                      },
                      "mimeType": "image/png",
                      "size": 1303
                    }
                  },
                  "children": [
                    {
                      "$type": "pub.leaflet.blocks.unorderedList#listItem",
                      "content": {
                        "$type": "pub.leaflet.blocks.text",
                        "facets": [],
                        "plaintext": "Oct 2016"
                      },
                      "children": []
                    },
                    {
                      "$type": "pub.leaflet.blocks.unorderedList#listItem",
                      "content": {
                        "$type": "pub.leaflet.blocks.text",
                        "facets": [],
                        "plaintext": "Three inherent weaknesses of Neural Machine Translation are responsible for this gap: its slower training and inference speed, ineffectiveness in dealing with rare words,and sometimes failure to translate all words in the source sentence"
                      },
                      "children": []
                    }
                  ]
                },
                {
                  "$type": "pub.leaflet.blocks.unorderedList#listItem",
                  "content": {
                    "src": "https://proceedings.mlr.press/v162/bansal22b/bansal22b.pdf",
                    "$type": "pub.leaflet.blocks.website",
                    "title": "",
                    "description": "",
                    "previewImage": {
                      "$type": "blob",
                      "ref": {
                        "$link": "bafkreifh4a5esk22bmpbgpwwlnqmg3bhua3ocgh5ktdqdlvtrfo4kqvfku"
                      },
                      "mimeType": "image/png",
                      "size": 31496
                    }
                  },
                  "children": [
                    {
                      "$type": "pub.leaflet.blocks.unorderedList#listItem",
                      "content": {
                        "$type": "pub.leaflet.blocks.text",
                        "facets": [],
                        "plaintext": "Back translation for NMT is particularly sensitive to data quality"
                      },
                      "children": []
                    }
                  ]
                },
                {
                  "$type": "pub.leaflet.blocks.unorderedList#listItem",
                  "content": {
                    "src": "https://aclanthology.org/2024.naacl-long.254.pdf",
                    "$type": "pub.leaflet.blocks.website",
                    "title": "",
                    "description": "",
                    "previewImage": {
                      "$type": "blob",
                      "ref": {
                        "$link": "bafkreica3sopygo2rajro6cheipcqz6evkams3zh7fywmldlhph5ae3pka"
                      },
                      "mimeType": "image/png",
                      "size": 33049
                    }
                  },
                  "children": []
                },
                {
                  "$type": "pub.leaflet.blocks.unorderedList#listItem",
                  "content": {
                    "$type": "pub.leaflet.blocks.text",
                    "facets": [],
                    "plaintext": "Backdoor Attacks on Multilingual Machine Translation"
                  },
                  "children": []
                }
              ]
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "src": "https://cuny.manifoldapp.org/read/ethnographies-of-datasets-teaching-critical-data-analysis-through-r-notebooks-8395d7ae-bbb0-4547-a738-89b4be8b9c12/section/f5d6c3ca-cc2e-4e2a-9049-9e8dfd5f2603",
              "$type": "pub.leaflet.blocks.website",
              "title": "Ethnographies of Datasets: Teaching Critical Data Analysis through R Notebooks | Ethnographies of Datasets: Teaching Critical Data Analysis through R Notebooks | Manifold @CUNY",
              "description": "by Lindsay Poirier",
              "previewImage": {
                "$type": "blob",
                "ref": {
                  "$link": "bafkreics3t2egbee7hcpphjlahe2kg7se2757jbwuwrd6jgm4mz2tw42vm"
                },
                "mimeType": "image/png",
                "size": 35876
              }
            }
          },
          {
            "$type": "pub.leaflet.pages.linearDocument#block",
            "block": {
              "$type": "pub.leaflet.blocks.unorderedList",
              "children": [
                {
                  "$type": "pub.leaflet.blocks.unorderedList#listItem",
                  "content": {
                    "$type": "pub.leaflet.blocks.text",
                    "facets": [],
                    "plaintext": "Critical Dataset ethnography"
                  },
                  "children": []
                }
              ]
            }
          }
        ]
      }
    ]
  },
  "bskyPostRef": {
    "cid": "bafyreic5i6vumccqikjoe5myju4i2hdnqxk5whn4fvohfg6o4cxufndhwi",
    "uri": "at://did:plc:tc67mguo6vvlgqpw54mez6jv/app.bsky.feed.post/3m2galsagyk23",
    "commit": {
      "cid": "bafyreic2ok52wf5sgr2d7nylso4unmzakk72ydfmgnjvifnfm6mznohfcm",
      "rev": "3m2galsdpz62v"
    },
    "validationStatus": "valid"
  },
  "description": "There is a ghost in Google Translate and it survived the AI era",
  "publishedAt": "2025-10-05T04:19:12.597Z"
}