Since 2014 · Built in Timor-Leste

The original English ↔ Tetun translator.

Tetun Translator is the oldest online translator for Tetun, the national language of Timor-Leste. Started in 2014 as a personal learning tool, it became one of the most-used websites in the country and has shaped the data and tooling the Tetun web runs on.

What is Tetun?

Tetun (also spelled Tetum) is the national language of Timor-Leste, spoken by roughly a million people. Tetun Dili — also called Tetun Prasa — is the urban variety used in government, courts, media, and education. It is one of two official languages of Timor-Leste, alongside Portuguese, and borrows heavily from Portuguese, Malay, and Austronesian roots.

Tetun Terik is the older rural variety spoken outside Dili. This translator targets Tetun Dili, the standard used in official and public contexts.

The story

In 2014, Tony Franklin was living in Timor-Leste and trying to learn Tetun. Frustrated by the absence of any decent online translator for the language, he built one — first as a simple test case, then as a website at translate.tetumdili.com.

Within two years it had spread through the Timorese population and was averaging around 700 daily users — roughly 57,000 total by 2017, ranked as the 22nd most-visited website in Timor-Leste.

An iOS version, Tetun Translator Pro, launched on 16 March 2019, and a major update in 2023 added photo-to-text and AI-enriched glossary lookups.

In 2026, the translator was rebuilt from the ground up with modern AI: Claude drafts each sentence, then the output is grounded in a 22,000-entry glossary and a 46,000- sentence parallel corpus, and linted against 47 orthography and grammar rules before it is shown to the user. That is the version you are using now.

How it works

Most machine translation engines treat Tetun as a low-resource language and hallucinate freely. This translator does not. Every translation goes through four stages:

  1. Draft. Claude translates the sentence using its knowledge of Tetun grammar and the surrounding context.
  2. Ground in the glossary. Every significant English term is looked up in a 22,000-entry glossary built from the Dili Institute of Technology (DIT) dictionary, DIT Justice Sector, DIT Health & Medical, DIT Tourism, DIT Wordfinder, and the INL Matadalan Ortográfiku ba Tetun-Prasa (2003).
  3. Ground in the corpus. The sentence is compared against 46,000 parallel English–Tetun sentences drawn from government publications, news, and curated DIT material. Real human phrasings take priority over generated ones.
  4. Lint. The output is checked against 47 rules covering spelling, Portuguese loan-word handling, orthography (DIT vs INL), aspect markers, negation, and common calques. Errors are flagged and the sentence is retried.

The linguistic rules are drawn from the Peace Corps Tetun Language Course (3rd ed., Catharina Williams-van Klinken, 2015), the DIT-TLPDP Tetun for the Justice Sector textbook (2015), and the INL orthography standard (Decree 1/2004).

DIT or INL?

Tetun has two standardised orthographies and this translator speaks both.

  • DIT (default) — Dili Institute of Technology. Academic standard. Uses Portuguese-style nh and lh digraphs, drops most Portuguese accents. Examples: kompanhia, konhesimentu, milhaun.
  • INL — Instituto Nacional de Linguística. Official government standard under Decree 1/2004. Uses ñ and ll, keeps Portuguese accents. Examples: kompañia, koñesimentu, millaun, , ne'ebé.
  • Auto — outputs DIT by default and switches to INL automatically for government-register text.

Frequently asked questions

What is Tetun?

Tetun (also spelled Tetum) is the national language of Timor-Leste, spoken by roughly a million people. Tetun Dili — also known as Tetun Prasa — is the urban variety used in government, media, and education, and is one of two official languages alongside Portuguese.

When was Tetun Translator first built?

Tetun Translator was started by Tony Franklin in 2014 in Timor-Leste as a personal language-learning tool, and launched publicly in 2015 at translate.tetumdili.com. By 2017 it had reached more than 57,000 users globally, with around 700 daily users, and was ranked among the most-visited websites in Timor-Leste. The iOS app followed on 16 March 2019.

What is the difference between DIT and INL orthography?

DIT (Dili Institute of Technology) is an academic standard that uses Portuguese-style digraphs like nh and lh and drops most Portuguese accents. INL (Instituto Nacional de Linguística) is the official government standard under Decree 1/2004, uses ñ and ll as single letters, and keeps Portuguese accents like fó and ne'ebé. Tetun Translator supports both — set Ortho to Auto, DIT, or INL to control the output.

How is this different from other Tetun translators?

Most machine translation engines treat Tetun as a low-resource language and produce unreliable output. Tetun Translator drafts with Claude, then grounds every sentence against a 22,000-entry glossary and a 46,000-sentence parallel corpus drawn from DIT publications, the INL Matadalan Ortográfiku (2003), and the Peace Corps Tetun Language Course. It then lints the output against 47 orthography and grammar rules. The result is reliable text, not guesses.

Is this translator free?

Yes. Tetun Translator is free to use at translate.tetumdili.com. There are no ads, no login, and no paywall. Upload PDF or DOCX documents and translate up to 20,000 characters per request.

Can I use it for legal, medical, or government text?

The translator is grounded in DIT Justice Sector and Health & Medical glossaries and supports a Government domain mode that biases the output toward INL orthography. However, for legal filings, medical records, or government decrees, a qualified human translator should always review the output before it is signed, filed, or published.

Which Tetun variant does it translate into?

Tetun Dili (Tetun Prasa) — the urban variety used in Dili, government, courts, hospitals, and media. Tetun Terik, the older rural variant, is not currently a translation target.

Sources & credit

The linguistic work behind this translator is not ours. It belongs to the people who built modern Tetun lexicography and language pedagogy:

  • Dr Catharina Williams-van Klinken and the team at Dili Institute of Technology (DIT).
  • Instituto Nacional de Linguística (INL), Universidade Nasionál Timor Lorosa'e.
  • The United States Peace Corps — Tetun Language Course, 3rd edition (2015).
  • Cliff Morris, A Traveller's Dictionary in Tetun–English (foundational).

This translator ingests their published data and pedagogical standards. It does not replace consulting the original works directly; it makes them reachable from a search box.

Built in Timor-Leste · by Tony Franklin← Back to translator