Built-in Localization support and message formatting

I think localization is important. I’m from germany and there are many servers with german chat, and plugins should be localized as well.

It would be nice to have a simple system for translation and maybe even a unified message (color) formatting system, making translation of big text-heavy plugins simple.

I thought about it when making Bukkit plugins, and i think it should work like gettext. The good thing about it is that the code is still readable, without placeholders for messages like “PLAYER_KILL_MESSAGE”.

//untranslated simple plugin
onlinePlayer.sendMessage(String.format("%1$s was killed by %2$s", victimPlayer.getName(), killPlayer.getName()));

//another example: plural handling
int kills = 3;

if(kills == 1) 
    onlinePlayer.sendMessage("You killed one player.");
else
    onlinePlayer.sendMessage(String.format("You killed %1$d players!", kills));
//translated version
I18n i18n = I18nFactory.getI18n(getClass());

onlinePlayer.sendMessage(i18n.tr("{0} was killed by {1}", victimPlayer.getName(), killPlayer.getName()));

//plural handling
int kills = 3;
System.out.println(i18n.trn("You killed one player.", "You killed {0} players!", 2, kills));

Translation Maps:
And now you can create a translation that maps all the strings to other strings:

German Translation file example:

{0} was killed by {1}   ---> {1} hat {0} getötet
You killed one player.  ---> Du hast einen Spieler getötet.
You killed {0} players! ---> Du hast {0} Spieler getötet!

When a translation for a string was not found in a language map, it uses the hardcoded one and displays a warning in the console (“Missing message in translation DE…”)

Multi-Language servers:
The Minecraft client sends the used client language to the server. That means there could be an option to display messages in that language, or permissions could be used for that…
Or the whole server uses one config-defined language.

Alternative language config (fallback):
When a plugin is not available in the selected language, have a fallback map:

EN_us       <---> EN
DE_de       <---> DE
DE           ---> EN
PIRATE_SPEAK ---> EN

Translation registration
To provide a translation, just use something like:

//set hardcoded language
plugin.setDefaultLocale(Locale.EN)

//add a translation
plugin.addTranslation(Locale.DE, translationMapFromFile)

Translators can make translation pack plugins that use the method above on other plugins:

//Worldedit German Language Pack Plugin
Plugin worldEditPlugin = server.getPlugin("WorldEdit");
if(worldEditPlugin != null) worldEditPlugin.addTranslation(Locale.RU, translationMapFromFile)

Formatting and line breaks

The translation API could be extended to unify the color formatting of messages:

maybe using a xml/html syntax for message translations:

<success>Game started by {0}. Use <cmd>/guess <letter></cmd></success>
<br/>
<info> 
  {1} <hearts>{2}</hearts>
  <br/>
  <cmd>/hangman hide</cmd> to disable this...
</info>

With a tag to color/format map:

success: {
  color: light-green
}
info: {
  color: yellow
}
hearts: {
  bold: true,
  color: dark-red
}
cmd: {
  color: gold
}

Admins can edit these files to modify the message formatting of a plugin…

6 Likes

It would be nice to have something like this available. I’d never really bothered with localisation on Bukkit, but if its easy and unified, I would definitely utilise it.

The problem with this is that you can’t easily change the original translation. If you for example want to change:

{0} was killed by {1}

to

{0} has been killed by {1}

There is no easy way to change all locales, and external language packs don’t work with this anymore. That’s why it’s usually an identifier instead of the English (original) string.

1 Like

It’s good idea. Maybe, it can be translations on French, Russian or other languuages

@boformer

It may be better to have a localization object which provides a way to register a new message with the translations for it (those could be loaded from config or just be hardcoded) which are stored per-plugin, e.g.

// pars: plugin_instance, default_language_code, automatically_detect_language

// Create new localization with English as default language
Localization lang = new Localization(this, "en_US", true);

Then in the initialization method:

// pars: language_code_of_message,  message_id (?), message_to_print_out

// Add the english message (will be default as it has been set above)
lang.addMessage("en_US", "welcome message", "Hello! Welcome to this server.");
// To add the german translation for the message with id "welcome message"
lang.addMessage("de_DE", "welcome message", "Hallo! Willkommen auf dem server.");
// To save the messages (to prevent changing it later)
lang.save();

And to print it:

// pars: message_string | message_id

// normally you'd do this:
player.sendMessage("Hello! Welcome to this server.");

// but with an localization:
player.sendMessageLocalized("welcome message");
// which would send the en_US message by default, but if the users client is set to de_DE, the message in that language would be sent

In thing this would be a solution. What do you think?

I love
I18n.tr(Plugin yourPlugin, String id, String... params)

OpenQuestion:
Marker for params
How to store localization
Lazy load?

Ps.: maybe add a default (english param to the method)

I’m not a friend of translating content from english.

It’s better to have a name or identifier for the localized content like:

welcome.member = "Welcome member {player.name}!"
welcome.developer = "Welcome developer {player.name}!"

With the proposed translator localization the player name for admin could be “developer mickare” and be translated into “Welcome member developer mickare”.

I’m currently working on an example visitor based localization.

*EDIT: I came to the conclusion that a visitor based localization isn’t generic enough…

1 Like
  1. There has been some discussion about using resource packs to store translation strings, which is how Minecraft does localization. It is also probably how Mojang plans to do things. However, admittedly I feel this would result in a lot of strings sent to the client and the client may even deny to receive them so I’m not a fan of this suggestion.
  2. Java already has ResourceBundles so whatever implementation we utilize should probably keep that in mind.
  3. We don’t want to invent a new format for storing localized strings. There are various localization services (CrowdIn, Transifex, etc.) that already take certain formats.
1 Like

Oh, that’s not a problem. You only change the hardcoded fallback message (it actually is an identifier) when there is substantial change, when the sense of it changed (Like “{0} was killed by {1}” to “{1} killed {0} with a sword”.

If you want to change a detail like formatting, you can just make a translation file for english as well (or whatever the hardcoded language is).

When the “version” of a message changes, you can change the identifier.

Of course, you don’t really need this part when you even have a translation file for a hardcoded language:

plugin.setDefaultLocale(Locale.EN)

From experience I can say that MC’s built in format doesn’t quite match anything that CrowdIn offers. It’s closest to .ini, but CrowdIn doesn’t handle quotes and some other ‘special’ characters correctly in that mode.

They all support gettext, and we don’t invent a new format if we use it.

http://docs.transifex.com/developer/formats/gettext

I think gettext is the best choice:

  • Many features (plurals, number formats, etc…)
  • There are tools for converting source code (automatic translation file creation)
  • even without tools, it’s easy to localize a one-language plugin, the only thing you have to do is translate messages, nothing else
  • You don’t have to think of a name for each message
  • There is always the hardcoded fallback message
  • It’s easy to read the source code

gettext is also what most open source software uses, if that helps. GNOME and KDE, for example.

On the other hand if we use MC style we can generate a resource pack for the translations, offer it to the client on connect, and fallback to the english translation if they reject it. That lets us mostly push how translations are formatted and such to Mojang and the client.

1 Like

That sounds nice and I like it.
A nice feature would be to offer an interface where plugins can pre-register simple translations.


… But as far as I remember the JSON formatting of messages with translations AND action elements gets tricky.

Translating something like: "<hover text="generic stuff">mickare</hover> joined the game

{
  "translate":"multiplayer.player.joined"
  "with": [
     "generic stuff",
     "mickare"
  ]
}

I think this won’t work.
You would have to split it on serverside and then you are back at the start, managing the localization on the server.

Am i wrong?

Personally I favor the ResourceBundle @sk89q mentioned.

But how can we fill the localized strings with content?


With my plugins I developed my own syntax for localization content.
It’s realy uncomplicated and easy to understand.

Greetings {player.name}! Your UUID is {player.uuid}

I call it an “object orientated text replacment”.

So how can we achieve this with a generic aproach.

All classes that want to be usable in the localization output need to include an annotation “Localization

public @interface Localization {
  String value();
}

By adding this annotation to any class and its method you can tell the system what is available and where to get it.

@Localization("player")
public class ExamplePlayer extends Player { 

  private final UUID uuid = new UUID(12512315, 634543453);
  private final String name = "Hodor";

  @Localization("uuid")
  public UUID getUuid() {
    return uuid;
  }

  @Localization("name")
  public String getName() {
    return name;
  }

  public void sendMessage(String text) {
     super.sendMessage("Hodor");
  }

}

By calling a method in a Localization Bundle those methods are called when they are requested.
Object without the @Localization annotation are converted to a String via .toString()


Case fill("{player.name}", new ExamplePlayer() );

  1. A ExamplePlayer object will be resolved to the field “player”.
  2. From player the field .name is resolved to the method .getName()
  3. {player.name} is replaced with the result from .getName()

Case fill("{player.uuid}", new ExamplePlayer() );

  1. A ExamplePlayer object will be resolved to the field “player”.
  2. From player the field .uuid is resolved to the method .getUUID()
  3. UUID has no annotation @Localization
  4. {player.uuid} is replaced with the result from .getUUID().toString()

Examples:

  • {ban.banned_by.uuid}
  • {player.stats.kills.count}

The informations on the classes must not be generated every time, they can be statically cached in a provider class. Click here to see the cache.

What do you think?
You can be straightforward to me, I am in a mood to accept criticism. ^^

instead of
i System.out.println(i18n.trn("You killed one player.", "You killed {0} players!", 2, kills));
Why not just use a ternary operator

Java also has MessageFormat too.

Gettext totally supports this.
What you would usually do, is simply to also include an “en” translation file, translating the “was killed by” to “has been killed” for EN, that way, you can even translate differently for en_GB and en_US.

Is this resolved by using the new TextTemplates? We could create several translation_lang.hcon files and then use the template the client needs. Or will there be another solution by Sponge. If not, what would be the best pratice combining TextTemplates and translation?

For gettext-analogue: there is Localization API in the works. Looks like it will be very configurable, so we don’t have to worry about it.

As for TextTemplates: no, I don’t think so. For this purposes formatters is usually used, not templates. They have different scopes: TextTemplate takes Template and several Texts and make it into Text, and designed to be maked with Builder and stored in Configurate’s HOCON format that looks like raw Minecraft JSON (which is good, but is cumbersome to translate); example of using: Pagination service. Formatters like MessageFormat is designed to take human-readable and editable singular formatter-string, any Objects, and Locale, and makes it into String. It takes into account factors like different countries uses different formats for floats, have different thousand-separators, different Date-Time format, etc. Also it natively handles things like plurals (which is usually a pain in languages other than English).

tl;dr: Formatter is for content, TextTemplates is for presentation.

P.S. Also, there is PlaceholderAPI by AniSkywalker, but I haven’t looked into it yet.

@VcSaJen Thanks a lot for your post! The WIP LocalizationApi is what I was looking for. So we have to wait until release version 5. I am happy that there will be an official solution for localization.