Translation at the Carpentries:

Technology past, present, and future



Joel Nitta

Carpentries logo

How to translate?

  • Not as simple as just re-writing text in another language as if translating a novel

  • Carpentries lessons are technical documents (rendered using software) and therefore present unique challenges

Symbol of letter A translated to Japanese hiragana

Challenges in technical translation

  • Need to be able to
    • update translation when original changes
    • deal with source code vs. rendered version
  • Most solutions for translating source code (gettext and PO files) are designed with software in mind, not prose text

PO files

Translation workflow using PO files

What is technical translation anyways?

Two aspects:

  • internationalization (i18n): Providing the framework to support translation (requires technical knowledge)

  • localization (l10n): Actually translating strings (requires linguistic knowlege)

Globes with different continents at center

Past approach (“Styles” format)

  • The current Carpentries lesson format is called the “Styles” format

Programming With Python built with the Styles template on 2022-01-27

Past approach (“Styles” format)

  • The current Carpentries lesson format is called the “Styles” format

  • The Styles format is based on Jekyll (and some other tools)

  • Translation system1 designed by David Pérez-Suárez used a tool called PO4gitbook

PO4gitbook

  • All translations controlled from a central repo with submodules for each lesson (can track changes)

  • Rendering not straightforward

    • (most) Translators can’t “preview” lesson
  • Various methods used to localize

    • transifex (cloud-based)
    • POedit (local text editor)
    • github (online code review; used by JA community)

Screenshot of https://github.com/carpentries-i18n/i18n

Case study: Carpentries-JA

  • Check translations using GitHub PR review tools

Screenshot of PR review of translation on GitHub

Case study: Carpentries-JA

  • Completed R-novice-gapminder, working on git-novice, shell-novice

Screenshot of progress on GitHub projects in Carpentries-JA repo

Case study: Carpentries-JA

Screenshot of SWC workshop website in Japanese

Case study: Carpentries-JA

  • Green stickies 💚
    • GitHub works well for collaboration
    • Can do translation review in browser
  • Red stickies
    • Requirement for git knowlege is a very high barrier to participation
    • Leads to burnout because only a few members can contribute

New approach (“Workbench” format)

  • The upcoming Carpentries lesson format is called “Workbench”, developed by Zhian Kamvar

Programming With Python built with the Carpentries Workbench on 2022-01-27

New approach (“Workbench” format)

  • The upcoming Carpentries lesson format is called “Workbench”, developed by Zhian Kamvar

  • The Workbench format is based on Rmarkdown and pandoc

    • Rendering of lessons is greatly simplified
  • I am developing an R package to facilitate translating with the Workbench format called dovetail

sandpaper R package logo

dovetail

  • Each translation is contained within each lesson1

  • Rendering is easily accomplished locally by the translator

  • Plan to have a standard system for translation (e.g., pushing/pulling from transifex)

dovetail

library(dovetail)

# Copy (untranslated) files needed for rendering lesson
create_locale("ja")

# Create PO files ----
create_po_for_locale("ja")

# Edit PO files ----
# for example, with
# usethis::edit_file("po/ja/01-introduction.po")

# Translate md files ----
# translate all (R)md files at once to `./locale/{lang}/`
translate_md_for_locale("ja")

# Build translated lesson ----
sandpaper::build_lesson("locale/ja/")

dovetail

Output of translation

|-- CONTRIBUTING.md             # - Carpentries Rules for Contributions
|-- README.md                   # - Describes lesson
...
|-- po                          # - NEW, contains PO files for translation
|   `-- ja/                     
|       |-- 00-introducition.po 
|       |-- CONTRIBUTING.po     
|       |...                    
|-- locale                      # - NEW, contains translated files
|   `-- ja/                     
|       |-- CONTRIBUTING.md     # - NEW, translated markdown files
|       |-- site/               # - NEW, translated, rendered site
|           |-- built/          
|           |...               

dovetail design philosophy

  • Make it easier for maintainer to maintain (i18n)
    • Not dependent on one person maintaining one central repo
  • Make it easier for translators to translate (l10n)
    • Requires minimal technical knowlege to participate (don’t need git)

Promote participation in Carpentries by translating!

  • By making translation (l10n) simple, we can encourage participation and grow local communities

Sprouting plant