Translations at The Carpentries

Joel Nitta
Chiba University
https://www.joelnitta.com
https://joelnitta.github.io/bioc_cab_2023-07-14

What is The Carpentries?

  • A non-profit supporting free (or very low cost) workshops to learn best practices for research computing

Photograph of a Carpentries workshop with participants at their computers and instructors standing

Example Carpentries lesson

Screenshot of Carpentries lesson

https://carpentries-incubator.github.io/targets-workshop

How to translate?

  • Not as simple as just re-writing text in another language as if translating a novel

  • Carpentries lessons are technical documents (rendered using software) and therefore present unique challenges

Symbol of letter A translated to Japanese hiragana

Challenges in technical translation

  • Need to be able to
    • update translation when original changes
    • deal with source code vs. rendered version
  • (I am focusing on translating lessons, i.e., teaching materials. Translating contents of R packages is a different matter.)

Example Carpentries lesson (again)

Screenshot of Carpentries lesson

https://carpentries-incubator.github.io/targets-workshop/basic-targets.html#run-the-workflow

Example Carpentries lesson (code)

Screenshot of Carpentries lesson code

https://github.com/joelnitta/targets-workshop/blob/4588d719f590d134aea783152cd1bd4c7695f012/episodes/basic-targets.Rmd#L235

What is technical translation anyways?

Two aspects:

  • internationalization (i18n): Providing the framework to support translation (requires technical knowledge)

  • localization (l10n): Actually translating strings (requires linguistic knowlege)

Globes with different continents at center

Past approach (“Styles” format)

  • The old Carpentries lesson format is called the “Styles” format

  • The Styles format is based on Jekyll (and some other tools)

  • Translation system1 designed by David Pérez-Suárez used a tool called PO4gitbook

Past approach (“Styles” format)

R lesson in Japanese

Past approach (“Styles” format)

  • All translations controlled from a central repo* with submodules for each lesson

  • Rendering not straightforward

    • (most) Translators can’t “preview” lesson
  • Various methods used to localize

    • transifex (cloud-based)
    • POedit (local text editor)
    • github (online code review; used by JA community)

New approach (“Workbench” format)

  • The upcoming Carpentries lesson format is called “Workbench”, developed by Zhian Kamvar

  • The Workbench format is based on R and pandoc-flavoured markdown

    • Rendering of lessons is greatly simplified
  • I am developing an R package to facilitate translating with the Workbench format called dovetail

sandpaper R package logo

dovetail

  • Each translation is contained within each lesson1

  • Rendering is easily accomplished locally by the translator

  • Plan to have a standard system for localization (e.g., pushing/pulling from crowdin)

dovetail

library(dovetail)

# Copy (untranslated) files needed for rendering lesson
create_locale("ja")

# Create PO files ----
create_po_for_locale("ja")

# Edit PO files ----
# for example, with
# usethis::edit_file("po/ja/01-introduction.po")

# Translate md files ----
# translate all (R)md files at once to `./locale/{lang}/`
translate_md_for_locale("ja")

# Build translated lesson ----
sandpaper::build_lesson("locale/ja/")

dovetail

Output of translation

|-- CONTRIBUTING.md             # - Carpentries Rules for Contributions
|-- README.md                   # - Describes lesson
...
|-- po                          # - NEW, contains PO files for translation
|   `-- ja/                     
|       |-- 00-introducition.po 
|       |-- CONTRIBUTING.po     
|       |...                    
|-- locale                      # - NEW, contains translated files
|   `-- ja/                     
|       |-- CONTRIBUTING.md     # - NEW, translated markdown files
|       |-- site/               # - NEW, translated, rendered site
|           |-- built/          
|           |...               

dovetail design philosophy

  • Make it easier for maintainer to maintain (i18n)
    • Not dependent on one person maintaining one central repo
  • Make it easier for translators to translate (l10n)
    • Requires minimal technical knowlege to participate (don’t need git)

Promote participation in Carpentries by translating!

  • By making translation (l10n) simple, we can encourage participation and grow local communities

Sprouting plant