5.3 KiB
Architecture
This page describes how foreignthon-core works internally.
Pipeline
source.xx.py
│
▼
_check_shebang() ← reads "# foreignthon: xx" if present
│
▼
load_pack(lang_code) ← discovers + validates the JSON pack
│
▼
_apply_postfix_syntax() ← rewrites "expr @@keyword:" lines
│
▼
_swap_tokens() ← tokenizer pass: replaces NAME tokens
│
▼
standard Python string ← ready to compile or write to disk
Module overview
| Module | Responsibility |
|---|---|
transpiler.py |
The engine — postfix rewriter and tokenizer pass |
pack.py |
Pack discovery, loading, and validation |
cli.py |
Click commands (fpy run, fpy compile, etc.) |
errors.py |
Bilingual exception hook |
template.json |
Canonical set of all keywords/builtins a pack must cover |
Tokenizer-based translation
ForeignThon uses Python's standard tokenize module rather than regex or AST manipulation.
tokenize.generate_tokens() splits source code into typed tokens. ForeignThon only looks at NAME tokens — identifiers. It replaces any NAME token whose string appears as a key in the active pack mapping. All other token types (strings, comments, operators, numbers) pass through unchanged.
This gives three important guarantees:
- Strings are safe. A keyword inside
"..."orf"..."is aSTRINGtoken, never aNAME— it is never touched. - Comments are safe. Comment tokens are passed through verbatim.
- Variable names are safe. A variable like
si_conditioncontainssionly as a substring; as aNAMEtoken it issi_condition, which is not in the mapping.
The whitespace between tokens is preserved by tracking (row, col) positions and copying the gaps from the original source.
Pack discovery
Language packs register themselves using Python entry points:
# in foreignthon-es/pyproject.toml
[project.entry-points."foreignthon.langs"]
es = "foreignthon_es"
pack.py calls importlib.metadata.entry_points(group="foreignthon.langs") at runtime to discover all installed packs. Installing a pack is sufficient — no configuration file needs to be edited.
Each pack module must expose:
def get_pack_path() -> Path:
return files(__name__) / "xx.json"
The core calls get_pack_path() to locate the JSON, loads it, and validates that all required sections are present.
Results are cached with @lru_cache so each pack is loaded at most once per process.
Pack mapping
Four sections of the JSON are merged into a single flat dict for translation:
mapping = {}
mapping.update(pack["keywords"])
mapping.update(pack["builtins"])
mapping.update(pack["exceptions"])
mapping.update(pack["stdlib"])
The merged mapping is { foreign_word: english_word }. It is passed directly to _swap_tokens().
If two sections define the same foreign key, later sections win (stdlib last). In practice this does not occur because pack authors ensure uniqueness.
Postfix syntax (@@)
The @@ operator is a source-level pre-processing step that runs before tokenization.
A line like:
x > 0 @@si:
escribir(x)
is rewritten to:
si x > 0:
escribir(x)
The rewriter uses a regex that matches (.+?)@@(<keyword>) and moves the keyword to the front. It only operates on lines that contain @@, preserving indentation and line endings.
@@ is never valid Python and never appears in the tokenizer output.
Decompile direction: fpy decompile --postfix does the reverse — it looks for lines of the form foreign_kw expr: where foreign_kw is in the pack's postfix_keywords list, and rewrites them to expr @@foreign_kw:.
Bilingual error hook
errors.py installs a custom sys.excepthook before running user code:
- On exception, it looks up the exception type name in the pack's
exceptionssection (reverse map: English → foreign). - It looks up a translated message in
error_messages. - It prints
[XX] ForeignName: translated_msgthen[EN] EnglishName: original_msg. - It always calls
traceback.print_exception()afterwards so the full traceback is shown.
Tracebacks point to the original .xx.py file. This is achieved by populating linecache.cache with the original source before exec()-ing the compiled code, so Python's traceback machinery reads the right lines.
Custom pack override
When .foreignthon.toml declares custom_pack = "path/to/custom.json":
- If the custom JSON has
meta.codeset, it is treated as a standalone pack and used directly. - If
meta.codeis absent, it is treated as an override — it is merged on top of the installed pack, replacing only the keys it defines.
The CLI (cli.py) handles this in _load_effective_pack() by walking up the directory tree to find .foreignthon.toml.
File naming and language detection
Language detection order (highest priority first):
--langCLI flag- Shebang comment
# foreignthon: xxon the first line - Double extension
.xx.py→xx - Fallback to
"en"(no-op — English is Python)
_detect_lang() and _check_shebang() in transpiler.py implement steps 3 and 2 respectively. Step 1 is handled by the run command in cli.py.