# Architecture This page describes how `foreignthon-core` works internally. --- ## Pipeline ``` source.xx.py │ ▼ _check_shebang() ← reads "# foreignthon: xx" if present │ ▼ load_pack(lang_code) ← discovers + validates the JSON pack │ ▼ _apply_postfix_syntax() ← rewrites "expr @@keyword:" lines │ ▼ _swap_tokens() ← tokenizer pass: replaces NAME tokens │ ▼ standard Python string ← ready to compile or write to disk ``` --- ## Module overview | Module | Responsibility | |---|---| | `transpiler.py` | The engine — postfix rewriter and tokenizer pass | | `pack.py` | Pack discovery, loading, and validation | | `cli.py` | Click commands (`fpy run`, `fpy compile`, etc.) | | `errors.py` | Bilingual exception hook | | `template.json` | Canonical set of all keywords/builtins a pack must cover | --- ## Tokenizer-based translation ForeignThon uses Python's standard `tokenize` module rather than regex or AST manipulation. `tokenize.generate_tokens()` splits source code into typed tokens. ForeignThon only looks at `NAME` tokens — identifiers. It replaces any `NAME` token whose string appears as a key in the active pack mapping. All other token types (strings, comments, operators, numbers) pass through unchanged. This gives three important guarantees: 1. **Strings are safe.** A keyword inside `"..."` or `f"..."` is a `STRING` token, never a `NAME` — it is never touched. 2. **Comments are safe.** Comment tokens are passed through verbatim. 3. **Variable names are safe.** A variable like `si_condition` contains `si` only as a substring; as a `NAME` token it is `si_condition`, which is not in the mapping. The whitespace between tokens is preserved by tracking `(row, col)` positions and copying the gaps from the original source. --- ## Pack discovery Language packs register themselves using Python [entry points](https://packaging.python.org/en/latest/specifications/entry-points/): ```toml # in foreignthon-es/pyproject.toml [project.entry-points."foreignthon.langs"] es = "foreignthon_es" ``` `pack.py` calls `importlib.metadata.entry_points(group="foreignthon.langs")` at runtime to discover all installed packs. Installing a pack is sufficient — no configuration file needs to be edited. Each pack module must expose: ```python def get_pack_path() -> Path: return files(__name__) / "xx.json" ``` The core calls `get_pack_path()` to locate the JSON, loads it, and validates that all required sections are present. Results are cached with `@lru_cache` so each pack is loaded at most once per process. --- ## Pack mapping Four sections of the JSON are merged into a single flat dict for translation: ```python mapping = {} mapping.update(pack["keywords"]) mapping.update(pack["builtins"]) mapping.update(pack["exceptions"]) mapping.update(pack["stdlib"]) ``` The merged mapping is `{ foreign_word: english_word }`. It is passed directly to `_swap_tokens()`. If two sections define the same foreign key, later sections win (stdlib last). In practice this does not occur because pack authors ensure uniqueness. --- ## Postfix syntax (`@@`) The `@@` operator is a source-level pre-processing step that runs **before** tokenization. A line like: ``` x > 0 @@si: escribir(x) ``` is rewritten to: ``` si x > 0: escribir(x) ``` The rewriter uses a regex that matches `(.+?)@@()` and moves the keyword to the front. It only operates on lines that contain `@@`, preserving indentation and line endings. `@@` is never valid Python and never appears in the tokenizer output. **Decompile direction:** `fpy decompile --postfix` does the reverse — it looks for lines of the form `foreign_kw expr:` where `foreign_kw` is in the pack's `postfix_keywords` list, and rewrites them to `expr @@foreign_kw:`. --- ## Bilingual error hook `errors.py` installs a custom `sys.excepthook` before running user code: 1. On exception, it looks up the exception type name in the pack's `exceptions` section (reverse map: English → foreign). 2. It looks up a translated message in `error_messages`. 3. It prints `[XX] ForeignName: translated_msg` then `[EN] EnglishName: original_msg`. 4. It always calls `traceback.print_exception()` afterwards so the full traceback is shown. Tracebacks point to the original `.xx.py` file. This is achieved by populating `linecache.cache` with the original source before `exec()`-ing the compiled code, so Python's traceback machinery reads the right lines. --- ## Custom pack override When `.foreignthon.toml` declares `custom_pack = "path/to/custom.json"`: - If the custom JSON has `meta.code` set, it is treated as a **standalone pack** and used directly. - If `meta.code` is absent, it is treated as an **override** — it is merged on top of the installed pack, replacing only the keys it defines. The CLI (`cli.py`) handles this in `_load_effective_pack()` by walking up the directory tree to find `.foreignthon.toml`. --- ## File naming and language detection Language detection order (highest priority first): 1. `--lang` CLI flag 2. Shebang comment `# foreignthon: xx` on the first line 3. Double extension `.xx.py` → `xx` 4. Fallback to `"en"` (no-op — English is Python) `_detect_lang()` and `_check_shebang()` in `transpiler.py` implement steps 3 and 2 respectively. Step 1 is handled by the `run` command in `cli.py`.