Here are some internal info that might be useful for new contributors trying to understand the codebase and how to get some work done.
To work on this project, you will just need Node 16+ (and Docker to run tests). We use npm
to manage dependencies,
and prettier to lint our code.
These are the runnable scripts with npm run
:
General:
run-dev
: Run the CLI (withtsx
). Use--
to pass arguments to the CLI rather than NPM:
npm run run-dev -- extract print --extractor react src/**/*.tsx
build
: Build the CLI.eslint
: Run ESLint.format
: Run ESLint with --fix.schema
: Generate REST API schemas (see REST Client)
Tests:
test
: Run all tests (Unit & E2E). Will start the E2E Tolgee test instancetest:unit
: Run unit tests only.test:e2e
: Run e2e tests only. Will start the E2E Tolgee test instance
E2E test instance:
tolgee:start
: Start the E2E testing instance. Will be available on port 22222.tolgee:stop
: Stop the E2E testing instance.
The CLI uses commander.js to handle the whole command parsing & routing logic. As the way we deal with arguments is more complex than what the library can do by itself, we have some extra validation logic.
We use cosmiconfig to handle the loading of the .tolgeerc
file.
There is also a module that manages the authentication token store (~/.tolgee/authentication.json
). These modules
can be found in src/config
.
The .tolgeerc
file is loaded at program startup, and the tokens (which depend on options) are loaded before our
custom validation logic.
ApiClient uses openapi-typescript
to generate typescript schema and openapi-fetch
for fetching, so it is fully typed client. Endpoints that use multipart/form-data
are a bit problematic (check ImportClient.ts
).
The Tolgee Extractor/Code Analyzer is one of the biggest components of the CLI, it has following layers:
- TextMate grammars to parse source code files and generate tokens
- Mappers (generalMapper, jsxMapper, vueMapper), which rename tokens to general tolgee tokens (which are typed)
- Because tokens are abstracted to general ones, we can reuse many pieces of logic across different file types
- Mergers allow merging multiple tokens into one, this has two usecases:
- Simplifying tokens (e.g. there are three tokens specifying a string, which can be merged into one)
- Generating trigger tokens (e.g.
<T
is merged intotrigger.t.component
) - these triggers are then mapped to custom rules
- Very simple semantic tree is then constructed, where we identify blocks, expressions and objects + when there is a trigger, a custom rule is applied and there are special node types for important pieces (like
KeyInfoNode
andNamespaceInfoNode
) - Last step is generating report from the semantic tree, we look if the values are static or dynamic and because we keep the structure of blocks, we know which
useTranslate
belongs to whicht
function- Tree can be manipulated before the report is generated (with
treeTransform
function), which is used forvue
andsvelte
, so thescript
tags are hoisted to the top and so on
- Tree can be manipulated before the report is generated (with
To add new TextMate grammars, do not do it manually! Modify the scripts/grammars.ts
file following these
steps:
- Add the URL to the grammar file to the
Grammars
dictionary. - Add applicable licensing information to the
GrammarsLicense
dictionary. - If you need to transform the TextMate grammar:
- In the
Transformers
object, add a function that'll receive the raw TextMate grammar - Make sure to add a comment to the file stating the file is modified, a link to the original, and a reason for the transformation
- Hint: Look at how the transformation for
TypeScriptReact
is done.
- In the
- In
src/extractor/tokenizer.ts
:- Add a new entry to the
Grammar
enum - Add a new entry to the
GrammarFiles
dict - Add new cases in the
extnameToGrammar
function
- Add a new entry to the
Feel free to join the Slack channel if you have questions!
Happy hacking 🐀