The milestone was to implement 7 more transformations using the AST from DMD as a library.
Upon successfully being able to reproduce D source code, we realised that the different types of D comments pose an issue to dfmt. Although the lexer generates comment tokens (TOK.comment
), the actual comment text is discarded in the lexing stage unless it is a DDoc comment. During parsing, the DDoc comments are attached to their relevant top-level nodes. Enabling regular comments to be stored and retrieved during AST traversal can be performed in two ways:
- Introduce a separate field for raw comments in
struct Token
in the lexer, and similarly process the comments into a separate hash table in the parser. - Introduce a pass before the existing syntactic parsing stage, where a concrete syntax tree (CST) is built, preserving all whitespace, newline, and comment data. In the subsequent pass, strip these nodes from the CST to generate the current AST that DMD creates.
While the second option is a better option to provide a rich syntax tree that can be used by tools like dfmt, the first option was chosen since it is minimally invasive and does not require significant changes in the compiler pipeline. These DMD changes are in progress, and upon completion will enable dfmt to retain, reproduce, and format all types of comments in the source code.
The following transformations have been implemented:
dfmt_brace_style
(large): Allows configuring brace styles across Allman (default), OTBS, K&R, and Stroustrup.dfmt_selective_import_space
: Adds a space between the module name and the colon for selective imports, e.g.import std.stdio : writeln
. This is enabled by default.dfmt_space_after_keywords
: Adds a space between keywords likeif
,while
,for
,foreach
,switch
, etc., and the opening parenthesis, e.g.if (foo == bar)
. This is enabled by default.dfmt_compact_labeled_statements
: Places labels on the same line as the labeled statement, e.g.foo: while (1) {}
. This is enabled by default.dfmt_space_before_named_arg_colon
: Adds a space between a named function argument or struct constructor argument and the colon, e.g.foo(a : 0, b : true);
. This is disabled by default.dfmt_template_constraint_style
(partial): Allows configuring the formatting of template constraints acrossalways_newline
,always_newline_indent
,conditional_newline
, andconditional_newline_indent
. Two of the configuration options (always_newline
andalways_newline_indent
) have been fully implemented, and the remaining two options (conditional_newline
andconditional_newline_indent
) will be taken up after implementing the line length configuration options (max_line_length
anddfmt_soft_max_line_length
).
- feat: add
dfmt_space_after_keywords
- feat: add
dfmt_selective_import_space
- feat: add brace styles
- feat: add
dfmt_compact_labeled_statements
- feat: add
dfmt_space_before_named_arg_colon
- feat: non-conditional template constraints styles
The dfmt_single_template_constraint_indent
and dfmt_single_indent
transformations both allow configuring whether the code on the following line is indented by a single tab instead of two tabs. They have been removed from the list of remaining transformations for the rewrite, since they introduce formatting inconsistencies in the code, which is a problem dfmt is meant to eliminate. In all scenarios, code on the following line should ideally be indented by a single tab, irrespective of the context of the newline. This ensures formatting consistency across all types of expressions. A D forum post will be created to get some community feedback on the necessity of these two transformations before implementing them.
The dfmt_outdent_attributes
transform has not been implemented in the existing dfmt project yet, and hence will not be a goal for the rewrite; it will instead be a good-to-have transform, outside of the scope of the next milestone.
The next milestone involves completing the dfmt_template_constraint_style
transformation and implementing the remaining 3 feasible transformations. It will also involve research work about the changes required in DMD to support the dfmt_keep_line_breaks
configuration, which requires newline information to be preserved and available in the AST.