ok, here it goes: | Official Unity Discord | Page 1

loud drift Jan 11, 2024, 10:19 PM

#

So I am currently writing basically a bunch of classes for working with HLSL and Shaderlab syntax programatically. I've been loosely basing my work on Roslyn, but I because I wasn't concerned about parsing Shaderlab+HLSL, only constructing syntax trees, I was working on a more "type safe", suboptimal, non-optimized approach for now.

First things first: I declared a bunch of classes such as Tree<Lang>, Node<Lang>, SyntaxOrToken<Lang>, Syntax<Lang>, Token<Lang>, Trivia<Lang>, SyntaxOrToken/Syntax/TriviaList<Lang> (abstract records) etc. They form what I would call a general "shape" a language definition should take, as well as some commmon, language-agnostic syntax types/constructs.

Then, I created a bunch of records like

// simplified
record VariableDeclarationSyntax : Syntax<hlsl> {
   TypeSyntax { get; init; }
   EqualsToken { get; init; }
   SyntaxOrToken<Lang> Children => ...
}

I also coded some syntax generators that extend the definitions of those records with useful data. The usage of records here was my design decision to have deep tree comparisons for free (generated by compiler), and the with syntax for immutability.

I also decided against using something like a runtime field enum SyntaxKind that they use in Roslyn, because I was annoyed by how tedious it was to make syntax nodes and the "kind" play nicely together. Additionally, I am only concerned with correct syntax trees, so I assumed strong typing and differentiation by type and pattern maching to be better for DX.

Because the design was based on roslyn, I decided my trees will also be immutable, thus they are constructed bottom-up (i.e. nodes know what is below them, but they don't know their parent, only during traversal a lazy view of the top-down tree is constructed with parent references).

#

now, I decided against what roslyn did with their red-green trees — internal, hidden syntax that is a bottom-up, persistent immutable tree, and external, basically duplicated syntax classes that lazily construct parent references to nodes on access. I decided that traversal operations should be explicit and ightweight, thus I created a simple data type which is a simple wrapper around a generic tree node with a reference to the parent. Doing it the Roslyn way was to much to maintain, even with syntax generators, and I decided that it can always be refactored later (there are more tree construction aka bottom-up operations than top-down lookups for this library)

loud drift Jan 11, 2024, 10:55 PM

#

nevertheless, I needed some way of walking over the syntax trees, possibly rewriting their parts (thus constructing new syntax trees with reuse of existing nodes) etc.

For that, I created: a simple 2 Visitor interfaces that are language agnostic and visit common set of nodes, that includes Syntax<Lang> or Expression<Lang>. For each language generated, I generate yet another interface, for the language specific classes. These are for example interface HlslVisitor : Visitor<hlsl> with method signatures like Visit(WithParent<VariableDeclarationSyntax> syntaxWithParent) etc. (Now I see that possibly typed wrappers were a mistake, but due to declaration-site variance in C# pattern matching is harder to do)

Now, the actual visitor implementations were used for rewriting parts of syntax. I called them Mappers and these are possibly stateful, generated classes implementing respective visitors. They have generated methods that recursively walk the tree, collect changed nodes and construct new tree from them. This way the whole "updating tree" logic is encapsulated in the Mappers.

Where this fails now in my case is writing hierarchical mappers: for example a generic, language-agnostic Mapper that is only concerned with differentiating between Token and Syntax nodes can be written that basically applies some common formatting logic without duplicating it's implementation from one language's rewriter to another.

So now, there is possibly a hierarchy like:

Mapper -> language agnostic mapper, implements language-agnostic visitor
HlslMapper -> hlsl specific mapper, inherits Mapper, implements HlslVisitor on top
ShaderlabMapper -> ...

#

So the interfaces with default methods serve a purpose of "mixing in" default visiting functionality to the concrete visitors, to avoid explicitly having to implement methods and to work around the inability for C# to do multiple inheritance

mossy imp Jan 12, 2024, 2:47 AM

#

DIM is generally considered one of the worst C# features, and multiple inheritance is generally considered a bad practice even in languages that have it (while newer languages choose not to have it)

loud drift Jan 12, 2024, 12:56 PM

#

mossy imp DIM is generally considered one of the worst C# features, and multiple inheritan...

is DIM Default Interface Methods*? Yeah, now that I work with it I find them to be a PITA. Compared to Java's, these are imo harder to work with reasonably.

For the multiple inheritance: yeah, It's not the way I would want to do it either, what I would ideally want is:

a language-agnostic walker/rewriter for comon language constructs and common logic
specialized walkers/rewriters building upon the former for each language. Things like Rust's traits could be better to work with. Generally, the whole thing could be easier in a functional world as well with full ADT support etc.

loud drift Jan 12, 2024, 1:34 PM

#

In the meantime since yesterday I made some changes and ditched the default interface implementations (since they were scarcely used) and I guess I now understand why my calls were dispatched to default methods instead of implementing classe's: unless a class provides explicit interface method implementation, when refereing to runtime type by interface type the interface default method will be called.

mossy imp Jan 12, 2024, 4:13 PM

#

From my understanding the initial motivation for DIM is not for implementers to not implement stuffs, but rather to prevent breaking library backwards compatibility when adding something to an interface that you don't need anyone to implement, which itself is already very niche.

#

So I'd say it's already a misusage of DIM for something it wasn't designed for.

#

But yeah, parsers, compilers, or anything that's "data in, transform, data out" are places functional programming really shines.

#ok, here it goes: