Skip to content
This repository has been archived by the owner on Aug 8, 2024. It is now read-only.

Naming conventions cause friction with C interop #44

Open
orangebowlerhat opened this issue Mar 16, 2024 · 8 comments
Open

Naming conventions cause friction with C interop #44

orangebowlerhat opened this issue Mar 16, 2024 · 8 comments

Comments

@orangebowlerhat
Copy link

I think c3 is a great language. Its very well designed, its the C/C++ I always wanted.

The one bugbear I have is that I need to use C headers and port C code. c3's naming conventions are a constant obstacle to this. I have to keep renaming things and refactoring code because of it.

Its not a part of C and it goes against the project stated intention of being low friction for C coders. Plus, it serves no real purpose AFAIK. I'd be very much happier if the language did not have rules about case.

@lerno
Copy link
Collaborator

lerno commented Mar 16, 2024

Unfortunately, the rules are there to ensure that the grammar is unambiguous.

Take something as innocent as:

foo* f = 4;

The meaning of this C code is completely context dependent as foo may either be a type or an identifier.

The alternative would be to change the grammar to make it completely unambiguous, some languages do this:

var f : foo* = 4;

But then we'd deviate significantly from C syntax, this tactic is used by Go etc.

Another alternative is to disallow expression statements that do not have side effects. So for example foo * 1; would be disallowed. This eliminates ambiguity too, but unfortunately requires infinite lookahead. This makes it impossible to parse using LL(1), and in addition some cleverness is needed in order not to make this wasteful when lexing, as the simple case with 2 token lookahead will occur frequently.

This method is used by D.

Seen to the above, the naming standard is a fairly unobtrusive way to solve this problem.

BUT one could imagine variants to simplify implementing C headers, for example we could imagine things like:

  1. Defining prefixes to remove from external definitions. An example:
// Today
module foo;
extern fn int win32_SomeFunction(Win32_HANDLE) @extern("SomeFunction");
// Possible feature
module foo @externprefix("win32");
// "win32" implicitly removed from the extern declaration
extern fn int win32_SomeFunction(Win32_HANDLE);
  1. Having language coded prefixes or suffixes that will make just make things work:
// Today
module foo;
extern fn int win32_SomeFunction(Win32_HANDLE) @extern("SomeFunction");
// Possible feature
module foo;
extern fn int SomeFunction__fn(HANDLE__t); 
  1. Having uniform conventions for C types/C functions
module foo;
extern fn int fn_SomeFunction(Type_HANDLE) @extern("SomeFunction"); 
  1. Variant of 2, having a sigil style suffix
module foo;
extern fn int SomeFunction@fn(HANDLE@t); 
  1. Variant of 2, using sigil prefixed strings
module foo;
extern fn int @"SomeFunction"($"HANDLE"); 

@lerno
Copy link
Collaborator

lerno commented Mar 16, 2024

And I am open to other ideas as well @orangebowlerhat

@orangebowlerhat
Copy link
Author

orangebowlerhat commented Mar 16, 2024

Clearly I don't understand well enough, to me

foo* f = 4;

is declaring a pointer to a foo type, called _f, and setting it to 4. Granted that 4 is an odd thing to set a pointer to. How is that ambiguous? What else can it be?

@lerno
Copy link
Collaborator

lerno commented Mar 17, 2024

Sorry, a better example is foo * a; is this variable "foo" multiplied by "a" or declaration of the variable "a"?

@lerno
Copy link
Collaborator

lerno commented Mar 17, 2024

Similarly, is foo ** a declaration or multiplying foo and *a.
Is foo[1]* a multiplying foo[1] and a or declaring a as foo[1]*?

@orangebowlerhat
Copy link
Author

@lerno

Excuse my taking so long to respond. That all makes sense, they are syntactically ambiguous and I wonder how C interpreter deals with those cases.

However, I'm not sure how the naming conventions resolves these issues? I assume that C would know that foo is a type and not a variable name? You can't declare a foo* foo, right?

@orangebowlerhat
Copy link
Author

orangebowlerhat commented Mar 29, 2024

BTW, in terms of the syntax of pointers. In C, I always use foo *a, b because its syntactically accurate. You can write foo* a, b but b is not a pointer in that case.

However, I prefer the modern language way of doing things. That where 'foo* a, b' means both a and b are pointers, and foo *a is not valid syntax. So that *a always refers to the contents of the pointer.

@lerno
Copy link
Collaborator

lerno commented Mar 29, 2024

However, I'm not sure how the naming conventions resolves these issues? I assume that C would know that foo is a type and not a variable name? You can't declare a foo* foo, right?

The common way to solve this is the so called "Lexer Hack", which feeds back symbols into the lexer itself from the semantic analysis. See this wikipedia page: https://en.wikipedia.org/wiki/Lexer_hack

So it's not resolvable by the grammar. As an example:

int a;

The above is valid, but if we do this:

typedef int a;
int a;

It is not. This a simple example of how the C grammar is context dependent.

C3 is context free in regards to parsing, so will not need this feedback into the lexer.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants