Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue 18716: type const(char)[] can not be mapped to C++ #8120

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jacob-carlborg
Copy link
Contributor

@jacob-carlborg jacob-carlborg commented Apr 3, 2018

D arrays don't have any corresponding type in C++. Instead we mangle it as a templated struct with the name __dslice, i.e. struct __dslice(T), where T is the element type of the array. For an array of ints it would be mangled as the following type: __dslice!int.

Reboot of #7893.

@WalterBright this should interesting for you since you want to kill C string in DMD.

@dlang-bot
Copy link
Contributor

dlang-bot commented Apr 3, 2018

Thanks for your pull request and interest in making D better, @jacob-carlborg! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Auto-close Bugzilla Severity Description
18716 normal [ICE] type const(char)[] can not be mapped to C++

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + dmd#8120"

src/dmd/root/array.d Outdated Show resolved Hide resolved
src/dmd/typesem.d Outdated Show resolved Hide resolved
@WalterBright
Copy link
Member

Thanks for doing the work on this, it's a nice job.

I've thought about doing this now and then, having run into the issue myself, but I have reservations:

  • The user has to declare the array type in the C++ code. I run into strong resistance whenever I suggest something like this. If anything, they want an std::vector<T> interface.
  • There's already a solution - make the function extern (C).
  • Or just pass the array parameter as .ptr and .length.

The conclusion I always come to is it is not worth the extra complexity in the D core language.

@jacob-carlborg
Copy link
Contributor Author

@WalterBright This primarily for use within DMD.

Based on your conclusion, you don’t want this feature?

@wilzbach
Copy link
Member

wilzbach commented Apr 4, 2018

The user has to declare the array type in the C++ code. I run into strong resistance whenever I suggest something like this.

As @jacob-carlborg mentioned this is primarily intended for DMD, s.t. D strings can be used everywhere and passed to LDC/GDC.
LDC even has something like this on their side (see #7893 (comment))

See also: #7893 (another attempt at making the interoperability possible)

If anything, they want an std::vector interface.

FWIW: Last time I checked, this hit multiple dead-ends (see: dlang/druntime#1316) :/

The conclusion I always come to is it is not worth the extra complexity in the D core language.

Hmm what's your plan on passing D strings to LDC/GDC then?

@Geod24
Copy link
Member

Geod24 commented Apr 4, 2018

Or just pass the array parameter as .ptr and .length.

I think this is a much more sensible approach. Instead of introducing another symbol, could we turn extern (C++) void func(const(char)[] slice) into extern (C++) void func(const(char)* ptr, size_t length) in the backend ?

@jacob-carlborg
Copy link
Contributor Author

I think this is a much more sensible approach. Instead of introducing another symbol, could we turn extern (C++) void func(const(char)[] slice) into extern (C++) void func(const(char)* ptr, size_t length) in the backend ?

I disagree.

It would be more complicated to rewrite the AST or the IR. I think it’s too late to do it in the backend. My approach is ABI compatible so no rewriting is required.

@jacob-carlborg
Copy link
Contributor Author

@wilzbach auto-tester is reporting that all tests passed here, but looking at the details, one test failed.

@jacob-carlborg jacob-carlborg force-pushed the 18716-dslice-cpp branch 2 times, most recently from d854bde to deb51eb Compare April 4, 2018 19:41
@WalterBright
Copy link
Member

This primarily for use within DMD.

Then I definitely feel this is not worth it. It would be simple enough to manually rewrite:

foo(T[]);
foo(a[])

as:

foo(Array!(T[]));
foo(Array(a[]);

Based on your conclusion, you don’t want this feature?

Right, not worth it.

@ibuclaw
Copy link
Member

ibuclaw commented Apr 4, 2018

Hmm. When I think about moving the dmd implementation from char* to char[], two things come to mind.

  1. Wrap it.
extern(C++) class SomeType
{
  extern(D) char[] doSometing(char[] input);

  char* doSomething(char* input)
  {
    // or .toStringz() instead of ptr.
    return doSomething(input[0..strlen(input)]).ptr;
  }
}
  1. Remove it. There are many methods that are unused outside the frontend and so better off internal to the parser/semantic as far as C++ api is concerned.

@jacob-carlborg
Copy link
Contributor Author

Remove it. There are many methods that are unused outside the frontend and so better off internal to the parser/semantic as far as C++ api is concerned.

How can we know which methods are used in GDC/LDC?

@ibuclaw
Copy link
Member

ibuclaw commented Apr 5, 2018

How can we know which methods are used in GDC/LDC?

We will tell you. 😉

...

Let's start with a rule of thumb. The role of the interface is to initialize the frontend, call entrypoints to read, parse, semantic, and optionally generate header/ddoc/json files for all modules.
It may also provide its own interface for various tasks such as CTFloat, Target, Port, and builtins, and may require access to ctfeInterpret or new'ing objects via Foo::create.
Finally it traverses the AST, gets information out of the object/trees (hopefully without rewriting any part of in-place) and generates code.

Whatever doesn't fall into any of those categories should be good for the taking. i.e: AST types that only live during semantic analysis, anything that sets or mutates state, and resolver methods and functions.

Off the top of my head, the only functions that need to continue returning a char* are mangleExact, toChars, and toPrettyChars. All of which could have an extern(D) or have a new name toString, toPrettyString, etc... for the string variant.

@ibuclaw
Copy link
Member

ibuclaw commented Apr 5, 2018

optionally generate header/ddoc/json files for all modules.

For example (this might already have been done), the only function that needs to be exposed to C++ for header generation is genhdrfile, everything else in the hdr module can be extern(D).

@thewilsonator
Copy link
Contributor

@WalterBright with the (questionable) power of C++ implicit conversions this is a std::vector<T> interface, behold:

#include <vector>

template<typename T> struct __dslice { 
    size_t length; T* ptr;
    __dslice(const std::vector<T>& x) 
    {
         length = x.size();
         ptr = (T*)&x[0]; // seems to default to const not sure why hence the cast
    }
};
void foo(__dslice<int> a) {};

int main(int a, char** b)
{
    std::vector<int> x;
    foo(x);
}

Proof this compiles: https://godbolt.org/g/VFA1xV

If somebody is using the code then they are already #includeing something, include the definition as part of that (maybe adding an include guard to avoid multiple definitions).

@jacob-carlborg
Copy link
Contributor Author

Wrap it

@ibuclaw Unfortunately it's not possible to overload a function taking const(char)* and one taking const(char)[] when passing a string literal:

void symbol_name(const(char)* name) {}
void symbol_name(const(char)[] name) {}

void main()
{
    symbol_name("foo");
}
main.d(6): Error: main.symbol_name called with argument types (string) matches both:
main.d(1):     main.symbol_name(const(char)* name)
and:
main.d(2):     main.symbol_name(const(char)[] name)

@ibuclaw
Copy link
Member

ibuclaw commented Jun 30, 2018

@jacob-carlborg - then make the C++ overload include a length parameter?

Identifier::idPool is a good example of this, the const(char)* override is only declared in C++ headers.

D:

dmd/src/dmd/identifier.d

Lines 175 to 193 in f1effe0

/********************************************
* Create an identifier in the string table.
*/
extern (D) static Identifier idPool(const(char)[] s)
{
return idPool(s.ptr, cast(uint)s.length);
}
static Identifier idPool(const(char)* s, uint len)
{
StringValue* sv = stringtable.update(s, len);
Identifier id = cast(Identifier)sv.ptrvalue;
if (!id)
{
id = new Identifier(sv.toDchars(), len, TOK.identifier);
sv.ptrvalue = cast(char*)id;
}
return id;
}

C++:

dmd/src/dmd/identifier.h

Lines 42 to 47 in f1effe0

static Identifier *idPool(const char *s, unsigned len);
static inline Identifier *idPool(const char *s)
{
return idPool(s, strlen(s));
}

@jacob-carlborg
Copy link
Contributor Author

jacob-carlborg commented Jun 30, 2018

then make the C++ overload include a length parameter?

@ibuclaw there’s already one overload with and one without the length parameter. I’m trying to avoid breaking all existing usages.

https://github.com/dlang/dmd/blob/master/src/dmd/backend/global.d#L335

@jacob-carlborg
Copy link
Contributor Author

@thewilsonator rebased.

@thewilsonator
Copy link
Contributor

Hmm, it seems to be segfaulting on windows? No idea why though.

See e.g. https://auto-tester.puremagic.com/show-run.ghtml?projectid=1&runid=3231832&isPull=true

@thewilsonator
Copy link
Contributor

Also with the typeForCppMangling() done as part of semantic it seems that __dslice!T will be instantiated for every T[]. I think it would make more sense as a private method of TypeDArray so that it is done lazily only when typeForMangling(LINK.cpp) is called.

@thewilsonator
Copy link
Contributor

@WalterBright ahh I get you.

Passing arrays as a special __dslice they have to add a special code for on the C++ side is not going to go over well

The point of its constructor is that given

void foo(__dslice<int> a);

you can pass a std::vector, a std::span, a gsl::span, std::string_view or any custom container U, that supports U u; size_t i = u.size(); T* = u.data(); to it.

Likewise

template<typename T struct __dslice
{
    size_t length; T* ptr;
...
operator gsl::span<T>(){ return gsl::span<T>(ptr,length);}
};
__dslice<int> bar();
...
gsl::span<int> c = bar();

works. This means that you don't ever have to use __dslice directly, use agsl::span/std::span which the C++ core guidelines encourage people to use.

We can synthesise D T[] wrappers for any code that uses gsl::span<T> or std::span<T>

D arrays don't have any corresponding type in C++. Instead we mangle
it as a templated struct with the name `__dslice`, i.e.
`struct __dslice(T)`, where `T` is the element type of the array. For
an array of ints it would be mangled as the following type:
`__dslice!int`.
@jacob-carlborg
Copy link
Contributor Author

@thewilsonator rebased.

@thewilsonator
Copy link
Contributor

Thanks. @WalterBright can we please get the green light for this?

@kinke
Copy link
Contributor

kinke commented Nov 10, 2018

I don't see how you can simply discard the ABI mismatch on Win64. If the purpose is direct slices interop between D and C++, but it doesn't work on all platforms, it's useless to me.

Edit: I think that at least for LDC, there are potential ABI mismatches for non-Windows x86_64 targets as well. If there's only one available GP register left, the slice's size is put in there, and the ptr is pushed on the stack, IIRC. This isn't done for a regular struct, which is either passed completely in registers or on the stack.

// D slices as parameters in extern(C++) functions

struct Test18716 {}
struct __dslice(T) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests needs to be in runnable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ABI doesn't match, so there's no point, at least not yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is exactly that the ABI does not match.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following. This whole PR is moot as long as the ABI problem exists.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I was just saying that it's not obvious, since there are tests, and the PR is green.
But IMO, this PR should ultimately be accepted, but it looks like you fear it to be vetoed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It should, but Walter hasn't responded to the reasoning why his concerns are unfounded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not going to spend time on moving the tests if/until I fix the ABI issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Walter did already vetoed it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But not sensibly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#11073 contains a runnable_cxx test (if someone wants to use it)

@thewilsonator
Copy link
Contributor

Sorry for the confusion, I meant green light as in he's be happy with the feature, so that it would then not be a waste of time fixing ABI issues for a PR that would be rejected anyway.

@jacob-carlborg jacob-carlborg added the WIP Work In Progress - not ready for review or pulling label Nov 12, 2018
@jacob-carlborg
Copy link
Contributor Author

WIP due to missing ABI compatibility.

@kinke
Copy link
Contributor

kinke commented Aug 29, 2020

I'm not familiar with the DMD ABI details, but at least for LDC, a half-feasible option would be mangling a slice parameter as 2 separate C++ parameters - size_t + T*. For Itanium, that should work, as the return type isn't part of the mangle; MSVC++ does encode it, and well, there's only a single return value. :/

@Imperatorn
Copy link
Contributor

Any update on this?

@RazvanN7
Copy link
Contributor

@WalterBright how should we proceed here?

@ljmf00
Copy link
Member

ljmf00 commented Mar 22, 2022

Ping @WalterBright . This is a good step to make the compiler strlen-free.

@ibuclaw
Copy link
Member

ibuclaw commented Mar 22, 2022

Ping @WalterBright . This is a good step to make the compiler strlen-free.

How? Many things are still being passed to the front-end from C++ land.

@ljmf00
Copy link
Member

ljmf00 commented Apr 18, 2022

Ping @WalterBright . This is a good step to make the compiler strlen-free.

How? Many things are still being passed to the front-end from C++ land.

I'm sure there is a lot of benefits on make this step, especially making the compiler use more memcpy's and less strcpy's, that surely benefits better from SIMD implementations. The strlen part is another bonus that basically removes some O(N) strlen to O(1) accesses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Fix Enhancement Needs Rebase Needs Work stalled WIP Work In Progress - not ready for review or pulling
Projects
None yet
Development

Successfully merging this pull request may close these issues.