Dynamic loading for Java #38

eliotmoss · 2015-07-21T15:39:13Z

So Adam and I have run into an interesting question about how to do dynamic loading for Java. The thing is, one does not know all the details of a class in advance. Therefore, it is hard to give things signatures. Consider, for example, the vtable. We need to have Mu types for all the classes mentioned in all the methods -- the vtable will be a struct of function pointers, each pointer specifically typed. But that would force eager loading of the entire universe to figure out the types!

The only alternative seems to be to refcast all over the place at run time. Is that the intent? (Coming from Java I had a (mistaken) bias that this involves a cost, but I see on referring to the spec that refcast does not involve any run-time work.)

wks · 2015-07-21T17:17:27Z

We need to have Mu types for all the classes mentioned in all the methods -- the vtable will be a struct of function pointers, each pointer specifically typed.

The only alternative seems to be to refcast all over the place at run time. Is that the intent?

Yes. It is the "proper" way in Mu. The REFCAST instruction can cast any function reference (func<sig1>) to another function reference with a different signature (func<sig2>). It does not involve any run-time checking. This approach weakens the typing at the Mu IR level, but Mu never does type checking for references (including function references) at run time anyway.

Alternatively Java can be implemented in "the Python way" where dictionary look-ups happen everywhere at run time, but that would be terribly inefficient.

p.s. Previously I proposed not letting the Mu type system have ref<T> or func<sig>, but have the raw ref and func types, instead. In that way, instructions must provide the referent types or signatures when addressing (deriving internal references), accessing (load/store) and calling. This proposal has been rejected many times because the type parameter does provide more static information than not having them. We also worried that losing the type parameter <T> may make verification more difficult, though I am not certain about this claim.

Here is how I think vtable can be implemented:

We can implement the vtable as an array (or hybrid) of function pointers, but each pointer points to a function of a generic signature (like void (*)() in C) rather than a concrete signature (like Object* (*)(int, double, String*) in C).

We can define the vtable as this:

.funcsig @generic_signature = @void ()
.typedef @generic_function = func<@generic_signature>
.typedef @vtable = hybrid <@size_t @generic_function>

Assume there is a virtual call in A.foo:

class A {
  void foo() {
    B b = ...;
    Object rv = b.bar(arg1, arg2, arg3);
  }
}

class B {
  public Object bar(int a, double b, String c) {...}
}

And assume we loaded A, but know nothing about class B at this moment. The .class file of class A contains a method descriptor which has the signature of the callee b.bar (e.g. (IDLjava/lang/String;)Ljava/lang/Object). Then a signature for the callee can be generated when compiling class A to Mu IR:

.funcsig @b_bar_sig_as_seen_by_A_foo =  @refvoid (
    @refvoid // "this" reference
    @i32 @double @refvoid) // other parameters

Here we use @refvoid to refer to anything because we may not know the object types yet. References can always be cast to another reference without checking.

We must have created a vtable for class B even though class B has not been loaded yet. We must assign a slot in the vtable for the method b.bar. We must declare (without defining) a Mu function for b.bar so that it will trap to the client on the first call. The vtable entry for b.bar must be filled at load time. Just like the instruction set, the API can also cast references (including function references), allocate objects (such as the vtable) and write fields (such as the vtable entry).

Then the virtual call mentioned above can be translated as:

%b = ... // the reference to a B instance

// get the vtable
%vtable = CALL <...> @get_vtable (%b) // This may be inlined.

// get the vtable entry
%bar = CALL <...> @get_vtable_entry (%vtable @INDEX_OF_BAR) // This may be inlined, too.
// %bar is func<@generic_signature>, i.e. func<void ()>

// cast the function reference to the appropriate signature
%bar_concrete = REFCAST <@generic_signature @b_bar_sig_as_seen_by_A_foo> %bar

// call the method
%rv = CALL <@b_bar_sig_as_seen_by_A_foo> %bar_concrete (...)

When B is actually loaded, if A.class is correctly compiled against the class B, then the actual signature of B.bar must be the same as @b_bar_sig_as_seen_by_A_foo. So this call site is still valid. Mu considers two signatures the same if their parameters are the same and their return values are the same.

Some other problems: this does not work around field accesses. If field accesses are implemented as struct member access, then the layout of the class must be known when compiling the methods that access its fields.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic loading for Java #38

Dynamic loading for Java #38

eliotmoss commented Jul 21, 2015

wks commented Jul 21, 2015

Dynamic loading for Java #38

Dynamic loading for Java #38

Comments

eliotmoss commented Jul 21, 2015

wks commented Jul 21, 2015