Benchmark registration shouldn't be coupled to `main.cpp` #39

travisdowns · 2018-03-12T21:53:02Z

Currently every file that defines benchmarks needs to declare a benchmark registration method that is called explicitly in make_benches in main.cpp which look like:

template <typename TIMER>
GroupList make_benches() {

    GroupList groupList;

    register_default<TIMER>(groupList);
    register_loadstore<TIMER>(groupList);
    register_mem<TIMER>(groupList);
    register_misc<TIMER>(groupList);
    register_cpp<TIMER>(groupList);
    register_vector<TIMER>(groupList);
    register_call<TIMER>(groupList);
    register_oneshot<TIMER>(groupList);

    return groupList;
}

This is unfortunate since it means that otherwise independent lists of benchmarks have to be registered in a common place (increasing merge conflicts for independent code) and it also adds another step to adding a new benchmark file.

We should allow independent registration of benchmarks: ideally simply dropping in a new .cpp file that has benchmarks would be enough for it to be picked up. That probably means registration should use some kind of global constructor to register tests from the implementing .cpp file directly. This the order of such calls aren't defined across translation units, we need to sort on benchmark name or something so that we have a consistent order in the benchmark list regardless of actual registration order.

The text was updated successfully, but these errors were encountered:

nemequ · 2018-03-14T20:13:15Z

You could look at GLib's Constructors for an idea of how to do this somewhat portably, but AFAIK there is just to guaranteed way to do this.

I think a better solution would be to just have each group compiled into a separate library which could then be dlopen()ed by uarch-bench. In addition to the ability to just drop a C file into uarch-bench and have it just work, it would also make it easy to load modules from other locations, allowing projects to keep uarch-bench tests in their trees. It would also make it easier to test different compilers since only the module in question would need to be recompiled.

travisdowns · 2018-03-14T20:24:31Z

@nemequ - I'm not following the problem here. I've used this pattern many times (note I'm talking about C++ here not C). The portable approach is just to use straightforward C++, isn't it? An object with a constructor that registers the benchmarks (push instead of the current pull approach).

Perhaps I wasn't clear in the description, but I don't anticipate any issue here really, this is just a task so I remember to do the work.

I also like the idea of loading additional benchmarks at runtime from a shared object, but that's really a different, bigger feature (and using C++ in "plugin" interfaces is kind of a mess last time I looked). I added issue #43 to track that.

nemequ · 2018-03-15T17:41:16Z

Ah right, C++. Sorry, not used to thinking in C++. I don't know about it being a "portable" approach (IIRC there can be some issues if you use them in shared libraries), but yeah it should be safe for what you're talking about here.

travisdowns · 2018-03-15T17:58:05Z

@nemequ - yeah there are definitely gotchas in C++, the big ones being that (a) the order between different compilation units isn't defined, so you can easily blow up if your globals refer to each other during initialization and (b) destruction is similarly messy, especially with shared libraries, and it's easy to accidentally access objects have they have been destroyed (the "easy out" here is often just to ensure the global object destructor never run at all: just "leak" them).

What I'm thinking of here is simple though and shouldn't run into those issues.

I didn't think much about C specifically, but yeah now that C benchmarks are supported it would be nice to come up with an approach there. One option would to be to have a build step that looks for .c files and picks out the benchmark registration method and adds it to an autogenerated file that registers all the C benchmarks, or ... dunno.

nemequ · 2018-03-15T19:48:56Z

I didn't think much about C specifically, but yeah now that C benchmarks are supported it would be nice to come up with an approach there. One option would to be to have a build step that looks for .c files and picks out the benchmark registration method and adds it to an autogenerated file that registers all the C benchmarks, or ... dunno.

AFAICT the registration code currently needs to be C++ anyways, even if the tests themselves are in C, so that doesn't really matter right now.

An API to register stuff in C would be very nice for #43, but if you're dlopen()ing a module you can just run a function too, so you don't need a constructor.

At some point it wouldn't be too difficult to add a macro which would support constructors on Windows, most GCC/clang/icc configurations, and suncc (would actually be a good idea for portable-snippets), plus C++, but if #43 happens I don't see why the test built in to uarch-bench couldn't use the same mechanism, then there is no need to maintain multiple paths.

travisdowns · 2018-03-16T03:27:28Z

An API to register stuff in C would be very nice for #43, but if you're dlopen()ing a module you can just run a function too, so you don't need a constructor... but if #43 happens I don't see why the test built in to uarch-bench couldn't use the same mechanism, then there is no need to maintain multiple paths.

Right, but you could only run one function or perhaps a fixed list of functions after dlopen, right? I mean it solves the problem if you put each group of benchmarks into its own shared library, but if you don't do that you still have the same problem within the library of how to have N separate .cpp or .c files defining benchmarks without registering them in a separate spot. If you could enumerate the available functions in a shared object, and perhaps call all that met some pattern it would also solve this "inner" independence problem, but AFAIK there's not portable way to do that.

Just want to check that we're on the same page here. Of course, having different modules already makes the problem a lot better since you have at least one way to decouple things.

nemequ · 2018-03-16T04:16:46Z

I've been assuming one group per shared library, but if you'd prefer it would be fairly easy to do something like

typedef struct UarchCtx_ UarchCtx;
typedef struct UarchGroup_ UarchGroup;

void uarch_ctx_add_group(UarchCtx* ctx, UarchGroup* group);
UarchGroup* uarch_group_new(UarchCtx* ctx, const char* id, const char* description);
void uarch_group_add_bench(UarchGroup* group, const char* id, const char* description, long(* func)(uint64_t), int ops_per_iter);

I don't think it's too much to ask the modules to have a registration function which knows about all functions within that module.

If you dlopen() a module odds are decent a constructor won't be run automatically anyways, so the only thing I can really think of would be some build system magic to automatically generate a registration function using the file names to generate symbols (e.g., foo-bar.c would trigger a foo_bar_register() call to be generated).

Besides, I think the idea of just dropping a C/C++ file into a directory and having uarch-bench pick it up automatically is more attractive if you're modifying uarch-bench (as you currently have to) than if you're building something in your own source tree, especially when you consider that any build system magic would need to be rewritten, or at least customized, for your project.

travisdowns · 2018-03-16T04:51:29Z

If you dlopen() a module odds are decent a constructor won't be run automatically anyways

Really, why? That would basically break any code that relies on this standard and widespread feature, which is probably most C++ code. It seems unlikely to me that this is a problem on almost any modern platform.

In any case, I really see the two things as orthogonal: there should be a way to load benchmark shared modules at runtime (or perhaps a way to embed uarch bench as a component inside your own application/module) and there should be a way to make the registration of benchmarks from C++ and C within uarch-bench or a module a bit more decoupled from having a master list.

I think each feature can more or less live or die on its own. I agree with you that for the modules approach it makes the most sense to call one known function after dlopen which registers all the benchmarks in a module. If there happens to be a constructor-based way to generate this list from separate .cpp files then it could be used to generate the list that this method returns, or it could be just an explicit list like we have today in main.cpp.

I don't think it's too much to ask the modules to have a registration function which knows about all functions within that module.

Yeah, that's fair.

nemequ · 2018-03-19T17:42:57Z

Really, why? That would basically break any code that relies on this standard and widespread feature, which is probably most C++ code. It seems unlikely to me that this is a problem on almost any modern platform.

You're probably right for this project. AFAIK ELF and PE both support constructors in the runtime linker. My understanding is that there are still some platforms where it isn't supported, but you're not likely to run into them on x86 anyways.

In any case, I really see the two things as orthogonal: there should be a way to load benchmark shared modules at runtime (or perhaps a way to embed uarch bench as a component inside your own application/module) and there should be a way to make the registration of benchmarks from C++ and C within uarch-bench or a module a bit more decoupled from having a master list.

I would think you would want a single code path since it's easier to maintain, but if you're comfortable with two I don't mind.

That said, I believe there is still an issue with static libraries you should watch out for. Last time I checked, the linker would skip any static libraries for which you don't actually use any symbols, and a constructor doesn't count as usage (https://ofekshilon.com/2013/04/06/forcing-construction-of-global-objects-in-static-libraries/ comes up from a quick search). So, if you're thinking about creating static libraries for each module to keep code duplication minimal while still embedding the built-in tests that may cause problems.

I think each feature can more or less live or die on its own. I agree with you that for the modules approach it makes the most sense to call one known function after dlopen which registers all the benchmarks in a module. If there happens to be a constructor-based way to generate this list from separate .cpp files then it could be used to generate the list that this method returns, or it could be just an explicit list like we have today in main.cpp.

I don't see any reason to complicated it by using a global constructor, but I guess that would be up to each module.

travisdowns added enhancement help wanted labels Mar 12, 2018

nemequ mentioned this issue Mar 15, 2018

Constructor module nemequ/portable-snippets#18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark registration shouldn't be coupled to `main.cpp` #39

Benchmark registration shouldn't be coupled to `main.cpp` #39

travisdowns commented Mar 12, 2018

nemequ commented Mar 14, 2018

travisdowns commented Mar 14, 2018 •

edited

Loading

nemequ commented Mar 15, 2018

travisdowns commented Mar 15, 2018 •

edited

Loading

nemequ commented Mar 15, 2018

travisdowns commented Mar 16, 2018

nemequ commented Mar 16, 2018

travisdowns commented Mar 16, 2018 •

edited

Loading

nemequ commented Mar 19, 2018

Benchmark registration shouldn't be coupled to main.cpp #39

Benchmark registration shouldn't be coupled to main.cpp #39

Comments

travisdowns commented Mar 12, 2018

nemequ commented Mar 14, 2018

travisdowns commented Mar 14, 2018 • edited Loading

nemequ commented Mar 15, 2018

travisdowns commented Mar 15, 2018 • edited Loading

nemequ commented Mar 15, 2018

travisdowns commented Mar 16, 2018

nemequ commented Mar 16, 2018

travisdowns commented Mar 16, 2018 • edited Loading

nemequ commented Mar 19, 2018

Benchmark registration shouldn't be coupled to `main.cpp` #39

Benchmark registration shouldn't be coupled to `main.cpp` #39

travisdowns commented Mar 14, 2018 •

edited

Loading

travisdowns commented Mar 15, 2018 •

edited

Loading

travisdowns commented Mar 16, 2018 •

edited

Loading