Skip to content

Commit

Permalink
Add templated links between arbitrary datatypes (#257)
Browse files Browse the repository at this point in the history
* Use more suitable name for header

* Make things work again (again) with c++17

* [wip] Start working on templated associations

* [wip] Almost complete implementation of Association

* [wip] Scaffolding of AssociationCollection

* [wip] AssociationColllection implementation

* [wip] unit tests for subset collection functionality

* [wip] Proper move semantics

* [wip] Add necessary functionality for SIO backend

* [wip] clang-tidy fixes

* Move non public headers to detail folder

* Fix tests after Frame

* [wip] Make everything compile again

* Add json output support for associations

* Generate all possible combinations for root dicts

This makes it easy for the users but compile times explode for the
dictionaries as we are instantiating N**2 templates in static variables
in order for dictionary generation to pick them up properly.

* Add writing of Association collection to Frame I/O tests

* Generate SIOBlocks instantiations for associations

Similar to root dictionary generation to make them available to the SIO
backend. Same caveats and concerns as with ROOT in this case.

* Make association read tests version dependent

* Fix include guards to comply with clang-tidy

* [wip] Make things compile again after rebase

* [wip] Make associations work with Root I/O

* [wip] Make SIO Frame reading (almost) work again

* "fix" legacy sio frame test

* Make the AssociationSIOBlock public

* [wip] Introduce macros for registering associations

* Make Association I/O work with python

* Avoid unnecessary string copies, add documentation

* Add templated get/set and structured bindings to Associations

* Introduce typedefs for consistency with other types

* Move all but the public one into detail directory

* Fix clang-tidy warning

* Default initialize weight to 1

* Simplify and fortify SFINAE logic for mutable associations

* Add markdown documentation describing usage and some implemenation details

* Make things compile again after PR revival

* Fix some runtime errors and cleanup code

* Bump patch version and make all tests run

* Bring documentation up-to-date

* Code cleanup and addressing review comments

* Make sure to cleanup cloned associations properly

* Add functionality to be closer to Container named requirement

* Adapt version check

* Add missing header

* Switch to more canonical member type names

* Switch to podio docstring comment style

* Fix cloning to behave the same as other generated types

* Add tests for inequality operator

* Remove unnecessary NOLINT from tests

* Update type requirements to match generated code

* Make templated associations work the same as generated ones

* Add test for cloning without relations

* Add definition of inequality operator

* Fix typos in documentation

* Move template helpers to class and introduce a new one

* Switch to different types in association for tests

* Add test to make sure structured bindings work in loops

* Simplify and fortify some mutability checks

* Silence false-positives from clang-tidy

* Rename Association to Link after EDM4hep discussion

* Remove empty file

* Use proper preprocessor guards to avoid ROOT interference

* Ensure that JSON ADL is working correctly for downstream consumers

* Add typeName to Interface types

* Add tests for reading / writing links with interface types

* Make sure info for schema evolution is in buffers

* Make sure setFrom and setTo work for python bindings

Need a dedicated overload for the Mutable types again, otherwise
overload resolution for cppyy doesn't consider the implicit conversion
to Mutable as viable option to invoke these

* Make it possible to pass in interfaced types for setters

Removes the need for an explicit cast that would otherwise be necessary

* Make LinkData an explicit type

Makes backwards compatibility easier to achieve

* Bump versions for tests now that a new tag has been made

* Fix typo

* Make initialization work for older compilers

* Make sure to not expect a non-generated header for roundtrip tests

* Make sure cmake runs without ENABLE_SIO=ON

* Move generic preprocessor macros to separate header

* Try to reduce code duplication in macros

* Switch to unique_ptrs for managing the related objects

* Fix a few minor typos in documentation

* Fix typos in comments and docstrings

Co-authored-by: Mateusz Jakub Fila <[email protected]>

* Fix comment

---------

Co-authored-by: Mateusz Jakub Fila <[email protected]>
  • Loading branch information
tmadlener and m-fila authored Sep 30, 2024
1 parent 2064274 commit 383c222
Show file tree
Hide file tree
Showing 31 changed files with 2,260 additions and 13 deletions.
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ project(podio)
#--- Version -------------------------------------------------------------------
SET( ${PROJECT_NAME}_VERSION_MAJOR 1 )
SET( ${PROJECT_NAME}_VERSION_MINOR 1 )
SET( ${PROJECT_NAME}_VERSION_PATCH 0 )
SET( ${PROJECT_NAME}_VERSION_PATCH 99 )

SET( ${PROJECT_NAME}_VERSION "${${PROJECT_NAME}_VERSION_MAJOR}.${${PROJECT_NAME}_VERSION_MINOR}.${${PROJECT_NAME}_VERSION_PATCH}" )

Expand Down
224 changes: 224 additions & 0 deletions doc/links.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,224 @@
# Linking unrelated objects with each other
Sometimes it is necessary to build links between objects whose datatypes are
not related via a `OneToOneRelation` or a `OneToManyRelation`. These *external
relations* are called *Links* in podio, and they are implemented as a
templated version of the code that would be generated by the following yaml
snippet (in this case between generic `FromT` and `ToT` datatypes):

```yaml
Link:
Description: "A weighted link between a FromT and a ToT"
Author: "P. O. Dio"
Members:
- float weight // the weight of the link
OneToOneRelations:
- FromT from // reference to the FromT
- ToT to // reference to the ToT
```
## `Link` basics
`Link`s are implemented as templated classes forming a similar structure
as other podio generated classes, with several layers of which users only ever
interact with the *User layer*. This layer has the following basic classes
```cpp
/// The collection class that forms the basis of I/O and also is the main entry point
template<typename FromT, typename ToT>
class LinkCollection;
/// The default (immutable) class that one gets after reading a collection
template<typename FromT, typename ToT>
class Link;
/// The mutable class for creating links before writing them
template<typename FromT, typename ToT>
class MutableLink;
```

Although the names of the template parameters, `FromT` and `ToT` imply a
direction of the link, from a technical point of view nothing actually
enforces this direction, unless `FromT` and `ToT` are both of the same type.
Hence, links can effectively be treated as bi-directional, and one
combination of `FromT` and `ToT` should be enough for all use cases (see also
the [usage section](#how-to-use-links)).

For a more detailed explanation of the internals and the actual implementation
see [the implementation details](#implementation-details).

## How to use `Link`s
Using `Link`s is quite simple. In line with other datatypes that are
generated by podio all the functionality can be gained by including the
corresponding `Collection` header. After that it is generally recommended to
introduce a type alias for easier usage. **As a general rule `Links` need
to be declared with the default (immutable) types.** Trying to instantiate them
with `Mutable` types will result in a compilation error.

```cpp
#include "podio/LinkCollection.h"
#include "edm4hep/MCParticleCollection.h"
#include "edm4hep/ReconstructedParticleCollection.h"
// declare a new link type
using MCRecoParticleLinkCollection = podio::LinkCollection<edm4hep::MCParticle,
edm4hep::ReconstructedParticle>;
```

This can now be used exactly as any other podio generated collection, i.e.
```cpp
edm4hep::MCParticle mcParticle{};
edm4hep::ReconstructedParticle recoParticle{};
auto mcRecoLinks = MCRecoParticleLinkCollection{};
auto link = mcRecoLinks.create(); // create an link;
link.setFrom(mcParticle);
link.setTo(recoParticle);
link.setWeight(1.0); // This is also the default value!
```

and similar for getting the linked objects
```cpp
auto mcP = link.getFrom();
auto recoP = link.getTo();
auto weight = link.getWeight();
```

In the above examples the `From` and `To` in the method names imply a direction,
but it is also possible to use a templated `get` and `set` method to retrieve
the linked objects via their type:

```cpp
link.set(mcParticle);
link.set(recoParticle);
auto mcP = link.get<edm4hep::MCParticle>();
auto recoP = link.get<edm4hep::ReconstructedParticle>();
auto weight = link.getWeight();
```

It is also possible to access the elements of a link via an index based
`get` (similar to `std::tuple`). In this case `0` corresponds to `getFrom`, `1`
corresponds to `getTo` and `2` corresponds to the weight. The main purpose of
this feature is to enable structured bindings:

```cpp
const auto& [mcP, recoP, weight] = link;
```

The above three examples are three equivalent ways of retrieving the same things
from an `Link`. **The templated `get` and `set` methods are only available
if `FromT` and `ToT` are not the same type** and will lead to a compilation
error otherwise.

### Enabling I/O capabilities for `Link`s

`Link`s do not have I/O support out of the box. This has to be enabled via
the `PODIO_DECLARE_LINK` macro (defined in the `LinkCollection.h`
header). If you simply want to be able to read / write `Link`s in a
standalone executable, it is enough to use this macro somewhere in the
executable, e.g. to enable I/O capabilities for the `MCRecoParticleLink`s
used above this would look like:

```cpp
PODIO_DECLARE_LINK(edm4hep::MCParticle, edm4hep::ReconstructedParticle)
```

The macro will also enable SIO support if the `PODIO_ENABLE_SIO=1` is passed to
the compiler. This is done by default when linking against the
`podio::podioSioIO` library in CMake.

For enabling I/O support for shared datamodel libraries, it is necessary to have
all the necessary combinations of types declared via `PODIO_DECLARE_LINK`
and have that compiled into the library. This is necessary if you want to use
the python bindings, since they rely on dynamically loading the datamodel
libraries.

## Implementation details

In order to give a slightly easier entry to the details of the implementation
and also to make it easier to find where things in the generated documentation,
we give a brief description of the main ideas and design choices here. With
those it should be possible to dive deeper if necessary or to understand the
template structure that is visible in the documentation, but should be fairly
invisible in usage. We will focus mainly on the user facing classes, as those
deal with the most complexity, the underlying layers are more or less what could
be obtained by generating them via the yaml snippet above and sprinkling some
`<FromT, ToT>` templates where necessary.

### File structure

The user facing `"podio/LinkCollection.h"` header essentially just
defines the `PODIO_DECLARE_LINK` macro (depending on whether SIO support
is desired and possible or not). All the actual implementation is done in the
following files:

- [`"podio/detail/LinkCollectionImpl.h"`](https://github.com/AIDASoft/podio/blob/master/include/podio/detail/LinkCollectionImpl.h):
for the collection functionality
- [`"podio/detail/Link.h"`](https://github.com/AIDASoft/podio/blob/master/include/podio/detail/Link.h):
for the functionality of single link
- [`"podio/detail/LinkCollectionIterator.h"`](https://github.com/AIDASoft/podio/blob/master/include/podio/detail/LinkCollectionIterator.h):
for the collection iterator functionality
- [`"podio/detail/LinkObj.h"`](https://github.com/AIDASoft/podio/blob/master/include/podio/detail/LinkObj.h):
for the object layer functionality
- [`"podio/detail/LinkCollectionData.h"`](https://github.com/AIDASoft/podio/blob/master/include/podio/detail/LinkCollectionData.h):
for the collection data functionality
- [`"podio/detail/LinkFwd.h"`](https://github.com/AIDASoft/podio/blob/master/include/podio/detail/LinkFwd.h):
for some type helper functionality and some forward declarations that are used
throughout the other headers
- [`"podio/detail/LinkSIOBlock.h"`](https://github.com/AIDASoft/podio/blob/master/include/podio/detail/LinkSIOBlock.h):
for defining the SIOBlocks that are necessary to use SIO

As is visible from this structure, we did not introduce an `LinkData`
class, since that would effectively just be a `float` wrapped inside a `struct`.

### Default and `Mutable` `Link` classes

A quick look into the `LinkFwd.h` header will reveal that the default and
`Mutable` `Link` classes are in fact just partial specialization of the
`LinkT` class that takes a `bool Mutable` as third template argument. The
same approach is also followed by the `LinkCollectionIterator`s:

```cpp
template<typename FromT, typename ToT, bool Mutable>
class LinkT;
template <typename FromT, typename ToT>
using Link = LinkT<FromT, ToT, false>;
template <typename FromT, typename ToT>
using MutableLink = LinkT<FromT, ToT, true>;
```

Throughout the implementation it is assumed that `FromT` and `ToT` always are the
default handle types. This is ensured through `static_assert`s in the
`LinkCollection` to make sure it can only be instantiated with those. The
`GetDefaultHandleType` helper templates are used to retrieve the correct type
from any `FromT` regardless of whether it is a mutable or a default handle type
With this in mind, effectively all mutating operations on `Link`s are
defined using [*SFINAE*](https://en.cppreference.com/w/cpp/language/sfinae)
using the following template structure (taking here `setFrom` as an example)

```cpp
template <typename FromU,
typename = std::enable_if_t<Mutable &&
std::is_same_v<detail::GetDefaultHandleType<FromU>, FromT>>>
void setFrom(FromU value);
```

This is a SFINAE friendly way to ensure that this definition is only viable if
the following conditions are met
- The object this method is called on has to be `Mutable`. (first part inside the `std::enable_if`)
- The passed in `value` is either a `Mutable` or default class of type `FromT`. (second part inside the `std::enable_if`)

In some cases the template signature looks like this

```cpp
template<bool Mut = Mutable,
typename = std::enable_if<Mut && Mutable>>
void setWeight(float weight);
```

The reason to have a defaulted `bool` template parameter here is the same as the
one for having a `typename FromU` template parameter above: SFINAE only works
with deduced types. Using `Mut && Mutable` in the `std::enable_if` makes sure
that users cannot bypass the immutability by specifying a template parameter
themselves.
37 changes: 37 additions & 0 deletions include/podio/LinkCollection.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#ifndef PODIO_LINKCOLLECTION_H
#define PODIO_LINKCOLLECTION_H

#include "podio/detail/LinkCollectionImpl.h"
#include "podio/detail/PreprocessorMacros.h"

#ifndef PODIO_ENABLE_SIO
#define PODIO_ENABLE_SIO 0
#endif

/// Macro for registering links at the CollectionBufferFactory by injecting the
/// corresponding buffer creation function.
#define PODIO_REGISTER_LINK_BUFFERFACTORY(FromT, ToT) \
const static auto PODIO_PP_CONCAT(REGISTERED_LINK_, __COUNTER__) = \
podio::detail::registerLinkCollection<FromT, ToT>(podio::detail::linkCollTypeName<FromT, ToT>());

/// Macro for registering the necessary SIOBlock for a Link with the SIOBlock factory
#define PODIO_REGISTER_LINK_SIOFACTORY(FromT, ToT) \
const static auto PODIO_PP_CONCAT(LINK_SIO_BLOCK_, __COUNTER__) = podio::LinkSIOBlock<FromT, ToT>{};

#if PODIO_ENABLE_SIO && __has_include("podio/detail/LinkSIOBlock.h")
#include "podio/detail/LinkSIOBlock.h"
/// Main macro for declaring links. Takes care of the following things:
/// - Registering the necessary buffer creation functionality with the
/// CollectionBufferFactory.
/// - Registering the necessary SIOBlock with the SIOBlock factory
#define PODIO_DECLARE_LINK(FromT, ToT) \
PODIO_REGISTER_LINK_BUFFERFACTORY(FromT, ToT) \
PODIO_REGISTER_LINK_SIOFACTORY(FromT, ToT)
#else
/// Main macro for declaring links. Takes care of the following things:
/// - Registering the necessary buffer creation functionality with the
/// CollectionBufferFactory.
#define PODIO_DECLARE_LINK(FromT, ToT) PODIO_REGISTER_LINK_BUFFERFACTORY(FromT, ToT)
#endif

#endif // PODIO_LINKCOLLECTION_H
Loading

0 comments on commit 383c222

Please sign in to comment.