Skip to content

Commit

Permalink
Calculate stop to subshape mapping during import (#136)
Browse files Browse the repository at this point in the history
* WIP: Move subshape request into `nigiri`

* WIP: Remove `trip_idx_t` from function signature

* WIP: Explore wrong shape for merged trips

* WIP: Fix missing end segment

* Fix trip index on merged trips

* Add check for stop order on same section

* WIP: Create struct to handle shapes data

* WIP: Swap to new shapes storage struct

* WIP: Format code

* WIP: Remove temporary cache

* WIP: Remove mutable cache

* WIP: Fix array variant

* WIP: Parametrize test

* Add missing header

* WIP: Calculate shape offsets per stop during load

* WIP: Use cache to improve import duration

This also adds files missing in last commit

* Add test for duplicated shape offsets

* Cleanup code

* Fix build errors

* Fix formatting

* Update progress bar

* Remove `constexpr` specifier

* Ensure mapping uses correct indices

* Make `shapes_storage` optional

Using a `shapes_storage` instance will now always calculate and store
shapes and shape offsets.

* WIP: Avoid creating an additional vector

* Add test for shared shapes

* Cleanup test input

* Move initializer used in `if`

* Simplify cache offset calculation

* Remove aliases used only once

* Use lambda expressions for offsets calculation cache

This will remove warnings about possibly unused functions in defined
function objects.

* Add vector to store duplicated shape offsets

* WIP: Store trip to shape mapping in shapes_storage

* Remove 'trip_shape_indices_' from 'timetable'

* Revert accidentally applied formatting

* Fix code style

* Fix typo

* fixup! Revert accidentally applied formatting

* Fix missing header

* Use uniform formatting library

* Simplify equals and hash functions

* Fix progess bar update

* Duplicate alias defined in osr

* Delete dead code

* WIP: Add prototype supporting block trips

* Fix assertions

* Update tests for changes function

This will also prepare the data set for multiple tests using a common
timetable.

* Handle trips without shape

* WIP: Add support for runs containing a trip subset

* WIP: Handle offsets of merged sub trips

* WIP: Add support for runs covering multiple trips

* Delete debug output

* WIP: Add support for shapes covering two trips

Notice that the connection stop will be processed twice for now.

* WIP: Fix duplicated connection stop

* WIP: Add support for many trips

* Fix offsets

* Support merged trips with and without shapes

* Simplify some code

* Fix includes

* Remove previous implementation

* Simplify shape offset calculation

* Use apropiate loop statement

* Remove not required namespace

* Improve test description

* Reduce duplicated code

* Fix missing `const`

* Fix formatting

* WIP: Prepare shape offset calculate by distance

* Simplify property access

* Fix test data

* Remove stop deduplication for merged trips

Stops connecting multiple trips in a journey leg will no longer be
merged. This will slightly improve code readability without having a
notible effect for most data sets.
Furthermore, it will support GTFS data sets that may use different
coordinates for stops connecting multiple trips.

* Calculate offsets based on distance traveled

* Add test for different traveled distances

* Fix formatting

* Fix out of bounds error

* Remove lambda function

* Swap checks for performance improvement

* Fix assertion

Fix offset, as 'stop_range_' is inclusive while 'range' is exclusive

* Prefer algorithms provided by 'utl'

* Use better variable name

* Fix missing namespace 'std'

* Replace unicode arrows

* Use explicit loop to replace 'std::for_each'

* Move expected values into assertions

* Remove not needed assertion

* Fix missing type conversion for interval shifts

* Use operations defined for interval

* Use enumeration

* Use unsigned integers for initialization

This is only applied for data types that are based on unsigned integers

* Use base type for offset calculation

* Revert "Fix assertion"

This reverts commit 99da3ed.

* Fix variable name

* Use 'interval::end()' implementation

Notice that using `end(interval)` will attempt to use `frun::end()`
instead.

* wip

* Fix code

* Add tests for mixed shape trips

* WIP: Simplify shape processing

* Cleanup code

* Fix behavior for single stops

* Format code

* Replace call to 'std::views::pairwise'

* Remove not supported 'std::views::pairwise'

* Use offset cache for shapes with distance traveled

* Remove calls to 'std::make_pair'

* Change temporary data structure

This will use a vector map to avoid an additional hash map.

* Simplify inserts

* Store median distance traveled

This allows minor errors within shape distance traveled

* Fix formatting

* Reduce memory usage

* Remove not required headers

* Rename variables to match intended usage

* Simplify check for valid distances traveled

* Fix offset error for multiple data sets

* Fix included headers

* Use trailing return type

* Improve variable names

* Remove no longer used function

* Remove no longer relevant information from test

* WIP: Change code to operate on segments

* Remove duplicated shape points

* Move duplicated code into shared lambda with state

* Add tests for interval intersection

* Remove not needed variable

* Use lvalue reference instead of rvalue reference

* Remove misleading alias

* Use default capture

* Update names

* Fix naming

* Store invalid starting point into constant

* Replace comment with assert

* Use alias

* Replace 'index' with 'idx'

* Remove not needed 'inline'

* Add brief explanation for stored median

* WIP: Setup tests for missing distances

* Avoid zero vectors for 'shape_dist_traveled'

* Move length check to the end

* Improve memory usage for empty shape_dist_traveled

As values must increase, only leading `0.0`s are allowed.

* Update test to use multiple leading `0.0`s

* Simplify check for valid distances

* Reduce memory usage when no shapes are used

* Mark unused variable

* Fix data type

* Delete obsolete compare function

* Simplify shape offset calculation

Assume visual errors will not be noticed by end users. Therefore some
optimization can be skipped.

* Update cista dependency

* Remove unused include

---------

Co-authored-by: Felix Gündling <[email protected]>
  • Loading branch information
MichaelKutzner and felixguendling authored Oct 9, 2024
1 parent d4c4b88 commit 164480a
Show file tree
Hide file tree
Showing 37 changed files with 1,284 additions and 131 deletions.
2 changes: 1 addition & 1 deletion .pkg
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
[cista]
[email protected]:felixguendling/cista.git
branch=master
commit=f52a62c4d83377acd398227ab4fcd6c946bdbd70
commit=f1358310262c347a8b4a533e5dd6184ec97ba637
[geo]
[email protected]:motis-project/geo.git
branch=master
Expand Down
7 changes: 4 additions & 3 deletions .pkg.lock
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
8342248202595136248
14235113949304861054
cista f52a62c4d83377acd398227ab4fcd6c946bdbd70
PEGTL 1c1aa6e650e4d26f10fa398f148ec0cdc5f0808d
res 7d97784ba785ce8a2677ea77164040fde484fb04
date d84b23ca2432e17f3f04a3e0cc96b096b99c39a2
res b759b93316afeb529b6cb5b2548b24c41e382fb0
date ce88cc33b5551f66655614eeebb7c5b7189025fb
googletest 7b64fca6ea0833628d6f86255a81424365f7cc0c
fmt dc10f83be70ac2873d5f8d1ce317596f1fd318a2
utl 77aac494c45d2b070e65fe712abc34ac74a91d0f
Expand All @@ -19,5 +19,6 @@ opentelemetry-proto 1624689398a3226c45994d70cb544a1e781dc032
abseil-cpp ba5240842d352b4b67a32092453a2fe5fe53a62e
protobuf d8136b9c6a62db6ce09900ecdeb82bb793096cbd
opentelemetry-cpp ec4aef6b17b697052edef5417825ad71947b2ed1
pugixml 60175e80e2f5e97e027ac78f7e14c5acc009ce50
unordered_dense 77e91016354e6d8cba24a86c5abb807de2534c02
wyhash 1e012b57fc2227a9e583a57e2eacb3da99816d99
5 changes: 2 additions & 3 deletions exe/import.cc
Original file line number Diff line number Diff line change
Expand Up @@ -106,10 +106,9 @@ int main(int ac, char** av) {
assistance = std::make_unique<assistance_times>(read_assistance(f.view()));
}

auto shapes = std::unique_ptr<shapes_storage_t>();
auto shapes = std::unique_ptr<shapes_storage>{};
if (vm.contains("shapes")) {
shapes =
std::make_unique<shapes_storage_t>(create_shapes_storage(out_shapes));
shapes = std::make_unique<shapes_storage>(out_shapes);
}

auto const start = parse_date(start_date);
Expand Down
13 changes: 11 additions & 2 deletions include/nigiri/common/interval.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

#include <cassert>
#include <concepts>
#include <cstdlib>
#include <algorithm>
#include <iterator>
#include <ostream>
Expand Down Expand Up @@ -56,12 +57,12 @@ struct interval {
};

template <typename X>
interval operator+(X const& x) const {
interval operator>>(X const& x) const {
return {static_cast<T>(from_ + x), static_cast<T>(to_ + x)};
}

template <typename X>
interval operator-(X const& x) const {
interval operator<<(X const& x) const {
return {static_cast<T>(from_ - x), static_cast<T>(to_ - x)};
}

Expand All @@ -79,6 +80,14 @@ struct interval {
return from_ < o.to_ && to_ > o.from_;
}

interval intersect(interval const& o) const {
if (overlaps(o)) {
return {std::max(from_, o.from_), std::min(to_, o.to_)};
} else {
return {};
}
}

iterator begin() const { return {from_}; }
iterator end() const { return {to_}; }
friend iterator begin(interval const& r) { return r.begin(); }
Expand Down
3 changes: 3 additions & 0 deletions include/nigiri/common/sort_by.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ template <typename T>
void apply_permutation(std::vector<unsigned> const& permutation,
T const& orig,
T& vec) {
if (orig.empty()) {
return;
}
for (auto i = 0U; i != permutation.size(); ++i) {
vec[i] = orig[permutation[i]];
}
Expand Down
5 changes: 3 additions & 2 deletions include/nigiri/loader/gtfs/load_timetable.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

#include "nigiri/loader/dir.h"
#include "nigiri/loader/loader_interface.h"
#include "nigiri/shape.h"
#include "nigiri/types.h"

namespace nigiri {
Expand All @@ -23,14 +24,14 @@ void load_timetable(loader_config const&,
dir const&,
timetable&,
assistance_times* = nullptr,
shapes_storage_t* = nullptr);
shapes_storage* = nullptr);

void load_timetable(loader_config const&,
source_idx_t,
dir const&,
timetable&,
hash_map<bitfield, bitfield_idx_t>&,
assistance_times* = nullptr,
shapes_storage_t* = nullptr);
shapes_storage* = nullptr);

} // namespace nigiri::loader::gtfs
3 changes: 2 additions & 1 deletion include/nigiri/loader/gtfs/loader.h
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#pragma once

#include "nigiri/loader/loader_interface.h"
#include "nigiri/shape.h"

namespace nigiri::loader::gtfs {

Expand All @@ -12,7 +13,7 @@ struct gtfs_loader : public loader_interface {
timetable&,
hash_map<bitfield, bitfield_idx_t>&,
assistance_times*,
shapes_storage_t*) const override;
shapes_storage*) const override;
cista::hash_t hash(dir const&) const override;
std::string_view name() const override;
};
Expand Down
9 changes: 7 additions & 2 deletions include/nigiri/loader/gtfs/shape.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

#include <string_view>

#include "nigiri/shape.h"
#include "nigiri/types.h"

namespace nigiri::loader::gtfs {
Expand All @@ -11,8 +12,12 @@ struct shape_state {
std::size_t last_seq_{};
};

using shape_id_map_t = hash_map<std::string, shape_state>;
struct shape_loader_state {
hash_map<std::string, shape_state> id_map_{};
vecvec<shape_idx_t, double> distances_{};
shape_idx_t index_offset_;
};

shape_id_map_t parse_shapes(std::string_view const, shapes_storage_t&);
shape_loader_state parse_shapes(std::string_view const, shapes_storage&);

} // namespace nigiri::loader::gtfs
15 changes: 15 additions & 0 deletions include/nigiri/loader/gtfs/shape_prepare.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#pragma once

#include "nigiri/loader/gtfs/shape.h"
#include "nigiri/loader/gtfs/trip.h"
#include "nigiri/shape.h"
#include "nigiri/timetable.h"

namespace nigiri::loader::gtfs {

void calculate_shape_offsets(timetable const&,
shapes_storage&,
vector_map<gtfs_trip_idx_t, trip> const&,
shape_loader_state const&);

} // namespace nigiri::loader::gtfs
3 changes: 2 additions & 1 deletion include/nigiri/loader/gtfs/stop_time.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ namespace nigiri::loader::gtfs {
void read_stop_times(timetable&,
trip_data&,
locations_map const&,
std::string_view file_content);
std::string_view file_content,
bool);

} // namespace nigiri::loader::gtfs
4 changes: 3 additions & 1 deletion include/nigiri/loader/gtfs/trip.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
#include "nigiri/loader/gtfs/shape.h"
#include "nigiri/loader/gtfs/stop.h"
#include "nigiri/timetable.h"
#include "nigiri/types.h"

namespace nigiri::loader::gtfs {

Expand Down Expand Up @@ -98,6 +99,7 @@ struct trip {
std::vector<std::uint16_t> seq_numbers_;
std::vector<stop_events> event_times_;
std::vector<trip_direction_idx_t> stop_headsigns_;
std::vector<double> distance_traveled_;

std::optional<std::vector<frequency>> frequency_;
bool requires_interpolation_{false};
Expand Down Expand Up @@ -126,7 +128,7 @@ trip_data read_trips(
timetable&,
route_map_t const&,
traffic_days_t const&,
shape_id_map_t const&,
shape_loader_state const&,
std::string_view file_content,
std::array<bool, kNumClasses> const& bikes_allowed_default);

Expand Down
3 changes: 2 additions & 1 deletion include/nigiri/loader/hrd/loader.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

#include "nigiri/loader/hrd/parser_config.h"
#include "nigiri/loader/loader_interface.h"
#include "nigiri/shape.h"

namespace nigiri::loader::hrd {

Expand All @@ -14,7 +15,7 @@ struct hrd_loader : public loader_interface {
timetable& tt,
hash_map<bitfield, bitfield_idx_t>&,
assistance_times*,
shapes_storage_t*) const override;
shapes_storage*) const override;
cista::hash_t hash(dir const&) const override;
nigiri::loader::hrd::config config_;
};
Expand Down
3 changes: 2 additions & 1 deletion include/nigiri/loader/load.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

#include "nigiri/loader/build_footpaths.h"
#include "nigiri/common/interval.h"
#include "nigiri/shape.h"
#include "nigiri/timetable.h"
#include "nigiri/types.h"

Expand All @@ -19,7 +20,7 @@ timetable load(std::vector<std::pair<std::string, loader_config>> const&,
finalize_options const&,
interval<date::sys_days> const&,
assistance_times* = nullptr,
shapes_storage_t* = nullptr,
shapes_storage* = nullptr,
bool ignore = false);

} // namespace nigiri::loader
3 changes: 2 additions & 1 deletion include/nigiri/loader/loader_interface.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

#include "nigiri/loader/assistance.h"
#include "nigiri/loader/dir.h"
#include "nigiri/shape.h"
#include "nigiri/types.h"

namespace nigiri {
Expand All @@ -28,7 +29,7 @@ struct loader_interface {
timetable&,
hash_map<bitfield, bitfield_idx_t>&,
assistance_times*,
shapes_storage_t*) const = 0;
shapes_storage*) const = 0;
virtual cista::hash_t hash(dir const&) const = 0;
virtual std::string_view name() const = 0;
};
Expand Down
11 changes: 11 additions & 0 deletions include/nigiri/rt/frun.h
Original file line number Diff line number Diff line change
@@ -1,10 +1,16 @@
#pragma once

#include <functional>
#include <iosfwd>

#include "geo/latlng.h"

#include "nigiri/common/interval.h"
#include "nigiri/location.h"
#include "nigiri/rt/run.h"
#include "nigiri/shape.h"
#include "nigiri/stop.h"
#include "nigiri/types.h"

namespace nigiri {
struct rt_timetable;
Expand Down Expand Up @@ -136,6 +142,11 @@ struct frun : public run {
trip_idx_t trip_idx() const;
clasz get_clasz() const noexcept;

void for_each_shape_point(
shapes_storage const*,
interval<stop_idx_t> const&,
std::function<void(geo::latlng const&)> const&) const;

void print(std::ostream&, interval<stop_idx_t>);
friend std::ostream& operator<<(std::ostream&, frun const&);

Expand Down
32 changes: 19 additions & 13 deletions include/nigiri/shape.h
Original file line number Diff line number Diff line change
Expand Up @@ -3,24 +3,30 @@
#include <filesystem>
#include <span>

#include "cista/containers/pair.h"

#include "geo/latlng.h"

#include "nigiri/types.h"

namespace nigiri {
struct timetable;
}

namespace nigiri {

shapes_storage_t create_shapes_storage(
std::filesystem::path const&,
cista::mmap::protection = cista::mmap::protection::WRITE);

std::span<geo::latlng const> get_shape(timetable const&,
shapes_storage_t const&,
trip_idx_t);

std::span<geo::latlng const> get_shape(shapes_storage_t const&, shape_idx_t);
struct shapes_storage {
explicit shapes_storage(
std::filesystem::path const&,
cista::mmap::protection = cista::mmap::protection::WRITE);
std::span<geo::latlng const> get_shape(shape_idx_t) const;
std::span<geo::latlng const> get_shape(trip_idx_t) const;
std::span<geo::latlng const> get_shape(trip_idx_t,
interval<stop_idx_t> const&) const;
shape_offset_idx_t add_offsets(std::vector<shape_offset_t> const&);
void add_trip_shape_offsets(
trip_idx_t, cista::pair<shape_idx_t, shape_offset_idx_t> const&);

mm_vecvec<shape_idx_t, geo::latlng> data_;
mm_vecvec<shape_offset_idx_t, shape_offset_t> offsets_;
mm_vec_map<trip_idx_t, cista::pair<shape_idx_t, shape_offset_idx_t>>
trip_offset_indices_;
};

} // namespace nigiri
3 changes: 0 additions & 3 deletions include/nigiri/timetable.h
Original file line number Diff line number Diff line change
Expand Up @@ -386,9 +386,6 @@ struct timetable {
// Trip index -> all transports with a stop interval
paged_vecvec<trip_idx_t, transport_range_t> trip_transport_ranges_;

// Trip index -> shape per trip
vector_map<trip_idx_t, shape_idx_t> trip_shape_indices_;

// Transport -> stop sequence numbers (relevant for GTFS-RT stop matching)
// Compaction:
// - empty = zero-based sequence 0,1,2,...
Expand Down
7 changes: 6 additions & 1 deletion include/nigiri/types.h
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,9 @@ using optional = cista::optional<T>;
template <typename Key, typename T, std::size_t N>
using nvec = cista::raw::nvec<Key, T, N>;

template <typename K, typename V>
using mm_vec_map = cista::basic_mmap_vec<V, K>;

template <typename T>
using mm_vec = cista::basic_mmap_vec<T, std::uint64_t>;

Expand All @@ -134,6 +137,9 @@ using route_idx_t = cista::strong<std::uint32_t, struct _route_idx>;
using section_idx_t = cista::strong<std::uint32_t, struct _section_idx>;
using section_db_idx_t = cista::strong<std::uint32_t, struct _section_db_idx>;
using shape_idx_t = cista::strong<std::uint32_t, struct _shape_idx>;
using shape_offset_t = cista::strong<std::uint32_t, struct _shape_offset>;
using shape_offset_idx_t =
cista::strong<std::uint32_t, struct _shape_offset_idx>;
using trip_idx_t = cista::strong<std::uint32_t, struct _trip_idx>;
using trip_id_idx_t = cista::strong<std::uint32_t, struct _trip_id_str_idx>;
using transport_idx_t = cista::strong<std::uint32_t, struct _transport_idx>;
Expand Down Expand Up @@ -171,7 +177,6 @@ using attribute_combination_idx_t =
cista::strong<std::uint32_t, struct _attribute_combination>;
using provider_idx_t = cista::strong<std::uint32_t, struct _provider_idx>;

using shapes_storage_t = mm_vecvec<shape_idx_t, geo::latlng>;
using transport_range_t = pair<transport_idx_t, interval<stop_idx_t>>;

struct trip_debug {
Expand Down
1 change: 1 addition & 0 deletions src/abi.cc
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
#include "nigiri/rt/create_rt_timetable.h"
#include "nigiri/rt/gtfsrt_update.h"
#include "nigiri/rt/rt_timetable.h"
#include "nigiri/shape.h"
#include "nigiri/timetable.h"
#include "nigiri/types.h"

Expand Down
Loading

0 comments on commit 164480a

Please sign in to comment.