Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement new mission opaque ID protocol #20834

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

peterbarker
Copy link
Contributor

@peterbarker peterbarker commented May 24, 2022

This PR (now) fills in new extension fields in several messages, according to discussions here: mavlink/mavlink#2012

The MISSION_ACK when you have successfully uploaded a mission now contains an opaque ID for the loaded mission.
The MISSION_CURRENT message has three extra fields, one for each of the standard three MAV_MISSION_TYPEs (mission, rally, fence). We only support this field on mission at this point.
The MISSION_COUNT message also gets a new opaque_id field. A GCS can associate the downloaded mission with the opaque_id in the MISSION_COUNT message they receive back from the autopilot. The GCS MUST check the opaque ID hasn't changed after they have all the items if it wants to be sure the downloaded mission is intact!

Replaces #17370 - nothing changed, I just stuffed up when I went to reopen things.

@peterbarker
Copy link
Contributor Author

Relevant discussion: mavlink/mavlink#1171

Copy link
Contributor

@tridge tridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need GCS implementation and testing


checksum = 0;
const uint16_t count = mission.num_commands();
for (uint16_t i=1; i<count; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment on time to run on a F427 with at least 500 WPs.

@hamishwillee
Copy link
Contributor

@peterbarker When do you plan on running a test(s) with many waypoints to evaluate the cost of this operation?

A checksum generated on the vehicle is logically the safest way to check missions match, so if this is feasible, it would be a good thing to do. It also depends on how long it takes/whether it might be done in the background. While we'd like an instant update of mission changed, it it took 10 seconds, I'm not sure that this would be the end of the world if this was well understood by implementers.

If not, it would be good to know so we can look at other alternatives that meet the same use case. Various ideas have floated around:

  • The checksum is generated on GCS or companion and supplied to the vehicle in MISSION_ACK during upload (or a brand new mission item made for this purpose).
  • The vehicle emits a UUID rather supplied in the mission upload, probably as a new mission item. This allows a GCS to know if the mission is the one it just uploaded, and others to know it is not - and it can be cached. The mission item could be part of the plan, so a saved mission that was reloaded would be known to be the same.

We didn't like those options as much because we didn't want to mess with the existing mission items or mission messages, and because using a checksum means that you know two mission/plans are the same without having to trust some data stored in the mission. But if we need to look at these approaches it would be better to know sooner rather than later.

@peterbarker
Copy link
Contributor Author

peterbarker commented Sep 20, 2022

@peterbarker When do you plan on running a test(s) with many waypoints to evaluate the cost of this operation?

AP: Took 2.178000ms to checksum 373 points (5.839142ms/1000 points)

That's obviously not something we can do in the main thread! That's going to complicate the implementation.

Should be feasible to do the calculation in a thread - I'll see how it looks.

@peterbarker peterbarker force-pushed the pr/mission-checksum branch 2 times, most recently from 7054e46 to 8d53ee0 Compare September 21, 2022 05:53
@peterbarker
Copy link
Contributor Author

I've modified MAVProxy and pymavlink to use the checksum feature to avoid uploading a mission if it is identical to the one on the vehicle: ArduPilot/MAVProxy#1104

MANUAL> wp load Tools/autotest/ArduPlane_Tests/MISSION_CHECKSUM/text.txt
MANUAL> Loaded 373 waypoints from Tools/autotest/ArduPlane_Tests/MISSION_CHECKSUM/text.txt
Got COMMAND_ACK: REQUEST_MESSAGE: ACCEPTED
< MISSION_CHECKSUM {mission_type : 0, checksum : 1752152414}
Mission checksum mismatch
Got MISSION_ACK: TYPE_MISSION: ACCEPTED

MANUAL> 
MANUAL> 
MANUAL> 
MANUAL> Loaded 373 waypoint in 7.46s
Sent all 373 waypoints
Got MISSION_ACK: TYPE_MISSION: ACCEPTED
AP: Flight plan received
< MISSION_CHECKSUM {mission_type : 0, checksum : 4264322242}
Flight battery 100 percent

MANUAL> 
MANUAL> 
MANUAL> wp load Tools/autotest/ArduPlane_Tests/MISSION_CHECKSUM/text.txt
MANUAL> Loaded 373 waypoints from Tools/autotest/ArduPlane_Tests/MISSION_CHECKSUM/text.txt
Got COMMAND_ACK: REQUEST_MESSAGE: ACCEPTED
< MISSION_CHECKSUM {mission_type : 0, checksum : 4264322242}
Mission checksum match

@peterbarker
Copy link
Contributor Author

The patches in here are now updating the checksum progressively to avoid the scheduling problems.

@peterbarker
Copy link
Contributor Author

@hamishwillee we discussed this in today's DevCall.

This is now end--to-end working.

We're not sure of the utility of this in the face of mission-transfer-via-ftp, however. Could you summarize the arguments for this feature, please?

@hamishwillee
Copy link
Contributor

Thanks for taking the time to do this work.

MISSION_CHECKSUM is what tells you that the mission, geofence or rally point is different or has changed on the vehicle and needs to be re-synced.
It is required in order to support plan synchronization in vehicles, companions, and GCS, in particular in MAVLink networks where there are multiple components that might be making changes.

Using MAVFTP to upload/download a mission is a lot more efficient than the standard protocol, but:

  • not much use if you don't know the mission has changed.
  • is an expensive operation to make if you don't need to. That is particularly true in multiple GCS systems.

MAVFTP is great if you only have a single GCS and Vehicle that might affect plans. It's fast enough that on connection you might just decide that you'll either swamp the vehicle mission, or upload the vehicle mission.
Of course even if that takes 1 second, that's still longer than the time taken to get the checksum and decide you don't need to bother.

Upshot,

  • you need something like this if you want to efficiently make sure that multiple systems can remain synchronised with a mission. It doesn't have to be a checksum, but that is the "cleanest" way.
  • even if you can choose to always just upload/download a mission, this is more efficient.

@tridge
Copy link
Contributor

tridge commented Sep 28, 2022

@hamishwillee there should not be a tie between the GCS checksum and autopilot checksum. The checksums just won't match for ArduPilot, as we don't store all the mission item fields in ArduPilot, so the checksum calculated on the GCS for a mission file will not match the one from the autopilot.
If we're going to have a checksum it should be an opaque algorithm, do the GCS can tell it has changed, but cannot use it to tell if a file they have locally is the same as one on the autopilot
this will also make it a whole lot cheaper to compute, as we can directly CRC the memory mapper storage of the mission in the autopilot instead of reformatting all the time into the mavlink format

@tridge tridge removed the DevCallEU label Sep 28, 2022
@hamishwillee
Copy link
Contributor

@tridge Thanks very much. That's disappointing.

There's an additional complication; if the GCS can't infer the same checksum then it can't know that a particular emitted checksum is the result of a mission it just uploaded. Therefore it would have no way of knowing the checksum matched it.

To meet the use case of knowing "has the mission changed" either:

  1. The autopilot either has to tell the GCS that "this" particular checksum was set by it, or
  2. The GCS has to tell the autopilot what checksum it should use.

I kind of like option 2 because

  1. The GCS has more grunt to calculate a checksum and I presume is more likely to do so consistently across missions than different flight stacks.
  2. If you embed the checksum in a MAVCMD in the mission - such as MAV_CMD_CHECKSUM - then this will live in the mission on the autopilot whether it is uploaded using MAVFTP or the normal MAVLink protocol. Upload or download, the value persists.
  3. We could reasonably consider not using a checksum but instead using a UUID or similar. I prefer checksums as they allow two "like" missions to be the same, but a UUID would serve the purpose.

The rest of the process could be the same.

Might that work?

Re option 1 I'm not sure how you'd inform the GCS that it was from "X" system/component ID, and I'm not sure if you're using an opaque checksum of the autopilot mission you might not be just better off using UUIDs.

@timtuxworth
Copy link
Contributor

It does seem that this is "poor mans signing". For commercial/secure applications, the ideal solution is that the mission should be able to be cryptographically signed on the GCS and the signature validated on the autopilot. If there are data or algorithm issues with this, then those issues should be solved. This is the utopian solution, but the only way in the end to securely, reliably and completely solve the problem.

@hamishwillee
Copy link
Contributor

@timtuxworth The proposal and use case has nothing to do with signing (at least for me). It is merely a way to allow any system on the network that is interested in the plan on the vehicle to know if it has a matching version or needs to update.

I'm not convinced of the value of a signed mission plan. If there is a need for the drone to only accept trusted plans wouldn't it make more sense to establish trust with the supplier and supplying channel than overlay this requirement on any (every) particular data you want to share?

If you @tridge @peterbarker strongly feel this is a requirement then let me know, otherwise answer to suggestion at #20834 (comment) appreciated.

@peterbarker
Copy link
Contributor Author

@tridge Thanks very much. That's disappointing.

just to clarify with this current PR.

A GCS can't create a mission, checksum it, ask the autopilot what checksum it has and expect it to ever match what it came up with. ArduPilot can not round-trip the entire mission_item_int sequence as we don't store the vast majority of the data that can be present in MISSION_ITEM_INT.

What a GCS can do is checksum a mission it has downloaded from the autopilot, ask the autopilot what it's got on it and expect that to match. This is because once the mission has been round-tripped to the autopilot the GCS has a copy of the mission which has been filtered through the autopilot's storage mechanisms.

The suggestion that came out of the DevCall yesterday is to just go fully opaque - so the GCS never checksums anything, just has some sort of autopilot-generated token which can be used to uniquely identify a mission configuration (almost certainly a checksum).

Note that a mission can be modifed on the autopilot through many means - e.g. partial upload, scripting, custom patches - so adding a UUID in the sense of a randomly-generated number to e.g. MISSION_COUNT doesn't really make sense as something for the vehicle to store alongside the stored mission.

If we were to go down the "autopilot unique token (probably checksum)" route, we could potentially solve the race condition (vs other GCSs changing the mission) issues by having the GCS include an opaque session identifier as an extension in the MISSION_COUNT packet used to initiate the transfer. This will not help if the user transfers the file via ftp - but, then, ftp already has a CRC function built-in (which we could already use to avoid the FTP transfer if we did that implementation in the GCSs!)

@timtuxworth
Copy link
Contributor

timtuxworth commented Oct 5, 2022

@timtuxworth a lot of what you desire there is really outside the scope of what the autopilot can be expected to do - particularly in regards to workflows, which are more the purview of a GCS and/or fleet management software.

The use cases don't specify the solution @peterbarker - only the requirements. But if you think about the solution for these, it's hard to see how many of these requirements could be satisfied by the GCS only, without some cooperation (and changes) to how the autopilot does things. I'm hoping that these considerations can be taken into account. In the end the goal is to make "ArduPilot" (end to end solution, not just the FC) better.

Just a reminder for anyone joining the thread, I've created a forum post here https://discuss.ardupilot.org/t/mission-configuration-data-item/91382 that discusses these requirements. Perhaps I've tagged it wrong, it doesn't seem to be getting much attention :-(

@timtuxworth
Copy link
Contributor

timtuxworth commented Oct 5, 2022

Current 2022-10-05 17:05

New theory is that we will not have a mission specific to the checksum, rather inform the GCS of the ID of the mission currently being executed in an existing message. The ID would also be present in the final MISSION_ACK for a mission tansfer (presumably only upon success).

So does this mean there will also be a "MISSION_TYPE" mission item stored in the vehicle? (I'm assuming that, otherwise I don't see where you store the unique_id value and the other values you are suggesting).

If so, you really have just created a cut down version of the "Mission Configuration" I was suggesting, but having done that, why not create a true immutable "unique_id" for the mission, in addition to one that changes when the mission is updated? It would sure help in the long run.

@timtuxworth
Copy link
Contributor

timtuxworth commented Oct 5, 2022

@timtuxworth Wr.t. your last message above ...

It is likely that ArduPilot knows that it doesn't support certain parameters in certain commands and chooses not to store those. SO if the params have values in the GCS mission those will be thrown away. It is also possible that some mission items are accepted (don't fail an upload) but discarded. The obvious solution is to include the data even if you don't use it but that is costly.

The algorithm is documented in the message:

The checksum for a mission, geofence or rally point definition is run over each item in the plan in seq order (excluding the home location if present in the plan), and covers the following fields (in order): frame, command, autocontinue, param1, param2, param3, param4, param5, param6, param7.

If you run that algorithm and the GCS has values for the fields you will get a different result.

I don't know the history of this, but just reading this, it seem that if the definition algorithm is changed to only include fields that will be common on the FC and GCS, it should be possible to guarantee the same result for the same mission. What am I missing here?

@hamishwillee
Copy link
Contributor

If you run that algorithm and the GCS has values for the fields you will get a different result.

I don't know the history of this, but just reading this, it seem that if the definition algorithm is changed to only include fields that will be common on the FC and GCS, it should be possible to guarantee the same result for the same mission. What am I missing here?

The GCS does not know/should not need to know what parameters the autopilot implements for each and every command. If even one command in the mission has one parameter that is not stored the checksum breaks.

There are ways this might be fixed, but autopilots aren't willing to pay those either. For example, a flight stack might reject any command that has anything other than the "invalid value" for an unused parameter. That way the mission is either stored with a parameter or is invalid and could be ignored by the algorithm. But flight stacks/GCSs generally don't use the defaults properly and much might break while trying to enforce this.

@hamishwillee
Copy link
Contributor

@peterbarker Last addition in #20834 (comment) sounds good.

  • UID returned in MISSION_ACK as extension field. Current GCS knows the ID at that point and can latch it.
  • UID streamed in MISSION_CURRENT as extension field. If it doesn't match GCS value then you know your mission is out of date.
  1. I assume this generalise to MAVFTP based mission upload - i.e. the GCS would get the checksum from MAVFTP for the UID and this is what it would store. This is also what MAVLINK would put in MISSION_CURRENT?
  2. My only concern here w.r.t. the original MISSION_CHECKSUM is that we had separate ids for the parts of the plan - mission, geofence, rally point. The MISSION_ACK is per part of the plan, so you will get a separate ID for each part of the plan, but you only added one ID to MISSION_CURRENT. Thoughts?

@tridge tridge removed the DevCallEU label Oct 12, 2022
@peterbarker peterbarker changed the title Implement MISSION_CHECKSUM Implement new mission opaque ID protocol Jul 1, 2023
@peterbarker peterbarker force-pushed the pr/mission-checksum branch 2 times, most recently from 0c963ba to 9dd3fd4 Compare July 1, 2023 04:50
@peterbarker
Copy link
Contributor Author

Probably worth adding in some defines for this functionality. If we add support for Fence/Rally it will grow several hundred more bytes.

Cost:

Board               AP_Periph  blimp  bootloader  copter  heli  iofirmware  plane  rover  sub
Durandal                       1568   *           1568    1568              1568   1568   1568
HerePro             128                                                                   
Hitec-Airspeed      *                                                                     
KakuteH7-bdshot                888    *           888     888               880    888    912
MatekF405                      880    *           880     888               880    888    904
Pixhawk1-1M-bdshot             888                888     888               888    880    904
f103-QiotekPeriph   *                                                                     
f303-Universal      0                                                                     
iomcu                                                           *                         
revo-mini                      888    *           888     880               888    888    912
skyviper-v2450                                    888                                     

@hamishwillee
Copy link
Contributor

@peterbarker How's this progressing? Note that mavlink/mavlink#2029 is also in play which also affects MISSON_CURRENT.

@peterbarker
Copy link
Contributor Author

@peterbarker How's this progressing? Note that mavlink/mavlink#2029 is also in play which also affects MISSON_CURRENT.

So I rebased and started to look at this again.

We should rename rally_point to rally_id in the new fields.

Bigger problem is that this doesn't play well with MAVFTP transfer of items AFAICS. Can't transfer the opaque ID in the packed format - there's no extensibility here.

https://github.com/ardupilot/MAVProxy/blob/master/MAVProxy/modules/lib/mission_item_protocol.py#L1029

            magic2, dtype, options, start, num_items = struct.unpack("<HHHHH", buf)
.
.
        data = data[10:]
        mavmsg = mavutil.mavlink.MAVLink_mission_item_int_message
        item_size = mavmsg.unpacker.size
        while len(data) >= item_size:

We'd need to adjust the ArduPilot FTP code so that you can request the mission from a slightly different URL to get it with the opaque ID.

Copy link
Contributor

@tridge tridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this be done as an io callback, where it bails if the mission_change_counter changes, and we increment mission_change_counter on any change


const uint32_t mission_last_change_time_ms = mission.last_change_time_ms();

if (mission_last_change_time_ms == checksum_state.last_calculate_time_ms) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there could be more than one change in a ms, especially with scripting changing missions, or DDS

@tridge tridge removed the DevCallEU label Jul 31, 2024
@hamishwillee
Copy link
Contributor

hamishwillee commented Aug 1, 2024

@peterbarker

We should rename rally_point to rally_id in the new fields.

The current docs indicate this is MISSION_CURRENT.rally_points_id - what are you synchronized against?

This is also in PX4 now PX4/PX4-Autopilot#21839

Bigger problem is that this doesn't play well with MAVFTP transfer of items AFAICS. Can't transfer the opaque ID in the packed format - there's no extensibility here.

This is stored on the file system as a file right? With MAVFTP you can get the file CRC (CalcFileCRC32) from the vehicle, which implies that it is calculated on the vehicle.
So why can't GCS retrieve the file CRC as part of an upload of a plan and use some part of that for othe ID?
Similarly, ArduPilot can emit that same value.

Is the theory sound? It might not be complete, but this is the way I was hoping you'd handle it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

6 participants