Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 2019.04 - RC1 #113

Closed
danpetry opened this issue Apr 5, 2019 · 19 comments
Closed

Release 2019.04 - RC1 #113

danpetry opened this issue Apr 5, 2019 · 19 comments

Comments

@danpetry
Copy link
Contributor

danpetry commented Apr 5, 2019

This issue is for discussion related to testing and bugfixing of Release Candidate 1 of the 2019.04 release.

To track testing, please use the tracking spreadsheet, which contains the list of tests and a column for your initials if you're testing something, and pass/fail status for your test. This should hopefully be a better way of capturing information than a checklist and inline conversation, let me know if it's not.

https://docs.google.com/spreadsheets/d/18k5rijHnoFEC3AZA5Ur3_nRjwban-5na6R2bkDu2Jac/edit?usp=sharing

There is a "test failures and bugfixes" tab in the spreadsheet, which is there to capture the progress of bugfixes coming out of the tests. This is optional for you, as I'll keep it updated, but hopefully will be also useful.

There are also folders for you to drop test artefacts in, if there are any:

https://drive.google.com/drive/folders/14EwQPQM7zN0Go5SJHBtIOBF6qN8L7vig?usp=sharing

@miri64
Copy link
Member

miri64 commented Apr 9, 2019

Why is 1.1 marked as failed? As far as I can see the linked Murdock output only fails on the tests for tests/pkg_c25519 but not on the compiling of any of those applications.

@danpetry
Copy link
Contributor Author

danpetry commented Apr 9, 2019

Ok thank you for the clarification!

@jia200x
Copy link
Member

jia200x commented Apr 10, 2019

Test 11.3 (LoRaWAN abp) fails. It freezes:

main(): This is RIOT! (Version: 2019.07-devel-HEAD)
All up, running the shell now
> loramac set nwkskey B74B805FAFC3E10B81E7A9015E67B43C
loramac set nwkskey B74B805FAFC3E10B81E7A9015E67B43C
> loramac set appskey 59A6C0AA0870E53CCAD8E4EB7647D10C
loramac set appskey 59A6C0AA0870E53CCAD8E4EB7647D10C
> loramac set rx2_dr 3
loramac set rx2_dr 3
> loramac join abp
loramac join abp
Join procedure succeeded!
> loramac tx hola
loramac tx hola
help

/* Crickets... */

EDIT: I will check what's going on there

@miri64
Copy link
Member

miri64 commented Apr 10, 2019

Meta-comment: While I think that the spreadsheet could help with the syncing problem we faced in the past, it is a bit impractical in the regard that it doesn't provide links to the tasks, as the checklist did.

@miri64
Copy link
Member

miri64 commented Apr 10, 2019

I'm investigating why 4.3 is failing

@miri64
Copy link
Member

miri64 commented Apr 10, 2019

@danpetry regarding 4.7-8: did you compile the arduino-zero application with USEMODULE=xbee?

@danpetry
Copy link
Contributor Author

@miri64 I used the automated scripts, haven't checked further yet

@miri64
Copy link
Member

miri64 commented Apr 10, 2019

I'm investigating why 4.3 is failing

RIOT-OS/RIOT#9523 introduced the regression. I'm trying to find out why later.

@danpetry
Copy link
Contributor Author

Meta-comment: While I think that the spreadsheet could help with the syncing problem we faced in the past, it is a bit impractical in the regard that it doesn't provide links to the tasks, as the checklist did.

Putting hyperlinks in now. Is this adequate? Can revert back to the checklist if spreadsheet is not helping overall

@miri64
Copy link
Member

miri64 commented Apr 10, 2019

Putting hyperlinks in now. Is this adequate? Can revert back to the checklist if spreadsheet is not helping overall

At least for the problem I pointed out it does. I'll open an issue to discuss the structure / tool for organizing the testing further. This way we get the discussion out of the way of the actual testing discussion.

@miri64
Copy link
Member

miri64 commented Apr 10, 2019

Regarding 4.3

RIOT-OS/RIOT#9523 introduced the regression. I'm trying to find out why later.

The problem is that with RIOT-OS/RIOT#9523 ping is now asynchronous. So instead of sending packets in an interval [RTT of previous packet] + [given interval] we now burst out the packets every 100ms pretty much precisely. However, the round-trip time of a 1KB packet in 6LoWPAN is ~140ms, so we are already filling up the packet buffer with a second packet while we still wait for the reply from the last (maybe even a third, I did not analyze it that deeply). Because of that the packet buffer fills up, resulting in not enough space for the reply. If I change the test parameters to 500 packets every 200ms, it works. So I'd say given the use case the test specification is broken, we just did not realize since the implementation used to be wrong as well (obviously 100ms + RTT are not 100ms ;-)).

@jia200x
Copy link
Member

jia200x commented Apr 10, 2019

Ok, LoRaWAN ABP test passes.
But, the devaddr was wrong and the MAC layer tried to retry on DR0. This means, the MAC layer was blocked for ~ (50* NUM_OF_RETRANS) seconds, which is quite a lot. I think we should slowly try to make it asynchronous (something similar to RIOT-OS/RIOT#11022)

@miri64
Copy link
Member

miri64 commented Apr 10, 2019

At least for the problem I pointed out it does. I'll open an issue to discuss the structure / tool for organizing the testing further. This way we get the discussion out of the way of the actual testing discussion.

See #120

@miri64
Copy link
Member

miri64 commented Apr 10, 2019

The problem is that with RIOT-OS/RIOT#9523 ping is now asynchronous. So instead of sending packets in an interval [RTT of previous packet] + [given interval] we now burst out the packets every 100ms pretty much precisely. However, the round-trip time of a 1KB packet in 6LoWPAN is ~140ms, so we are already filling up the packet buffer with a second packet while we still wait for the reply from the last (maybe even a third, I did not analyze it that deeply). Because of that the packet buffer fills up, resulting in not enough space for the reply. If I change the test parameters to 500 packets every 200ms, it works. So I'd say given the use case the test specification is broken, we just did not realize since the implementation used to be wrong as well (obviously 100ms + RTT are not 100ms ;-)).

Mhhh... the packet buffer doesn't seem to be the problem after all :-/ I still don't get any replies if make it bigger and it isn't even filled up 50% (note position of last used byte):

1554900331.956012;m3-101;> pktbuf
1554900331.956204;m3-101;packet buffer: first byte: 0x20001b08, last byte: 0x20004acc (size: 12228)
1554900331.957131;m3-100;packet buffer: first byte: 0x20001b08, last byte: 0x20004acc (size: 12228)
1554900331.957323;m3-100;  position of last byte used: 5832
1554900331.957611;m3-101;  position of last byte used: 5728
1554900331.958410;m3-100;~ unused: 0x20001b08 (next: 0, size: 12228) ~
1554900331.958703;m3-101;~ unused: 0x20001b08 (next: 0, size: 12228) ~

@danpetry
Copy link
Contributor Author

@danpetry regarding 4.7-8: did you compile the arduino-zero application with USEMODULE=xbee?

Yes, this module is being compiled

@miri64
Copy link
Member

miri64 commented Apr 10, 2019

It seems to be a resource problem after all. The reassembly buffer just runs full on both ends + the at86rftxx driver seems to have problems to receive everything thrown at it (handling both messages sent and received at once). When just sending UDP packets (that do not solicit a reply) at the same interval and size, I everything is fine. So since the new ping6 command is actually not broken, but the system just can't handle the newer, faster way of pinging another node, I'd still suggest, that we change the test parameters to something more careful. The test is about testing the fragmentation after all not about how fast the fragmentation can work under stress.

@kb2ma
Copy link
Member

kb2ma commented Apr 11, 2019

No issues with CoAP tests.

@miri64
Copy link
Member

miri64 commented Apr 15, 2019

No issue in Task 10.

@danpetry
Copy link
Contributor Author

Closing in favor of #124

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants