Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling set_rx_rate() doesn't effect on N310 with UHD 4.1 #486

Closed
andrepuschmann opened this issue Sep 21, 2021 · 9 comments
Closed

Calling set_rx_rate() doesn't effect on N310 with UHD 4.1 #486

andrepuschmann opened this issue Sep 21, 2021 · 9 comments

Comments

@andrepuschmann
Copy link
Contributor

Issue Description

Somehow the continuation of #449 in order to get the N310 reliably work with srsRAN.

This time we've noticed that the sample rates aren't applied correctly, perhaps they have the wrong order or something similar is wrong. But the result is that the rate stays at 1.92 MSps regardless of the configured value (usually a multiple of that).

Setup Details

  • N310
  • UHD_4.1.0.2-1-gceac1bdd

Expected Behavior

Rx rate to get applied as specified.

Actual Behaviour

With the srsRAN benchmark_radio testcase (which doesn't do any DSP, just uses the radio module to talk to the radio) the issue becomes clear by looking at the timestamps. The app receives 1ms worth of samples, i.e. 15360 in case we sample at 15.36e6. We expect the Rx timestamp to advance 0.001s in this case, but it's not. The rate stays at 1.92e6.

$ ./lib/src/radio/test/benchmark_radio -s 15.36e6 -t 1 -g 20 -v
duration=1.000000
srate=15360000.000000
Instantiating objects and allocating memory...
mlockall: Cannot allocate memory
Initialising instances...
Opening 1 channels in RF device= with args=default
[INFO] [UHD] linux; GNU C++ version 9.3.0; Boost_107100; UHD_4.1.0.2-1-gceac1bdd
[INFO] [LOGGING] Fastpath logging disabled at runtime.
Opening USRP channels=1, args: type=n3xx,master_clock_rate=122.88e6
[INFO] [UHD RF] RF UHD Generic instance constructed
[INFO] [MPMD] Initializing 1 device(s) in parallel with args: mgmt_addr=10.12.1.187,type=n3xx,product=n310,serial=317F537,fpga=HG,claimed=False,addr=10.12.1.187,master_clock_rate=122.88e6
[WARNING] [MPM.RPCServer] A timeout event occured!
[INFO] [MPM.PeriphManager] init() called with device args `fpga=HG,master_clock_rate=122.88e6,mgmt_addr=10.12.1.187,product=n310,clock_source=internal,time_source=internal'.
default handler->rx_rate 15360000.000000
default handler->set_tx_rate 15360000.000000
Start capturing 1000 sub-frames of 15360 samples (approx. 1s) ...
rf_uhd_recv_with_time_multi() nsamples=15360
rf_uhd_recv_with_time_multi() nsamples=15360
rf_uhd_recv_with_time_multi() nsamples=15360
rf_uhd_recv_with_time_multi() nsamples=15360
rf_uhd_recv_with_time_multi() nsamples=15360
[INFO]: Timestamp gap (964527 samples) detected! Frame 5/1000. ts=0.260983748+0.063794751=0.324778499
rf_uhd_recv_with_time_multi() nsamples=15360
rf_uhd_recv_with_time_multi() nsamples=15360
rf_uhd_recv_with_time_multi() nsamples=15360
[INFO]: Timestamp gap (968441 samples) detected! Frame 8/1000. ts=0.326778499+0.064049520=0.390828019
...

Steps to reproduce the problem

I couldn't reproduce the issue with the UHDs benchmark_rate so it seems to be an issue with how we call the API. But again, the same code works nicely with an X310 and UHD doesn't complain about anything.

@mbr0wn mbr0wn added the bug label Sep 21, 2021
@mbr0wn
Copy link
Contributor

mbr0wn commented Sep 21, 2021

@andrepuschmann Is this a patched version of benchmark_radio? Or can we clone srsRAN and try this directly ourselves?

@andrepuschmann
Copy link
Contributor Author

@andrepuschmann Is this a patched version of benchmark_radio? Or can we clone srsRAN and try this directly ourselves?

It should be identical to https://github.com/srsran/srsRAN/blob/master/lib/src/radio/test/benchmark_radio.cc , I don't recall any major changes. I just decorated the function calls a bit to print values.

@manderseck
Copy link
Contributor

@andrepuschmann I could reproduce the issue with the latest UHD and srsRAN masters. I'll see if I can find what is the difference between the different devices. I can confirm that it works properly with B210 and X300, but N310 brings up the timestamp gap.

@mbr0wn
Copy link
Contributor

mbr0wn commented Sep 23, 2021

@manderseck Can you check the order in which rate is done, streamers are created, etc. Any order should work, but I have a feeling this is related.

@andrepuschmann
Copy link
Contributor Author

Hey, just wondering if you guys have had a chance to investigate further and have a time estimate for a possible fix? We'd be happy to test here.
Thanks

@manderseck
Copy link
Contributor

Sorry for the delay, I had a few other things on my plate but I have this issue on my list for tomorrow (Tuesday). Will get back to you as soon as I know more.

@manderseck
Copy link
Contributor

@andrepuschmann When I confirmed that I can reproduce the issue I saw an overrun which of course also causes gaps. The overrun was caused by a too slow interface (used the RJ45 connector). When I continued the investigation I first took smaller multiples of 1.92MS/s which worked well. Using the SFP port (with an adaptor to RJ45) I'm able to get the N310 running your test at 15.36MS/s (17.28 was possible, too):

$ ./lib/src/radio/test/benchmark_radio -s 15.36e6 -t 1 -g 20 -v -a addr=192.168.10.2
Instantiating objects and allocating memory...
mlockall: Cannot allocate memory
Initialising instances...
Opening 1 channels in RF device= with args=addr=192.168.10.2
[INFO] [UHD] linux; GNU C++ version 7.5.0; Boost_106501; UHD_4.1.0.2-0-g01575510
[INFO] [LOGGING] Fastpath logging disabled at runtime.
Opening USRP channels=1, args: addr=192.168.10.2,type=n3xx,master_clock_rate=122.88e6
[INFO] [UHD RF] RF UHD Generic instance constructed
[INFO] [MPMD] Initializing 1 device(s) in parallel with args: mgmt_addr=192.168.10.2,type=n3xx,product=n310,serial=3195687,fpga=HG,claimed=False,addr=192.168.10.2,master_clock_rate=122.88e6
[INFO] [MPM.PeriphManager] init() called with device args `fpga=HG,master_clock_rate=122.88e6,mgmt_addr=192.168.10.2,product=n310,clock_source=internal,time_source=internal'.
Start capturing 1000 sub-frames of 15360 samples (approx. 1s) ...
Finished streaming with 0 gaps, 0 late timestamps, 0 overflows, 0 underflow...
Tearing down...
Ok!

This looks the same way in UHD 4.1 and in the latest master. Therefore I assume that setting the tx_rate works, otherwise I would expect to see timing gaps in all scenarios where the tx_rate != default. Can you confirm that your N310 is connected via the SFP port? Do you always see gaps and overruns or is this only occasionally?
Great test btw. to not only look at the overruns but also at the timestamps!

@andrepuschmann
Copy link
Contributor Author

This is very weird. I've always used the 10GigE over SFP port with an optical fiber. On two different systems. And it was never working for rates above 1.92Msps. I've tried now a few more things, here and there, played with other UHD examples, and out of a sudden the same calls - that have never worked before - just work fine. I've no idea why though.
So this is a 2 channel tx/tx with 30.72msps:

$ ./lib/src/radio/test/benchmark_radio -s 30.72e6 -t 1 -g 20 -v -p 2 -x
Instantiating objects and allocating memory...
mlockall: Cannot allocate memory
Initialising instances...
Opening 2 channels in RF device= with args=default
[INFO] [UHD] linux; GNU C++ version 7.5.0; Boost_106501; UHD_4.1.0.2-0-g8ce6e64f
[INFO] [LOGGING] Fastpath logging disabled at runtime.
Opening USRP channels=2, args: type=n3xx,master_clock_rate=122.88e6
[INFO] [UHD RF] RF UHD Generic instance constructed
[INFO] [MPMD] Initializing 1 device(s) in parallel with args: mgmt_addr=192.168.10.2,type=n3xx,product=n310,serial=3195684,fpga=HG,claimed=False,addr=192.168.30.2,master_clock_rate=122.88e6
[INFO] [MPM.PeriphManager] init() called with device args `fpga=HG,master_clock_rate=122.88e6,mgmt_addr=192.168.10.2,product=n310,clock_source=internal,time_source=internal'.
[INFO] [MULTI_USRP]     1) catch time transition at pps edge
[INFO] [MULTI_USRP]     2) set times next pps (synchronously)
[WARNING] [0/Radio#0] Attempting to set tick rate to 0. Skipping.
[WARNING] [0/Radio#0] Attempting to set tick rate to 0. Skipping.
Setting manual TX/RX offset to 0 samples
Start capturing 1000 sub-frames of 30720 samples (approx. 1s) ...
Finished streaming with 0 gaps, 0 late timestamps, 0 overflows, 0 underflow...
Tearing down...
Ok!

But I honestly have no clue what might have cause this. Could the N310 have been in a weird state not accepting rate changes? Also in your output above, I noticed your dev_addr and mgmt_add are the same. I have the ARM also connected. Does this make a difference?

@manderseck manderseck removed the bug label Oct 5, 2021
@manderseck
Copy link
Contributor

manderseck commented Oct 5, 2021

@andrepuschmann I'm glad the issue is resolved, even though we don't know exactly what caused it initially.
For the addr vs. mgmt_addr it makes sense in my output that it is the same as this is the only network connection that I used. It used to be the same in your original output, too. Would have to check what implications different IPs in those two places could have. Nevertheless I'll close this issue as it seems it's not a bug. Nevertheless thanks for reporting this and sharing your test with us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants