Based on IRTT (Isochronous Round-Trip Tester)
IRTT measures round-trip time, one-way delay and other metrics using UDP packets sent on a fixed period, and produces both user and machine parseable output.
IRTT has reached version 0.9.1. I would appreciate any feedback, which you can send under Issues. However, it could be useful to first review the Roadmap section of the documentation before submitting a new bug or feature request.
- Motivation
- Goals
- Features
- Limitations
- Installation
- Documentation
- Frequently Asked Questions
- Roadmap
- Changes
- Thanks
Latency is an under-appreciated metric in network and application performance. As of this writing, many broadband connections are well past the point of diminishing returns when it comes to throughput, yet that’s what we continue to take as the primary measure of Internet performance. This is analogous to ordinary car buyers making top speed their first priority.
There is a certain hard to quantify but visceral “latency stress” that comes from waiting in expectation after a web page click, straining through a delayed and garbled VoIP conversation, or losing at your favorite online game (unless you like “lag” as an excuse). Those who work on reducing latency and improving network performance characteristics beyond just throughput may be driven by the idea of helping relieve this stress for others.
IRTT was originally written to improve the latency and packet loss measurements for the excellent Flent tool, but should be useful as a standalone tool as well. Flent was developed by and for the Bufferbloat project, which aims to reduce "chaotic and laggy network performance," making this project valuable to anyone who values their time and sanity while using the Internet.
The goals of this project are to:
- Accurately measure latency and other relevant metrics of network behavior
- Produce statistics via both human and machine parseable output
- Provide for reasonably secure use on both public and private servers
- Support small enough packet sizes for VoIP simulation
- Support relevant socket options, including DSCP
- Use a single UDP port for deployment simplicity
- Keep the executable size small enough for use on embedded devices
- Provide an API for embedding and extensibility
- Measurement of:
- RTT (round-trip time)
- OWD (one-way delay), given external clock synchronization
- IPDV (instantaneous packet delay variation), usually referred to as jitter
- Packet loss, with upstream and downstream differentiation
- Out-of-order (measured using late packets metric) and duplicate packets
- Bitrate
- Timer error, send call time and server processing time
- Statistics: min, max, mean, median (for most quantities) and standard deviation
- One nanosecond time precision on Linux and OS/X, and 100ns on Windows
- Robustness in the face of clock drift and NTP corrections through the use of both wall and monotonic clocks
- Binary protocol with negotiated format for test packet lengths down to 16 bytes (without timestamps)
- HMAC support for private servers, preventing unauthorized discovery and use
- Support for a wide range of Go supported platforms
- Timer compensation to improve sleep send schedule accuracy
- Support for IPv4 and IPv6
- Public server protections, including:
- Three-way handshake with returned 64-bit connection token, preventing reply redirection to spoofed source addresses
- Limits on maximum test duration, minimum interval and maximum packet length, both advertised in the negotiation and enforced with hard limits to protect against rogue clients
- Packet payload filling to prevent relaying of arbitrary traffic
- Output to JSON
- An available SmokePing probe (code)
See the LIMITATIONS section of the irtt(1) man page.
To install IRTT manually or build from source, you must:
- Install Go
- Install irtt:
go get -u github.com/samiemostafavi/irtt/cmd/irtt
- For convenience, copy the
irtt
executable, which should be in$HOME/go/bin
, or$GOPATH/bin
if you have$GOPATH
defined, to somewhere on yourPATH
.
If you want to build the source for development, you must also:
- Install the
pandoc
utility for generating man pages and HTML documentation from their markdown source files. This can be done withapt-get install pandoc
on Debian flavors of Linux orbrew install pandoc
on OS/X. See the Pandoc site for more information. - Install the
stringer
utility by doinggo get -u golang.org/x/tools/cmd/stringer
andgo install golang.org/x/tools/cmd/stringer
. This is only necessary if you need to re-generate the*_string.go
files that are generated by this tool, otherwise the checked in versions may also be used. - Use
build.sh
to build during development, which helps with development related tasks, such as generating source files and docs, and cross-compiling for testing. For example,build.sh min linux-amd64
would compile a minimized binary for Linux on AMD64. Seebuild.sh
for more info and a "source-documented" list of platforms that the script supports. See this page for a full list of valid GOOS GOARCH combinations.build.sh install
runs Go's install command, which puts the resulting executable in$GOPATH/bin
.
If you want to build from a branch, you should first follow the steps above,
then from the github.com/samiemostafavi/irtt
directory, do:
git checkout branch
go get ./...
go install ./cmd/irtt
or./build.sh
and move resultingirtt
executable to install location
Building for iOS:
I have no way to verify this, but I received a report that the following is "close to but not quite the right command" to cross-compile for iOS:
GOOS=ios GOARCH=arm64 IPHONEOS_DEPLOYMENT_TARGET=14.0 CGO_ENABLED=1 CGO_CFLAGS="-arch arm64 -isysroot `xcrun --sdk iphoneos --show-sdk-path` -mios-version-min=10.0" CGO_LDFLAGS="-arch arm64 -isysroot `xcrun --sdk iphoneos --show-sdk-path`" go build -o irtt cmd/irtt/main.go
Please file an issue if you get this working so I can update the doc.
After installing IRTT, see the man pages and their corresponding EXAMPLES sections to get started quickly:
-
Why not just use ping?
Ping may be the preferred tool when measuring minimum latency, or for other reasons. IRTT's reported mean RTT is likely to be a bit higher (on the order of a couple hundred microseconds) and a bit more variable than the results reported by ping, due to the overhead of entering userspace, together with Go's system call overhead and scheduling variability. That said, this overhead should be negligible at most Internet RTTs, and there are advantages that IRTT has over ping when minimum RTT is not what you're measuring:
- In addition to round-trip time, IRTT also measures OWD, IPDV and upstream vs downstream packet loss.
- Some device vendors prioritize ICMP, so ping may not be an accurate measure of user-perceived latency.
- IRTT can use HMACs to protect private servers from unauthorized discovery and use.
- IRTT has a three-way handshake to prevent test traffic redirection from spoofed source IPs.
- IRTT can fill the payload (if included) with random or arbitrary data.
- On Windows, ping has a precision of 0.5ms, while IRTT uses high resolution timer functions for a precision of 100ns (high resolution wall clock only available on Windows 8 or Windows 2012 Server and later).
Also note the following behavioral differences between ping and IRTT:
- IRTT makes a stateful connection to the server, whereas ping is stateless.
- By default, ping waits for a reply before sending its next request, while IRTT keeps sending requests on the specified interval regardless of whether or not replies are received. The effect of this, for example, is that a fixed-length pause in server packet processing (with packets buffered during the pause) will look like a single high RTT in ping, and multiple high then descending RTTs in IRTT for the duration of the maximum RTT.
-
Is there a public server I can use?
There is a test server running at
irtt.heistp.net
with an HMAC key ofirttuser
. Please do not abuse it. To restrict bandwidth, the minimum interval is set to 100ms, the max length to 256 bytes, and the max duration to 60 seconds. Example usage:irtt client --hmac=irttuser irtt.heistp.net
-
How do I run the IRTT server at startup?
This depends on your OS and init system, but see:
-
Why can't the client connect to the server, and instead I get
Error: no reply from server
?There are a number of possible reasons for this:
- You've specified an incorrect hostname or IP address for the server.
- There is a firewall blocking packets from the client to the server. Traffic must be allowed on the chosen UDP port (default 2112).
- There is high packet loss. By default, up to four packets are sent when
the client tries to connect to the server, using timeouts of 1, 2, 4 and 8
seconds. If all of these are lost, the client won't connect to the server.
In environments with known high packet loss, the
--timeouts
flag may be used to send more packets with the chosen timeouts before abandoning the connection. - The server has an HMAC key set with
--hmac
and the client either has not specified a key or it's incorrect. Make sure the client has the correct HMAC key, also specified with the--hmac
flag. - You're trying to connect to a listener that's listening on an unspecified
IP address, but reply packets are coming back on a different route from the
requests, or not coming back at all. This can happen in network
environments with [asymmetric routing and a firewall or NAT]
(https://www.cisco.com/web/services/news/ts_newsletter/tech/chalktalk/archives/200903.html).
There are several possible solutions to this:
- Change your network configuration to avoid the problem.
- Have the IRTT server listen on specific addresses with the
-b
flag. - Use the
--set-src-ip
flag on the server, which explicitly sets the source address on all reply packets from listeners on unspecified IP addresses to the destination address that the request was received on. The only reason this is not done by default is to avoid the extra per-packet heap allocations required by thegolang.org/x/net
packege to do so.
-
Why is the send (or receive) delay negative or much larger than I expect?
The client and server clocks must be synchronized for one-way delay values to be meaningful (although, the relative change of send and receive delay may be useful to look at even without clock synchronization). Well-configured NTP hosts may be able to synchronize to within a few milliseconds. PTP (Linux implementation here) is capable of much higher precision. For example, using two PCEngines APU2 boards (which support PTP hardware timestamps) connected directly by Ethernet, the clocks may be synchronized within a few microseconds.
Note that client and server synchronization is not needed for either RTT or IPDV (even send and receive IPDV) values to be correct. RTT is measured with client times only, and since IPDV is measuring differences between successive packets, it's not affected by time synchronization.
-
Why is the receive rate 0 when a single packet is sent?
Receive rate is measured from the time the first packet is received to the time the last packet is received. For a single packet, those times are the same.
-
Why does a test with a one second duration and 200ms interval run for around 800ms and not one second?
The test duration is exclusive, meaning requests will not be sent exactly at or after the test duration has elapsed. In this case, the interval is 200ms, and the fifth and final request is sent at around 800ms from the start of the test. The test ends when all replies have been received from the server, so it may end shortly after 800ms. If there are any outstanding packets, the wait time is observed, which by default is a multiple of the maximum RTT.
-
Why is IPDV not reported when only one packet is received?
IPDV is the difference in delay between successfully returned replies, so at least two reply packets are required to make this calculation.
-
Why does wait fall back to fixed duration when duration is less than RTT?
If a full RTT has not elapsed, there is no way to know how long an appropriate wait time would be, so the wait falls back to a default fixed time (default is 4 seconds, same as ping).
-
Why can't the client connect to the server, and I either see
[Drop] [UnknownParam] unknown negotiation param (0x8 = 0)
on the server, or a strange message on the client like[InvalidServerRestriction] server tried to reduce interval to < 1s, from 1s to 92ns
?You're using a 0.1 development version of the server with a newer client. Make sure both client and server are up to date. Going forward, the protocol is versioned (independently from IRTT in general), and is checked when the client connects to the server. For now, the protocol versions must match exactly.
-
Why don't you include median values for send call time, timer error and server processing time?
Those values aren't stored for each round trip, and it's difficult to do a running calculation of the median, although this method of using skip lists appears to have promise. It's a possibility for the future, but so far it isn't a high priority. If it is for you, file an Issue.
-
I see you use MD5 for the HMAC. Isn't that insecure?
MD5 should not have practical vulnerabilities when used in a message authenticate code. See this page for more info.
-
Are there any plans for translation to other languages?
While some parts of the API were designed to keep i18n possible, there is no support for i18n built in to the Go standard libraries. It should be possible, but could be a challenge, and is not something I'm likely to undertake myself.
-
Why do I get
Error: failed to allocate results buffer for X round trips (runtime error: makeslice: cap out of range)
?Your test interval and duration probably require a results buffer that's larger than Go can allocate on your platform. Lower either your test interval or duration. See the following additional documentation for reference: In-memory results storage,
maxSliceCap
in slice.go and_MaxMem
in malloc.go. -
Why is little endian byte order used in the packet format?
As for Google's protobufs, this was chosen because the vast majority of modern processors use little-endian byte order. In the future, packet manipulation may be optimized for little-endian architecutures by doing conversions with Go's unsafe package, but so far this optimization has not been shown to be necessary.
-
Why does
irtt client
use-l
for packet length instead of following ping and using-s
for size?I felt it more appropriate to follow the RFC 768 term length for UDP packets, since IRTT uses UDP.
-
Why is the virt size (vsz) memory usage for the server so high in Linux?
This has to do with the way Go allocates memory, but should not cause a problem. See this article for more information. File an Issue if your resident usage (rss/res) is high or you feel that memory consumption is somehow a problem.
-
Why doesn't the server start on Linux when the kernel parameter
ipv6.disable=1
is set?By default, IRTT tries to listen on both IPv4 and IPv6 addresses, and for safety, the server shuts down if there are failures on any of the listeners for any of the addresses. In this case, the server may be started with the
-4
flag. -
Why don't you make use of
x
library?We need to keep the executable size as small as possible for embedded devices, and most external libaries are not compatible with this.
See CHANGES.md.
Planned for v0.9.2...
- Refactor lconn, and make the srcSrcIP/ecn flag stuff independent.
- Solidify TimeSource, Time and new Windows timer support:
- Add --timesrc to client and server
- Fall back to Go functions as necessary for older Windows versions
- Make sure all calls to TimeSource.Now pass in only needed clocks
- Find a better way to log warnings than fmt.Fprintf(os.Stderr) in timesrc_win.go
- Rename Time.Mono to Monotonic, or others from Monotonic to Mono for consistency
- Document 100ns resolution for Windows
- Improve diagnostic commands:
- Change bench command to output in columns
- Rename sleep command to timer and add --timesrc, --sleep, --timer and --tcomp
- Rename timer command to resolution and add --timesrc
- Rename clock command to drift and add --timesrc
- Add a
late
flag to RoundTrip - Measure and document local differences between ping and irtt response times
- Sync Debian package to history re-write and create backports version for Debian stable
- Add
report
command, or similar, to print results from an existing JSON file
Planned for v1.0.0...
- Refactor handshake params to use signed values and straight bytes as appropriate.
- Improve client output flexibility:
- Allow specifying a format string for text output with optional units for times
- Add format abbreviations for CSV, space delimited, etc.
- Add a subcommand to the CLI to convert JSON to CSV
- Add a way to disable per-packet results in JSON
- Add a way to keep out "internal" info from JSON, like IP and hostname, and a subcommand to strip these out after the JSON is created
- Add more info on outliers and possibly a textual histogram
- Refactor packet manipulation to improve readability, prevent multiple validations and support unit tests
- Add DSCP text values and return an error when ECN bits are passed to --dscp
- Improve open/close process:
- Do Happy Eyeballs (RFC 8305) to better handle multiple address families and addresses
- Make timeout support automatic exponential backoff, like 4x15s
- Repeat close packets until acknowledgement, like open
- Include final stats in the close acknowledgement from the server
- Improve robustness and security of public servers:
- Add bitrate limiting
- Limit open requests rate and coordinate with sconn cleanup
- Add separate, shorter timeout for open
- Specify close timeout as param from client, which may be restricted
- Add per-IP limiting
- Add a more secure way than cmdline flag to specify --hmac
- Stabilize API:
- Minimize exposed functions (remove timer, timer comp, etc)
- Always return instance of irtt.Error? If so, look at exitOnError.
- Use error code (if available) as exit code
- Improve induced latency and jitter:
- Use Go profiling, scheduler tracing, strace and sar
- Do more thorough tests of
chrt -r 99
,--thread
and--gc
- Find or file issue with Go team over scheduler performance, if needed
- Prototype doing thread scheduling or socket i/o for Linux in C
- Show actual size of header in text and json, and add calculation to doc
Collection area...
- Add ping-pair-like functionality
- Add UDP-lite support to allow partially damaged packets to be received
- Add different server authentication modes:
- none (no conn token in header, for minimum packet sizes during local use)
- token (what we have today, 64-bit token in header)
- nacl-hmac (hmac key negotiated with public/private key encryption)
- Implement graceful server shutdown with sconn close
- Implement zero-downtime restarts
- Add a Scheduler interface to allow non-isochronous send schedules and variable
packet lengths
- Find some way to determine packet interval and length distributions for captured traffic
- Determine if asymmetric send schedules (between client and server) required
- Add an overhead test mode to compare ping vs irtt
- Add client flag to skip sleep and catch up after timer misses
- Add seqno to the Max and maybe Min columns in the text output
- Prototype TCP throughput test and compare straight Go vs iperf/netperf
- Support a range of server ports to improve concurrency and maybe defeat latency "slotting" on multi-queue interfaces
- Add more unit tests
- Add support for load balanced conns (multiple source addresses for same conn)
- Use unsafe package to speed up packet buffer manipulation
- Add encryption
- Add estimate for HMAC calculation time and correct send timestamp by this time
- Implement web interface for client and server
- Set DSCP per-packet, at least for IPv6
- Add NAT hole punching
- Use a larger, internal received window on the server to increase up/down loss accuracy
- Allow specifying two out of three of interval, bitrate and packet size
- Calculate per-packet arrival order during results generation using timestamps
- Make it possible to add custom per-round-trip statistics programmatically
- Allow Server to listen on multiple IPs for a hostname
- Prompt to write JSON file on cancellation
- Open questions:
- What do I do for IPDV when there are out of order packets?
- Does exposing both monotonic and wall clock values, as well as dual timestamps, open the server to any timing attacks?
- Is there any way to make the server concurrent without inducing latency?
- Should I request a reserved IANA port?
Many thanks to both Toke Høiland-Jørgensen and Dave Täht from the Bufferbloat project for their valuable advice. Any problems in design or implementation are entirely my own.