Week 30: Finishing the naive Load Balancer #5

mrdude · 2017-03-22T01:27:43Z

What have I been doing?

I've mostly been working on fixing bugs in my naive load balancer implementation. This "load balancer" just forwards all connections to the first backend. As of commit 0d7865, this is mostly done. Athena will happily accept and forward Vegeta's connections for a while. After about ~20k packets or so, Vegeta will start reporting that it's connections are being rejected. I dunno what is going on yet; I'm still in the process of scanning the pcaps in Wireshark.

Milestones

~~TCP replay~~
add naive LB - need to finish debugging
- ~~add a SELECTING_BACKEND state to the main state machine~~
- ~~allow SELECTING_BACKEND to queue up packets from the client~~
pluggable LBs - allow the load balancer to be specified on the command line
- round robin
- least connections
- smart queue time algorithm

What am I doing this week?

Once the naive LB implementation works, I'm going to move on to implementing pluggable load balancers. By the end of this week, I want to have implementations for Naive, Round Robin, and Least Connections done.

Once I have all of these algorithms implemented, I can compare their performance using Vegeta's reported stats. This will be good to have in my presentation; I can create a graph comparing 95%ile latency for each alg.

Other Misc TODO items

figure out how to get ATH_ASSERT() working
clean up code so it follows the style guide
? add a TIME_WAIT state to state machine -- keep forwarding connections for 4 min after a RST?

Potential Roadblocks

In the current implementation, Athena assumes that it has the same IP as its load balancer backends. Because of this, Athena doesn't have to know ARP; Athena just blindly forwards any non-TCP packets it gets, and the networking stack in the backend's kernel handles everything else.

Ideally, one would be able to run Athena on a server as a reverse proxy for a cluster of web servers. This would require Athena to: 1) respond to ARP requests for its IP, and 2) read incoming ARP packets so that Athena can patch the ethernet headers (as well as the TCP and IP headers) while routing.

I'd really rather not have to write code to understand ARP right now; considering how long it took me to iron out the bugs in TCP replay, I don't have time to get Athena to understand another protocol (even one as relatively straightforward as ARP). For the time being, Athena is going to assume that it shares an IP with it's backends. If I still have time after implementing the rest of my intended milestone features, I'll add ARP support to Athena.

mrdude · 2017-03-22T02:02:05Z

Here are the pcaps I've been looking at: nn27, nn29. Nimbnode27 has an IP of 11.0.0.27 and hosts the backend webservers at 11.0.0.27:81, :85, :90, :95, and :100. Nimbnode29 has an IP of 11.0.0.29 and sends the client requests. Athena runs on Nimbnode28 and routes all packets that pass between nn27 and nn29.

I've been looking at them in Wireshark. Everything seems to be working nicely until this point:

Link to larger image

Wireshark notes that TCP port numbers are beginning to be reused; I suspect that this is causing Athena to get confused about connection states.

twood02 · 2017-03-22T13:55:20Z

Yes, I was going to ask about this -- at some point ports will be reused, so if you aren't clearing out old connections that is likely to be an issue. Detecting the close of a connection can be a bit tricky (ordering of FIN/RST isn't always consistent). For your purposes it may be fine to just detect when a port is being reused (new SYN) and recognize that means you need to reset your state machine for that connection.

mrdude added the git-check label Mar 22, 2017

mrdude self-assigned this Mar 22, 2017

mrdude added a commit that referenced this issue Mar 22, 2017

added pcaps and screenshots for issue #5

80c71c0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Week 30: Finishing the naive Load Balancer #5

Week 30: Finishing the naive Load Balancer #5

mrdude commented Mar 22, 2017

mrdude commented Mar 22, 2017

twood02 commented Mar 22, 2017

Week 30: Finishing the naive Load Balancer #5

Week 30: Finishing the naive Load Balancer #5

Comments

mrdude commented Mar 22, 2017

What have I been doing?

Milestones

What am I doing this week?

Other Misc TODO items

Potential Roadblocks

mrdude commented Mar 22, 2017

twood02 commented Mar 22, 2017