In this lab you will build a fault-tolerant key/value storage service using your Raft library from lab 2. You key/value service will be a replicated state machine, consisting of several key/value servers that use Raft to maintain replication. Your key/value service should continue to process client requests as long as a majority of the servers are alive and can communicate, in spite of other failures or network partitions.
The service supports three operations: Put(key, value), Append(key, arg), and Get(key). It maintains a simple database of key/value pairs. Put() replaces the value for a particular key in the database, Append(key, arg) appends arg to key's value, and Get() fetches the current value for a key. An Append to a non-existant key should act like Put. Each client talks to the service through a Clerk with Put/Append/Get methods. A Clerk manages RPC interactions with the servers.
Your service must provide strong consistency to applications calls to the Clerk Get/Put/Append methods. Here's what we mean by strong consistency. If called one at a time, the Get/Put/Append methods should act as if the system had only one copy of its state, and each call should observe the modifications to the state implied by the preceding sequence of calls. For concurrent calls, the return values and final state must be the same as if the operations had executed one at a time in some order. Calls are concurrent if they overlap in time, for example if client X calls Clerk.Put(), then client Y calls Clerk.Append(), and then client X's call returns. Furthermore, a call must observe the effects of all calls that have completed before the call starts (so we are technically asking for linearizability).
Strong consistency is convenient for applications because it means that, informally, all clients see the same state and they all see the latest state. Providing strong consistency is relatively easy for a single server. It is harder if the service is replicated, since all servers must choose the same execution order for concurrent requests, and must avoid replying to clients using state that isn't up to date.
This lab has two parts. In part A, you will implement the service without worrying that the Raft log can grow without bound. In part B, you will implement snapshots (Section 7 in the paper), which will allow Raft to garbage collect old log entries. Please submit each part by the respective deadline.
Do a git pull to get the latest lab software.
We supply you with skeleton code and tests in src/kvraft. You will need to modify kvraft/client.go, kvraft/server.go, and perhaps kvraft/common.go.
To get up and running, execute the following commands:
$ cd ~/CS443 $ git pull ... $ cd src/kvraft $ GOPATH=~/CS443 $ export GOPATH $ go test ... $
Each of your key/value servers ("kvservers") will have an associated Raft peer. Clerks send Put(), Append(), and Get() RPCs to the kvserver whose associated Raft is the leader. The kvserver code submits the Put/Append/Get operation to Raft, so that the Raft log holds a sequence of Put/Append/Get operations. All of the kvservers execute operations from the Raft log in order, applying the operations to their key/value databases; the intent is for the servers to maintain identical replicas of the key/value database.
A Clerk sometimes doesn't know which kvserver is the Raft leader. If the Clerk sends an RPC to the wrong kvserver, or if it cannot reach the kvserver, the Clerk should re-try by sending to a different kvserver. If the key/value service commits the operation to its Raft log (and hence applies the operation to the key/value state machine), the leader reports the result to the Clerk by responding to its RPC. If the operation failed to commit (for example, if the leader was replaced), the server reports an error, and the Clerk retries with a different server.
Your first task is to implement a solution that works when there are no dropped messages, and no failed servers.
You'll need to add RPC-sending code to the Clerk Put/Append/Get methods in client.go, and implement PutAppend() and Get() RPC handlers in server.go. These handlers should enter an Op in the Raft log using Start(); you should fill in the Op struct definition in server.go so that it describes a Put/Append/Get operation. Each server should execute Op commands as Raft commits them, i.e. as they appear on the applyCh. An RPC handler should notice when Raft commits its Op, and then reply to the RPC.
You have completed this task when you reliably pass the first test in the test suite: "One client". You may also find that you can pass the "concurrent clients" test, depending on how sophisticated your implementation is.
Your kvservers should not directly communicate; they should only interact with each other through the Raft log.
Add code to cope with duplicate Clerk requests, including situations where the Clerk sends a request to a kvserver leader in one term, times out waiting for a reply, and re-sends the request to a new leader in another term. The request should always execute just once. Your code should pass the go test -run Lab3 tests.
Your code should now pass the Lab Lab3 tests, like this:
$ go test -run Lab3 Test: one client (Lab3) ... ... Passed -- 15.1 5 12882 2587 Test: many clients (Lab3) ... ... Passed -- 15.3 5 9678 3666 Test: unreliable net, many clients (Lab3) ... ... Passed -- 17.1 5 4306 1002 Test: concurrent append to same key, unreliable (Lab3) ... ... Passed -- 0.8 3 128 52 Test: progress in majority (Lab3) ... ... Passed -- 0.9 5 58 2 Test: no progress in minority (Lab3) ... ... Passed -- 1.0 5 54 3 Test: completion after heal (Lab3) ... ... Passed -- 1.0 5 59 3 Test: partitions, one client (Lab3) ... ... Passed -- 22.6 5 10576 2548 Test: partitions, many clients (Lab3) ... ... Passed -- 22.4 5 8404 3291 PASS ok kvraft 96.2s
The numbers after each Passed are real time in seconds, number of peers, number of RPCs sent (including client RPCs), and number of key/value operations executed (Clerk Get/Put/Append calls).
Before submitting, please run the tests for part A one final time. Some bugs may not appear on every run, so run the tests multiple times.
If a Raft-based key/value server reboots it should resume service where it left off. You should pass the Challenge2 tests, like this:
$ go test -run Challenge2 Test: restarts, one client (Challenge2) ... ... Passed -- 19.7 5 13978 2821 Test: restarts, many clients (Challenge2) ... ... Passed -- 19.2 5 10498 4027 Test: unreliable net, restarts, many clients (Challenge2) ... ... Passed -- 20.5 5 4618 997 Test: restarts, partitions, many clients (Challenge2) ... ... Passed -- 26.2 5 9816 3907 Test: unreliable net, restarts, partitions, many clients (Challenge2) ... ... Passed -- 29.0 5 3641 708 Test: unreliable net, restarts, partitions, many clients, linearizability checks (Challenge2) ... ... Passed -- 26.5 7 10199 997 PASS ok kvraft 141.1s
As things stand now with your lab code, a rebooting server replays the complete Raft log in order to restore its state. However, it's not practical for a long-running server to remember the complete Raft log forever. Instead, you'll modify Raft and kvserver to cooperate to save space: from time to time kvserver will persistently store a "snapshot" of its current state, and Raft will discard log entries that precede the snapshot. When a server restarts (or falls far behind the leader and must catch up), the server first installs a snapshot and then replays log entries from after the point at which the snapshot was created. Section 7 of the extended Raft paper outlines the scheme; you will have to design the details.
You should spend some time figuring out what the interface will be between your Raft library and your service so that your Raft library can discard log entries. Think about how your Raft will operate while storing only the tail of the log, and how it will discard old log entries. You should discard them in a way that allows the Go garbage collector to free and re-use the memory; this requires that there be no reachable references (pointers) to the discarded log entries.
The tester passes maxraftstate to your StartKVServer(). maxraftstate indicates the maximum allowed size of your persistent Raft state in bytes (including the log, but not including snapshots). You should compare maxraftstate to persister.RaftStateSize(). Whenever your key/value server detects that the Raft state size is approaching this threshold, it should save a snapshot, and tell the Raft library that it has snapshotted, so that Raft can discard old log entries. If maxraftstate is -1, you do not have to snapshot.
Your raft.go probably keeps the entire log in a Go slice. Modify it so that it can be given a log index, discard the entries before that index, and continue operating while storing only log entries after that index. Make sure you pass all the Raft tests after making these changes.
Modify your kvserver so that it detects when the persisted Raft state grows too large, and then hands a snapshot to Raft and tells Raft that it can discard old log entries. Raft should save each snapshot with persister.SaveStateAndSnapshot() (don't use files). A kvserver instance should restore the snapshot from the persister when it re-starts.
Modify your Raft leader code to send an InstallSnapshot RPC to a follower when the leader has discarded the log entries the follower needs. When a follower receives an InstallSnapshot RPC, your Raft code will need to send the included snapshot to its kvserver. You can use the applyCh for this purpose, by adding new fields to ApplyMsg. Your solution is complete when it passes all of the Lab 3 tests.
The maxraftstate limit applies to the GOB-encoded bytes your Raft passes to persister.SaveRaftState().
Your code should pass the Challenge3 tests (as in the example here) as well as the Lab3 tests.
$ go test -run Challenge3 Test: InstallSnapshot RPC (Challenge3) ... ... Passed -- 1.5 3 163 63 Test: snapshot size is reasonable (Challenge3) ... ... Passed -- 0.4 3 2407 800 Test: restarts, snapshots, one client (Challenge3) ... ... Passed -- 19.2 5 123372 24718 Test: restarts, snapshots, many clients (Challenge3) ... ... Passed -- 18.9 5 127387 58305 Test: unreliable net, snapshots, many clients (Challenge3) ... ... Passed -- 16.3 5 4485 1053 Test: unreliable net, restarts, snapshots, many clients (Challenge3) ... ... Passed -- 20.7 5 4802 1005 Test: unreliable net, restarts, partitions, snapshots, many clients (Challenge3) ... ... Passed -- 27.1 5 3281 535 Test: unreliable net, restarts, partitions, snapshots, many clients, linearizability checks (Challenge3) ... ... Passed -- 25.0 7 11344 748 PASS ok kvraft 129.114sPlease post questions on KLMS.