In-Memory Key-Value Store
In-Memory Key-Value Store
For this assignment, you will implement a simple distributed key-value store. This will be used as the storage system for your crawler and your analytics platform later on. The basic functionality is quite simple: clients will be able to PUT new values into the store, and GET existing values from the store. The main challenge is coordination: the store will be sharded across multiple worker nodes, and each node will store the values for keys in a certain range. We will use consistent hashing to determine the key ranges: each worker will have an ID, and will be responsible for all the keys between its ID and the next-higher worker ID. As usual, the worker with the highest ID will take care of any keys that are higher than its ID, or lower than the lowest ID. For instance, if there are three workers with IDs banana, fig, and pear, the keys coconut, guava, and tangerine would be stored on the first, second, and third worker, respectively. The key apple would be stored on the third worker as well. This approach requires a way to keep track of system membership. We will use a single coordinator node for this. The worker nodes will periodically check in with the coordinator, and the coordinator will keep a list of currently active workers and their IDs. Clients can download this information from the coordinator, and then contact the relevant workers directly when they want to issue a GET or PUT. This kind of coordination is needed for many kinds of distributed systems, including the analytics framework we will build in later assignments, so we will make this functionality reusable, and factor it out into a separate package. To keep things simple, we will keep the keys and values entirely in memory for now; we will add persistence in the next assignment. Also, we will not implement any kind of replication, so even a single worker failure will cause some unavailability. All the communication will be done via HTTP, using the web server from HW1ā3. We will make a reference implementation of the web server available for now, and your submission should use this reference implementation, so that your HW4 score wonāt be affected by bugs in your HW1ā3 solution. However, the goal very much is to use your own code for the project, so you may want to try your HW4 solution with your HW3 code as well, just in case it uncovers any additional bugs. The key-value store will be used again later, for your project, so we strongly recommend that you aim for robust, well-documented code (and not ājustā code that passes our test cases), and that you fix any problems that the autograder finds. As in the earlier assignments, please do use Google, answers from the discussion group, the Java API reference and, if necessary, a good Java book to solve simple problems on your own. If none of these steps solve your problem, please post on the discussion group, and we will be happy to help!
Requirements
Please start by downloading the HW4 package from http://cis5550.net/hw4.zip. This contains a README file with a short questionnaire, an Eclipse project definition (which you can ignore if you are not using Eclipse), an interface (KVS), and three classes (HTTP, Row, and KVSClient). It also contains a simple implementation of HW3, as a file called lib/webserver.jar. Your solution must meet the following requirements:
Generic coordinator/worker: Your solution should contain four classes: cis5550.kvs.Worker, cis5550.kvs.Coordinator, cis5550.generic.Coordinator, and cis5550.generic.Worker. The KVS classes should extend the generic classes, and the latter should contain as much of the functionality below as possible, so it can be reused in HW6. In particular, the generic coordinator should contain 1) a static function called getWorkers that returns the current workers as a vector of ip:port strings; 2) a static function called workerTable that returns, as a String, the HTML table with the list of workers as described below; and 3) a static function called registerRoutes that creates routes for the /ping and /workers routes (but not the / route) below. The generic worker should have a static function called startPingThread that creates a thread that makes the periodic /ping requests, as described below.
Coordinator: Your cis5550.kvs.Coordinator should accept a single command-line argument: the port number on which to run the web server. The application should maintain a list of active worker nodes, with an ID, an IP address, and a port number for each worker, and it should have make two functions available via HTTP GET requests. A GET to /ping?id=x&port=y should add an entry for the worker with ID x, port number y, and the IP address the request originated from; if an entry for x already exists, its IP and port should be updated. The request should return a 400 error if the ID and/or the port number are missing, and the string OK otherwise. When there are k active workers, a GET to /workers should return k + 1 lines of text, separated by LFs; the first line should contain k, and each of the following lines should contain an entry for a different worker, in the format id,ip:port. When a worker has not made a /ping request within the last 15 seconds, it should be considered inactive, and be removed from the list. In addition to the above two methods, your coordinator should also accept GET requests for /, and such requests should return a HTML page with a table that contains an entry for each active worker and lists its ID, IP, and port. Each entry should have a hyperlink to http://ip:port/, where ip and port are the IP and port number of the corresponding worker.
Worker: Your cis5550.kvs.Worker should accept three command-line arguments: 1) a port number for the worker, 2) a storage directory, and 3) the IP and port of the coordinator, separated by a colon (š. When started, this application should look for a file called id in the storage directory; if it exists, it should read the workerās ID from this file, otherwise it should pick an ID of five random lower-case letters and write it to the file. It should make a /ping request to the coordinator every five seconds, and it should, via a web server that runs on the port number from the command line, make two functions available via HTTP: 1) A PUT to /data/// should set column C in row R of table T to the (possibly binary) data in the body of the request, and 2) a GET for /data/// should return the data in column C of row R in table T if the table, row, and column all exist; if not, it should return a 404 error. In both cases, the angular brackets denote arguments and should not appear in the URL; for instance, a GET to /data/foo/bar/xyz should return the xyz column of row bar in table foo. Row keys, column keys, and table names should be case-sensitive.
Compatibility: Your solution must work with the unmodified reference implementation in lib/webserver.jar, which we will use for grading.
Packaging: Your solution should be in a directory called HW4, which should contain 1) the README file from the HW4 package, with all the fields filled in; 2) the file webserver.jar in a subdirectory called lib; and 3) a subdirectory called src with all of your HW4 source code, including the files we provided, and in the directory structure Java expects (with subdirectories for packages). Please do not include your HW1/2/3 code. Your solution must compile without errors if, from the HW4 folder you run javac -cp lib/webserver.jar --source-path src src/cis5550/kvs/Coordinator.java and javac -cp lib/webserver.jar --source-path src src/cis5550/kvs/Worker.java. Please do try this before you submit! Submissions that fail this basic check will receive a zero.
Extra credit
If you like, you can implement the following additional features for some extra credit. If you do, please indicate this in the README file you submit!
Add a conditional PUT (+5 points): Extend the PUT route so that it accepts two optional query parameters, ifcolumn and equals. If both are present, it should check whether the column whose name is specified in ifcolumn exists (within the same row) and has the value that is specified in equals, and it should execute the PUT only if this is the case; if not, it should return FAIL instead of OK.
Support versioning (+10 points): Instead of keeping only the current values for each row, keep the previous values as well, and give a version number to each. The first value that is PUT into a row should be version 1, and each PUT after that should increase the version by 1. (Notice that version numbers are attached to entire rows, not to individual columns!) Add a Version: xxx header to the PUT and GET responses to specify the version xxx that has been assigned or is being returned, respectively, and add a query parameter version=xxx to the GET that can be used to request a specific version, instead of the most recent version.
