rolandweber / pityoulish Goto Github PK

View Code? Open in Web Editor NEW

7.0 2.0 0.0 753 KB

A collection of programming/debugging exercises to support a course on Distributed Systems.

License: Creative Commons Zero v1.0 Universal

Java 72.75% HTML 24.63% CSS 2.30% Makefile 0.20% Dockerfile 0.12%

distributed-systems exercise classroom debugging sockets java-rmi

pityoulish's People

Contributors

Stargazers

Watchers

pityoulish's Issues

Exception info in TLV Error Responses

Enhance the TLV error response of the Binary Protocol to include exception information in addition to an error text. No stack traces, but class and message of nested exceptions. This requires updates of:

protocol specification (html)
server-side response building
client-side response parsing
client-side output formatting

Instructor's Notes for the Sockets exercise

Use Java Logging to write a server log file

Employ java.util.logging across the Message Board code.

Exceptions should be logged where they are thrown (or created), not where they are caught.

On the server, direct most log output into a file, to keep the console clean. It's not helpful for the classroom exercise to see stack traces of ProtocolException there.

Don't use class names as logger names on the server. It would interfere with obfuscation (#25).

implemented:

The servers and FtB clients for Sockets and Java RMI are now enabled for logging. Call java with -Dpityoulish.log=file to generate a log file in the current directory. Call it with -Dpityoulish.log= to enable standard Java log settings, level INFO or higher to the console. If the property is not set, the programs assume -Dpityoulish.log=silent and will log only level SEVERE to the console. Exceptions from invalid requests are logged at level WARNING and will therefore not be printed by default.

Factory interface for ResponseParser.Visitor

A Message Board client that connects to multiple servers in parallel (see #29) should be able to parse multiple responses in parallel. The Visitor interface is inherently single-threaded. Introduce a factory pattern, so that visitors can be instantiated for each request. The factory implementation may still return the same instance over and over, if parallel processing is not required.

Tutorial: Prepare for Java Exercises

Based on a local JAR. Need instructions for:

running the JAR
testing the available functions
locating where code is missing
working with the source code
creating a fixed JAR

No specific IDE. In the worst case, students have to call javac, jar, and java with classname from the command line.

NLS Enablement (light)

Move texts to properties files to enable translation.
Define message catalogs (enums?) with message codes for error/info messages.
Probably need a helper class for looking up the properties, to avoid duplication in the catalogs.

This is just about texts, no consideration of text direction, and with the server locale only.
However, there should be comments explaining how to handle client-provided locales.

Localize Message Board with Java Sockets for German

Provide German localization for:

src/main/java/pityoulish/mbclient/CatalogData.properties
src/main/java/pityoulish/msgboard/CatalogData.properties
src/main/java/pityoulish/sockets/client/CatalogData.properties
src/main/java/pityoulish/sockets/follow/CatalogData.properties
src/main/java/pityoulish/sockets/server/CatalogData.properties
src/main/java/pityoulish/tickets/CatalogData.properties

Message Board with gRPC

Implement the Message Board with remote procedure calls, using gRPC rather than Java RMI. The latter is specific to Java, the former language agnostic.

Initially, this should be a Java exercise, to replace the Java RMI exercise in the classroom. Clients or servers in other languages can be added later.

The Java RMI exercise doesn't expose much of the RMI infrastructure, because Java has become too convenient. The second part about direct messages is artificial. Remote calls are more often used in a client-server scenario, without even callbacks from the server to the client.
Therefore, keep the gRPC exercise closer to the Sockets exercise. If possible, omit the marker of the listMessages call from the API definition. The students have to modify the API definition and re-generate some classes. Of course that will work only if the server can handle requests for both API variants.

Housekeeping in the ticket manager

The DefaultTicketManager does not remove expired tickets from its internal data structures. Not even if the same user obtains a new ticket. The ticket manager should occasionally scan for expired tickets and remove them.

CatalogHelper fallback when string is empty

The CatalogHelper class implements fallbacks if a property is not found. If a property is found but yields an empty string, these fallbacks are not used. Reconsider the fallback behavior for the following cases:

format with parameters
format without parameters
lookup

The first case is an obvious error in the catalog.
In the other cases, the empty string could be intentional.

Implement unit tests for the expected behavior.

Follow the Board, with Sockets & TLV

Implement a client that periodically calls "List Messages" with the TLV-based protocol. It comes in handy during the classroom exercise, if the server doesn't print the content of messages put on the board. Compare with #51 and #48.

Originally intended as a homework exercise, to be implemented in a language of the student's choosing. Preferably, the sample solution should not have been in Java. Rather, node.js or Python.

Integrate log files and log configuration into exercises?

Does it make sense to investigate log files and change log levels for an exercise? Might be distracting for the classroom exercises, but how about a tutorial? Could become part of a second tutorial with advanced topics, for interested students.

Depends on a client-side version of #21.

Instructor's Notes for the Java RMI exercise

Instructions
Server JAR
...with stubs and skeletons?

Instructor's Notes for the generic Classroom Setup

follow-up to #19

Java RMI: Automatically generate Direct Messages

Periodically send a direct message from the server to all published outlets. Similar to #17.

Automatically remove unreachable outlets? The server-side implementation could use a local method to unpublish without a ticket, which is not provided in the remote API. Such cleanup could also be reported as a system message on the message board.

Package description for pityoulish.sockets.client

There will be some redundancy with the Code Overview in the exercise, but that's OK.
The JavaDocs should speak for themselves, independent of the exercise.

Implement a non-trivial sequencer for the Message Board

Plain numbers from the TrivialSequencerImpl are too simple.

Classroom: Message Board with Java RMI

Provide a JAR with Remote API and a client implementation with holes in the code.

Restrict ticket replacements

Restrict how often a ticket may be replaced. For example, allow two replacements but deny the third. Returning the ticket and obtaining a new one with separate requests should still be possible. Replacing the newly obtained ticket should also be possible again.

There's no particular reason for this. It's just a bit of fun programming to introduce another error situation that students may encounter.

Use ByteBuffer for request parsing (Sockets)

The RequestParser on the server side expects a byte[], while response building returns a ByteBuffer. This leads to an ugly mix of both in RequestHandler.handle. Refactor the request parsing to use ByteBuffer, too.

The client side (written later) already uses ByteBuffer consistently.

More output from the Message Board Server with Sockets

Need to see granted and returned tickets, and messages written, for the classroom exercise.

Generate system messages for the Message Board

Automatically generate system messages, like:

There are X messages on the board.
The time is XX:XX.
The last user messages has been posted XX minutes ago.
X user messages have dropped from the board so far.
... more useless statistics ...

Depends on issue #16: support slots for system messages.

Note that the default implementation MixedMessageBoard is not thread-safe. Either use a thread-safe wrapper (similar to java.util.Collections.synchronizedSortedMap), or trigger message generation synchronously from the SimplisticSocketHandler. The purpose of this task is to see changes of the board messages, even if testing on a local system with a single user. At the same time, the board should not flow over from pointless system messages, so these should use only a few slots and replace older system messages.

Boundary checks when parsing TLVs on the client side

TLVResponseParserImpl has some comments about missing boundary checks. Review the code and add missing checks. Maybe the TLV utility classes can be improved to simplify such checks?

Support slots in the MixedMessageBoard

Currently, the MixedMessageBoard supports only system messages without a slot. Implement slots, which means that system messages can be replaced and deleted. Cover the new functionality with unit tests.

Return some errors without exception (Sockets)

The application logic in pityoulish.sockets.server.MsgBoardRequestHandlerImpl should not have to report all errors by throwing a ProtocolException. Application-level errors are not protocol errors. The handler should have a way to return a plain error message, which is sent back to the client without an exception classname.

See a comment in MsgBoardRequestHandlerImpl.putMessage for ideas. There might also be cases in other handler methods where an error message without exception class would be appropriate. Maybe return a status object, which may indicate an error with an error message, or success with a genericized return value. That would better match the protocol specification, too.

Java RMI Server does not report exceptions

The console output doesn't show exceptions thrown by the remotely called methods. Options:

Report the exception where it is thrown. Related to issue #21.
Implement wrappers that delegate the remote calls and log exceptions that come up.

The first option is tedious to implement, but an effort that needs to be spent for issue #21 anyway. The second option could mislead students into thinking that it is necessary, or at least sensible, to implement wrappers for remotely callable objects. Somehow, I prefer the first one.

Encapsulate logic to detect the end of PDUs received via sockets

Classes SimplisticSocketHandler on the server and MsgBoardClientHandlerImpl on the client are supposed to be independent of the wire format. They use helper objects for parsing the received PDUs. However, both classes implement some limited TLV parsing in order to detect when a full PDU is received. Ideally, these classes should work without changes if the wire format is changed, for example to JSON instead of TLVs.
Refactor the code to detect PDU boundaries. Implement a helper that can be used on both the server and the client. Keep the interface independent of the current TLV encoding. While TLVs provide the length of the PDU in the header, other formats might not. The helper should work equally well with blocking IO and asynchronous NIO.

Classroom: Message Board with Sockets & TLV, in Java

Message Board with a TLV-based, binary protocol over socket connections.
Provide a protocol description and a client JAR with holes in the code.

Refactor to avoid client code duplication

Some of the code for the Sockets client and Java RMI client is identical, or nearly so:

MsgBoardCommandDispatcher class and MsgBoardClientHandler interface
*BackendHandler interface and *Impl class, the handling of hostname and port
Catalog class and CatalogData properties, the command-line argument messages
some of the console output messages, too

Package name? pityoulish.cmdline.mbclient would match the scope. But #11 introduces a common server-side package which is not specific to command-line arguments. So maybe pityoulish.mbclient here and pityoulish.mbserver there?

Consolidate the duplicated code into a common client utility package.
Update the Code Overview sections of the exercise descriptions.

Classroom: Message Board with HTTP & JSON

~~Provide client code in node.js (JavaScript) with holes.~~

For consistency of the classroom exercises, provide a command-line client in Java. A browser-based UI with JavaScript or TypeScript would be a nice addition though.

Java RMI: print success messages

The Java RMI client and outlet programs should print success messages after operations that have no other output, like "return ticket" or "put message". Lesson learned from #48.

Not a problem for the sockets client, because that always prints the TLVs which are sent and received.

Explicit input validation on the server

Input data sent by the client should be validated explicitly: Text, Originator, Marker, Ticket.
The initial implementation of the Message Board server with sockets just passes these values to server-side APIs and relies on implicit validation of arguments. Except for the Limit, which is already validated explicitly because the binary protocol specifies the valid range.

Implement explicit validation logic that can be re-used for Java RMI as well. Re-usable logic cannot throw a ProtocolException, because that class is specific to the Sockets implementation. Actual validation can take place in the generic server-side code (pityoulish.msgboard and pityoulish.tickets), but there should be an API which is independent of those. This allows to enforce additional constraints in the protocol layer.

Prevent hard-coding of server name or IP

During the classroom exercises (Sockets, Java RMI), students could hard-code the server name instead of using the respective command-line argument, without encountering an error. With just one server machine in the network this is hard to prevent.

Improve the comments in the code? Change the server IP address during the exercise? Run two servers in parallel?

Detected during Sockets exercise 2016 (#38).

Classroom: Direct Messages with Java RMI

Enhacement to the Message Board with Java RMI.
Let students register their own, remotely-callable objects.

Custom style for JavaDocs

The generated JavaDocs contain a style sheet from the (Open)JDK used to generate them. Personally, I never liked the change from JDK 6 to JDK 7 styles, because the fonts became so much smaller. And I'm not too fond of the differently licensed style sheet that ends up in the binaries. Replace it with a custom "Pityoulish" style sheet.

IMPORTANT: The custom style sheet must be developed from scratch to be put into the public domain. We cannot use the style sheet from a JDK and make modifications to it, because the license of the original file would apply. There's no license notice in it, but somebody does own the copyright, and some kind of license applies. If the file is modified and published with a different license outside the context of the (Open)JDK, the copyright owner could rightfully complain.

Feedback: Tutorial 2016

Students needed between 1 and 2 hours to work through the Tutorial.

For a 1-hour student, it was too much effort for too little coding... a developer should know that stuff.

Other students, which program less frequently in Java, liked the detailed descriptions, to get into the swing again.

"Towards the end, the comments in the code were more helpful than the description [in the Tutorial]."

All in all, the Tutorial has achieved its purpose. I noticed only one group that encountered technical problems with their IDE. Might have been caused by creating the project differently than while working through the Tutorial.

Create temporary GitHub repos for exercises?

Some students prefer to clone a GitHub repo rather than downloading a JAR with included source code. Lesson learned from #48. Could create separate repositories with the processed source code for each exercise, to be deleted when the term is over.

Requires updates of the tutorial and the exercise descriptions, because there are two options to get at the source code. Does it require an Ant build file as well? Additional explanations for generating a runnable JAR? IDE-specific adjustments? The exercise description is not in Markdown format, so it isn't automatically rendered with the GitHub repo.

Obfuscate server JARs

When programming at home (for example #5), students need to run their own server. Unfortunately, experience has shown that some students are likely to load a server JAR straight into their IDE. And not even into a separate project, but mixed up with the client classes.

Reduce the incentive for and impact of such behavior by providing obfuscated server JARs for home programming.

ProGuard is a free obfuscator and comes with an Ant task.

Package description for pityoulish.sockets.server

Similar to #24, but for the server.

Update Readme

Update the Readme to reflect the experience of the first iteration.

Revise ticket format

With the current username@random format, a student has mixed up username and ticket. See #38.

Refactor MsgBoardClientHandlerBase/Impl classes

The base class doesn't need the SocketBackendHandler. I put it there for convenience, assuming that a client would only ever connect to a single server. But with an asynchronous handler implementation, or a multi-threaded synchronous implementation, it would be possible to query several message boards in parallel. It's also questionable whether such implementations would use the command-line related SocketBackendHandler interface at all. Other helpers of the base class would still be useful though.

Move the socketBackend attribute to the implementation class.

Feedback: Sockets 2016

Before the exercise, I talked about message formats. Tried to compare text-based formats (HTTP) with binary formats. Introduced TLVs on a high level. We invented an example out of the Blue. Bad idea, didn't work.

After 60 minutes of coding, students were not done and wanted to continue. I stopped after 80 minutes of coding. Levels of completion:

everybody managed to connect
almost everybody could parse the returned token
about half added the marker (or were working on that?)
one group replaced tokens, another was working on that

About three quarters of the students afterwards considered the time as well spent.

Many students had problems with the "Think!" step. I noticed students trying to code without looking at other places where similar things are done, without checking standard JavaDocs on how to use a class, or without looking for locally available data. Several students failed to give their IDE project a decent name - I'll never understand these youngsters.

Some students mixed up tickets and usernames. They used only the last part of the ticket instead of the full ticket. Or they tried to return a ticket by passing the username instead of the whole ticket. Change the ticket format?

One group encountered problems with their IDE, IntelliJ. The project was configured for Java 1.3 instead of 7 or higher. The Main class was "not found" on execution. There were no such problems during the Tutorial. During the exercise, they unzipped the JAR file and created the IDE project from the files in the filesystem, rather than by importing the JAR. This might have been the difference to what they did in the Tutorial.

On the first attempt to request a ticket, the ticket is not shown because of the missing code. When trying again, the server responds with an error instead of a ticket, so the problem cannot immediately be reproduced. Requesting a second ticket is not possible, because the server tracks the client IP addresses. Returning the ticket is possible, but you have to read it from the server screen. In the worst case, students have to wait a few minutes for their ticket to time out.

Refactor code for server-side console output

Since issue #20, class pityoulish.sockets.server.RequestHandlerImpl has a set of describe* methods that print information about requests and responses to System.out. Define an interface for these methods, and move their implementation to a separate class.

Compare with FormattingVisitorImpl on the client side.

Message Board with Transport Layer Security (TLS)

Adapt the Message Board with Sockets implementation to use TLS instead of plain sockets.
Use self-signed certificate(s) on the server, make sure the clients validate.

Classroom? Homework? Optional?

Failover when connecting sockets

The SocketBackendHandler currently resolves the target hostname to a single IP address and connects there, either synchronously or asynchronously. Instead, look up all IP addresses for the hostname and try them in a loop. If none of the addresses responds, throw an exception and use Throwable.addSuppressed to attach the exceptions for all the failed attempts.

Note that this is a place where the students have to fill a gap, so the gap will have to be revised. Lookup code should be shared between the synchronous and asynchronous cases. Ideally, error handling code is shared as well. But some copy'n'paste from the asynchronous to the synchronous case is acceptable.

For testing the scenario, play around with /etc/hosts or the Windows equivalent. Define a hostname there, with some IP addresses pointing to wrong hosts or nowhere. Use debug output to verify that connection attempts fail before the correct host is contacted.

Tweak the SimpleSequencerImpl

Besides the instance-specific prefix, SimpleSequencerImpl translates numbers into letters as follows:

0 -> a
1 -> b
25 -> z
26 -> ba

That's consistent, because 'a' represents zero and therefore never appears in the lead position. But it would be cooler if it did, like...

26 -> aa
51 -> az
52 -> ba
701 -> zz
702 -> aaa

Mathematically, the leading letter is interpreted as 1-26, while subsequent letters are interpreted as 0-25. Sounds easy, but is easy to mess up too.

The examples above interpret a single letter as 0-25, even though it's a leading letter. There is no need to encode the number 0, so this would also work:

1 -> a
26 -> z
27 -> aa
52 -> az
53 -> ba

The charming thing about those tweaks is that the comparator doesn't have to be changed at all. Longer identifiers correspond to higher numbers, and alphabetical sorting is correct for identifiers with the same length.

Feedback: Java RMI 2016

tl;dr: The session ran completely out of time. I have to prepare totally different next year. See further below for my ideas.

I started by explaining the steps for development in general: define API, generate stubs and skeletons, export objects, use the registry. Over 30 minutes of dry talk, and by the time the students got to that point in the programming session, they probably didn't remember a thing. Doesn't work.

I let the programming session run close to the end of the lesson, almost 120 minutes. After 45 minutes, the first group started on the second part, while others were still struggling with locating the registry. After 60 minutes, about half of the students were working on the second part. Nobody completed both parts, but nobody wanted to continue either. One student asked about the server later, to complete the exercise at home. During the last minutes, I gathered feedback:

Some hands-on programming is useful, compared to just listening. (1)
It didn't feel like applying the things I had explained about remote calls. (2)
I agree. The theory was about stubs, skeletons, and sending messages. But all that happens under the cover. The programming was about looking up objects in the registry, and exporting one.
It didn't feel like programming at all, just debugging. (2)
Yes, that's how the sessions are designed. In real life, a lot of programming is debugging, too. But I thought the students would get the big picture while looking at the pieces. That's not the case.
How about programming an application from scratch? (1)
Takes too long for the classroom. Could do it as homework, but that would have to be voluntary. The programming is not relevant for the grade, and I don't want the minimum-effort students to spend their preparation time on programming instead of learning the relevant material. Actually, I don't want to spend my preparation time on a completely optional programming exercise either.

During the programming session, two students asked about the meaning of the marker. Don't know how many others were wondering without asking. I thought that the functionality of the Message Board would be familiar by now, from the sockets exercise. Not so.

One student cloned the GitHub repo with the full source, instead of downloading the JARs with the processed sources. Should I set up temporary GitHub repos with the processed source for the exercises? #50

Several students were struggling with RegistryBackendHandlerImpl.getTicketIssuer(), without noticing getMessageBoard() just a few lines above - although it was right there on their screen. Not sure if any of them looked at the JavaDocs of the interface and noticed the similar purpose of these methods. No idea how to overcome that.

Not everybody was familiar with typecasts. At least one IDE showed an auto-correct option to add the typecast. But it's not the idea of the exercise to select from auto-correct options.

The programs should print a success message for operations that have no other output, like "return ticket" and "put message". #49

There were problems with desktop firewalls on corporate laptops. The firewall blocked the remote calls from the server to the outlet, so it wasn't possible to publish the outlet. The program just hung for minutes, apparently the server didn't even get an error when trying to connect. For some, but not all, it worked when exorting the outlet on port 21.

For 2017...

I must present the functionality of the Message Board in class. Week 1, or at latest before the sockets exercise. Students have too much other information to digest during the programming sessions. I have to give them the big picture before (or after).

I should approach the RPC/RMI topic from a completely different angle. Instead of theory, start with the programming exercise. Guide the students by switching between explanations and programming. In exercise order, rather than chronologically:

There's an object on the server, it's in the registry, look it up and make a call. -> programming
Export your own object and publish it on the server. -> programming
Call other students' objects, handle errors. -> programming
Clean up your object to exit cleanly. -> programming

Don't make the programming/debugging blocks too small. Students should have enough time to dive into the code, they shouldn't just wait until the solution is presented.

Explain background and theory of stubs, skeletons, generating both, serializing data,... afterwards. Maybe even the week after.

Message Board Server with Sockets using Java NIO

Implement the server side with asynchronous IO in Java. Most of the helpers are agnostic and can be used for both synchronous and asynchronous IO. The server should be able to serve more than one request over the same connection.
The SimplisticSocketHandler cannot do that, because it uses synchronous IO with a single thread. That means it couldn't wait for a second request on an existing connection, and for a new connection on the server socket, at the same time. That's easily done with asynchronous IO.

Unit tests with JMockit

Use JMockit to stub objects in unit tests.

Test candidate for first use: pityoulish.sockets.server.RequestHandlerImpl. It expects three different helper objects. The logic in the class is about calling these helpers in the correct order, feeding the output from one as input to the next.