Traveling Salesman: 2009

Monday, November 9, 2009

PRES

The first thing I thought when I read that this was going to talk about to recreate bugs in a multi-threaded system was the previous paper CHESS. In CHESS, the system was basically trying all possible thread interleavings (being smart about it of course) in order to replicate the dreaded “Heisenbug”. PRES on the other hand has an slightly different approach. Instead of trying to achieve all possible thread interleavings, PRES trys to use previously known interleavings or execution sketches as guidelines and feedback from failed replays to get closer to reproducing a target concurrency bug.

Armstrong Chapter 6

We have finally come to the end of the Armstrong thesis. I thought this chapter did a very good job explaining how everything we have read up till now connects so well together. I liked how the supervisor structure that was talked about in the previous chapter was used to add a hierarchy to a parallel structure to protect it from faults.

Overall I really enjoyed reading the thesis as a whole, even though some of the code examples required me to stare at it for quite sometime in order to really know what was happening.

Loop Parallelism, Task Queue, Graph Implementation Strategy and Pipeline, Geometric Decomposition and Data Parallelism

This is one of the most practical patterns I have read about parallelizing code. Its premise is so simple. It starts out by identifying key loops within a program. Then you analyze how that loop handles data and its dependencies. The next step is the heart of the pattern and the most work, in my opinion, to identify key loop transforms. The author provides a list of common transforms that can be used on most occasion. The programmer will then keep applying these transforms, ensuring that serial equivalence at each step.

The Task Queue pattern can be defined as a “mechanism to synchronously distribute a sequence of tasks among parallel threads of execution”. The idea is that problem is broken down into smaller “tasks” which are then placed in a queue. Parallel threads will then take tasks off the queue and perform computations on them.

The graph implementation strategy paper seemed to talk more about how to partition the graph than how to incorporate parallelism by exploiting the concurrency in those graphs.

Pipeline is a very simple pattern that achieves parallelism by dividing the operations or the data into stages that can be acted on in parallel. It seems very familiar to the pipes and filters pattern. I have not personally used this pattern, but I understand why this pattern is widely used.

I don't think there is much to say about the Data Parallelism pattern. I just thought it was weird that it was only 3 sentences long.

Monday, November 2, 2009

Armstrong Chapter 5

Let's face it. No one can write perfect code. The best we can hope for is to write fault tolerant code and even that is difficult. One of the first things that needs to be understood by the programmer is what exactly is an error. There are also different levels of severity for errors. Some errors can be handled by the function that caused the error, while others may need to know more information in order to fix the problem. In these cases the programmer will just throw an exception and let the supervision trees handle where to take the problem next. Supervision trees are hierarchies of processes where the parent has an OR/AND relationship with its children. If a parent receives an exception from one of its children, depending on its relationship, it may restart one or all of its children processes.

Task Parallelism, Discrete Events and Recursive Splitting

Well this no better way to start off this blog than to answer the first question.

1)When you first saw this pattern title, did you feel like you knew the subject beforehand? After reading the pattern, did you confirm your impression or did it surprise you? How?

I have to be honest I did think that I knew what the subject was and I was surprised. I thought that is was a simple pattern of trying to add parallelism to sets of independent data. It was very interesting to see how this sort of pattern can be used to solve a wide range of problems. It would have been nice for me if they gave a simple code example to flush out the pattern a little better.

Recursive splitting was a very well written straight forward pattern. The examples they gave also helped clear up any confusion on what the pattern was actually doing.

Wednesday, October 21, 2009

Dense Linear Algebra, Graph Algorithms, and Monte Carlo

I have never used any of these patterns. I also never heard of them specificity, but I knew of the ideas behind the first two patterns. The Linear Algebra pattern allows for the efficient calculation of matrix multiplication. It does this by using outer products and improves on spacial locality by splitting up the matrices into blocks. Which lends to the caveat of this pattern. In order to use it, the programmer needs to be able to control how the matrices are laid out in memory.

I am currently in the Algorithm course here and I have seen that lots of problems can be easily solved, as well as visualized, with the help of graphs. There are many ways in order to store graphs in memory depending on the type of graph and this paper does a pretty good job at explaining it.

The monte carol pattern is the one I never heard of before and its design is an interesting one, if not simplistic. How it works is that is runs many different independent experiments in parallel, then it computes the summary of all the results.

Monday, October 19, 2009

Armstrong Chapter 2

This chapter focus on fault tolerant systems with an emphasis on switching equipment. The paper says in order to be fault-tolerant, the systems needs to be fault-isolated, which means a failure in one part of the system will not effect the whole system. This is achieved in the paper by running each modules in separate processes and the communication between these processes is read-only. This way an error in one process will not infect another. There is a lot of overhead when this is done with OS processes.

I am reminded of the other paper in the book about fault-tolerant operating systems, Guardian. Guardian's system was about redundancy. It had duplicated hardware as well as software processes. This is more of trying to isolate systems as to not interfere with one another.

Map Reduce & EBII

Map Reduce & Event Based Implicit Invocation

Is map reduce related to any previous patterns we've read? How are they the same, how are they different?

It is similar to the Java Fork/Join Framework. A problem can be split into smaller parts that can be done in parallel to improve efficiency. At the end they are combined to form the final answer. In the java fork/join framework the worker threads could also steal jobs from other workers in order to improve the efficiency even more.

How should failures and errors be handled? Should the framework handle errors, or the application developer?

I think it should be on the application developer to handle the errors. Just like how the application developer uses the simple framework to define a map and reduce function, they should also add error handling.

Are there any interesting uses of map reduce anybody has come across?
I only first heard about Map reduce last year and have never seen it used in my day to day job.

Event-based implicit invocation

How is this pattern related to the well-documented observer pattern and publisher/subscriber patterns? Are they the same? How are they different?

This pattern is very similar to the observer pattern. They both have a list of “listeners” that will get called when a certain event happens. In this implementation, it is possible that the announcer can call a component directly when an event happens, but that component should have the ability to ignore the message. In the pure form of this pattern, the announcer does not know exactly if anyone is listening to the event.

Are there any good real world examples of this pattern, besides what the authors mention?
The new google wave probably will use a pattern similar to this. Google wave is "a personal communication and collaboration tool” that will merge email, IM, wiki, and social networks together. I am currently waiting to try out the beta. (Crosses Fingers)

Wednesday, October 14, 2009

Chess

Am I the only one who thought this paper was on a chess architecture? Even though it wasn't about that, I still enjoyed this paper. I, like many others, have come across these “Heisenbugs” and have tried my hardest to debug them. Just recently I have fixing a bug that I had to run 30 min of the same test over and over again to reproduce it. I definitely could of have used a system like this to help me out. However Chess itself could not help me since the program was in Java. Since this was developed by Microsoft, it is only usable on Win32, .Net, and the Singularity research platform. It would be interesting to see this sort of technique used in other platforms like Java.

It was also interesting to read about the search algorithm used to determine the next state to check. Sine a thread could potentially be preempted at anytime, the number of thread interleaving can be quite large, if not infinite. It turns out however that most bugs can be reproduced with only a few threads and that the preempting doesn't factor in as much as some people think it would.

Tuesday, October 13, 2009

BA Chapter 14

I never used smalltalk until last semester. Once I got over the hurdle of the syntax I could see that this was a powerful language. Since I already knew smalltalk before reading this chapter, I didn't get anything out of the parts that talked about the syntax of smalltalk. I see a little benefit of discussing the syntax in the context of the structure of the language, but it seems to me that this part would be better off in a section to teach smalltalk. The paper just did too much on this part.
One of the main things about Smalltalk is that everything is an object and they just send messages to each other to communicate. Java also uses this line of thinking but does not go to the extent that Smalltalk does. Java does have primitive types that are not objects, but those primitives can be encapsulated into Objects (Integer holding a primitive int for example).

BA Chapter 13

I have hardly worked with a functional language before so I liked how this paper laid out both with examples of each. It made it easy to compare what I already know about the OO approach and see how it differs from the functional approach.

What types of problems lend themselves to an Object-Oriented Solution?
If you want to simulate objects and their relationships to one another, then the OO solution is the way to go.

What types of problems lend themselves to a Functional Solution?
A problem of the mathematical verity lends itself well to the functional solution. These problems will calculate a result and return it and they also don't care about the global state.

BA Chapter 12

I would like to make a comment about something the author said about the social and coordinative aspects of Free Software, “Communication is key, and it is hampered by time zones, cultural barriers, more or less reflected preferences and prejudices, and also simply by physical distance”. While I agree that these are problems, I don't think these are limited to Free/Open source software. I have run into these problems at my job since at one point we had to communicated with a team in India. I am sure that they are more of a problem in the Free/Open environment, but they do exist in the corporate world.

I also found the KDE motto of “those who do the work decide” to be a interesting one. There can still be arguments/disagreements of what direction the program should go, but I feel that developers will have a better time doing work they believe in and not work they were told to do.

Wednesday, September 30, 2009

Java Fork/Join

Programmers have been lucky the in past. Since processor speed has been increasing, their code would run faster without modification. Now the focus is on multi-core systems so the performance increase is not just given to the programmer. They have to modify their code to work on multiple cores in order to increase performance.

What I thought was the most interesting part of this framework is the idea of stealing work. I really never thought about it before but it makes perfect sense. Instead of waiting to be told to do a task, the process just takes some work from another. However this process is costly because of the synchronization issue so it is still important to balance the work among the available workers.

I have never worked directly with worker thread pools in my work. I have used systems that have them but I never touched that part of the code or needed to add another worker. I can definitely see the the benefit of using this framework on small tasks.

BA Chapter 11

For the first two years of my undergrad, I used emacs to code projects. I admit I used it in the fashion the author discourages. I opened up a new emacs sessions for every file I wanted to modify. I didn't know all the tricks that emacs offered at the time and even when I did find out I had already switched to vi and didn't feel the need to go back.

During the first part of this chapter, it lists all the different things that emacs can do. I was amazed with all the different functions that emacs has. I had no idea. I just thought it was a powerful text editor. Also with all this functionality, a lot of different developers working on the code, and its age, one would expect it to be a mess at the code level. It is not however. It is just a great testament to how well emacs was designed in the first place that it survived this long and is still under development.

Monday, September 21, 2009

BBOM (Big Ball of Mud)

“This paper examines this most frequently deployed of software architectures: the BIG BALL OF MUD.” This is a very sad but true fact. I didn't even know the definition of the big ball of mud pattern when I read this line in the beginning of the paper, but I had a good idea where it was going.

I have to say have done all of the patterns discussed in this paper. Mostly in my undergrad education. The one that I used most was throwaway code. Even though it has been since drilled into my head “Don't use a prototype in the final product”, it is just so easy for a small school project (that is not on architecture) to use some throwaway prototype as a part of the final product.

Moving from academia to corporate america I have seen big ball of mud patterns used less. I definite agree with the author on how to avoid this is code reviews and refactoring. The product I am working on now has very strict guidelines on what you need to do when you are changing code. It must be code reviewed by at least one senior programmer and it has to be run through a static code analysis tool to check for “code smells”. Of course this wasn't always in place so the big ball of mus is still in there.

BA Chapter 8

I found this to be an interesting read, but it didn't seem to fit in with the other book chapters. Then again an argument can be made that we learn more from mistakes than successes. Guardian was Tandem's solution for a fault tolerant system developed in the 1970's. They eliminated single points of failure by duplicating hardware to make a extremely reliable system.

I found it interesting that one of the big things about operating systems today is how secure they are, but Guardian didn't really put too much thought into security. Of course their first objective was to design it to be fault tolerant, but I still think a little more emphasis could have been placed in security. They used a privileged bit to distinguish privileged code. It seems to me that this could be very easy to get around.

Wednesday, September 16, 2009

Layered Architecture

There really is no better example for a layered architecture than TCP/IP. This pattern involves stacking different services on top of each other, each performing a task. If the current layer can't not handle the request, it sends it further down the line until it can. Then the final answer is sent back up the stack. Another form of this architecture is the bottom-up approach. This is where the information flows from the lower levels, like a driver for example, up the stack getting translated and modified along the way, til it reaches the user. Of course it is possible that the information may not go through all the layers in order to get serviced in the top-down or the bottom-up approach.

There are ten steps for refining the definition of a layered architecture.

1. Define the abstraction criterion
2. Determine the number of abstraction layers
3. Name the layers and assign tasks to each of them
4. Specify the services
5. Refine the layering
6. Specify an interface for each layer
7. Structure individual layers
8. Specify the communication between adjacent layers
9. Decouple adjacent layers
10. Design an error-handling strategy

I never had to refactor code to use a layered architecture, but having this sort of high level check list would be very handy to get started. I still feel that it could still take lot of careful work in order to make sure that you have done everything properly.

BA Chapter 7

The virtual machine running a guest operating system no longer has the kernel as the most privileged software running. That is now the hypervisor. Before Xen, in order for the kernel to execute privileged instructions they would have to modified. Since some of these instructions could cause an error, the hypervisor could trap it and emulate the instruction and return the result. However some instructions that fail silently without a trap to the hypervisor. To handle these cases, it was necessary to scan the OS code at runtime and replacing those code sections with code that calls the hypervisor. This can hinder performance but keeps the guest OS thinking that it is still the most privileged, thus maintaining perfect compatibility.

Paravirtualization is the process that Xen uses to handle the privileged instructions. Instead of scanning the code at runtime, the operating system's code was rewritten to allow it to run in user mode. As the hypervisor is now the most privileged at the supervisor level. They also introduced a hypercall, which is like a system call to the hypervisor when the kernel is executing code that needs to be run in supervisor mode. Since all this work was all done before hand, the performance compared to the previous method is a lot better. The only problem is that they would have to convince maintainers of proprietary systems to make these changes for you. Which is another benefit for open source operating systems such as Linux.

Monday, September 14, 2009

Pipes and Filters

One of the first systems I thought about when reading this paper was Yahoo! Pipes. I stumbled across this a few years ago and never really used it, but I can't argue with the power this sort of system provides. I also really never heard any huge buzz about it. It is a graphical editor to create data mashups that aggregate feeds and websites.

A system that would not benefit from using this pattern are applications that depend on throughput. It is possible that a system has to push data through hundreds of filters and pipes. That could potentially have a negative effect on the throughput of the application.

As for benefits of parallelism, as long as the data sets being worked on are independent you can see a benefit. That way each one can be worked on safely in parallel and then combined at the end.

BA Chapter 6

This chapter was very detailed. I think a little too detailed. However it was still interesting to learn about what happens behind the scenes of something I use everyday. I sometimes think about writing an application for Facebook, but I either don't have time or lose interest.

One of the main reasons why social networking sites, like Facebook, are very helpful to other applications is that is a very easy was to extend the functionality of them with little extra work. Facebook has lots of data on lots of different users and this information is very valuable. The book example in this chapter is a good one. The information found in Facebook could point me to another user with the same interests as me and new authors and books that I never heard of.

FQL was used to optimize service calls by reducing the actual number of services calls needed to obtain data. It did this in a way that is familiar to many developers. A query request is returned as tables and fields just like SQL. It also goes through the security services to ensure the data is not being accessed by unauthorized uses.

Wednesday, September 9, 2009

Excerpts from Christopher Alexander

When I was reading the part about why farmers build a standard barn I couldn't stop thinking about using software pattern X to solve problem Y. And my reason has always been the same as the barn example, I know it works and I don't have a reason to change it. Then I got to thinking that even though I used to same pattern over and over again to solve the same problem, it is, just like those barns, unique in their own way.

My way of describing a Pattern Language is a set of vague requirements that need to be met. Normally good requirements for a system/architecture are not that vague so that is one way of setting these requirements apart. As long as these requirements are met, you fulfill the pattern. But since they are not specificity described to the user, they have their own vision of what is needs to be in their heads. I just found all this fascinating that even though I was reading about building actual structures, in my head I was thinking about software and it still all mades sense.

BA Chapter 5

REST is very useful as a front facing interface to a web service. By only having 4 commands, get, put, post, and delete, it allows for a simple mechanism for accessing data. It is a “pay no attention to the man behind the curtain” aspect. The requester should only care that they get a response from the query. And since the design is stateless, which means that all the info needed to handle the request is in the request itself, it makes it even more portable. It reminds me a lot of SQL statements. Get is a select, post is an insert, put is an update, and delete is ….. well delete. Even though there are more SQL statements. They all boil down to doing three things, creation, modification and deletion which is all you can really do with data.

The author seems to think that they can rely on already in place security features to prevent data access from unauthorized parties. I don't really agree with that unless he implies that each request and reply is using SSL encryption. I do agree with him that a simple system will be easier to find and plug security holes than a more complicated one. Then again a complicated one might make it harder to exploit those holes to begin with.

Tuesday, September 8, 2009

BA Chapter 4

This is my favorite paper so far. It talked about the design workflow of a team developing software for multiple photo studios. Since they did early requirements gathering, and really understood those requirements. They were able to not only building a great system, but the right system. I think that one of the main reasons that they were able to build a great system was because they had a good understanding of what exactly needed to be built. No matter how robust an architecture is, if it isn't what the customer wanted, then you have failed.

It was interesting reading about all the good decisions that this team made during development, but I also think it would have been helpful to know some of the bad decisions that they made. Now if they didn't make any, thats great and we should always strive to make best decisions. But there are usually factors that are beyond your control that you have to make a quick decision without all the information. The business climate was never really addressed in this paper. I would have liked to know more about that.

Sunday, September 6, 2009

BA Chapter 3

I would like to mention other designs of MMO's that are different than the popular World of Warcraft. First off, World of Warcraft has many different servers in many different regions. Players connect to the server of their choice and interact only with other people connected on the same server.

Guild Wars, a monthly payment free MMO, only allows a player to connect with one “server”, but you can only interact with other players in certain places. A city or village for example. Each city is divided into districts. A district is a copy of the city on a different physical server. So when the population of a district reaches its limit, it will push incoming players to that town into another district.

Eve Online, a sci-fi based MMO, also only allows players to connect to one “server”. Each star-system that the player can visit is a separate process on a server. With some high-load systems being given a server all to themselves and many low-load systems being combined and run on servers together. These "SOL Servers" are tied into EVE's main database server where changes to the game take place. Since players need to move between solar systems, they are connected to proxy servers which keep track of which SOL server the player is on. I had to some research in to the server architecture for this game and found this diagram to help out.

Boxology

Even though I thought this paper had an interesting premise, it was not that interesting to read. It really got the feel of a reference book used by designers to figure out how to build their architecture, and reading a reference book always seems dry to me. It is a great attempt at trying to classify high level patterns, a check list for designers to go by when they are designing a system. If a system can be classified as a certain type then you can determine common control and data flow issues that go along with those patterns. However, as other people have pointed out, I don't think it is very useful to try to apply these patterns to a broad system.

Tuesday, September 1, 2009

Beautiful Architecture Chapter 2: A Tale of Two systems

Even though I have only been in the professional software field for a little over 2 years, I have already seen some of the problems associated with a “Messy Metropolis”. Not on the same scale as the book, but the problems were the same. If only all software is designed with care just like “Design Town”, it would make our jobs a lot easier. But the sad truth we all will probably all have at least one “Messy Metropolis” story by the time we retire.

The author does point out that business pressure was one of the factors that made Metropolis fail, but as Will pointed out, didn't really go into the business pressure of the successful case. Maybe the author feels that since the project was a success, that either the pressure was very little or non existent. Now maybe he has a point that the business pressure in the successful project must have been manageable since it did get done and got done with good design. That still does not mean that the team was not under any stress. I am sure there were still late nights to make a deadline.

I also liked the concept of YAGNI. I was working on a group project once and one member insisted on added functions that, if we needed them, would make things easier. We ended up not using them and that engineer wasted valuable time implementing those features and that time could of spent fixing bugs.

Tuesday, August 25, 2009

Test Post

test