Several commenters pointed out the connection to Bayesian networks. I think they are right, and the network reliability problem is a very special case of Bayesian inference. However, so far this hasn’t seemed to help very much, since the things I can find about algorithms for Bayesian inference are either too general (e.g. allowing arbitrary functions at nodes) or too specific (e.g. only working for certain kinds of trees). So I’m going to put aside Bayesian inference for now; perhaps later I can come back to it.
In any case, Derek Elkins also made a comment which pointed to exactly what I wanted to talk about next.
Consider the related problem of computing the reliability of the single most reliable path from to in a network. This is really just a disguised version of the shortest path problem, so one can solve it using Dijkstra’s algorithm. But I want to discuss a more general way to think about solving it, using the theory of star semirings. Recall that a semiring is a set with two associative binary operations, “addition” and “multiplication”, which is a commutative monoid under addition, a monoid under multiplication, and where multiplication distributes over addition and . A star semiring is a semiring with an additional operation satisfying . Intuitively, (though can still be well-defined even when this infinite sum is not; we can at least say that if the infinite sum is defined, they must be equal). If is a star semiring, then the semiring of matrices over is also a star semiring; for details see Dolan (2013), O’Connor (2011), Penaloza (2005), and Lehmann (1977). In particular, there is a very nice functional algorithm for computing , with time complexity (Dolan 2013). (Of course, this is slower than Dijkstra’s algorithm, but unlike Dijkstra’s algorithm it also works for finding shortest paths in the presence of negative edge weights—in which case it is essentially the Floyd-Warshall algorithm.)
Now, given a graph and labelling , define the adjacency matrix to be the matrix of edge probabilities, that is, . Let be the star semiring of probabilities under maximum and multiplication (where , since ). Then we can solve the single most reliable path problem by computing over this semiring, and finding the largest entry. If we want to find the actual most reliable path, and not just its reliability, we can instead work over the semiring , i.e. probabilities paired with paths. You might enjoy working out what the addition, multiplication, and star operations should be, or see O’Connor (2011).
In fact, as shown by O’Connor and Dolan, there are many algorithms that can be recast as computing the star of a matrix, for an appropriate choice of semiring: for example, (reflexive-)transitive closure; all-pairs shortest paths; Gaussian elimination; dataflow analysis; and solving certain knapsack problems. One might hope that there is similarly an appropriate semiring for the network reliability problem. But I have spent some time thinking about this and I do not know of one.
Consider again the simple example given at the start of the previous post:
For this example, we computed the reliability of the network to be , by computing the probability of the upper path, , and the lower path, , and then combining them as , the probability of success on either path less the double-counted probability of simultaneous success on both.
Inspired by this example, one thing we might try would be to define operations and . But when we go to check the semiring laws, we run into a problem: distributivity does not hold! , but . The problem is that the addition operation implicitly assumes that the events with probabilities and are independent: otherwise the probability that they both happen is not actually equal to . The events with probabilities and , however, are not independent. In graph terms, they represent two paths with a shared subpath. In fact, our example computation at the beginning of the post was only correct since the two paths from to were completely independent.
We can at least compute the reliability of series-parallel graphs whose terminals correspond with and :
In the second case, having a parallel composition of graphs ensures that there are no shared edges between them, so and are indeed independent.
Of course, many interesting graphs are not series-parallel. The simplest graph for which the above does not work looks like this:
Suppose all the edges have probability . Can you find the reliability of this network?
More in a future post!
Dolan, Stephen. 2013. “Fun with Semirings: A Functional Pearl on the Abuse of Linear Algebra.” In ACM SIGPLAN Notices, 48:101–10. 9. ACM.
Lehmann, Daniel J. 1977. “Algebraic Structures for Transitive Closure.” Theoretical Computer Science 4 (1). Elsevier: 59–76.
O’Connor, Russell. 2011. “A Very General Method for Computing Shortest Paths.” http://r6.ca/blog/20110808T035622Z.html.
Penaloza, Rafael. 2005. “Algebraic Structures for Transitive Closure.” http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.71.7650.
I make no particular guarantees about anything; e.g. there is a crufty, complicated shake script that builds everything, but it probably doesn’t even compile with the latest version of Shake.
There are some obvious next steps, for which I have not the time:
All the material is licensed under a Creative Commons Attribution 4.0 International License, so go wild using it however you like, or working on the above next steps. Pull requests are very welcome, and I will likely give out commit access like candy.
This morning Kenny Foner pointed out to me this tweet by Gabriel Gonzales, asking why there isn’t a default Arbitrary
instance for types implementing Generic
. It reminded me that I’ve been meaning for a while now (years, in fact!) to get around to packaging up some code that does this.
As several pointed out on Twitter, this seems obvious, but it isn’t. It’s easy to write a generic Arbitrary
instance, but hard to write one that generates a good distribution of values. The basic idea is clear: randomly pick a constructor, and then recursively generate random subtrees. The problem is that this is very likely to either blow up and generate gigantic (even infinite) trees, or to generate almost all tiny trees, or both. I wrote a post about this three years ago which illustrates the problem. It also explains half of the solution: generate random trees with a target size in mind, and throw out any which are not within some epsilon of the target size (crucially, stopping the generation early as soon as the tree being generated gets too big).
However, I never got around to explaining the other half of the solution: it’s crucially important to use the right probabilities when picking a constructor. With the wrong probabilities, you will spend too much time generating trees that are either too small or too big. The surprising thing is that with exactly the right probabilities, you can expect to wait only time before generating a tree of size (approximately^{1}) .^{2}
So, how does one pick the right probabilities? Essentially, you turn the generic description of your data type into a mutually recursive system of generating functions, and (numerically) find their radii of convergence, when thought of as functions in the complex plane. Using these values it is straightforward to compute the right probabilities to use. For the intrepid, this is explained in Duchon et. al^{3}.
I have some old Haskell code from Alexis Darrasse which already does a bunch of the work. It would have to be updated a bit to work with modern libraries and with GHC.Generics
, and packaged up to go on Hackage. I won’t really have time to work on this until the summer—but if anyone else is interested in working on this, let me know! I’d be happy to send you the code and provide some guidance in figuring it out.
The constant factor depends on how approximate you are willing to be.↩
I wanted to put an exclamation point at the end of that sentence, because this is really surprising. But it looked like factorial. So, here is the exclamation point: !↩
Duchon, Philippe, et al. “Boltzmann samplers for the random generation of combinatorial structures.” Combinatorics Probability and Computing 13.4-5 (2004): 577-625.↩
Suppose that when a router receives a message on an incoming connection, it immediately resends it on all outgoing connections. For , let denote the probability that, under this “flooding” scenario, at least one copy of a message originating at will eventually reach .
For example, consider the simple network shown below.
A message sent from along the upper route through has an probability of arriving at . By definition a message sent along the bottom route has an probability of arriving at . One way to think about computing the overall probability is to compute the probability that it is not the case that the message fails to traverse both links, that is, . Alternatively, in general we can see that , so as well. Intuitively, since the two events are not mutually exclusive, if we add them we are double-counting the situation where both links work, so we subtract the probability of both working.
The question is, given some graph and some specified nodes and , how can we efficiently compute ? For now I am calling this the “network reliability problem” (though I fully expect someone to point out that it already has a name). Note that it might make the problem a bit easier to restrict to directed acyclic graphs; but the problem is still well-defined even in the presence of cycles.
This problem turned out to be surprisingly more difficult and interesting than it first appeared. In a future post or two I will explain my solution, with a Haskell implementation. In the meantime, feel free to chime in with thoughts, questions, solutions, or pointers to the literature.
Let be a monoid, and let denote the subset of elements of which actually have an inverse. Then it is not hard to show that is a group: the identity is its own inverse and hence is in ; it is closed under the monoid operation since if and have inverses then so does (namely, ); and clearly the inverse of every element in is also in , because being an inverse also implies having one.
Now let , where the operation is multiplication, but the coefficients and are reduced modulo 3. For example, . This does turn out to be associative, and is clearly commutative; and is the identity. I wrote a little program to see which elements have inverses, and it turns out that the three elements with do not, but the other six do. So this is an Abelian group of order 6; but there’s only one such group, namely, the cyclic group . And, sure enough, turns out to be generated by and .
I had never been to Vancouver before; it seems like a beautiful and fun city. One afternoon I skipped all the talks and went for a long hike—I ended up walking around the entire perimeter of the Stanley Park seawall, which was gorgeous. The banquet was at the (really cool) aquarium—definitely the first time I have eaten dinner while being watched by an octopus.
Instead of staying in the conference hotel, four of us (me, Ryan Yates, Ryan Trinkle, and Michael Sloan) rented an apartment through AirBnB.^{1} The apartment was really great, it ended up being cheaper per person than sharing two hotel rooms, and it was a lot of fun to have a comfortable place to hang out in the evenings—where we could sit around in our pajamas, and talk, or write code, or whatever, without having to be “on”.
I met some new people, including Aahlad Gogineni from Tufts (along with another Tufts student whose name I unfortunately forget); Zac Slade and Boyd Smith from my new home state of Arkansas; and some folks from Vancouver whose names I am also blanking on at the moment. I also met a few people in person for the first time I had previously only communicated with electronically, like Rein Heinrich and Chris Smith.
I also saw lots of old friends—way too many to list. It once again reminded me how thankful I am to be part of such a great community. Of course, the community is also far from perfect; towards that end I really enjoyed and appreciated the ally skills tutorial taught by Valerie Aurora (which probably deserves its own post).
Here are just a few of my favorite talks:
I can’t really say the tribute to Paul Hudak was one of my “favorites”, since I would have much preferred to have Paul still with us instead! But I thought John Hughes and John Peterson did a great job. Paul will live on through the many, many people he has loved and inspired.
The FARM keynote by Fabienne Serriere was wonderful: funny, erudite, astounding, and inspiring.
I really enjoyed Mary Sheeran’s keynote, Hardware Design and Functional Programming: Still Interesting after All These Years. She did a great job of presenting some of the history and current and future challenges of the area in a way that was accessible and engaging.
Kenny Foner’s talk, Getting a Quick Fix on Comonads, was fantastic.^{2}
Dan Piponi’s presentation of his Moodler project was a lot of fun. I love his use of digital technology to enable, rather than move away from, an analog/physical interface.
I had a lot of great discussions relating to diagrams. For example:
I talked with Alan Zimmerman about using his and Matthew Pickering’s great work on ghc-exactprint with an eye towards shipping future diagrams releases along with an automated refactoring tool for updating users’ diagrams code.
After talking a bit with Michael Sloan I got a much better sense for the ways stack can support our development and CI testing process.
I had a lot of fun talking with Ryan Yates about various things, including projecting diagrams from 3D into 2D, and reworking the semantics and API for diagrams’ paths and trails to be more elegant and consistent. We gave a presentation at FARM which seemed to be well-received.
I got another peek at how well Idris is coming along, including a few personal demonstrations from David Christiansen (thanks David!). I am quite impressed, and plan to look into using it during the last few weeks of my functional programming course this spring (in the past I have used Agda).
If I had written this as soon as I got back, I probably could have remembered a lot more; oh well. All in all, a wonderful week, and I’m looking forward to Japan next year!
Yes, I know that hotel bookings help pay for the conference, and I admit to feeling somewhat conflicted about this.↩
I asked him afterwards how he made the great animations in his slides, and sadly it seems he tediously constructed them using PowerPoint. Someday, it will be possible to do this with diagrams!↩
I prepared three questions for the exam. The first was fairly simple (“explain algorithm X and analyze its time complexity”) and I actually told the students ahead of time what it would be—to help them feel more comfortable and prepared. The other questions were a bit more open-ended:
The second question was of the form “I want to store X information and do operations Y and Z on it. What sorts of data structure(s) might you use, and what would be the tradeoffs?” There were then a couple rounds of “OK, now I want to add another operation W. How does that change your analysis?” In answering this I expected them to deploy metrics like code complexity, time and memory usage etc. to compare different data structures. I wanted to see them think about a lot of the different data structures we had discussed over the course of the semester and their advantages and disadvantages at a high level.
The final question was of the form “Here is some code. What does it do? What is its time complexity? Now please design a more efficient version that does the same thing.” With some students there was enough time to have them actually write code, with other students I just had them talk through the design of an algorithm. This question got more at their ability to design and analyze appropriate algorithms on data structures. The algorithm I asked them to develop was not something they had seen before, but it was similar to other things they had seen, put together in a new way.
Overall I was happy with the questions and the quality of the responses they elicited. If I do this again I would use similar sorts of questions.
You might well be wondering how long all of this took. I had about 30 students. I planned for the exam to take 30 minutes, and blocked out 45-minute chunks of time (to allow time for transitioning and for the exam to go a bit over 30 minutes if necessary; in practice the exams always went at least 40 minutes and I was scrambling at the end to jot down final thoughts before the next students showed up). I allowed them to choose whether to come in by themselves or with a partner (more on this later). As seems typical, about 1/3 of them chose to come by themselves, and the other 2/3 in pairs, for a total of about 20 exam slots. 20 slots at 45 minutes per slot comes out to 15 hours, or 3 hours per day for a week. This might sound like a lot, but if you compare it to the time required for a traditional written exam it compares quite favorably. First of all, I spent only two or three hours preparing the exam, whereas I estimate I would easily spend 5 or 10 hours preparing a written exam—a written exam has to be very precise in explaining what is wanted and in trying to anticipate potential questions and confusions. When you are asking the questions in person, it is easy to just clear up these confusions as they arise. Second, I was mostly grading students during their exam (more on this in the next section) so that by about five minutes after the end of their slot I had their exam completely graded. With a written exam, I could easily have spent at least 15 hours just grading all the exams.
So overall, the oral exam took up less of my time, and I can tell you, hands down, that my time was spent much more enjoyably than it would have been with a written exam. It was really fun to have each student come into my office, to get a chance to talk with them individually (or as a pair) and see what they had learned. It felt like a fitting end to the semester.
In order to assess the students, I prepared a detailed rubric beforehand, which was really critical. With a written exam you can just give the exam and then later come up with a rubric when you go to grade them (although I think even written exams are usually improved by coming up with a rubric beforehand, as part of the exam design process—it helps you to analyze whether your exam is really assessing the things you want it to). For an oral exam, this is impossible: there is no way to remember all of the responses that each student gives, and even if you write down a bunch of notes during or after each exam, you would probably find later that you didn’t write down everything that you should have.
In any case, it worked pretty well to have a rubric in front of me, where I could check things off or jot down quick notes in real time.
People are often surprised when I say that I allowed the students to come in pairs. My reasons were as follows:
Overall I was really happy with the result. Many of the students had been working with a particular partner on their labs for the whole semester and came to the exam with that same partner. For quite a few pairs this obviously worked well for them: it was really fun to watch the back-and-forth between them as they suggested different ideas, debated, corrected each other, and occasionally even seemed to forget that I was in the room.
One might worry about mismatched pairs, where one person does all of the talking and the other is just along for the ride. I only had this happen to some extent with one or two pairs. I told all the students up front that I would take points off in this sort of situation (I ended up taking off 10%). In the end this almost certainly meant that one member of the pair still ended up with a higher grade than they would have had they taken the exam individually. I decided I just didn’t care. I imagine I might rethink this for an individual class where there were many of these sorts of pairings going on during the semester—but in that case I would also try to do something about it before the final exam.
Another interesting social aspect of the process was figuring out what to do when students were floundering. One explicit thing one can do is to offer a hint in exchange for a certain number of points off, but I only ended up using this explicit option a few times. More often, after the right amount of time, I simply guided them on to the next part, either by suggesting that we move on in the interest of time, or by giving them whatever part of the answer they needed to move on to the next part of the question. I then took off points appropriately in my grading.
It was difficult figuring out how to verbally respond to students: on the one hand, stony-faced silence would be unnatural and unnerving; on the other hand, responding enthusiastically when they said something correct would give too much away (i.e. by the absence of such a response when they said something incorrect). As the exams went on I got better (I think) at giving interested-yet-non-committal sorts of responses that encouraged the students but didn’t give too much away. But I still found this to be one of the most perplexing challenges of the whole process.
One might wonder how much of the material from an entire semester can really be covered in a 30-minute conversation. Of course, you most certainly cannot cover every single detail. But you can actually cover quite a lot of the important ideas, along with enough details to get a sense for whether a student understands the details or not. In the end, after all, I don’t care whether a student remembers all the details from my course. Heck, I don’t even remember all the details from my course. But I care a great deal about whether they remember the big ideas, how the details fit, and how to re-derive or look up the details that they have forgotten. Overall, I am happy with the way the exam was able to cover the high points of the syllabus and to test students’ grasp of its breadth.
My one regret, content-wise, is that with only 30 minutes, it’s not really possible to put truly difficult questions on the exam—the sorts of questions that students might have to wrestle with for ten or twenty minutes before getting a handle on them.
Would I do this again? Absolutely, given the right circumstances. But there are probably a few things I would change or experiment with. Here are a few off the top of my head:
Again, I’m happy to answer questions in the comments or by email. If you are inspired to also try giving an oral exam, let me know how it goes!