Archive for the ‘thoughts’ Category


Better living through parallelism

Judging by the interest to my last actors post, I thought I’d throw up a piece of code that uses actors anonymously to parallelise a long running operation. Not every operation can be parallelised, most things we work on tend to be fairly sequential. However, sometimes if you can split up the work to perform an operation, you can get some serious bang for your buck.

Let’s take a pretty interesting little problem as an example, working out pi. I first encountered this problem in a job interview a while back (yeah, I thought it was out there too). It’s an example of what’s known as a Monte Carlo Simulation. The general way to work this out is described in detail in this Google Code University tutorial on parallel programming (worth taking a look).

The genaral approach works on the basis of this diagram.

The steps are as follows:

  1. generate points randomly in the square
  2. work out whether each point falls into the circle (distance from centre is less than the radius, use Pythagoras here)
  3. the proportion of points in the circle to those in the square is approximately the same as that of the areas of the two shapes; transpose the area formulas (I’ll show it in the code) to work out an approximate value of pi

It’s a nice little problem, because the more points you generate, the better the estimate gets. It’s also easily parallelisable as the first two steps can be performed repeatedly by anyone. Let’s do a first pass in a single thread:

[scala]
package net.jakubkorab.pi_evaluator

import scala.actors._
import scala.actors.Actor._
import scala.math._

object PiEvaluator {
val sideLength : Double = 1
val radius : Double = sideLength / 2
val totalPointsToSample : Long = 10000

def isRandomPointWithinCircleRadius() : Boolean = {
val x = (random * sideLength) – radius
val y = (random * sideLength) – radius
val hypotenuse = sqrt(pow(abs(x), 2) + pow(abs(y), 2)) // pythagoras
hypotenuse <= radius
}

def pointsInCircle(pointsToSample : Long) : Long = {
(1L to pointsToSample)
.map((i : Long) => isRandomPointWithinCircleRadius())
.foldLeft(0L) { (pointCount : Long, pointInCircle : Boolean) =>
if (pointInCircle) pointCount + 1 else pointCount
}
}

def approximatePi(pointsEvaluated : Long, pointsInCircle : Long) = {
println("After " + pointsEvaluated
+ " samples, the number of points in a circle was " + pointsInCircle)

val areaOfSquare = sideLength * sideLength
// areaOfCircle/areaOfSquare =~ pointsInCircle/pointsEvaluated, so
val areaOfCircle = pointsInCircle * areaOfSquare / pointsEvaluated
// areaOfCircle = pi * r^2, so
val approximatePi = areaOfCircle / pow(radius, 2)
println("Pi is approx: " + approximatePi)
}

def main(args : Array[String]) = {
approximatePi(totalPointsToSample, pointsInCircle(totalPointsToSample))
}
}
[/scala]

At line 20, there’s a nice little bit of map-reduce action going on (fold is an equivalent name). What’s going on is that for each number, we’re generating a random number an determining whether it’s in a circle. Then we fold/reduce the list of booleans returned by the map function to give a count of all the points that were generated within the radius of the circle. It’s a super useful little programming construct.

We then call approximatePi() to do our maths for us.

Running this (I’m doing it through SBT, that’s why the [Info] outputs) we get:

After 10000 samples, the number of points in a circle was 7883
Pi is approx: 3.1532
[info] == run ==
[success] Successful.
[info]
[info] Total time: 5 s, completed 30-Jan-2011 13:45:46


Not bad, but we need more points to get a better result. We know that pi should be around 3.14159, but we’re not quite there yet. Scaling up our numbers:

Samples Time (s) Pi
10,000 5 3.1532
100,000 5 3.13676
1,000,000 7 3.141188
10,000,000 20 3.1424864
100,000,000 189 3.14139356


Ok, we’re approaching our known value, but this approach has shown some stress. It’s not scaling too well. It’s slowing right down, but my processor (Intel Core2 Duo T9300 @ 2.5GHz) is showing only ~60% usage. If we could break down the point generation among a number of threads, and then tally up the result at the end, we could make better use of our hardware. Enter fork-join.

Fork-join is probably one of the most common things that you’d want to do with concurrency. It’s a bit of a pain in the butt in Java (task executor framework, futures etc.). It really shouldn’t be. There’s some excellent work going on around a framework for this, and map-reduce too, for future versions of the language. In the meantime, you need to jump through hoops – cue fishing through Concurrency in Practice. In Scala though, it’s really easy using actors. So we rewrite main:

[scala]
val workers = 10
def main(args : Array[String]) = {
// break down the job between worker – there’s a rounding issue here, but never mind
val samplesPerWorker = totalPointsToSample / workers
1 to workers foreach { (workerNumber : Int) =>
val worker = actor {
receive {
case pointsToEvaluate : Long => {
sender ! (pointsToEvaluate, pointsInCircle(pointsToEvaluate)) // reply with a tuple
}
}
}
worker ! samplesPerWorker
}

var workersReplied = 0
var totalPointsEvaluated = 0L
var totalPointsInCircle = 0L
while (workersReplied < workers) {
receive {
case (pointsEvaluated : Long, pointsInCircle : Long) => {
workersReplied += 1
totalPointsEvaluated += pointsEvaluated
totalPointsInCircle += pointsInCircle
if (workersReplied % 100 == 0) {
println(workersReplied + " workers replied")
}
}
}
}
approximatePi(totalPointsEvaluated, totalPointsInCircle)
}
[/scala]

Samples Workers Time (s) Pi
10,000,000 10 18 3.1427624
100,000,000 10 64 3.1418042


64s is a lot better than our original 189s! And this effect just gets more pronounced the more processors you have (where the optimal number of actors matches the number of processors; at least it should be – feel free to play around with this).

In the second version of main, the actor keyword/block is actually a method on the Actor object that creates an anonymous actor instance and instantly starts it. We ping some messages to it, and wait for the results using receive().

Easy, and you can apply it for a ton of different tasks.

Why I dig Scala: Concurrency and the Dining Philosophers

I am occasionally asked what the big deal is about Scala. For me, to decide whether a programming language is worthwhile is dependent on two practical questions: does it aid comprehension, and does it reduce code. The two are not necessarily interchangeable. Terseness, after all, does nothing to aid comprehension. Scala scores points on both counts. It also has a sweet spot that I haven’t encountered elsewhere, that is that it lends itself to concurrent programming in a way which is easy to reason about, and therefore get right. It does this through a set of supporting language features that combined allows us to code at a higher level of abstraction: a leaning towards immutability, functional constructs (such as closures), as well as a familiar way to model a domain via object orientation.

Put together, it’s massively powerful. As it runs on the JVM you could code Scala concurrency the same way as Java, with the bog-standard tools such as wait/notify/notifyAll or the JDK 5+ concurrency libraries. You might get some shorter code, but it misses the point. Scala comes with an implementation of the Erlang-inspired actors model out of the box, which lets you deal with the problems of concurrency in a manner that is much easier to reason about. Actors aren’t a language construct, but a library that makes use the underlying platform (the JVM) and Scala’s language features to provide a much simpler mental model for us to deal with. Actors are “like” threads (not really, but close enough for a starting point) that send and receive messages to and from other actors. How does this aid comprehension? Synchronous and asynchronous messages are very simple to reason about, and IMHO much more straightforward than the Java concurrency libraries (compare the actors documentation to the Trains Book).

Consider the classic computer science concurrency problem, the Dining Philosophers. A number of philosophers sit down at a round table to do some eating and thinking. Each philosopher brings with him a single chopstick that he places on his right hand side. So you have X philosophers and X chopsticks. To eat, a philosopher must pick up the chopsticks on his left and right sides. Leaving aside hygiene issues, it’s a cool toy problem around resource contention. So, how would you do this in Scala? The mental leap to be made is that “everything is an actor”. Given a number of philosophers dining at a table, it’s quite nicely modelled if you think about both the philosophers and the table as actors that pass messages beteen each other. If you want to see the whole file (~150 lines), it’s available here.

Firstly the messages that we’re going to be passing around:

[scala language=”scala” wraplines=”false” firstline=”1″]package net.jakubkorab.philosophers

import messages._
import scala.actors._
import scala.actors.Actor._
import scala.math._

package messages {
class Chopstick(val position : Int)

object Side extends Enumeration {
type Side = Value
val Left, Right = Value

def randomSide() = { Side(floor(Side.values.size * random).intValue) }
def otherSide(side : Side.Value) = { Side.values.find{_ != side}.get }
}

sealed abstract class Message

abstract class TableMessage() extends Message
case class AllFinished() extends TableMessage

abstract class ChopstickResponse() extends TableMessage
case class ChopstickAvailable(val chopstick : Chopstick) extends ChopstickResponse
case class ChopstickUnavailable() extends ChopstickResponse

abstract class DinerMessage() extends Message
case class RequestChopstick(val philosopher : Philosopher, val side : Side.Value) extends DinerMessage
case class ReplaceChopstick(val chopstick : Chopstick) extends DinerMessage
case class CouldNotEatAnotherBite(val guest : String) extends DinerMessage
}[/scala]

Messages don’t actually need to be of any particular type, I just like thinking of that sort of thing in a hierarchy. All of the messages that I’ll pass around are subclasses of Message.

Now for our philosophers:
[scala language=”scala” wraplines=”false” firstline=”34″]class Philosopher(val name : String, val wordsOfWisdom : String) extends Actor {
var table : Actor = null
var seatedAt : Int = -1

private var timesLeftToEat = 3
override def act() = {
while (timesLeftToEat > 0) {
think()
val side = Side.randomSide // pick a chopstick to use first
say("Requesting chopstick 1 on " + side)
table !? (1000, RequestChopstick(this, side)) match {
case Some(ChopstickAvailable(chopstick1 : Chopstick)) => {
pause() // put in a delay so we can see actors switching
val otherSide = Side.otherSide(side) // request the other
say("Requesting chopstick 2 on " + otherSide)
table !? (100, RequestChopstick(this, otherSide)) match {
case Some(ChopstickAvailable(chopstick2 : Chopstick)) => {
eat()
pause()
table ! ReplaceChopstick(chopstick1) // return chopsticks
pause()
table ! ReplaceChopstick(chopstick2)
}
case Some(ChopstickUnavailable()) => {
say("No " + otherSide + " chopstick");
table ! ReplaceChopstick(chopstick1);
}
case None => { say("None"); table ! ReplaceChopstick(chopstick1) }
}
}
case Some(ChopstickUnavailable()) => { say("No " + side + " chopstick") } // no luck getting a chopstick
case None => say("None")
}
}
react {
case AllFinished => { say(wordsOfWisdom); exit }
}
}

private def think() = { say("Hmm"); pause() }
private def eat() = {
say("Nom nom");
timesLeftToEat -= 1
if (timesLeftToEat == 0) {
table ! CouldNotEatAnotherBite(name)
}
}
private def say(s : String) = { println(name + ": " + s) }
private def pause() = { Thread.sleep(ceil(random * 1000).intValue) }
}
object Philosopher {
def apply(name : String, wordsOfWisdom : String) = new Philosopher(name, wordsOfWisdom)
}[/scala]

As I said, it’s pretty straightforward if you think of an actor as a Thread. Think of act() as the equivalent of Runnable#run(). Philosophers will be instantiated with a name and some words of wisdom they’ll come up with. Once they’re sat at a table, they’ll receive an instance of table for them to communicate with and a place where they’re sitting. Messages are sent either asynchronously to the table using the ! method, or synchronously using !? (in which case the number that follows is a timeout). The syntax may be unfamiliar, but I think it reads pretty easily even to those unfamiliar with Scala. I won’t go through it in detail. A philosopher sends a chopstick request to the table and gets a response, either that a chopstick is available, or that it’s unavailable. Pretty straightforward.

So, now the table.
[scala language=”scala” wraplines=”false” firstline=”88″]class Table(val philosophers : Set[Philosopher]) extends Actor {
if (philosophers.size < 2) throw new IllegalArgumentException("At least 2 philosophers must dine together")
var chopsticks = new Array[Chopstick](philosophers.size)
var location = 0
philosophers.foreach { philosopher =>
chopsticks(location) = new Chopstick(location) // lay the cutlery
philosopher.seatedAt = location
location += 1
}
var guestsEating = philosophers.size

override def act() = {
println("Starting the meal")
philosophers.foreach { philosopher => philosopher.table = self; philosopher.start } // let’s go
while (true) {
receive {
case RequestChopstick(philosopher : Philosopher, side : Side.Value) => giveChopstickIfAvailable(philosopher, side)
case ReplaceChopstick(chopstick : Chopstick) => replaceChopstick(chopstick)
case CouldNotEatAnotherBite(guest : String) => guestFinished(guest)
}
}
}

private def giveChopstickIfAvailable(philosopher : Philosopher, side : Side.Value) = {
var index = if (side == Side.Right) philosopher.seatedAt else philosopher.seatedAt – 1
if (index < 0) { index = philosophers.size – 1 } // get the one on the end of the array

val chopstick = chopsticks(index)
if (chopstick == null) {
println("No chopstick available at " + index)
sender ! ChopstickUnavailable() // sender, not philosopher!
} else {
chopsticks(index) = null
sender ! ChopstickAvailable(chopstick)
}
}

private def replaceChopstick(chopstick : Chopstick) = {
chopsticks(chopstick.position) = chopstick
}

private def guestFinished(guest : String ) = {
println(guest + " is done")
guestsEating -= 1
if (guestsEating == 0) {
philosophers.foreach {_ ! AllFinished}
println("All done")
exit
}
}
}[/scala]

The role of the table is to manage the resources, in this case the chopsticks. Calling start() on an actor is analogous to Thread#start().

And now, to kick it all off, let’s stick some philosophers on a table. I have chosen the Greco-Roman Stoics for their easy going approach to life, but any school of thought will do. Chinese philosophers may have been more appropriate to the cutlery. My example, my choice.

[scala language=”scala” wraplines=”false” firstline=”140″]object PhilosophersLauncher {
def main(args : Array[String]) = {
val table = new Table(
Set(Philosopher("Seneca the Younger", "The point is, not how long you live, but how nobly you live."),
Philosopher("Epictetus", "Freedom is secured not by the fulfilling of men’s desires, but by the removal of desire." ),
Philosopher("Marcus Aurelius", "Everything is right for me, which is right for you, O Universe."),
Philosopher("Zeno of Citium", "Shit happens.")) // one of his lesser known ones
).start
}
}[/scala]

So, does it work?

Starting the meal
Seneca the Younger: Hmm
Epictetus: Hmm
Marcus Aurelius: Hmm
Seneca the Younger: Requesting chopstick 1 on Right
Marcus Aurelius: Requesting chopstick 1 on Right
Epictetus: Requesting chopstick 1 on Right
Marcus Aurelius: Requesting chopstick 2 on Left
No chopstick available at 1
Marcus Aurelius: No Left chopstick
Marcus Aurelius: Hmm
Epictetus: Requesting chopstick 2 on Left
No chopstick available at 0
Epictetus: No Left chopstick
Epictetus: Hmm
Marcus Aurelius: Requesting chopstick 1 on Left
Seneca the Younger: Requesting chopstick 2 on Left
Seneca the Younger: Nom nom
Marcus Aurelius: Requesting chopstick 2 on Right
Marcus Aurelius: Nom nom
Epictetus: Requesting chopstick 1 on Left
No chopstick available at 0
Epictetus: No Left chopstick
Epictetus: Hmm
...
Epictetus: Requesting chopstick 1 on Left
Zeno of Citium: Requesting chopstick 1 on Left
Epictetus: Requesting chopstick 2 on Right
Epictetus: Nom nom
Epictetus is done
Zeno of Citium: Requesting chopstick 2 on Right
Zeno of Citium: Nom nom
Zeno of Citium: Hmm
Zeno of Citium: Requesting chopstick 1 on Right
Zeno of Citium: Requesting chopstick 2 on Left
Zeno of Citium: Nom nom
Zeno of Citium is done
All done
Marcus Aurelius: Everything is right for me, which is right for you, O Universe.
Epictetus: Freedom is secured not by the fulfilling of men's desires, but by the removal of desire.
Seneca the Younger: The point is, not how long you live, but how nobly you live.
Zeno of Citium: Shit happens.


Yup. I think it’s pretty easy to make sense of all this. You can easily reason about what happens when, just by drawing a sequence diagram. Consider the backdown strategy when a philosopher can’t get hold of a chopstick:

Zeto->Table: RequestChopstick(Left)
activate Table
Table-->Zeto: ChopstickAvailable(C0)
deactivate Table

Epictetus->Table: RequestChopstick(Left)
activate Table
Table-->Epictetus: ChopstickAvailable(C1)
deactivate Table

Zeto->Table: RequestChopstick(Right)
activate Table
Table-->Zeto: ChopstickUnavailable()
deactivate Table
Zeto->Table: ReplaceChopstick(C0)

note over Zeto: Sleeps for a bit before trying again

Epictetus->Table: RequestChopstick(Right)
activate Table
Table-->Epictetus: ChopstickAvailable(C0)
deactivate Table
Epictetus->Epictetus: Eat
Epictetus->Table: ReplaceChopstick(C1)
Epictetus->Table: ReplaceChopstick(C0)


So, actors are cool, and in Scala they are easy to reason about due to the syntax and ability to mix in OO concepts. For a comparison, check out the equivalent in Java using semaphores. The Scala version is far less code and much easier to comprehend. And if you think that’s cool, check out Akka.

Tips on the London Eye

Pretty much everyone who comes to London goes on to the London Eye. If you want to save yourself some time and money, book tickets online at their website to get a 10% discount, and you get to skip the ticket queue when you turn up to collect your tickets. Great way to save yourself 15 minutes.

The Moustache is Back!

During Movember (the month formerly known as November) I’m growing a Mo. That’s right I’m bringing the moustache back, because I want to help tackle men’s health issues and fight prostate cancer.The Movember rules are simple. I start out on the 1st clean shaven, and for the rest of the month grow the most awesome moustache humanly possible within one month!Here’s where you come in. You help out with a donation to help beat the most common form of cancer among men, and in return on the 30th of October I let you vote on the shape of my upcoming soup-strainer. Here’s where it gets interesting. The higher the total, the bigger the choice. Here’s the breakdown:

  • £0-500 – Not bad. The tache looks respectable, even pretty good in a pastiche kind of way.
    • Errol Flynn/Clark Gable
    • Johnny Depp
  • £501-1000 – Pretty good. Serious face-caterpillar time.
    • Borat
    • Tom Selleck
  • £1001-1500 – Now we’re cooking with gas. I either get mistaken for:
    • Merv Hughes
    • Ron Jeremy
  • £1501-2000 – Smoking. Getting out the tache comb and wax for some serious swash buckling.
    • Salvador Dali
    • The “D’Artagnan” – complete with triangular goatee.
  • £2000+ Facezilla destroys Tokyo!
    • General Ambrose Everett Burnside

But wait! There’s more!
Every day in Movember I’ll take a photo of the tache progression, and animate it so you can see the magic happen in internet time at the end.
As you can see, every little bit helps. So you get to have a bit of a laugh for a month and get to feel good for helping out with a great cause. “But Jake? How do I get in on this sweet deal?” I hear you say. It’s real easy…

  1. Go to https://www.movember.com/uk/donate and donate online using your credit card or PayPal account.
  2. Email me at jakekemail-movember at yahoo dot com dot au.

I will then add you to the “special” mailing list that lets you… Pimp My Face.
The money raised by Movember is donated to the good people at The Prostate Cancer Charity which will have an enormous impact on many men’s lives, and the awareness will help us to fight prostate cancer on every front – through research, support, information and campaigning.

Did you know…

  • Prostate cancer is the most common cancer in men in the UK. 35,000 men are diagnosed every year and one man dies every hour.
  • 1 in 11 UK men will be diagnosed in their lifetime .

For those that have supported Movember in previous years you can be very proud of the impact it has had and can check out the details at:
[ Fundraising Outcomes ].

Movember culminates at the end of month Gala Partés. If you would like to be part of this great night you’ll need to purchase a [Gala Parté Ticket].

And if you want to link back to this blog, that would be cool too.

Conference organizing no more

After taking a step back to have a think about my situation, I have regretfully pulled out of organizing the IJTC conference in Dublin this year. Putting together a conference is a hugely involved activity and I just do not have the time to spare on it this year. I wish the remaining organizers the best of luck and hope the event works out to be a success.

Irish Java Technologies Conference 2008

It’s that time of the year again. We are just starting to ramp up getting this year’s conference in Dublin together for November. A good few months this time, as opposed to our insane 3 month schedule last year. There are a bunch of themes that we are looking to cover in this year’s line-up, but the crux of it will be around addressing common problems and how the tools can help to support that effort, rather than being a bog-standard tech showcase. It’s kind of a reverse point of view from what other other events take.

It was a great buzz putting it all together last year, and I know this one’s going to be even more fun, both for the delegates and speakers. We already have our scheming hats on :)

Who reads your blog?

So, why no blog posts lately? Most of the interesting stuff that folks blog about are those issues and ideas which have currency, those at the forefront at their minds. Tech blogs too deal with the everyday. Issues that we have come across, interesting ideas, problems and techniques. However, in this day of corporate non-disclosure agreements and overly keen security departments sometimes it’s just not prudent to scratch that blogging itch, regardless of how tangential the topic might be.

On the flipside, if you didn’t know, Eclipse 3.4 aka Ganymede is out! With a whole bunch of new goodies as standard. After my first cursory test drive, the concensus is… very nice.

Teenage Knife Crime in London

An unfortunately pessimistic, yet topical, post this time. You only have to walk past the news stands any given day to see the topic of the month. Knife crime is increasing, with the victims typically being teenagers. What were a couple of isolated incidents now appear to be accelerating into a sustained trend. Anyone who has read Malcom Gladwell’s Tipping Point would not be surprised to see the similarities between the case studies described, and what’s going on. Ironically, bringing media attention to the issue, and staging marches in unity against knife crime only serves to validate this behaviour as an appropriate way to resolve disputes. Seeing one’s peers behaving in a particular way serves to validate that behaviour as an acceptable form of expression. The current trend will only be stopped by applying lessons from past equivalents. In the meantime, we can unfortunately expect it to accelerate.

The Long Tail of Java Tools

I went to an awesome session yesterday evening that did a rapid fire listing of small tools that you should know about if you are working with Java. The breakdown is on the Dublin JUG site.

Dynamic Language Smackdown

I just came out of a session comparing scripting (but probably better described as dynamic) languages. Groovy, Ruby, Python and Scala went head to head in three rounds; desktop app, web app and freestyle. Whatever language you follow, the possibilities for use are awesome with excellent features in each language ecosystem. While each compile down to Java bytecode Groovy really does provide developers with the most natural integration, Ruby has awesome libraries, and Scala’s concurrent programming model is compelling (and will probably be one I will be following up). The diversity in this space is awesome and Java developers really should keep abreast of developments in this area. The ease with which web apps in particular are developed in all three leave traditional frameworks in the dust.