logji: July 2010

I was working on a problem today where some of the computations could fail, but degrade gracefully while providing information about how exactly they failed so that clients could choose whether or not to use the result. This is what I came up with:

/**
 * A monad used to entrain computations on a value that might fail.
 * This is distinct from Either in that it allows computations to continue
 * in the presence of errors, with the possibility of a degraded result.
 * The user of the result can then decide its suitability for use based upon
 * any errors that might be returned. Errors must form a semigroup.
 */
sealed trait Attempt[E, +A] {
  def value: A
  def map[B](f: A => B): Attempt[E, B]
  def flatMap[B](f: A => Attempt[E, B]): Attempt[E, B]

  def error: Option[E]
  def either: Either[E, A]
}

case class Success[E, +A](value: A) extends Attempt[E, A] {
  def map[B](f: A => B): Attempt[E, B] = Success(f(value))
  def flatMap[B](f: A => Attempt[E, B]): Attempt[E, B] = f(value)
  def error = None
  def either = Right(value)
}

case class Failure[E, +A](err: E, value: A)(implicit append: (E, E) => E) extends Attempt[E, A] {
  def map[B](f: A => B): Attempt[E, B] = Failure(err, f(value))
  def flatMap[B](f: A => Attempt[E, B]): Attempt[E, B] = f(value) match {
    case Failure(e, b) => Failure(append(e, err), b)
    case Success(b)   => Failure(err, b)
  }

  def error = Some(err)
  def either = Left(err)
}

Pretty trivial, but maybe occasionally useful. Here's how it looks in my code:

  def mergeComponents(f: String => JSONReport): Attempt[List[String], JSONReport] = {
    components.map(f).map(_.mergeComponents(f)).foldLeft[Attempt[List[String], JSONReport]](Success(this)) {
      (result, comp) => for {
        cur <- result
        jr  <- comp
        properties     <- mergeProperties(cur, jr)
        queries        <- mergeQueries(cur, jr)
        dataAliases    <- mergeAliases(cur, jr, queries.values)
        dataTransforms <- mergeTransforms(cur, jr, dataAliases.values)
        dataMailers    <- mergeMailers(cur, jr, dataTransforms.values)
      } yield {
        JSONReport(
          cur.reportId, cur.active, cur.version, properties,
          queries, dataAliases, dataTransforms, dataMailers,
          cur.dataRange, Nil
        )
      }
    }

Here each of the various mergeX methods return an Attempt[List[String], X] where X is something needed to build the ultimate report. Here I'm just aggregating lists of strings describing the errors, but of course any type for which (E, E) => E is defined would work.

Attempt. For all those times where you want a monad that says, "Hey, I maybe couldn't get exactly what you asked for. Maybe it's little broken, maybe it won't work right, but this the best I could do. You decide."

EDIT:

After a bit of thinking, I realized that this monad is really more general than being simply related to success or failure - it simply models a function that may or may not produce some additional metadata about its result. Then a lightbulb went off, and quick google search confirmed... yup, I just reinvented the writer monad. It's not *exactly* like Writer, because it just requires a semigroup for E instead of a monoid, and the presence of a "log" is optional, so maybe it's better suited than Writer for a few instances.

The one really nice thing about rediscovering well-known concepts is that doing the derivation for yourself, in the context of some real problem, gives you a much more complete understanding of where the thing you reintevented is applicable!

Inspired by michid's example in the comments below, here's a simpler example that demonstrates some utility his doesn't quite capture.

implicit val append = (l1: List[String], l2: List[String]) => l1 ::: l2

def succ(s: String) = Success[List[String], String]("result: success")
val fail: String => Attempt[List[String], String] = { 
  var count = 0 
  (s: String) => {
    count += 1
    Failure(List("failure " + count), s + " then failure")
  }
}

val r = for {
  x <- succ("here we go!")
  y <- fail(x)
  z <- fail(y)
} yield z

println(r)

This results in:

Failure(List(failure 2, failure 1),result: success then failure then failure)

I've been working a bit with Hadoop lately. As many will know, there are essentially two versions of Hadoop that coexist within the Hadoop API; a deprecated legacy API and a newer one that is not nearly so feature-complete. It's a source of confusion to say the least.

Earlier today, I asked in the #hadoop channel on freenode, why were the breaking changes not simply made on a separate branch, the major version number incremented, and so forth such that everyone could move forward? The answers that I got back illuminated a troublesome trend that I think is worth considering for those projects (like my beloved Scala) that are moving towards professional support models of funding.

During the discussion, I commented that there is an unfortunate reality that the largest companies, who may provide the greatest support any given open-source project are also those who will likely be most conservative with respect to change. So as soon as open-source projects grow large enough that companies selling commercial support for those projects survive, there is an immediate disincentive for the support companies to make breaking changes. And stagnation ensues.

The response was startling. People who had been involved with numerous prominent businesses and projects immediately replied with stories of their own projects suffering such a fate, sometimes to the point that the project died entirely.

I think that this "success effect" can be traced to a single, simple cause, with a single, simple solution. Companies (and the individuals that make them up) need to stop thinking of mission-critical software as something that is ever complete. Marketing is never complete; sales is never done with their job, purchasing and HR are never done, nor is accounting. Software is no different - as engineers, we do not (and should not purport to) produce solutions to the problems that companies have; instead, we automate and support the operation of businesses so that businesses can do more, serve more customers, and in the end create more wealth. A project is finished when nobody is using it; never before then.

logji

Thursday, July 29, 2010

A monad for failure-tolerant computations.

Thursday, July 15, 2010

An unfortunate situation.

About Me

Blog Archive