Monday, January 30, 2012

starting to maybe get the point of node.js

Listening to Glenn Block talk about node.js at Technology and Friends, there's something interesting about the philosophy behind node.js.

Threads are hard. Threads include overhead, but more significantly, using them requires the developer to hold more stuff in his head. Node.js has a philosophy that leads to asynchronous processing without multiple threads.

Glenn explains how the node.js server is single-threaded, but all the APIs are asynchronous, using callbacks. All the incoming requests go onto one queue, and then whenever processing a request requires calling a service or doing I/O or anything that may take some time, that call is executed asynchronously and the callback goes onto the queue. Because there's only one thread, developers don't have to think about multithreading. Because of the asynchronous APIs with callbacks, the single thread doesn't block waiting for anything. Concurrent performance without threading.

Here's a speculation. This idea is implemented in JavaScript because the restrictions of the browser and the mechanics of JavaScript have driven us toward this event-driven, callback model in that language. It is natural, then, to take this idea to the server in the same language. Is this model, with a single thread and a queue of incoming requests and callbacks to process, implemented in any other languages in different frameworks?

Friday, January 27, 2012

Encapsulation in Javascript

This may be basic JavaScript:TGP, but I'm just getting the hang of it. I asked my JS peeps the other day, "If I have two functions and I they each need access to one internal function, how do I encapsulate that?"

If I need to define one function that uses a nested function, all is well and good:

function validateSomeField() {
function internallyNeededFunction() {
...
}
.... stuff that uses internallyNeededFunction ..
return result;
}

$.validator.addMethod("handyValidation", validateSomeField)

However, if I want to define one more function that needs the same internal function, without exposing the internal method to the rest of the world, it gets a lot more complicated:

var thingThatKnowsHowToValidateSomeField = function() {
var internallyNeededFunction = function() {
...
};
return {
validateSomeField: function() {
.... stuff that uses internallyNeededFunction ..
return result;
},
produceErrorMessage: function() {
.... stuff that uses internallyNeededFunction ..
return result;
}
}
}()

$.validator.addMethod("handyValidation", thingThatKnowsHowToValidateSomeField.validateSomeField)
... and later I can pass the second useful function ...
messages: {
handyValidation: thingThatKnowsHowToValidateSomeField.produceErrorMessage
}

So, in order to encapsulate that internal method in a place where two functions could access it, I had to store the result of an anonymous function that returns an object that contains the two functions I want to publicly expose. That way the internal function can be local to the anonymous function.

... and then, Jay Harris showed me how to do this correctly.

function SomeFieldValidationClass() {
if (!(this instanceof SomeFieldValidationClass)) {
return new SomeFieldValidationClass();
}

var privateMethod = function() {
....
}

this.validateSomeField = function() {
.... stuff that uses internallyNeededFunction ...
return result;
};

this.produceErrorMessage = function() {
.... stuff that uses internallyNeededFunction ...
return result;
};

return this;
};

var thingThatKnowsHowToValidateSomeField = new SomeFieldValidationClass();

$.validator.addMethod("handyValidation", thingThatKnowsHowToValidateSomeField.validateSomeField)
... and later I can pass the second useful function ...
messages: {
handyValidation: thingThatKnowsHowToValidateSomeField.produceErrorMessage
}

This creates a class. In Javascript a class is a function, which acts as a constructor when you call it with "new". Private methods defined within the constructor will only be accessible from within privileged methods, which are defined on "this" in the constructor. A better way to add methods to the class is to add them to the prototype, but then they wouldn't have access to the secret stuff in the constructor. The prototype technique also didn't work because the validator framework calls my methods using ".call(...)" and passing in a "this" which is not an instance of my class.

Wednesday, January 25, 2012

Consistency in Scala

One of the goals of Scala is increased consistency compared to Java.
Here are three high-level consistencies in Scala:

  • Every value is an object. There are no primitives in Scala. Even its version of void (Unit) is an object. Nothing is an object. Nil is an object. Arrays are objects. Everything is an object.
  • Every statement is an expression. Every line of code returns something, if only Unit. Every "if" statement, every block, everything is treated as an expression.
  • Every operation is a method. There are no primitive functions; even + is a method on a number object. And because methods can be passed as values in Scala, this means every operation is also an object.

The result of this consistency is a collapse of distinction between the smallest parts of the language and much larger, constructed parts -- hence, a SCAlable LAnguage.

For-confusion in Scala

In Scala, the "for" structure can throw off the Java programmer. For one, it looks enough like a Java "for" loop for a Java programmer to make a guess on how it works -- but that guess is likely to be wrong. For another, the for-loop is implemented in Scala as a special case of the for-comprehension, and its behaviour and purpose are quite different.

This post tackles the second of the two causes for confusion.
What is the difference between a for-loop and a for-comprehension? How can we tell them apart?

We have a sequence:
val seq = Seq(1,2,4)


Here's a super-simple for-loop, and its REPL output:
scala> for (s <- seq) println ("number " + s)
number 1
number 2
number 4


Here's a super-simple for-comprehension:
scala> for (s <- seq) yield 2 * s
res15: Seq[Int] = List(2, 4, 8)


The first part, for (s <- seq), is identical. However, these two are different constructs. Most blatantly, a for-loop returns Unit and a for-comprehension returns a sequence.

The for-loop is an imperative construct; it executes a series of statements. In this aspect it is the same as a for-loop in Java. You can put curly braces around the body of the for-loop and add as many statements as you like in there. This for-loop is implemented under the covers as a call to foreach on the sequence.

The for-comprehension is a functional construct; it performs a translation. The body of a for-comprehension contains exactly one statement, and it begins with yield. You can't put curly braces around the body of a for-comprehension; you'll get "illegal start of statement." Each time yield is executed, an element is added to the output sequence. Under the covers, this simple for-comprehension is implemented through the map method on the sequence. map is very familiar to functional programmers.

Keep in mind that since the for-loop is a special case of for-comprehension in Scala, all the power of the data generation and filtering in a for-comprehension are available to the for-loop. These simple examples here do not illustrate the power of each of these constructs; my purpose is to clarify the differences. When you use a for-loop, you're doing imperative programming, familiar to OO developers. When you use a for-comprehension, you're doing functional programming. Scala supports both paradigms, which is great, but watch out for confusion.

Sunday, January 8, 2012

Familiarity v readability

The topic of discussion at the office today (yes, it's Sunday) is Languages, Verbosity, and Java by Dhanji R. Prasanna, which purports to extoll the clarity of Java compared to more expressive languages like Ruby, Python, and Scala. What it really says is: "Languages that are familiar are more readable."

Of course languages that are familiar to us are easier to read - in consequence, we have a strong bias toward the languages we know. How strong is this bias? Let's look at some examples from Prasanna's article.

The intro specifies that he loves Java with his whole heart. This pretty much determines the outcome for what language he is going to find the most readable.

As an example of Scala's syntax, he writes sum as a left fold across a list, passing the _+_ operator as a function. Um, hello? If I'm going to sum a list of numbers in Scala, I'm going to use the "sum" method on List. This method is much more readable than Java's syntax of a for loop. This example is completely unfair. No language can prevent the programmer from doing a simple operation in a cryptic fashion.

Another, more reasonable Scala v Java example: he finds
string.exists(_.isDigit)
to be less readable than seven lines of Java. This is determined entirely by familiarity. If you're used to asking a list whether it contains an item that has a particular condition, then it won't disturb you to find this answer in one line with the "exists" method.

This also gets to a point where Prasanna and I disagree: what is the goal when reading the code -- is it to know exactly what is being executed, or to know what the program is doing?

Prasanna points out, correctly, that it's easier to tell exactly what is happening in Java code, whether a particular member is a field or a method, for instance. I could nitpick his examples, but... okay, let's nitpick an example. His assertion is that a Java expression is readable without access to context. He claims that
happy happy(happy happy) {
happy.happy.happy(happy);
}

is completely clear. I dispute that. (To be extra nitpicky, it isn't valid Java; he forgot the return.) It could be a method call on a happy field of a happy field of a happy variable. Or, it could be a static method call on a happy class in a happy.happy package. Or a static method call on a happy inner class of a happy class in a happy package. Some of what we call "readability of Java" is dependent on the conventions and coding standards that we use without even thinking about them. We don't expose public fields, package names start with com or org, etc.

Having disputed whether Java makes clear exactly what's happening in a particular expression or statement (even though overall I agree with him that it does make it more clear than Scala or Groovy), I will now dispute whether we care.

The point of Scala and many dynamic languages is that we shouldn't care whether a particular member is a field or a method. If it yields the value we want, why should we care? Exactly like the Java for loop that is replaced by a one-line method call on a list, Scala lets us hide the implementation details. We don't need to worry about how we determine whether a list contains a particular item, and we don't need to worry about whether we're accessing a method or a field. Work at a higher level of abstraction. That's really where the efficiency increase comes from. It has nothing to do with typing fewer characters. It has everything to do with how much stuff we have to hold in our heads. If we can stop worrying about how we flip through the list, about what's going on under the covers in that expression, then we can hold more of the higher-level code in RAM. Then we can be better readers and writers of code.

I agree that Scala has more ambiguity than Java. The result is a higher level of abstraction, and therefore efficiency.

Admittedly, when you're ferreting out a bug, it helps to know exactly where a line of code will go. We have debuggers for that.

Prasanna also extols Java for being similar to C++, "which is buried deep within the collective consciousness of most programmers." No -- it is familiar to him and to his friends. There's a whole generation of programmers who've never used C or C++. When you're defending a language based on its resemblance to its historic ancestors, you're making my point for me. Familiarity is not a property of the language; it is a property of the current state of your brain.

The CoffeeScript example he gives - of a space making the difference between between creating a new object or not - if it is accurate, then I agree that it is a language flaw. All languages have them, and programmers develop conventions to circumvent them.

Finally, in the concluding section, my point is made when he says that Scheme is incredibly expressive and readable. Really? Scheme is readable? It is one of his favorite languages -- therefore it is familiar to him, and therefore it is readable.

If you've ever gone from using Windows to using Mac, everything is harder for a few days. If you choose to adjust your user model, then you'll find the Mac perfectly usable. Programming languages are the same way, although it takes more than a few days to make friends with a language, especially one as deep as Scala. We as programmers have the capability of learning new idioms and new ways of looking at concepts and constructs. Let's not blame the language for our lack of familiarity. Let's learn instead.