I'm proposing a new programming language, called Guava, for Genuinely Usable Java. Guava would be very like Java, but would be designed for usability by learners, not for safety in the hands of experts. This post is to suggest some ideas about the motivation for Guava, and to lay out some of the principles that would guide its development.
I've been talking to friends for many years about the problems with using Java as a language for teaching programming. Java is a difficult language, that requires the programmer to keep in mind many details simultaneously in order to produce successful programs. I just finished teaching our CS2 course, which teaches Java and data structures. While the students loved learning a "real" programming language, we all found it frustrating dealing with the many aspects of Java that make it a very difficult language for beginners.
There are many languages that are much nicer as a first programming language. Scheme has the advantages of grace, elegance, and one of the best books ever written about computer science. Python has the advantage of simplicity, and real-world power. C has the "advantage" of being close to the machine. (I am among those who believe that if the fundamental power of computer science is abstraction, we ought to begin with some deep abstractions!) However, most programmers are likely to eventually end up writing programs in Java, and all three of these languages suffer as beginning languages for someone whose trajectory is toward Java. (It is appropriate to continue the argument about whether Java is the "right" language for most programs to be written in. Certainly Java is neither elegant enough nor extensible enough that we should believe it will survive longer than the usual ten or twenty years of dominance. But: we academics need to recognize that we don't get to make that choice. Java is what our students will have to program in, at least for the first part of their careers.)
Scheme is awkward as a first language for eventual Java programmers because the functional approach in Scheme does not prepare students well for the imperative style of Java. We may all love the style, but taking a detour before starting down the imperative path is an ivory tower exercise. Python is much more similar to Java -- but has several unusual syntax choices that get in the way of easily migrating to Java. Further, once students have learned Python, they're going to be frustrated with the strictures of Java programming. Better would be a path that would let them write much of their program in a language at the same level as Python, while having access to full-featured Java for those parts of their program that benefit from the rigor.
Those of you who have explored Groovy are already thinking that you have the answer. Groovy goes a long way to solve these problems. It runs on the Java VM, and smoothly interacts with Java programs. Classes can be written in either Groovy or Java and interact smoothly with classes written in the other language. But, Groovy makes several bizarre syntax choices that will make the transition to Java more difficult for students -- such as freedom from semicolons, except of course when they're needed -- and has put many of us off with frustrating run-time errors caused by unnecessarily complicated language semantics. Groovy may well eventually be the solution, but for now I find it a frustrating waypoint on the path to a truly usable dialect of Java.
Of course, there are many other languages available for the Java VM. One I particularly like is Scala, which has many elegant language features from modern language theory. But, Scala is not really a language that wants to provide a graceful transition to Java. Scala is more useful as a demonstration of how powerful and elegant features from Ocaml could be brought to Java.
I propose a new dialect of Java, called Guava, for Genuinely Usable Java. (Yes, Guava isn't really an acronym. Unless you think it's kind of cool that it drops the J in Java, as a sign that usability often means removing features instead of adding them. (There is already a language called Guava, discussed in a 2000 SigPLAN paper, but I don't see more recent articles on it, so I think we should take over the name.) (While we're going crazy on parens: I'm not sure whether Guava will really qualify as a "dialect" of Java, since it will in many ways be a very different language. In particular, Guava will allow functions outside of a class, code outside of any function, and higher-order functions (functions as arguments to other functions), none of which Java allows. However, Guava will be in a very deep way Java-like: Guava language syntax will map directly to Java syntax so learnes can easily move back and forth between the two languages.)
Note that criticising the usability of existing languages for teaching beginners is not the same as criticising the languages directly. It may well be that Java has made the right trade-offs for expert programmers who are willing to devote years to developing mastery. It is possible that Guava will only ever be used by people learning to program, or by people writing very simple programs, such as are now written in scripting languages. Perhaps all important Guava programs will eventually be migrated to true Java. That's okay! Guava is intended to be a simpler language usable for non-experts, with a direct path to deployment in full Java.
There are many careful decisions to be made on the path to Guava. These decisions will all be based on three core principles that form The Guava Manifesto:
1) Guava will be compatible with Java. Guava syntax will be like Java syntax. Guava will use exactly the Java object hierarchy, including Java Strings, Arrays, Lists, and Maps. Guava classes and Java classes will be completely interoperable.
2) Guava will support functional programming. Programmers will be able to pass functions as arguments without having to know how to create anonymous inner classes.
3) Guava will simplify the Java type system, transparently. Guava programs will be strongly typed. The difference between primitive and "boxed" types will be invisible to the programmer. Types that are developed in Guava will automatically be able to be compared and stored in maps with the expected semantics.
Of course, these principles don't answer all -- or even most -- questions about how the language ought to work. For instance, should Groovy provide operator syntax for commonly used operations, like List or Map operations? On the one hand, beginners would benefit from simpler syntax than Java provides. On the other hand, directly using the Java "everything looks like a method call" style would provide a smoother transition to Java.
I'm not certain I've articulated the best set of principles yet. What do you think?
John
Why not Rhino JavaScript?
Hi,
Rhino (http://www.mozilla.org/rhino/) is a JavaScript implementation written in Java. It satisfies all core principles, in fact:
1) Rhino can use underlayng Java classes, data structures and libraries and vice-versa
2) JavaScript support functional programming and functions are first order onject in JavaScript
3) Rhino is weakly typed for JavaScript native type, but strongly typed for Java imported object.
Morover, JavaScript is another language that students will have to program in.
Naw
I don't think it's necessary to teach students "useful" languages as part of a CS curriculum: CS students should be exposed to many different languages in order to develop the *principles* of programming rather than learning the quirks of any particular language. In fact, I'd force this issue by using only esoteric, non-"useful" languages until the later part of the sequence.
Teach Useful Stuff
Reid:
Interesting that we disagree so much on this issue, since I usually find your perspective valuable. I have two reasons I think your approach is wrong here, though:
1) Academics are lousy at predicting what the important principles of programming are. We'll be more likely to be helpful if we teach principles that are known to be practical.
2) Computer science is one of the most accessible of the academic disciplines. Talented, young students can participate actively in research and practice very quickly, if they are introduced to the right tools. Let's let them play in the real world right away!
John
Consider the multi-paradigm language Oz (and Mozart)
After reading through your analysis of Scheme, Python, etc. it seems to me that what you are really in search of is a practical, multi-paradigm language. Have you considered Oz?
The language Oz and its environment Mozart are the basis for the great textbook Concepts, Techniques, and Models of Computer Programming (Peter Van Roy and Seif Haridi) which many believe will take the place of SICP as The Canonical Computer Science text. Everything is freely available, plus there are numerous examples of practice applications and research artifacts that demonstrate its "real world" value.
Thoughts on Oz...
Having used Oz (in the graduate-level PL course at my undergrad institution, where the text was CTMCP), I don't think it's a good language to throw at our beginning students.
It's an excellent language for advanced students to explore programming language concepts (hence it was a brilliant choice for the course I used it in), but it's got a few too many weird corners. It feels very much like a research language, and as such is good for exploring and experimenting, but doesn't, in my opinion, do much to prepare programmers for 95% of actual programming.
For the public record: Oz is a dynamically-typed higher-order logic programming language with syntactic features to make it readily usable in a functional style, and additional features for object orientation, lazy evaluation, etc. It particularly shines in concurrent programming - it uses a dataflow-based declarative concurrency model that, so long as you stay declarative, makes race conditions impossible. Thread synchronization is done via unification of unbound variables.
I think the concepts in Oz represent some useful directions for programming languages, and I hope that many of its insights are carried into the next Big Language, but at this point, I think that it's something best saved for their junior, senior, or graduate years.
Incidentally, though, I've been thinking similar things as John Riedl on a usable Java for teaching purposes. I'd tweak a few things (I'm a little more willing to dispense with or abstract the Java collections framework, for example), but I think there's a lot of room for such a language. And not necessarily tied to the JVM - why couldn't such a language also have backends for the CLR and LLVM (and thus native code)?
Oz?
John:
Sounds very interesting. I'll take a look at it.
My current thinking is that I'm not looking for a practical multi-paradigm language, but rather for a language that supports smooth transition to Java, while being accessible to beginners. (Which Java fails at, badly.) Of course, the desire for Java is based on the assumption that Java is an important practical tool for computer scientists, and that students have to eventually learn it. That assumption will eventually be wrong.
John