Programmable toaster ovens

"In contrast the Java community has stagnated, stuck with a language designed in a hurry to power toasters. "

Random image of the Banana Jr. in 'Bloom Country' which, when it's owner phoned up the support hotline as it was misbehaving, was put on the phone and the technician threatened it with its chips being used for programmable toaster ovens.

So much of Java seems to be very same-y, putting a web front on a database, EJB for scaling out. Tuple spaces are interesting; C4I systems are blackboard based, some of those fielded ten years or more years ago. It went mainstream in that market a decade ago.


Labels: , ,


Better GUIs

Every now and then I go and have a look back at Java land. Yesterday I found this blog entry
John O'Conner's Blog: Better GUIs are one step closer.

It's very painful making good layouts in Java. For my last large Java UI project, which was based on porting a large mainframe ISPF application to run as a desktop application on PCs, I ended up implementing a layout and look-and-feel with most of the CSS box model on top of swing. I would have used XUL for it, but it was a Java shop and adding another platform was too political. There already is CSS look-and-feels in Java, so it shouldn't still be an issue getting things lined up right.

Anyway, I tried to do the same thing in XUL, so that it looks like this on a Mac:


Now, this took a little while while I remembered that you have to specify widths to ensure each flexible box ends up the same size, but the declarative syntax means you don't really need an IDE and a graphical editor, and the full CSS support means you can skin things if you like.

But that's not the real problem. Better guis are not just better aligned guis - they are concerned with user experience. And if you have to produce that amount of code just to align your fields, then you won't be agile enough to respond to your users.


Labels: , , ,


How fast is Java?

Link: http://blogs.sun.com/roller/page/dagastine?entry=java_is_faster_than_c

In response to David Dagastine's use of the SciMark numeric benchmark, where he finds that there's very little difference between in speed between Sun's JVM and native C compiled with Visual C++ 6, using a numeric benchmark that uses static methods to manipulate arrays of primitive data types, primarily doubles.

This agrees with my experience for numerical code. When I was lead software engineer in the technical computing group of the world's largest aerospace company, I moved several numerical projects from Fortran or C++ to Java. At the time, one the major reason were bugs in VC's optimisation code which meant the results were wrong, and discrepancies between release and debug builds that made C++ bugs harder to track, whereas Java had performance within 10% and much stronger guarantees of its results.

But MS got round to fixing the bugs, and in VS 2005 appear to have made some major gains in performance.

On my laptop (WinXP SP2, AMD Athlon 3400+), comparing Java 1.6.0-beta with VS 2005 express gives 388.67 and 631.12 mflops respectively, which is a much bigger difference than I observed between Java 1.4 and VS6. The SciMark code is portable C, so they don't use C++ intrinsics, which can give an order of magnitude speed up for certain code (though since it's working in double precision you probably wouldn't get that much improvement for quite a lot less readability, as you can only process 2 doubles at a time with SSE2, but can process 4 floats, and so keep whole vectors in single registers).

So I'd disagree that Java 1.6 is faster than C/C++ in this case on the two counts - firstly it's significantly slower than the C code it was benchmarked against, and secondly SciMark is not using current C++ techniques for optimising numeric code, which any serious numerical library would.

The SciMark code is optimised to a similar degree on both C and Java platforms, but the nature of the Java language means that there are further optimisations which are impossible (due to not having access to intrinsics or inline assembler), and many more that go against Java's intent of being a safe object oriented programming language (such as passing around char arrays rather than String objects).

I've observed before, that without any kind of meta-programming facility, writing optimised numerical Java is very painful (though I haven't had cause to try again - even with better function inlining, you have to copy all the parameters you need, and construct and object for any result which is more complicated than a single type, so it doesn't impact on the complexity much). For small projects, you can do it be hand, for others you have to either use a code generator from a higher level model, or just put up with it running slow. It would be nice if having to hard code optimisations disappear from Java code as Sun improves the JVM's capabilities. For example, copying all state out of an object into local variables, then performing a tight loop, then copying back and gives a measurable performance boost. The same applies for nested loops on multidimensional arrays - you hoist the reference to the row. Currently you have to do such tedious code by hand, though Fortress seems to favour such optimisations in its transactions, so may be a better way ahead for numeric code.

For something a little closer to Java's enterprise use, say date formatting functions, the difference between Java6 and C++ in small benchmarks is closer to a factor of 7 slower, for example see this thread.

However, what I learnt in that thread isn't that if you port C to Java it runs seven times slower (usually it's not as bad as that, and as the SciMark above shows - a good programmer can write Fortran in any language and it'll run quickly), but rather that even good Java programmers won't think of solving the problem in the faster way, but use Java idioms instead.

The idiomatic Java way of solving that problem was twenty times slower than the C code. If you're used to thinking that StringBuilder is faster than StringBuffer is faster than concatenating String, you won't write the sort of code that works fast, however good the VM gets. The granularity at which safety is assured in Java - immutable objects - means you can't optimise across method calls. You need to wrap up a string as an immutable object to protect it from modification, rather than proving its const via the type system.

Java is designed to be a safe, object oriented language. Fast code either operates above this level - for example, using a DSL specify that you want to multiple a matrix row by factor and it'll generate the code inline for it which works on the primitive types - or below this level, such as C can do, and the generated code would. Having to work only with objects with a statically typed interface means you don't have the flexibility to get a high level view of the logic, and get to feel the bits between your toes if you want to.

The granularity problem in Java also applies to array bounds and other checks - much of what can be known at compile time is lost by the Java type system. The attempt to fix this with generics didn't get far in terms of capability - you can only specify the generic type as far as the current Java type system allows, and the information is erased at runtime, so the JVM can't selectively remove redundant checks (other than checkcast). Traditionally, this hit you in loops of a known number of iterations against an array of known length – the array access instruction would still check the bounds. That one may have been mitigated, but only by means of specifically getting the JVM to look for that particular pattern. Traits allow you to provide a general mechanism. For example, a method which can only take a square matrix only needs to check its arguments if they are not known to be square at compile time, which depends where the method's being called from, so requires a backwards possible-caller analysis, and different code to be generated based on the results. Having an NxM matrix with a trait for squareness, also square matrix subclasses with the trait fixed, and knowing this trait doesn't change after the object is created would allow the check to be done both at compile and run time as appropriate. I'm not aware of any language yet that has that level of sophistication of traits inference. (I also don't like the compiler's lack of escape analysis for generics - if you've only ever put a Foo into a list, then you shouldn't have to tell it that it's a list of Foo. But that's for another time, as I'm not doing much Java programming in the real world now, and getting on to how traits can improve dynamic language performance is another post.)

So I don't think the JVM will ever be the fastest thing. Java's too much in the middle ground:

  • The language lacks the meta-programming facilities that hide the tedium of hand optimisations

  • The language lacks access to low level instructions that allow optimisations

  • There is no concept of a cross-object inference to prove unsafe operations cannot happen, rather it relies on information hiding to prevent them, which forces conservative restrictions and redundancy

  • Java idioms use object level safety and synchronised blocks, rather than traits and transactions. This prevents many possible optimisations based on direct access, thread-safe caching, and often requires redundant checks on information already known.

There were good reasons for these decisions in the evolution of Java, but the effects of them won't ever go away, and as long as it's the Java virtual machine, the JVM will have some compromise on its performance.

But it never was intended to compete with Fortran, and is fast enough for many uses.


Labels: , ,


Art, state of, one year on from last year (the).

Previous frustrations with XMLHttpRequest, and more recently finding DeltaV didn't appear to be supported even in Firefox at work may be changed if a bit of sensible flexibility gets the W3C spec to conform to the HTTP rfc's extension-method = token rather than a vendor specific white-list.

I'd still really like a browser based, version controlled, graph drawing tool for modelling and knowledge capture, but with the WhatWG's canvas and support for SVG in Firefox stable enough that I'm writing production code based on it, and the real possibility of single page applications such as this wiki using Amazon Simple Storage Solution, I'm thinking of retiring the Java based, serverside image code of my LapisBlue, which I never got round to connecting to a versioned store anyway.

So I'm thinking of retiring LapisBlue, since I'm paying monthly for a full featured server solution that's not getting any use, whereas I can pay for a tiny amount of data storage and get the clients to do the rendering work now. Though proper version control would be nice, saving deltas or labelled versions to S3 should also be possible, more fun that configuring a tomcat installation that pulls in a thousand or so libraries, and not reliant on extension methods as subversion's DeltaV implementation is. What you lose is a queryable database, but I'm thinking of using it for a pattern wiki rather than anything else.

In other news, I got rather exited over the weekend thinking about using SSE for a faster 'byte' code interpreter, and resurrecting kin - my toy language for graph matching based code generators to generate simulation models defined generically on traits, which I'd partly implemented on the JVM - as a scripting language plugin for the gecko platform. If you can SIMD the graph matching, and maybe also either SIMD the bytecode scripting, or (since kin uses pure visitor functions a lot) use SIMD optimised blocks with scripting, you may get close to Java performance without having to track Sun's generics cruft.

It's still easier for me to write Java than C++, especially when you need to use libraries - each library having its own code conventions and memory management model - or Lisp for that matter (since I've done far more Java than Lisp in anger), but for many things JavaScript's good enough. The only things I've found this year that I've written in Java have been something to test an algorithm for work, which could have been written in anything really, and an annealing based graph layout, which ran too slow in JS to be useable. But annealing graphs is exactly what kin would be suited to, and be designed to parallelise it, so it may be that the Java world gets even smaller for me.

I'm not sure how useful web-based simulation tools would be, and suspect a good enough interpreter + a really, really good code generator would be a better match to a lot of the problems I like thinking about than trying to do anything like Sun's Hotspot, brilliant though it is.

Third point of this summary - I'm also excited about building distributed clusters of collaborating applications and services on top of xmpp. It's something I've been pushing at work, and I've got enough of the infrastructure there that the rest of my team are starting to play with it, building models and connecting them to XUL UI's with xmpp pub-sub. I've got till mid June to build it out to a full distributed system with service discovery, which means a mix of quite easy xml binding and doing some fairly hard concurrency work to get the simulators' execution model and the pubsub code working well without excessive threads or critical sections.

Oh, and I'm going to XTech2006 next month. It's nice to be working for somewhere that's not too stingy to send it's people away again.


Labels: , , , , ,