2018-02-25

PureScript and Google Closure notes

I have played around with PureScript and Google Closure sometime in the last couple of weeks. The interesting part is of course advanced compilation. It doesn’t work out of the box and there are some hurdles to getting it to work even in principle. I’m not going to work on it in the near future, but I want to write down some observations, just in case.

The reason I have looked at Closure at all is that I keep thinking of ways to generated better JavaScript code from PureScript. We could not emit dictionaries for classes with no methods. We do not need an object as a dictionary if the class has only one method, we could pass the function itself. Specialize if the type is known. General inlining. Common subexpression elimination. Whatnot. Generating optimal code is a huge time sink. It would be nice if we could rely on Closure to do all the boring optimizations and only do cool language-specific optimizations in PureScript.

Closure as a bundler only (simple optimizations)

In addition to its own goog.module-based module system, Closure nowadays also handles CommonJS modules and ES6 modules. The PureScript compiler purs produces CommonJS modules in the output/ folder by default. This is how to call Closure to produce a single JavaScript file bundle.js with all modules bundled into one:

closure-compiler --process_common_js_modules --module_resolution NODE --dependency_mode STRICT --entry_point main.js --js main.js --js 'output/**.js' --js_output_file bundle.js

Note the --entry_point main.js argument. PureScript does not actually generate code that invokes any main function. Assuming you have a module Main, this is a main.js file that actually causes side effects to happen:

require('output/Main').main();

There can be multiple entry points. I think these are just modules assumed to be called from external code, and Closure will omit bundling any modules that are not transitively required by any entry point.

You might want to add --isolation_mode IIFE which wraps the output in an immediately called function and enables further optimizations.

This seems to work okay. It is not particularly fast (about the same as browserify). It is also not particularly good at reducing code size. In fact, in my tests it has been much better to use purs bundle and use Closure with simple optimizations on the resulting bundle. However, I think purs bundle cannot deal with npm dependencies. Closure does find modules in node_modules.

Advanced optimizations

Just adding -O ADVANCED to the above command does not work.

Constructors

Advanced optimizations require every function that is used as a constructor/with the new keyword to be annotated with /** @constructor */. PureScript generates constructors for typeclass dictionaries (I think) and datatypes.

I hacked the PureScript code generation to add @constructor annotations to capitalized functions. Note: the constructor annotation needs to be on the binder, not the function itself. This took me ages to figure out.

// This is okay
/** @constructor */
var Foo = function (...) { ... };

// This is okay
/** @constructor */
function Foo(...) { ... }

// This is NOT okay
var Foo = /** @constructor */ function ( ... ) { ... };

Records

The obvious problem, and what makes Closure advanced compilation so great/difficult are records. Just briefly: Closure renames record fields consistently if you use the non-string syntax for construction and projection. Closure would rename field in this {field: 5}.field, to a, or something. Well, this example would optimize to just 5, but you get the idea. However, Closure does not rename fields that introduced with the string syntax {"field": 5} or projected using the string/array syntax record["field"]. Now, if you mix these you are in trouble, because the "field" you are trying to project will likely have been renamed to a. The solution in JavaScript and ClojureScript is basically: people mostly use the non-string syntax because it’s shorter anyways, and you just have to be a bit careful to be consistent when you use strings.

PureScript has an interesting challenge: RowToList and all the record stuff that builds on it like records, simple-json, and so forth. I think in theory the set of record labels that will possibly be projected is computable at compile time and with enough compiler smarts we could generate specialized getters and setters to enable consistent non-string label records. In practice, it’s pretty hopeless right now. I think the way to go would be to always emit record construction and projection using the string syntax.

Note: the same is not true for datatypes and typeclass dictionaries. These should use non-string syntax. Their runtime representation is meant to be hidden from JavaScript anyway, so not even FFI code should be affected (much).

I thought of another approach. We could have different kinds of labels, those that get renamed and those that don’t. PureScript already has string syntax for labels (for labels with names of keywords and similar parser-related reasons). I suppose one could hijack those to have basically the same distinction as in Closure, where plain labels are subject to renaming, and string labels are guaranteed to be preserved. This is a massive change to the language though, and almost definitely not worth it now.

Other thoughts

One can annotate JavaScript with types that are used by Closure mostly for “typechecking” or at least linting and, supposedly, optimizations. While we could annotate at least some simple types, it is not clear how much would be gained.

The PureScript culture uses fairly little foreign code and it tends to be well-isolated into a few FFI files. This could make writing/checking “externs” files a lot nicer than in less disciplined languages and communities.

Conclusions

Closure with simple optimizations is not much better than other bundlers/minifiers. If you can, use purs bundle first, then run Closure on the result.

My codegen hacks to the PureScript compiler made it possible to compile some very simple experiments (that did no use records) with Closure’s advanced optimizations to really, really good JavaScript. No curried functions. All typeclass indirection gone. Pretty much perfect. It will not work as well for more abstract code than just printing results of arithmetic expressions, of course.

I don’t think it would be too hard to either change PureScript itself, or write an alternative backend, to produce Closure-compatible code. I still think the easy way to better performance is through using Closure, rather than reimplementing optimizations in the PureScript compiler.