PureScript and Google Closure notes
I have played around with PureScript and Google Closure sometime in the last couple of weeks. The interesting part is of course advanced compilation. It doesn’t work out of the box and there are some hurdles to getting it to work even in principle. I’m not going to work on it in the near future, but I want to write down some observations, just in case.
The reason I have looked at Closure at all is that I keep thinking of ways to generated better JavaScript code from PureScript. We could not emit dictionaries for classes with no methods. We do not need an object as a dictionary if the class has only one method, we could pass the function itself. Specialize if the type is known. General inlining. Common subexpression elimination. Whatnot. Generating optimal code is a huge time sink. It would be nice if we could rely on Closure to do all the boring optimizations and only do cool language-specific optimizations in PureScript.
Closure as a bundler only (simple optimizations)
In addition to its own goog.module
-based module system, Closure
nowadays also handles CommonJS modules and ES6 modules. The PureScript
compiler purs
produces CommonJS modules in the output/
folder by
default. This is how to call Closure to produce a single JavaScript
file bundle.js
with all modules bundled into one:
closure-compiler --process_common_js_modules --module_resolution NODE --dependency_mode STRICT --entry_point main.js --js main.js --js 'output/**.js' --js_output_file bundle.js
Note the --entry_point main.js
argument. PureScript does not
actually generate code that invokes any main
function. Assuming you
have a module Main
, this is a main.js
file that actually causes
side effects to happen:
require('output/Main').main();
There can be multiple entry points. I think these are just modules assumed to be called from external code, and Closure will omit bundling any modules that are not transitively required by any entry point.
You might want to add --isolation_mode IIFE
which wraps the output
in an immediately called function and enables further optimizations.
This seems to work okay. It is not particularly fast (about the same
as browserify). It is also not particularly good at reducing code
size. In fact, in my tests it has been much better to use purs bundle
and use Closure with simple optimizations on the resulting
bundle. However, I think purs bundle
cannot deal with npm
dependencies. Closure does find modules in node_modules
.
Advanced optimizations
Just adding -O ADVANCED
to the above command does not work.
Constructors
Advanced optimizations require every function that is used as a
constructor/with the new
keyword to be annotated with /** @constructor */
. PureScript generates constructors for typeclass
dictionaries (I think) and datatypes.
I hacked the PureScript code generation to add @constructor
annotations to capitalized functions. Note: the constructor annotation
needs to be on the binder, not the function itself. This took me ages
to figure out.
// This is okay
/** @constructor */
var Foo = function (...) { ... };
// This is okay
/** @constructor */
function Foo(...) { ... }
// This is NOT okay
var Foo = /** @constructor */ function ( ... ) { ... };
Records
The obvious problem, and what makes Closure advanced compilation so
great/difficult are records. Just briefly: Closure renames record
fields consistently if you use the non-string syntax for construction
and projection. Closure would rename field
in this {field: 5}.field
, to a
, or something. Well, this example would optimize to
just 5, but you get the idea. However, Closure does not rename fields
that introduced with the string syntax {"field": 5}
or projected
using the string/array syntax record["field"]
. Now, if you mix these
you are in trouble, because the "field"
you are trying to project
will likely have been renamed to a
. The solution in JavaScript and
ClojureScript is basically: people mostly use the non-string syntax
because it’s shorter anyways, and you just have to be a bit careful to
be consistent when you use strings.
PureScript has an interesting challenge: RowToList
and all the
record stuff that builds on it like records
, simple-json
, and so
forth. I think in theory the set of record labels that will possibly
be projected is computable at compile time and with enough compiler
smarts we could generate specialized getters and setters to enable
consistent non-string label records. In practice, it’s pretty hopeless
right now. I think the way to go would be to always emit record
construction and projection using the string syntax.
Note: the same is not true for datatypes and typeclass dictionaries. These should use non-string syntax. Their runtime representation is meant to be hidden from JavaScript anyway, so not even FFI code should be affected (much).
I thought of another approach. We could have different kinds of labels, those that get renamed and those that don’t. PureScript already has string syntax for labels (for labels with names of keywords and similar parser-related reasons). I suppose one could hijack those to have basically the same distinction as in Closure, where plain labels are subject to renaming, and string labels are guaranteed to be preserved. This is a massive change to the language though, and almost definitely not worth it now.
Other thoughts
One can annotate JavaScript with types that are used by Closure mostly for “typechecking” or at least linting and, supposedly, optimizations. While we could annotate at least some simple types, it is not clear how much would be gained.
The PureScript culture uses fairly little foreign code and it tends to be well-isolated into a few FFI files. This could make writing/checking “externs” files a lot nicer than in less disciplined languages and communities.
Conclusions
Closure with simple optimizations is not much better than other
bundlers/minifiers. If you can, use purs bundle
first, then run
Closure on the result.
My codegen hacks to the PureScript compiler made it possible to compile some very simple experiments (that did no use records) with Closure’s advanced optimizations to really, really good JavaScript. No curried functions. All typeclass indirection gone. Pretty much perfect. It will not work as well for more abstract code than just printing results of arithmetic expressions, of course.
I don’t think it would be too hard to either change PureScript itself, or write an alternative backend, to produce Closure-compatible code. I still think the easy way to better performance is through using Closure, rather than reimplementing optimizations in the PureScript compiler.