Saturday, October 13, 2007

Type-Converting Operator Considered Ridiculous

I have come to question the wisdom of the type-converting == operator in JavaScript. The algorithm (defined in §11.9.3 of the ECMA-262 v3 spec) requires no fewer than 22 steps to execute, and produces such amazing intrasitivities as:

>>> '0' == 0
true

>>> 0 == ''
true

>>> '' == '0'
false

and random quirkiness such as:

>>> ' \t\r\n' == 0
true

but:

>>> null == 0
false

>>> null == false
false

>>> null == ''
false

>>> null == undefined
true

Do these behaviors strike anyone else as, well, a bit off? In the first example (discussed elsewhere) the intransitivity of the equality operator is a result of wanting an equality operator to play well within a weakly-typed language. If you want to be able to compare numbers to their string representations as though they were of the same type, while still preserving string comparison, this is the logical result. The second example is a result of the ToNumber algorithm, described in §9.3ff. The whitespace conversion to number value 0 is strange, but defined in the spec (in fact, it's a result of the same algorithm that dictates Number(' ') => 0). The third example seems bizarre as well; I'm not sure why null would be treated differently than 0 or false, but the same as undefined, in such expressions. Many articles from otherwise reputable sources often describe all of these values in terms of "truthiness" and "falsiness", but these articles sometimes gloss over these distinctions.

I don't condone transitioning JS into a strongly-typed system, but some of these behaviors are quite subtle. Given the fact that the vast majority of JavaScript programmers seem to be unaware of the strict-equality operator === (and its negative counterpart, !==), it's no wonder behaviors such as these can cause confusion. See also this article for even more pitfalls, this time simply with math operators combining string and numeric types.

So as Crockford says, use the strict comparison operator === unless you know what you're doing.

Thursday, October 4, 2007

All the Code Security You Need

Lots of beginning JS coders seem to have this idea that the web is all about having your cake and eating it too. I want to participate in the glorious Internet revolution, I just don't want to share my ideas or information with anybody else. I'm writing rich internet applications that are so great, just so unique, that I want to make sure nobody else can possibly look at my code and steal all those great juicy super-awesome proprietary ideas I've got. And maybe, just for bragging rights, I want my files to look HUGE as well. I've mocked up a handy little device that can take care of both tasks at once. Behold, the decompressJS module:

var decompressJS = (function () {
    var encoding = [' ', '\t', '\n', '\r'];
    var revEncoding = (function () {
        var ec = {};
        for (var i=0; i<encoding.length; i++) {
            ec[encoding[i]] = i;
        };
        return ec;
    })();

    return {
        encode: function (codeText) {
            var out = [], cc, i=0, len=codeText.length;

            while (i<len) {
                cc = codeText.charCodeAt(i++);

                // JS chars are actually 16-bit UTF-16 chars, 
                // so we need 4 chars encoded per source char
                // to fully encode
                for (var j=0; j<8; j+=2) {
                    out.push(encoding[(cc>>j)&3]);
                }

            }
            return out.join('');
        },

        decode: function (encodedText) {
            var out = [], cur, i=0, len=encodedText.length;
            while (i<len) {
                cur = 0;
                for (var j=0; j<8; j+=2) {
                    cur += revEncoding[encodedText.charAt(i++)]<<j;
                }
                out.push(String.fromCharCode(cur));
            }
            return out.join('');
        }
    };

})();

Like any mature programming paradigm, my decompressJS can even effectively encode and decode itself, the above 1225-character-long code example (including whitespace) decompressed neatly into an all-whitespace string with no fewer than 4900 characters! Just run your code through this baby and it'll come out the other end fully converted into only invisible whitespace characters! Transfer your JS code over the wire, making it look like nothing but an empty file with all whitespace! Magically increase your file size by a factor of 4! Impress your colleagues with your invisible code! Who's laughing now, code stealers?

Friday, July 27, 2007

Glorious links

BoingBoing's been on a good tear recently. Better than usual. For those of you whose eyes have glazed over, go back and re-read these posts:

Tuesday, July 17, 2007

All the Templating You Need

function replicate(items, template, defaults) {
    var indices = {};
    var i=0, item;

    // build direct map of column names -> row index
    while ((item = data.columns[i])) indices[item] = i++;

    return data.rows.map(function (row) {
            return template.replace(/__\$\{(.+?)\}__/g, function (str, keyword) {
                return (keyword in indices)? row[indices[keyword]] : '';
                });
            }).join('\n');
}

var data = {
    'columns': ['adj', 'noun'],
    'rows': [
        ['main', 'man'],
    ['leading', 'lady'],
    ['green', 'dreams']
        ]
};

var template = '<p>__${adj}__ __${noun}__</p>';
replicate(data, template);


Peter Michaux has some nice ideas about keeping the JSON format DRY, so that data returned resembles something more like a list of Python tuples. (Python is also probably the single language that helped me to understand efficient JavaScript patterns.)

Client-side transforms - converting an XML or JSON response into HTML on the client, to save server bandwidth and processing time - are a key part of modern web apps, but I'm not sure about a transform system that implements full-blown JavaScript logic. Branching or looping can be implemented easily in transforming functions; several templates can be used and plugged in to each other, leading to nested data structures in the response. (Hopefully, time permitting, I'll get to demonstrate how that works soon.)

innerHTML may not be a part of any standard, but there's no reason why it shouldn't be. Sometimes we need to interact with the DOM as a tree, sometimes it's more useful to unleash JavaScript's string parsing and regex power on it.

Monday, July 9, 2007

One-line CSS minifier

CSS minification in one line:

$ cat sourcefile.css | sed -e 's/^[ \t]*//g; s/[ \t]*$//g; s/\([:{;,]\) /\1/g; s/ {/{/g; s/\/\*.*\*\///g; /^$/d' | sed -e :a -e '$!N; s/\n\(.\)/\1/; ta' >target.css

With comments:

$ cat sourcefile.css | sed -e '
s/^[ \t]*//g;         # remove leading space
s/[ \t]*$//g;         # remove trailing space
s/\([:{;,]\) /\1/g;   # remove space after a colon, brace, semicolon, or comma
s/ {/{/g;             # remove space before a semicolon
s/\/\*.*\*\///g;      # remove comments
/^$/d                 # remove blank lines
' | sed -e :a -e '$!N; s/\n\(.\)/\1/; ta # remove all newlines
' > target.css

Using this script, I was able to chop about 29% (10K) off our master.css file. Assumes lines end in semicolons that should end in semicolons. May not play well with certain freakish outdated CSS hacks. Use at your own risk, and always test throughly before releasing into the wild.

Saturday, July 7, 2007

The Problem with SlickSpeed

For the past month or so, there's been a lot of noise about the SlickSpeed Selectors Test Suite. Since I'm in the market for a good selector engine for Zillow, and since it's a bit of a rite of passage (a front-end web dev's equivalent of compiler authoring?), I wrote my own, to see how well I could do and to see how it stacks up to the rest.

So of course, I modified the suite (under its MIT license) to test my little attempt as well. I was pleased with my initial results, but found the test document that comes packaged with the suite to be a little simplistic. Not enough variety or depth of nesting; the resulting DOM structures don't really resemble what I look at on a daily basis at work. I wanted to measure performance in the wild. So I replaced Shakespeare's As You Like It with the Home Details Page for Zillow.com, perhaps the most complex page on the site. Among other things, it includes a photo gallery, a Flash map, an Live Local-based Bird's Eye View map, a chart widget, several ads, tables, etc.

The results, you can see for yourself, here.

As it turns out, according to SlickSpeed, my engine outperforms all but 2 of the other engines on Firefox, and is the best performer on IE7.

So my misgivings on the nature of the document wandered over to the construction of the queries. The given queries perform a "breadth" pass, but they don't really provide a "depth" pass including all manner of combinations of possible selectors, so I wrote my own addition to the suite that picks random elements from the DOM and generates a matching CSS1 selector for it. You can see the dynamic test suite here.

Now, my Element.select engine's performance is fair to decent at best, but no longer the front-runner. Unless I can iron out the kinks, I might look into Ext's engine, especially since it fits nicely into the Yahoo! UI library we use at Zillow.

On the other hand, my Element.select engine is stand-alone and does not provide any other services or dependencies. It's a whopping 6KB (minified), but I wouldn't recommend the use of a CSS query engine for anything short of a full-scale web application anyway.

Some thoughts, though: For reasons that should be self-explanatory, it appears that all of the CSS engine makers are optimizing for Firefox. And once again, Opera's JavaScript engine (named Linear B) and DOM implementation beats out all the rest. Performance on IE looks to be all-around poorer. The Internet Explorer Team certainly has their work cut out for them, not only in improving their DOM and JScript performance and their developer tools (a decent profiler and a debugger that's not attached to Visual Studio would be nice), but also in winning over a hostile developer community. I guess that's what happens when the maker of the World's Number One Browser shuts down their development team for 5 years.

Prototype and MooTools appear to be compiling the CSS selectors into XPath statements for Firefox's (and Opera's) built-in xpath evaluator (too bad IE forgot to allow MSXML's XPath engine to apply to the HTML DOM). While the DOM performance for these XPath-based implementations is fantastic, they also help underline the end-user experience difference between browsers. Let's hope users take notice how much faster the leading non-IE browsers are in comparison; it's hard to win users over on the basis of standards compliance alone.

If nothing else, I hope my modified SlickSpeeds will help CSS query engine developers focus on what's important: CSS1 queries. The time scores at the bottom of the SlickSpeed test skew heavily toward obscure pseudoclass and attribute selectors which I for one won't use most of the time. It's the meat-and-potatoes tag, class, and ID that really count.

Sunday, July 1, 2007

Sicko

As we all know, Google has great power. It's power that comes from the masses: utilizing and channeling the activities, ideas and opinions of millions via the Web. All that information and trust capital can be a powerful tool for sustaining an environment that encourages democracy. Or not.

Update:

Apparently now, for some at Google, democracy is available to the highest bidder.

But the more important point, since I doubt that too many people care about my personal opinion, is that advertising is an effective medium for handling challenges that a company or industry might have. You could even argue that it's especially appropriate for a public policy issue like healthcare. Whether the healthcare industry wants to rebut charges in Mr. Moore's movie, or whether Mr. Moore wants to challenge the healthcare industry, advertising is a very democratic and effective way to participate in a public dialogue.
That is Google's opinion....

Web-Native UX

I'd like to address something many in the User Experience community would rather avoid, since many times it may interfere with deploying the latest cool widget or Ajax technique that comes down the pike. I want to talk about User Experience Consistency and the web. Because while standards bodies have come into being to coordinate the development of the cornerstone technologies from which we build interfaces, and any web developer worth her salt pays attention to valid, semantic markup and the very latest in CSS techniques and the newest developments in unobtrusive scripting and REST and microformats, the pace of development in web-wide usability standards has been glacial at best.

I bring this up because I've been noticing a slower adoption rate of highly-usable, widget-heavy, responsive, dynamic, configurable, powerful web applications. My source? Purely anecdotal and completely unscientific, among my friends and family and even coworkers at Zillow, who express frustration and antipathy toward websites for even minor perceived flaws, while clunky interfaces in other, more primitive sites are tolerated and even preferred to their more elegant "web-application-y" counterparts. With the exception of certain Google and Yahoo! applications, many powerful, innovative web apps are being ignored. In the rush to push the browser to its limits, it's easy to lose sight of the end goal: making routine tasks easier for end users, in the most straightforward way possible.

Web developers are a sensitive bunch - the profession long disregarded in the eyes of "serious" programmers. Ajax was to change all of that. And with impressive things now being done in one of the most challenging software development environments, the front-end of web development has finally been able to attract some formidable talent away from server-side, OS, and game development. For the better, I should think, the Web has been gaining ground, not only as a place to exchange information, but as a valid, full-fledged platform for software development. That's the idea behind all those standards out there: eventually, if we clap our hands and work hard enough, the Web might supplant desktop-native applications for all but a few specialized purposes. Soon, your computer will connect into the World Wide Continuum; your data will mingle freely with the data of billions around the world on an indistinguishable platform of desktop/web-hybrid applications, the social utopia of the Web will supplant thick-client, rugged-individualist desktop computing, the Singularity will occur, and we will all live in happy harmony with the universe.

Fact of the matter is, not nearly enough consistency and code reuse is happening on the web. To an extent, that's good. I'd like to see the web remain a wild place that functions as laboratory as much as controlled platform. But too often, problems are approached like they've never been addressed before.

Sure, there are attempts at usability standards out there. But web usability is complicated, and in spite of the best attempts of several javascript libraries, nobody, not even Google, is as slick and consistent as OS-native applications.

When I'm designing an interface, I try to take into consideration three primary concerns:

Familiarity.
If I haven't seen it before, I don't know what it does, and I don't want to use it, and may not even recognize it as part of the UI. This becomes a huge obstacle for innovative interface development - more later.
Consistency.
If a widget looks more than 70% similar to something else, I will expect them to behave the same. This has ramifications beyond your website (duh).
Ease of use.
This is a big umbrella, encompassing everything from accesibility to ergonomics to "enjoyment": does my slider have a big enough click target? Can I elect to use the keyboard to control this thingie, or am I stuck with the mouse? Do I find myself repeating the same action for common tasks? Is my path into common tasks streamlined and foolproof?

Back in the dark ages, before DHTML graduated from the shadows of image rollovers, web interfaces were largely built out of browser-native form elements and links to more pages. Usability was a minimal concern, because layouts were simpler and interaction models were much less ambitious. Form interfaces were easy to manipulate, since they were largely designed to use OS-native widgets and behave comparable to their desktop-app counterparts. Links were all blue-and-underlined, and they all took you to a new page. Now that we've graduated from a website- to a webapplication-based web, however, many users haven't followed along. Widgets that don't look like text-input boxes can be hard to spot; a recent usability study at Zillow found as much. Part of the problem may be that many users simply haven't been exposed to web applications to expect that anything other than straightforward input controls to respond to input events. I'd like to think that's part of our responsibility as web developers: challenge our users to explore, experiment, discover. But it's also our responsibility to keep the guesswork out of our interfaces.

Web usability is a moving target. I don't have answers right now to many of these questions, but I'll be discussing them as they come up. This post was to survey the territory; I hope to be able to explore aspects of this issue in greater detail soon. I'll also be writing about the technical details of implementing a large-scale interface architecture that balances web standards with friendly, usable design. I believe in powerful, flexible user interfaces, but only inasmuch as they empower the user. Gratuitous lightboxes are not welcome!

Saturday, June 30, 2007

The Blogging Professional

It is a melancholy experience for a professional mathematician to find himself writing about mathematics. The function of a mathematician is to do something, to prove new theorems, to add to mathematics, and not to talk about what he or other mathematicians have done. Statesmen despise publicists, painters despise art-critics, and physiologists, physicists, or mathematicians have usually similar feelings: there is no scorn more profound, or on the whole more justifiable, than that of the men who make for the men who explain. Exposition, criticism, appreciation, is work for second-rate minds.

-- G. H. Hardy (PDF link)

Saturday, June 23, 2007

Crockford's revisions to ECMAScript

The ECMAScript standard is undergoing growing pains right now, and for the first time since 1999, it looks like major revisions to the language may take effect soon. Douglas Crockford has recommendations of his own. While I don't have any strong feelings about the new features he requests (they can mostly be implemented in the JavaScript we already have), his Corrections are mostly sane things that will remove a lot of the suckiness from the current standard:

  1. Reserved words
  2. Standardize implementation of trailing commas in object literals and arrays
  3. Make arguments object a true array
  4. Tail recursion optimization
  5. Deprecate primitive type wrappers, the with statement, semicolon insertion, arguments.callee, typeof, and eval.

It looks like Brendan Eich is well aware of Crockford's longstanding laments, particularly against the with statement and the reserved words restrictions, and has addressed them in the draft spec for ECMA-262 v4.

Crockford objects to restricting the use of reserved keywords in object definitions as unnecessary. According to his argument, usage such as:

var o = {
    class: 'style', 
    with: function (n) { return n+'a top hat'; }
};

will preserve the original use of these names, as the names are merely being used for scoped member properties; they cannot be confused in this context, syntactically, with the "official" uses of these keywords. (This is, apparently, why the JSON spec requires double-quoted property names.) However, it seems this restriction is in place mainly to make possible the use of the with keyword; consider:

with (o) {
   class = 'none';
   with(class);
}

If using these keywords were legal above, their use with the with statement here invites syntax hell.

I for one would be happy to see the demise of the with keyword. It breaks with JavaScript's function-level scoping, and creates more problems than it's meant to solve. It "feels" uncharacteristic of the language, since it promotes an object's properties from "property" status into "variable" status, blending them indiscriminately into the surrounding scope. JavaScript has enough trouble affording decent namespacing as it is; this "feature" smacks of bad practice.

More than adding new features, we could use a "trimming-down" phase to the language. While Mozilla toils at turning JavaScript into Python and other power-players at ECMA (presumably MS?) seek to remake the spec in the image of Java or C#, I'm mostly happy with the way it is - just please, for God's sake, fix the problems that are currently there before bounding into a new round of ill-conceived feature bloat.

We front-end developers are lucky to be enjoying a golden age of JavaScript. But the clouds of confusion and incompatibility are on the horizon. I don't see a compelling reason to add Java-style classical inheritance to the language, and I see even less a possibility of the vendors agreeing to ship timely implementations of this new spec.

Surely wiser heads than mine are laboring on this problem. But I guess I'm just not seeing that JavaScript falls short in ways that warrant changing it beyond recognition.

Tuesday, June 5, 2007

Code Modularization and CSS Queries

On his personal blog, YAHOO! front-end engineer Nicholas Zakas takes on the prevalence of the CSS query engine in client-side JavaScript libraries. The flexibility of some other quick syntax shorthand for accessing HTML elements and describing structure without using the conventional DOM methods has always appealed to me, so I wasn't sure what he was missing.

Granted, the recent proliferation of JS libraries in general (and query engines in particular) has been rather excessive, no doubt tracking a boom and bust path. I'm a fan of all of these efforts, though I wouldn't use most of them myself - they represent attempts to impose order onto the chaotic soup of client-side code that we've been dealing with since 1996. Each one - Prototype's dubious extensions to native objects, jQuery's wrapping of DOM elements and telltale method chaining, Mochikit's Python-like syntax - represents a new experiment in turning the best approximation we've ever had to a cross-OS, open-source application platform - the web browser - into the full-fledged application framework it's destined to be. Even if not all of these attempts have been useful, they've at least been instructive.

After reading his replies to comments to his post, and after an informal conversation with Joel Webber at the recent Google Developer Day, I realized that client-side techniques vary widely across organizations.

At Zillow, for example, we go to great lengths to separate our HTML, CSS, and JavaScript. Since our site is fairly complex and design-heavy, and code iterations happen with such frequency, this modularization is key to keeping sane. So we have a set of rules we follow:

  1. All pieces of client-side code have a well-defined means for interacting with each other. For CSS -> HTML, it's CSS selectors. For JS -> HTML, it's the DOM. For JS -> CSS, it's classnames. For HTML -> JS, it's the DOM events model.
  2. Do not mix HTML, CSS, and JavaScript. No inline styles, no inline event handlers, no creating elements via JavaScript.

You see where I'm going with this - HTML, CSS, and JavaScript are to exist independently of one another and "flip switches" rather than assemble by hand. For the same reason you wouldn't write code like

function makeBoldRed(el) {
    el.style.fontWeight = 'bold';
    el.style.color = 'red';
}

but rather

Element.addClass(el, CSS.ERROR_MESSAGE);

I prefer to use CSS-style selectors to fix the gap, and allow JavaScript to refer to HTML using the same syntax that its presentation counterpart does.

However, it appears that some large companies are generating HTML through JavaScript. This can work too, but in this case your goals are going to be much different. Once you start working this way, you risk accessibility and graceful degradation. You also lose the ability to develop and test presentation-layer components independently of one another, and this can lead to a poorer design or, at the very least, longer development and testing time. Another problem is debugging; if you start permitting your script to tinker with your HTML, or if your script depends on a fixed HTML structure to function properly, design changes can lead to interminable headaches when a complex script's dynamic change to the DOM structure goes undetected.

Sure, CSS queries can be performance bottlenecks. But usually they aren't, at least not when you're using a throughly-tested engine such as are available in the major JS libraries. A minor performance hit - most query engines can perform the most complex queries in under 5 ms, depending on your machine - is a small price to pay for the flexibility of modular code.

On a side note, I was most intrigued by Joel Webber's description of the Google Web Toolkit's function: to add a compile-time step to optimize and inline JavaScript code. I can appreciate that. Certainly, a well-written compiler will be able to optimize code to a greater degree than a meticulous programmer. But why Java? Why apply this essentially alien style of programming to the client? This question remains outstanding for me.