Communicating
I don't write a ton about programming. I think I've mentioned this before. The scope of it is so vast, and the requirements of any project can vary so wildly in cost and features, and what needs to be prioritized, it makes it difficult to pin anything down as a real best practice. I could write about my personal implementations of things, and how I solved this or that problem, but APIs and tools change so quickly, most blog posts with fine details like that are irrelevant within months, and their existence in an ecosystem that's moved on just makes it harder to find relevant information.
I thought I'd write a little bit about the nature of communication and code.
I think someone on the Data Stories podcast noted that people who are perceived as smarter don't write with very long words. When I was in high school, studying for the SATs and whatnot, my average word length in my speech and writing was probably much higher. When you're surrounded by educated people, who are also studying for the SATs, that makes a certain amount of sense, but in almost any other context, you're losing clarity. If there's a chance that the words you're using to communicate won't be understood by your target audience, you should choose different words. This happens so often with some words that they're effectively out of my vocabulary. The point of communication is to clearly convey an idea.
As I've gotten older, my writing style has simplified, and become perhaps more generic. I feel less compelled to dazzle with obscure words and turns of phrase, and further compelled to just try to transmit what I think is wisdom in the clearest way that I can.
The same has happened to code that I write. I used to think that people who wrote code that used obscure operators and clever bitwise operations were real experts to be admired and emulated. Now, I think they're just assholes.
Here's some CoffeeScript I found in something I forked recently:
randint = (min, max) ->
min + (Math.random() * (max - min + 1) | 0)
You can't see it, but there are other functions in the same file with names like mixPerm and randOri, so this is breaking an internal convention of lowerCamelCase for function names, but let's assume that this is supposed to generate a random int(eger). We assume, because there are no comments. Is it inclusive or exclusive of min and max? Still no comments, and we can't really assume here, because number ranges are generally pretty sensitive in programming, so we have to parse the code.
Okay, it's taking min and adding something to it, and CoffeeScript implicitly returns it.
Math.random() returns a random number between 0 (inclusive) and 1 (exclusive). That's basic JavaScript API knowledge.
(max - min + 1) is the range between min and max, plus 1, for some reason.
| is the bitwise-OR operator, which compares its two operands' corresponding pairs of bits. 0b0100 | 0b1010 is 0b1110, for example.
What is the bitwise-OR operator doing here? Does it have operator precedence over multiplication? I don't know offhand. Do you? Either way, it could be rewritten with more parens to make it explicitly understood, even by people who can't find or don't know to look for a precedence reference:
randint = (min, max) ->
min + ((Math.random() * (max - min + 1)) | 0)
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Operator_Precedence
Looks like | doesn't have precedence over *, so we can take Math.random() * (max - min + 1) as one operand, which is a random number between 0 (inclusive) and one more than the difference between min and max (exclusive). Then what? We do a bitwise-OR against 0.
What's the bit representation of 0? What is this code actually doing?
Turns out that almost all of the JS bitwise operators only work on 32-bit signed integers, in two's complement format, so the operands are all implicitly converted.
0becomes0b00000000000000000000000000000000Math.random() * (max - min + 1)becomes whatever that result is after you convert to an integer and chop off any significant bits it had past 32.
Doing a bitwise-OR against all bits being 0 is an identity operation, so this piece of code is just about getting that implicit conversion, basically trying to round any floating point result down. There's a Math library function that does that (Math.floor()), but it does it without losing all of the significant bits past 32. -2,147,483,648 and 2,147,483,647 are the minimum and maximum values that can be expressed in 32 bits. This is a 10^9 exponent, versus 10^308, which the JS Number type allows.
3000000000.12345 | 0 evaluates to -1294967296, while Math.floor(3000000000.12345) gives you the expected 3000000000.
So, to review, to understand what this one line of code does, you have to know the operator precedence of the highly uncommonly used bitwise-OR operator, and then the implementation detail that it casts to 32-bit signed integers in JavaScript, and the other implementation detail that 0 is expressed with all 0 bits because of two's complement encoding. And to avoid errors, you have to know the limits of a 32-bit signed integer, and never call this function with a larger range than a little over 2 billion.
And for what? So you don't have to type Math.floor()? Maybe so you can feel clever?
Don't be an asshole.
As a bonus, the cast to an integer occurs in the wrong place if you're not assuming that min and max are integers. randint(0.1, 5) will give you results like 3.1, because min isn't in the cast to int.
Also, maybe somewhere down the line, someone decides that their internal implementation of JS needs different number encoding, so that 0 isn't all 0 bits, and then the code just does something totally different. If you'd just stuck to the Math library call, which will be necessarily updated to work with whatever the new encoding is, you'd be fine. While this is unlikely in this particular case, this is always a risk when you rely on implicit behavior, rather than being more explicit in the intent of your code. And either way, relying on implicit behavior without justifying it in comments, so someone reading can at least understand your intent, is just rude.
When you write code, you're serving two masters. You need to tell the computer what to do, but you also need to communicate with the people who have to maintain it (it could be you). Of these, you should always optimize for the latter.
A good compiler/interpreter takes your verbose, explicit, easy-to-understand code, and turns it into something performant for the computer via inlining, folding, branch optimizations, or whatever. This stuff will just always be improving the longer a language is around, totally passively from your perspective. Your very old code was just suddenly performing better when they rolled out V8 for JavaScript and people started using newer browsers.
On the other hand, people's understanding of your code will always be deteriorating the longer they go after parsing it, and it starts from nothing for new readers. To easily maintain code, you have to be able to go from nothing to full understanding in the least amount of time possible, with the lowest mental load possible, and to do that, you need explicit behavior, good names, comments explaining intent, and baby steps. Just because you can write a function in one line doesn't mean you should.
Here's another implementation of randint:
randInt = (minIntInclusive, maxIntInclusive) ->
range = maxIntInclusive - minIntInclusive
randIntInRangeInclusive = Math.floor(Math.random() * (range + 1))
return minIntInclusive + randIntInRangeInclusive
It looks gross. The variable names are really long, but see if you can misunderstand it.
It still breaks if you pass a non-integer for the first argument, but when it's named minIntInclusive, it's clear that it was expecting an int. You could make it throw an error if you want, and need this to be more robust, but that should always be to supplement obvious expectations, because people shouldn't have to read the source of your function, if they can understand how to use it from its signature and, failing that, its docs.