Here's an old part of the Guile manual that explains the technique. For example you could take values whose lowest bit is 1 to be a pointer to the heap. Any value that ends in 0 could be an integer, represented directly in the bits of the pointer. (If your pointer tag is not 0, you will have to mask off the low bits before dereferencing a pointer..)
The size of a tagged pointer is the size of a pointer, so 32 bits on a 32-bit machine, and 64 on a 64-bit machine. V8 doesn't actually use 63-bit smi values on 64-bit machines, AFAIK.
Sayre does a good job of explaining things, but to recap, there are about 253 bit patterns that represent not-a-number values for IEEE-754 double-precision floats, but that only one of those is actually emitted by current hardware and software. So cheeky implementations are free to use the other patterns for their own purposes.
What I always wondered when I heard about this strategy was how 53 bits were possibly enough, for a 64-bit machine. How are you going to fit a pointer in that space?
Well it turns out that on x64-64 systems, you only have a 48-bit address space, currently anyway. The shape of that space is quite peculiar, too. I don't know whether Linux gives out addresses in the upper half, but Windows does not, though Solaris does.
Anyway, the other way of looking at NaN-boxing is that there are 264 - 248 values in a pointer that aren't being used, and we might as well stuff something in them, and hey, the whole space of valid double precision numbers fits!
It's convenient too that numbers in JS are specified as doubles. (Even still, all implementations define separate "small integer" types, for use in loops and indexing and such; integer operations are faster when you can use them, too.)
So, when you choose to do nan-boxing, you basically choose to do one of two things: you favor pointers, or you favor doubles. To favor pointers means that you recognize pointers as having initial (most-significant) 0 bits; if the initial bits are not 0, then it's a double, and you have to add or subtract a bit pattern to get to the double value.
Favoring doubles means that pointers are left as NaN values, so their initial bits are all ones (or the sign bit can be unset, it doesn't matter), and to unpack a pointer, you have to rotate the double space around.
Amusingly, there is a third option as well. For 32-bit machines, you can address the second word in the double directly, so there is no need to mask off anything. This is the JSVALUE32_64 case mentioned in the JSValue code I linked to above.
JSC chose to favor pointers, and as the first JS implementation to nan-box, got to call their choice "nan-boxing". Mozilla chose to favor doubles, and so made up the silly name "nun-boxing".
get thee to a nun boxery?
So you're implementing a language. Should you nan-box or not? I can't say in your case but I can give a couple of examples.
NaN-boxing has the obvious advantage of not allocating doubles on the heap. This reduces cache pressure, GC pressure, and such. That's why Moz and JSC chose it.
V8 on the other hand has not chosen it, at least not yet, anyway. I think that one reason is because especially on embedded devices, and to an extent on ia32, passing around 64-bit values is a big lose. It's so bad that I understand that Mozilla actually passes these values by reference instead of by value in some places, on 32-bit systems only.
But the real reason that V8 doesn't nan-box I think is that they have a compiler that is able to unbox values in registers and in temporary stack locations, both as int32 values and as double values. This works for loops and such things in a function, and is obviously quite efficient, as you don't need to box and unbox at all. Passing doubles as arguments or return values however does allocate them on the heap, as far as I can tell anyway.
Also there is the hackery that with NaN-boxing, you assume things about your OS's memory management. A language implementation should be able to get around this with use of mmap at specific addresses, but still, it's tricky.
I looked at doing nan-boxing for Guile, and eventually decided against it. Guile has a few more constraints that V8 does not have. Firstly it is very portable, so it can't rely the address range constraints that x86-64 has, not without some hackery. It's also portable to the low end, like V8, so 64-bit values are a lose there.
But the killer is the conservative GC on 32-bit systems. If you represent integers with the tag in the high bits, as you would with nan-boxing, then the lower 32 bits are not distinguishable from a pointer, so they can cause the GC to retain heap objects for longer than it should. With tagged pointers, the low-bit masking lets the GC know that an integer is not a pointer. If you were to reintroduce low-bit tagging to a nan-boxed system, you've lost much of the advantage of it. Furthermore if you want to support more than 32 bits of precision in a fixnum, then on a 32-bit system you need to play games with sign extension of a 48-bit value to a 64-bit value, and for all word sizes it looks like the overhead will be significantly higher than tagging.
Finally I think that if we're really going to shoot for the moon, we should implement something like V8's Crankshaft compiler that does decent type inference, to allow unboxed values in registers and on the stack. (For readers of source code, "crankshaft" is basically the "hydrogen" flow-graph code plus a "lithium" assembler. Confusing, right?)
Well that's all for now. If you corrections, or an idea for a further topic, let me know in the comments. Thanks!