There isn't any operation that would replace it that would be "faster" than multiplication; you could subtract one from the exponent using an integer operation, but that doesn't handle zero or infinity or nan or subnormal results correctly, so you'd have to fix those up, and as soon as you have to do more than one instruction a floating-point multiply is significantly faster.

Floating-point addition, subtraction, multiplication, and fused multiply-add are all among the fastest operations on modern cores. In particular, they are *fully pipelined*, meaning that one or more of them can begin every single cycle, and they have a latency of just 3-5 cycles on typical hardware designs (the world seems to be settling on a uniform 4 cycles for them, though some 3 cycle adders are around).

Floating-point division and square root are somewhat slower, but not a lot slower; on recent Intel and Apple cores, one of these operations can begin every 2 or 3 cycles, and the total latency is in the neighborhood of ten cycles. So more expensive than other operations, but cheap enough that avoiding them isn't usually worth the effort if it means using more instructions.