1) What does 'expensive' mean in this context?
Expensive has the usual meaning ($). For integrated circuit, price depends on circuit size and is directly related to the number and size of transistors. It happens that and "expensive" memory requires more area on an integrated circuit.
Actually, there are several technologies used to implement a memorizing device.
Registers that are used in processor. They are realized with some kind of logic devices called latches, and their main quality is to be fast, in order to allow two reads/one write per cycle. To that purpose, transistors are dimensioned to improve driving. It depends on the actual design, but typically a bit of memory requires ~10 transistors in a register.
Static memory (SRAM) is designed as a matrix of simplified flip-flops, with 2 inverters per cell and only requires 6 transistors per memorized bit. More, static memory is a memory and to improve the number of bits per unit area, transistors are designed to be smaller than for registers. SRAM are used in cache memory.
Dynamic memory (DRAM) uses only a unique transistor as a capacitance for memorization. The transistor is either charged or discharged to represent a 1 or 0. Though extremely economic, this technique cannot be very fast, especially when a large number of cells is concerned as in present DRAM chips. To improve capacity (number of bits on a given area), transistors are rendered as small as possible, and a complex analog circuitry is used to detect small voltage variations to speed up cell content reading. More reads destroys the cell content and requires a write. Last, there are leaks in the capacitance and data must be periodically rewritten to insure data integrity. Put altogether, it makes a DRAM a slow device, with an access time of 100-200 processor cycles, but they can provide extremely cheap physical memory.
2) Why are faster areas of memory like registers more expensive?
Processor rely on a memory hierarchy and different level of the hierarchy have specific constraints. To make a cheap memory, you need small transistors to reduce the size required to memorize a bit. But for electrical reasons, a small transistor is a poor generator that cannot provide enough current to drive rapidly its output. So, while the underlying technology is similar, design choices are different in different part of the memory hierarchy.