Cache hierarchy is a memory management strategy used in computer architecture to optimize data access speed by organizing multiple levels of cache with varying sizes and speeds, closer to the CPU than main memory. This approach reduces the average time to access data from the main memory by keeping frequently accessed data in faster, smaller caches closer to the processor.