A lot of people said the Xbox was designed to both do the operating system switching fast and go down in price quickly. The ESRAM for example is likely go down in price much quicker than GDDR5.
The esram is on the APU - that's why they went with slower, bigger ESRAM rather than a daughter die like on 360 (that way they could have had twice as much, and faster). They wanted a solution that cost reduced more neatly than 360. So it won't drop in cost any faster than sony's APU - you'll get a little bit of cost savings as yields improve, and then a big saving when the process node shrinks (but that will be a while away)
The on chip esram is the most confusing for me. It reduces space for the GPU (so reduces the absolute power of the machine), takes a huge amount of space (because it needs to be ESRAM rather than EDRAM), and still isn't big enough to take 1080p/60 buffers easily.
A daughter die could have had 64MB EDRAM, giving enough space for 1080p/60, and freed up the APU die for more GPU power. That setup could have had the 8GB DDR3 and potentially have been more powerful than PS4, with the downside of being more complex to cost reduce later in the generation
Of course they now have quite a bit of short term pain for those decisions