The hardware setup for the console itself likely has nothing to do with Kinect (at least, not directly) - it's more likely MS taking the safe route to guarantee 8GB of memory early on in the console's development which lead them to the comparative GPU deficiency (Need 8GB early -> use DDR3 -> need fast cache to alleviate bandwidth constraints -> on-die ESRAM -> less space for bigger GPU -> ultimately weaker system compared to PS4)
Pretty sure without Kinect Camera they could have had the funds to drop in the 79xx/67x class gpu pretty easily. Remember that GTX 670s with lower clockspeeds go into gaming laptops. That's right, the mobile version is the same card, just smaller and downclocked.
This wasn't a space issue. This was MS choosing to spend money on Kinect camera and bundling it in, thus the $500 price point. If anything, they could've just used a 3 GB GDDR5 setup for the GPU and 4-8 GB DDR3 for the system. Yes it isn't as flexible as pooled memory but its a helluva lot faster than being held back by ESRAM.
Hindsight is always 20/20, but from the start I believe MS could have had the most powerful console and still been competitive at the higher price point. But by choosing to bundle in Kinect 2.0, I think they've shot themselves in the foot. After how forward looking and powerful Xbox and 360 were, I am disappointed to say the least