Background noise as in ambient noise. A computer or a fridge.
Let's set up a thought experiment. You have one speaker, and one headphone cup, in a room with dead silent background noise. You place the speaker to your right and in front of you. We'll ignore delay for now. You play music from it at 80dB measured at the right ear. Let's just assume that the left ear hears the same signal at 70dB, though that might be a bit quiet. Now you place that headphone cup on your right ear, and play it at 80dB (At this stage do you still have the speaker on? if so this test is not valid) . How loud do you expect the left ear to hear it? Let's say 50dB, but I think that's optimistic. So we can presume that the effect is much weaker on the headphone, and imaging is weaker as well. (Even with the speaker turned off, the "imaging" is not "weaker" it is skewed too much to the right from the engineers intent, if the speaker is still on - this will be too many concurrent variables and is not valid)
Now increase the background noise, ambient volume, whatever to 50dB. The speakers still have a crossfeed effect 20dB louder than it. The headphones have a crossfeed effect equal to background noise. It's not just weaker, it's potentially gone. Turn the volume down 10dB, and the speakers still have 10dB of crossfeed but the headphones don't have anything. If imaging relied on this, low listening volume with high ambient volume would have little or no imaging, and it would increase dramatically if the difference increased.(This doesn't need any thought experiment if you want to introduce significant backround noise, use any open headphone out on busy traffic regardless of isolation - I doubt soundstage or even imaging can be reproduced without compromise. Also absolute fidelity is significantly compromised when playback volume strays too far from initial performance volume - volumes should be matched with a decibel meter to maintain accurate referencing)
What you're implying is pseudo-crossfeed. One ear hears a delayed and quieter version of the other channel. That's crossfeed. This is what creates imaging. You implied that headphones will low isolation image better. I presumed that was because the crossfeed effect would be stronger. What did you mean otherwise?
Crossfeed is when a delayed crosstalk is introduced into the signal in a manner like speakers do - this requires a crossfeed circuit or crossfeed algorithm. I do not know, in the above scenario, if this difference in an order of magnitude lower can still be described as crossfeed. It is definitively a delayed crosstalk, and in my opinion, of a level that will perceptible by a perfectly healthy individual.
Your hypothesis is just a hypothesis, just like mine. It has to stand up to testing, reason, and logic. You have to be able to defend it.
For me to defend it, you need to come up with a flaw in my reasoning and logic, for which you have not demonstrated.