After much fiddling about today fighting against the VHDL compiler which was adamant that even though I'd tested my design in the simulator, it wasn't going to compile a process using both rising and falling edges of the clock. I split this up into 2 processes and all went away. The reason behind this is that with SPI, the data is always written on the rising edge and received on the falling edge...
After things returning bogus data (but always the same bogus data), I tried to debug this and realised that I was changing the sense of din and dout about 5 times between the CPU and the SPI output port, so I decided to just rename everything read and write. There obviously was a logic mistake in there, as now everything almost worked... :)
Well, not quite, but I could work around the problem:
As you can see from the line marked KLUDGE, I have to do a bogus read as the in returns the previous byte transferred not the next byte. That was a bit annoying, as I'd done my calculations and I should be able to exactly fit 8 cycles into 2 T-states, but again due to a fix I had to make to the t80 core, the out is actually only active for 1 T-state before the result is sampled... :(
Anyway, now this works, I can start uploading images to the flash memory... :)