Tuesday, 7 April 2015

Breaking baud video

Breaking Baud... just realised that it's almost a year after the fact and I've still not uploaded a proper capture of the final version of Breaking Baud. So, the best quality capture is still the official livestream... :)

Saturday, 4 April 2015

Seminar at Revision 2015

Eeeks, so I realised that I've been too busy to post a single blog entry since Revision last year. Ooops! Anyway, today I presented a seminar at this year's Revision called "Why you should make an emulator". I'll provide a youtube link when available, but for now you can check out the slide deck and just imagine that it's the best presentation you've ever heard! :)

Tuesday, 22 April 2014

Breaking Baud - 2nd place in the Revision Oldskool demo competition

So, one of the reasons I've not been making much progress recently on the FPGA is that I've spent the last couple of months working on a new tape turbo load system for the Amstrad CPC.

The result is here (warning, don't turn up too loud until you get a feel for the volume of the loading sounds!):

I'm proud to say it came 2nd in the Oldskool competition, stacked up against some very worthy competition. Many thanks to JulijanaM and rexbeng for their beautiful artwork and McKlain for his amazing tunes!

Saturday, 11 January 2014

Soft horizontal scrolling on the CPC FPGA

So, the previous post was a bit light on explanation. That's because it wasn't quite working properly, although I've had a chance to fix it up and now it's a lot better, so I've just updated the picture. So, what does it represent?

Perhaps the best explanation is by way of a quick demo:

Basically, makes CPC FPGA the easiest CPC hardware to use on an LCD TV. Normally, smooth horizontal scrolling looks awful on an LCD because the monitor reacts to the signal differently on an LCD to a CRT. Essentially this is because the PLL is tuned to the higher frequencies that monitors are usually used at and so they react to a moving horizontal sync pulse more rapidly.

Moving the sync pulse is a pretty bad thing to do as it's technically a non-conformant video signal - it's just that it works fine with the CPC monitor... and now with the CPC FPGA it works fine on an LCD TV too! :)

Saturday, 4 January 2014

Tuesday, 31 December 2013

Rewriting sidecar2 - much better JTAG speed now

One of the key features of my FPGA board is the coprocessor - an Atmel ATmega32u2 running code I'd codenamed "sidecar". This is responsible for the USB interface to the PC and disk emulation.

Up until this point, my USB code was quite hacky - based on the LUFA example USBtoSerial example, it had a USB-serial bridge and also allowed JTAG operations using USB control messages. Unfortunately, the chip I'd chosen only supports a maximum of 4 endpoints and a virtual serial port requires 3 endpoints itself (actually, the FTDI chip only uses 2, but it consequently needs a driver as it's not a standard USB CDC device) and so there was little choice in the matter. And control messages seemed to work for the most part, but I'd often get random disconnects whilst programming the flash chip (originally, it was about 1 failure every 5 flash cycles so I wasn't too worried). As the code I was running on the Atmel was getting more and more complicated (it also handles FDC emulation and the SD card), this failure rate increased to the point where the majority of reflash cycles were failing.

So, this was annoying and I was also getting fairly poor performance from my JTAG interface - it'd take about 20 seconds to flash the PROM for instance, which for a 200KB file is pretty slow! In the mean time, I'd decided to research booting the FPGA over JTAG as then at least I wouldn't be pummelling the PROM with all these erase-write cycles.

So, you know from the previous post, I've had a lot of difficulty with this, ultimately it turns out that contrary to the Xilinx docs, JTAG mode isn't always available and as soon as you start configuring the chip, it resamples M0,M1,M2 and reads data from there (although unreliably as it seems to use the JTAG TCK as it's sample clock, but it's still feeding the PROM CCLK at a much faster rate). So, I hooked up a spare data pin from the Atmel to the M0 and M2 pads (which fortunately, I'd had the presence of mind to bring out to a jumper as I'd always planned to support JTAG boot). This didn't work either.

See the jumper wire in red

By chance I re-read the documentation and realised this pin was actually 2.5V level and so feeding it 3.3V from the Atmel, whilst not damaging the chip (fortunately) was causing erratic behaviour. I discovered that if I left the pins floating (I'd previously been pulling them to GND for PROM boot) then JTAG boot worked as the Xilinx chip actually contains pull-ups for these configuration pins. So far so good. However, I actually want the default option to be PROM boot and the Atmel takes a little time on boot before it can pull this pin low, so now the FPGA remains stubbonly uninitialised on power on. I tried having a pull-down resistor on these pins, but they fought with the pull-ups and put the voltage into the "not quite high, not quite low" territory, and whilst I could probably have won the battle by using a really low resistance pull-down, I didn't want the current loss when the Atmel was driving this high...

Despite all this, even though I could now boot via JTAG, this process was still unreliable - the USB stack was still crashing randomly, and so the whole thing was really frustrating. I got so annoyed, I kind of ignored the project until this week of holiday over Christmas. I'd wanted to do the USB stuff by interrupt, but the LUFA docs specifically mention how support for this was removed and how polling was the only option. However, my curiousity was piqued when I found Jim Paris' source for lufa-ftdi which emulates the functionality of an FTDI 2-endpoint serial port and does so solely through interrupts. I took a look, learned a lot more about the Atmel USB stack and started rewriting my sidecar code in a similar way.

So now, I have a rock solid JTAG implementation using 2 endpoints and because the whole process is optimised to use ping-pong buffers, it means I can do a 128-bit JTAG exchange per USB packet. Because these are bulk endpoints too, there can be more than one in a 1ms period, and because I'm using ping-pong buffers, I can be reading from a receive buffer (OUT), bit-banging the JTAG exchanging and writing to the transmit buffer (IN), whilst the PC is still reading the previous result (IN) buffer. Pretty sweet. Now, I can write the PROM in 8 seconds and what's more, the JTAG boot works reliably and boots in under 5 seconds.

I'm probably going to drop the serial port functionality, just because serial ports weren't really all that common on the CPC anyway, and I'm exploring the possibility of reconfiguring the USART into SPI-master mode and doing 8-bits of JTAG exchange that way instead which would further increase the speed.

Another casualty of this rewrite is that I've lost all my old FDC emulation and FAT code. The FDC code needed rewriting as its state machine was pretty buggy, but this is the next thing on the horizon now... :)