DMA successes with Verilog

classic Classic list List threaded Threaded
73 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

DMA successes with Verilog

Jim Brain

I know it's neither new nor terribly interesting to some, but the recent Ultimax mode success bolstered my confidence to try a few more complex things, like CBM 64 DMA.

I've been impressed by the flexibility of the development cart I designed, as it required no more work to set up for DMA tests.

I started off my trying to DMA one byte (to address $8000), and when that worked, I then tried 256 bytes.  I hit a snag when trying to store a user defined length and had to reimplement the code to use a state machine.

But, I have managed to get 1-256 bytes (user defined) to DMA from the cart to the 64.

It's a small victory, but I always felt the DMA mode was a mystery in implementation details.

Jim



-- 
Jim Brain
[hidden email] 
www.jbrain.com
Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

MiaM
Den Tue, 12 Jun 2018 00:35:33 -0500 skrev Jim Brain <[hidden email]>:

> I know it's neither new nor terribly interesting to some, but the
> recent Ultimax mode success bolstered my confidence to try a few more
> complex things, like CBM 64 DMA.
>
> I've been impressed by the flexibility of the development cart I
> designed, as it required no more work to set up for DMA tests.
>
> I started off my trying to DMA one byte (to address $8000), and when
> that worked, I then tried 256 bytes.  I hit a snag when trying to
> store a user defined length and had to reimplement the code to use a
> state machine.
>
> But, I have managed to get 1-256 bytes (user defined) to DMA from the
> cart to the 64.
>
> It's a small victory, but I always felt the DMA mode was a mystery in
> implementation details.

Nice!

Seems like your expansion might be able to have a REU compatible mode
in the future :)

--
(\_/) Copy the bunny to your mails to help
(O.o) him achieve world domination.
(> <) Come join the dark side.
/_|_\ We have cookies.

smf
Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

smf
In reply to this post by Jim Brain
Reminds me of the 90's when I started sketching out a DMA cart for the
C64 using a PC parallel port and a load of 74LS (buffers for the address
and data & then something more clever that waited for the bus to be
available before firing everything).

Unfortunately it's lost to time now (or fortunately because it saves my
embarassment).

I'm not aware of a decent remote debugger for a real c64 still.

On 12/06/2018 06:35, Jim Brain wrote:

> I know it's neither new nor terribly interesting to some, but the
> recent Ultimax mode success bolstered my confidence to try a few more
> complex things, like CBM 64 DMA.
>
> I've been impressed by the flexibility of the development cart I
> designed, as it required no more work to set up for DMA tests.
>
> I started off my trying to DMA one byte (to address $8000), and when
> that worked, I then tried 256 bytes.  I hit a snag when trying to
> store a user defined length and had to reimplement the code to use a
> state machine.
>
> But, I have managed to get 1-256 bytes (user defined) to DMA from the
> cart to the 64.
>
> It's a small victory, but I always felt the DMA mode was a mystery in
> implementation details.
>
> Jim
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

Jim Brain
In reply to this post by MiaM


On June 12, 2018 at 11:12 AM Mia Magnusson <[hidden email]> wrote:



Nice!

Seems like your expansion might be able to have a REU compatible mode
in the future :)

Long in the future, perhaps.  I suspect that handling all of the functionality in an REU will thwart my efforts. 

To add content worthy of this group:  Looking at the functionality of the REU, I noticed the "swap" functionality, and I wonder if one could perform 2 actions in 1 half PHI2 cycle.  I think the DRAM is 200nS or better, and maybe with that or with 150nS DRAM, a cart could place the address and set R/W to be a read for the first half of the high PHI2 cycle, and then change to a write halfway through with new data on the data bus.  That would allow a swap to occur in half the time.  I am sure there's some reason why it won't work, but it seemed an idea worth considering.

Jim


Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

David Wood-2
(TL;DR in third paragraph, a bit of extra info leads in.)

I looked closely into doubling the DMA transaction rate during PHI2 a long time ago and learned that RAS timing wasn't variable.  I did work out that you could change column address within one PHI2 cycle by bumping Ultimax mode to disable CASRAM out from the PLA momentarily, but not the row address.  This is due to the c64 wiring RAS directly to the DRAM's and routing cas through the PLA. 

For what its worth, the CAS line routes through the 74ls257's (on breadbins, that is) to select the A (high address byte) inputs while CAS is asserted.  This precludes accessing adjacent bytes of memory but does leave the possibility (however remote) of accessing data on two separate pages but at the same byte address in one phi2 cycle.

TL;DR -
A RMW operation should be possible without modifying anything but the r/w signal on the bus according to most FPM datasheets I've studied, but would require some pretty good timing to ensure the VIC-II has completed its CAS cycle since the cart has to run blind relative to RAS and CAS.  It's already capable of counting DOT cycles so that should be easy.

Also, some (very few) early breadbins had 300ns dram's in them.  My friend's first c64 was a sparkly unit and had 300ns ram's in it.

-David


On Tue, Jun 12, 2018 at 2:31 PM, Jim Brain <[hidden email]> wrote:


On June 12, 2018 at 11:12 AM Mia Magnusson <[hidden email]> wrote:



Nice!

Seems like your expansion might be able to have a REU compatible mode
in the future :)

Long in the future, perhaps.  I suspect that handling all of the functionality in an REU will thwart my efforts. 

To add content worthy of this group:  Looking at the functionality of the REU, I noticed the "swap" functionality, and I wonder if one could perform 2 actions in 1 half PHI2 cycle.  I think the DRAM is 200nS or better, and maybe with that or with 150nS DRAM, a cart could place the address and set R/W to be a read for the first half of the high PHI2 cycle, and then change to a write halfway through with new data on the data bus.  That would allow a swap to occur in half the time.  I am sure there's some reason why it won't work, but it seemed an idea worth considering.

Jim



Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

Jim Brain
On 6/12/2018 2:24 PM, David Wood wrote:
> TL;DR -
> A RMW operation should be possible without modifying anything but the
> r/w signal on the bus according to most FPM datasheets I've studied,
> but would require some pretty good timing to ensure the VIC-II has
> completed its CAS cycle since the cart has to run blind relative to
> RAS and CAS.  It's already capable of counting DOT cycles so that
> should be easy.
I thought the VIC-II did the CAS cycle during PHI2=low half of the
cycle.    I can put it on the LA tonight, but is there a diagram already
available showing the signals?

Jim


Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

MiaM
Den Tue, 12 Jun 2018 15:08:02 -0500 skrev Jim Brain <[hidden email]>:

> On 6/12/2018 2:24 PM, David Wood wrote:
> > TL;DR -
> > A RMW operation should be possible without modifying anything but
> > the r/w signal on the bus according to most FPM datasheets I've
> > studied, but would require some pretty good timing to ensure the
> > VIC-II has completed its CAS cycle since the cart has to run blind
> > relative to RAS and CAS.  It's already capable of counting DOT
> > cycles so that should be easy.
> I thought the VIC-II did the CAS cycle during PHI2=low half of the
> cycle.    I can put it on the LA tonight, but is there a diagram
> already available showing the signals?

Not 100% sure how it is done, but there must be some "pause" with both
RAS and CAS high between VIC and CPU access to ram, so you shouldn't
count on having a full half period of the 1MHz clock for memory access,
but slightly less. But that maybe eqauls to CAS asserted halfway
through a 1MHz half cycle?

--
(\_/) Copy the bunny to your mails to help
(O.o) him achieve world domination.
(> <) Come join the dark side.
/_|_\ We have cookies.

Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

MiaM
In reply to this post by smf
Den Tue, 12 Jun 2018 18:59:54 +0100 skrev smf <[hidden email]>:
> Reminds me of the 90's when I started sketching out a DMA cart for
> the C64 using a PC parallel port and a load of 74LS (buffers for the
> address and data & then something more clever that waited for the bus
> to be available before firing everything).

I guess that the chances of getting any such setup to work is rather
high.

At least the parallell port on an Amiga can only in theory
supply data at 700kbyte/sec so writing from another computer to the C64
wouldn't even need hand shaking if picture is disabled so there
wouldn't be any bad lines.

> Unfortunately it's lost to time now (or fortunately because it saves
> my embarassment).

Seems like many hardware projects were started in the 90's and never
finished and thus saved us from embarassment ;) For example I tried
doing a ROM emulator for the Atari 2600 VCS video game, but newer got
it to work. Afterwards I realized that a good reason for it not working
could had been that I took all the 74xx chips from known broken
boards ;) ;)

> I'm not aware of a decent remote debugger for a real c64 still.

That is an interesting use of the DMA mode.

--
(\_/) Copy the bunny to your mails to help
(O.o) him achieve world domination.
(> <) Come join the dark side.
/_|_\ We have cookies.

Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

Luke Crook


On Tue, Jun 12, 2018 at 14:13 Mia Magnusson <[hidden email]> wrote:
Den Tue, 12 Jun 2018 18:59:54 +0100 skrev smf <[hidden email]>:
> Reminds me of the 90's when I started sketching out a DMA cart for
> the C64 using a PC parallel port and a load of 74LS (buffers for the
> address and data & then something more clever that waited for the bus
> to be available before firing everything).

I guess that the chances of getting any such setup to work is rather
high.

At least the parallell port on an Amiga can only in theory
supply data at 700kbyte/sec so writing from another computer to the C64
wouldn't even need hand shaking if picture is disabled so there
wouldn't be a

> I'm not aware of a decent remote debugger for a real c64 still.

That is an interesting use of the DMA mode.


This cart allows a remote computer to directly read/write to the C64's RAM via USB. 





Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

MiaM
Den Tue, 12 Jun 2018 14:26:54 -0700 skrev Luke Crook <[hidden email]>:
> This cart allows a remote computer to directly read/write to the
> C64's RAM via USB.
>
> https://hackaday.io/project/388-c64-flash-cart

Am I missing something?
It looks like it "only" has 16k double ported ram which tou can
read/write remotely.

--
(\_/) Copy the bunny to your mails to help
(O.o) him achieve world domination.
(> <) Come join the dark side.
/_|_\ We have cookies.

Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

David Wood-2
In reply to this post by Jim Brain


On Tue, Jun 12, 2018 at 4:08 PM, Jim Brain <[hidden email]> wrote:
On 6/12/2018 2:24 PM, David Wood wrote:
TL;DR -
A RMW operation should be possible without modifying anything but the r/w signal on the bus according to most FPM datasheets I've studied, but would require some pretty good timing to ensure the VIC-II has completed its CAS cycle since the cart has to run blind relative to RAS and CAS.  It's already capable of counting DOT cycles so that should be easy.
I thought the VIC-II did the CAS cycle during PHI2=low half of the cycle.    I can put it on the LA tonight, but is there a diagram already available showing the signals?

I'm not aware one that anyone has officially published.  I think Zero-X had done a timing grab at high resolution when trying to figure out how memory corruption occurs during a VIC-II DMA restart though.  If that's still in the wild it could be used to gather information.

It's probably simpler to just grab your own sample.  Publishing what you find would be really interesting to see. :)

That said, there has to be a RAS/CAS sequence during PHI2=low so the VIC-II can read graphics data from memory.  During text mode I believe most accesses usually head for the ROM, but with a RAM charset or bitmap mode that wouldn't be the case.

A second RAS/CAS sequence has to occur during PHI2=high to complete a transaction for the CPU.

 

Jim



Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

Segher Boessenkool
On Tue, Jun 12, 2018 at 05:46:22PM -0400, David Wood wrote:

> On Tue, Jun 12, 2018 at 4:08 PM, Jim Brain <[hidden email]> wrote:
> > I thought the VIC-II did the CAS cycle during PHI2=low half of the
> > cycle.    I can put it on the LA tonight, but is there a diagram already
> > available showing the signals?
>
> I'm not aware one that anyone has officially published.  I think Zero-X had
> done a timing grab at high resolution when trying to figure out how memory
> corruption occurs during a VIC-II DMA restart though.  If that's still in
> the wild it could be used to gather information.
>
> It's probably simpler to just grab your own sample.  Publishing what you
> find would be really interesting to see. :)
>
> That said, there has to be a RAS/CAS sequence during PHI2=low so the VIC-II
> can read graphics data from memory.  During text mode I believe most
> accesses usually head for the ROM, but with a RAM charset or bitmap mode
> that wouldn't be the case.
>
> A second RAS/CAS sequence has to occur during PHI2=high to complete a
> transaction for the CPU.

CAS is active for approximately the first half of both PHI2 high and low.

Second line in
http://segher.ircgeeks.net/c64/plots/vic-old.html

(And see
http://segher.ircgeeks.net/c64/plots/vic.html
for why Gerrit made these plots: we were looking at the "grey pixel
noise" thing.  All these plots are his.)


Segher

Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

Luke Crook
In reply to this post by MiaM


On Tue, Jun 12, 2018 at 14:41 Mia Magnusson <[hidden email]> wrote:
Den Tue, 12 Jun 2018 14:26:54 -0700 skrev Luke Crook <[hidden email]>:
> This cart allows a remote computer to directly read/write to the
> C64's RAM via USB.
>
> https://hackaday.io/project/388-c64-flash-cart

Am I missing something?
It looks like it "only" has 16k double ported ram which tou can
read/write remotelay

Well, 16k is better than nothing :)



Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

Jim Brain
In reply to this post by MiaM
On 6/12/2018 4:08 PM, Mia Magnusson wrote:
> Not 100% sure how it is done, but there must be some "pause" with both
> RAS and CAS high between VIC and CPU access to ram, so you shouldn't
> count on having a full half period of the 1MHz clock for memory access,
> but slightly less. But that maybe eqauls to CAS asserted halfway
> through a 1MHz half cycle?
I should have been more precise.

I knew that there are two RAS/CAS cycles per 1MHz cycle (one for VIC
access and 1 for CPU/VIC access).  The scope shots show that on the CPU
cycle, CAS occurs about 1/3 of the way into the 500MHz cycle. That means
there is <300nS to perform all activities.  FOr 100 or 120nS, it might
work, but David's 300nS would fail for sure.

In other news, I now have 64kB reads working and 64kB writes working. 
No compare or swap, as I will need to once again modify the state
machine to add those.  If there is interest, I can put my Verilog up
somewhere.

Jim
>

--
Jim Brain
[hidden email]
www.jbrain.com


smf
Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

smf
In reply to this post by MiaM
On 12/06/2018 22:12, Mia Magnusson wrote:

> At least the parallell port on an Amiga can only in theory
> supply data at 700kbyte/sec so writing from another computer to the C64
> wouldn't even need hand shaking if picture is disabled so there
> wouldn't be any bad lines.

Halting both VIC and CPU would be interesting, but being able to access
memory while the C64 is running was the main aim & I don't think there
is a way to stop VIC externally.

Having the cartridge strobe the control signals at the right time and
then handshake to an Amiga to perform the next transfer on an interrupt
was the goal.



Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

silverdr@wfmh.org.pl
In reply to this post by Jim Brain

> On 2018-06-13, at 07:06, Jim Brain <[hidden email]> wrote:
>
> In other news, I now have 64kB reads working and 64kB writes working.

How does it go in terms of timing / cycles?

>  No compare or swap, as I will need to once again modify the state machine to add those.  If there is interest, I can put my Verilog up somewhere.

I am highly interested in things DMA for 6502 based systems. Although I've chosen VHDL as my development "platform" rather than Verilog - to my understanding the main concepts remain fully valid across the two. With DMA I basically need consecutive writes for one of my projects and am very curious how fast and in groups of how many bytes could this be done in a given time frame.

--
SD! - http://e4aws.silverdr.com/


Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

Jim Brain
On 6/13/2018 11:36 AM, [hidden email] wrote:
>> On 2018-06-13, at 07:06, Jim Brain <[hidden email]> wrote:
>>
>> In other news, I now have 64kB reads working and 64kB writes working.
> How does it go in terms of timing / cycles?

I've not put a real REU into the logic analyzer, but I assume it's the
same.  Any phi cycle where ba is high during the low half of the cycle
and we are wanting to perform a DMA activity is latched as phi goes high
  and if that is true, the code performs the DMA during the high half
cycle of PHI.  As the last PHI cycle goes low, the pointers are reloaded
  if needed, and the dma flag is reset.



I can see on the logic analyzer when badlines happen, and your transfer
speeds are still constrained by that fact.  But, the idea of
transferring a block of bytes coming from a PC looks relatively easy to
implement if the 64 requests the transfer.  The issue, as I understand
it, is if you want to surreptitiously DMA data into the running 64
memory map, since you don't know where the 6510 is in it's instruction
fetch/decode/action cycle, and pulling DMA low will corrupt CPU
activities in flight.



Gideon Z has a nice writeup on the issue.



Jim


smf
Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

smf
On 13/06/2018 18:53, Jim Brain wrote:
> The issue, as I understand
> it, is if you want to surreptitiously DMA data into the running 64
> memory map, since you don't know where the 6510 is in it's instruction
> fetch/decode/action cycle, and pulling DMA low will corrupt CPU
> activities in flight.

I would have thought that there would be a safe point if you trigger an
nmi and switch to ultimax mode when it fetches the vector, have the
ultimax nmi handler just return and switch out of ultimax mode.

For remote debugging you need to be able to switch to ultimax mode and
provide code anyway, because you can't update $00/$01 with dma


Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

silverdr@wfmh.org.pl
In reply to this post by Jim Brain

> On 2018-06-13, at 19:53, Jim Brain <[hidden email]> wrote:
>
> implement if the 64 requests the transfer.  The issue, as I understand
> it, is if you want to surreptitiously DMA data into the running 64
> memory map, since you don't know where the 6510 is in it's instruction
> fetch/decode/action cycle, and pulling DMA low will corrupt CPU
> activities in flight.

Roger that. I am about something a litte different though. Something like: the CPU puts some data into a buffer and goes about its other businesses. Once it returns it finds the data processed by a DMA capable circuit that reads the data left by the 6502/6510, processes it and writes back to the same (or another) RAM area. All that without actually stopping the CPU. I heard there were some DMA implementations that worked in such way.

--
SD! - http://e4aws.silverdr.com/


Reply | Threaded
Open this post in threaded view
|

Re: DMA successes with Verilog

Jim Brain
In reply to this post by smf
On 6/13/2018 3:47 PM, smf wrote:
> On 13/06/2018 18:53, Jim Brain wrote:
>>
>
> I would have thought that there would be a safe point if you trigger
> an nmi and switch to ultimax mode when it fetches the vector, have the
> ultimax nmi handler just return and switch out of ultimax mode.
If you can guarantee the NMI will occur (I think Marko showed that you
can trigger it and not ack an NMI, which would hold the NMI line down,
preventing NMI from working), then your idea would work.
>
>
>

--
Jim Brain
[hidden email]
www.jbrain.com


1234