Giter VIP home page Giter VIP logo

Comments (79)

DvdBoon avatar DvdBoon commented on June 17, 2024

Got some first results. Using the default MMU window of $80000000-$A0000000 did the trick. See http://amigafun.blogspot.nl/2015/12/picture-of-day-dualppc.html

Will push the updated code later.

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

Congrats on that, I saw the code, looks good to me. I really need to put back my A1200T together, now that I can test Sonnet with it.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

A (number of) function(s) appear(s) to be broken on the 1200TX. The only other program I tried didn't work (Quake2) It exits gracefully complaining about the soft_rend.dll not being available. It loads fine, however. I also see that the main program probes for a certain 68K port. I expect Run68K to be broken. Looking into it.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Investigation into the problem is hindered by debug tools not correctly working in the environment created by the PCI library. Monam crashes and COP cannot read the gfx or sonnet memory (put into place using the MMU by the pci.library).

Next to that some strange stuff is happening. The CyberPI program for example only outputs the pi result. The actual text before and after are not displayed. This is actually 68K code (Output() and Write()) called from the 68K part of the CyberPI program indicating that 68K lib function calling from gfx or sonnet mem is impaired, probably in combination with Z2 window shifting.

I don't think we'll see a working Sonnet card on the A1200 this year.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

At the moment writing my own MMU setup code to replace the one in pci.library. Chopping up the 512MB from $80000000-$A0000000 in 8MB pieces. Every 8MB piece will point to the Z2 memory of $200000-$A00000. Only one 8MB will be marked valid, though The rest will be marked invalid. An access to an invalid 8MB piece will invoke the bus error handler. Inside this handler we'll shift the Z2 window and mark the new 8MB piece valid and the old one invalid. Will be using indirect page descriptors for VERY fast MMU switching. Both supervisor and user MMU trees must be addressed.

In theory, this all sounds good, but I don't know yet if this will interfere with the workings of the pci.library. Even with MMU=no as option the pci.library still installs a bus error hook on the A1200, probably to catch PCI memory accesses. I'm optimistic, though :-)

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

I wonder, would it be possible to use mmu.library for this? It is still actively developed and seems to be the standard system-friendly way for manipulating the MMU.

http://aminet.net/package/util/libs/MMULib

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

Also note that you should support 4MB pieces too, after all Mediator 1200 can be switched into 4MB window mode by jumper.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Yes, I am using the mmu.library for this. Maybe pester the author about be able to set multiple indirect pages at once. I now have to do a Setproperties call for every page (4k...and that for 512MB)... Only during setup, though.

The 4MB support will follow if this works for the 8MB window

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

As I understand, @thorfdbg is the author of mmu.library, maybe he can comment here and give his opinion, since he has an account on GitHub 😉 .

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Oh, well, I've send him a mail already :-) It's a nice-to-have really as I can just program a loop. It is needed during setup only but it will shorten the setup time. Most setup time is already going in setting up the PPC MMU.

from sonnetamiga.

thorfdbg avatar thorfdbg commented on June 17, 2024

Am 01.02.2016 um 13:06 schrieb Radosław Kujawa:

As I understand, @thorfdbg https://github.com/thorfdbg is the author
of |mmu.library|, maybe he can comment here and give his opinion, since
he has an account on GitHub 😉 .

It's probably easier just to mail to this account. Not sure whether
github will reach me. But yes, thorfdbg is my github account.

Greetings,
Thomas

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Hello Thomas,

Thank you for your answer. I was under the impression that using the
INDIRECT approach was faster than the SINGLE approach.
I was also under the impression for SetIndirectArray to work, you have to
install the pages first using SetProperties. As INDIRECT only takes one
DESCRIPTOR I need to call it for all the pages during setup. There is no
magic code inside SetProperties which updates the DESCRIPTOR per page used
when size > 1 page.

I think I know enough to first try this all with the INDIRECT approach. I
just need to call the SetProperties function 131071 times :-)

Let's see how that goes. Otherwise I'll do the single approach.

I see that the docs don't explain what 'lower' (a1) is for
SetPageProperties.

The end goal is to make PCI memory access seem transparent to programs on
the Mediator 1200TX. The whole 32bit range can only be accessed using a
sliding windows residing at $200000. In effect, I need to replace a similar
function now executed within the pci.library but which is buggy and not
friendly to other programs using the MMU.

Thanks again,

Dennis.

I am developing a library which needs to redirect memory calls from a
512MB block starting from $80000000 to Zorro 2 memory space. So I want
to point multiple blocks of 8MB inside that 512MB block to one 8MB block
at $200000.

I want to do this with MAPP_INDIRECT as I need to change the properties
of those 8MB blocks quickly. (One of them is valid, the rest will be
invalid. If invalid, bus error is handled -> mediator window is set on
invalid block -> block is made valid -> Previous 8MB block is made
invalid -> rte).

That's one possibility. MAPP_INDIRECT has one drawback, however, and
that is that the page modes cannot and will not be automatically
adjusted when a DMA transfer is running from such pages. A better
alternative is usually to map the pages as MAPP_SINGLE, which creates a
single descriptor per page, i.e. it disables the early termination page
descriptors (for the 030 and the 581, of course).

Then, you can rather quickly change the modes by SetPageProperties().

There is one drawback, however, namely that your properties will be
overwritten as soon as the MMU table in this area is rewritten, so it's
usually best to setup the high-level page propertes with SetProperties()
to something useful, and then alter the page properties one by one.
SetPageProperties is pretty quick and goes directly on the page
descriptor, by a per-CPU type dispatcher that avoids CPU-dependencies.

If this is not acceptable, you can also install a page-table access
handler with

AddContextHook(MADTAG_TYPE,MMUEH_PAGEACCESS,...)

and you will be called as soon as the high-level function leaks through
to the low-level and changes a setting there. That's the strategy
MuGuardianAngel works, namely by quickly changing the page access to
INVALID to those pages that should not be reached because they only
contain free memory.

In case you want to recover from page faults, you should also set
MAPP_REPAIRABLE. It tells the bus-error handler to collect additional
information for you that is otherwise not available.

Is it correct I can only set 1 page at the time with INDIRECT and
MAPTAG_DESCRIPTOR? using SetProperties? I would like it that when I set
logical address and size that when I set size at 8MB that it increases
MAPTAG_DESCRIPTOR by 4 (or 16, depending on tag or flag?) for every page
(so for like 8MB/4k pages) so I don't need to call SetProperties for
every page during setup (=8MB/4k calls).

MAPP_INDIRECT works also on larger page sets, though it then acts
similar to MAPP_BUNDLED, i.e. the entire page block will go to the same
(single) descriptor. The Lib does not try to "smartly" adjust the
position of the indirect descriptor.

If you need to adjust an entire array, you have to use...

I want to use SetIndirectArray in the bus error hook.

..exactly that. (-: It's a low-level function, i.e. on the same level as
SetPageProperties().

Or is this already possible and did I overlook it in the documents? Or
is there another method of quickly just mark pages invalid/valid.

As suggested above, I would go for SetPageProperties() which is usually
the safer way of getting it done. However, not knowing your constraints
and design goals, it's a bit hard to answer.

2016-02-01 19:55 GMT+01:00 Thomas Richter [email protected]:

Am 01.02.2016 um 13:06 schrieb Radosław Kujawa:

As I understand, @thorfdbg https://github.com/thorfdbg is the author
of |mmu.library|, maybe he can comment here and give his opinion, since
he has an account on GitHub 😉 .

It's probably easier just to mail to this account. Not sure whether
github will reach me. But yes, thorfdbg is my github account.

Greetings,
Thomas


Reply to this email directly or view it on GitHub
#21 (comment)
.

from sonnetamiga.

thorfdbg avatar thorfdbg commented on June 17, 2024

Am 01.02.2016 um 20:17 schrieb DvdBoon:

Hello Thomas,

Thank you for your answer. I was under the impression that using the
INDIRECT approach was faster than the SINGLE approach.

The question is "how fast" is "fast enough". It would probably be
helpful to know what you are attempting and what the expected latency
is. So for example, how often do you need to adjust pages?
SetPageProperties is not overly slow (unlike the high-level functions).

A second question you need to answer yourself is whether the area you
are going to remap is ever touched by a DMA transfer. If so, then
indirect descriptors will cause trouble.

I was also under the impression for SetIndirectArray to work, you have to
install the pages first using SetProperties.

Yes, as always. You need to set the MAPP_INDIRECT flag for that.

As INDIRECT only takes one
DESCRIPTOR I need to call it for all the pages during setup. There is no
magic code inside SetProperties which updates the DESCRIPTOR per page used
when size > 1 page.

It will always point to the same descriptor for all papges in the area,
this is correct.

I think I know enough to first try this all with the INDIRECT approach. I
just need to call the SetProperties function 131071 times :-)

Not necessarily. You can also first set the pages to point all to the
same indirect page and then call SetPagePropertiesA() to adjust the
pointer. MAPTAG_DESCRIPTOR is the tag that defines the target descriptor.

Let's see how that goes. Otherwise I'll do the single approach.

I see that the docs don't explain what 'lower' (a1) is for
SetPageProperties.

Its the logical address for which the mapping has to be modified.

The end goal is to make PCI memory access seem transparent to programs on
the Mediator 1200TX. The whole 32bit range can only be accessed using a
sliding windows residing at $200000. In effect, I need to replace a similar
function now executed within the pci.library but which is buggy and not
friendly to other programs using the MMU.

I see. So in essence, a write into the PCI area (logical address) has to
be remapped to go into the window at $200000 instead, and for every
potential PCI access you have to perform a remap? However, how do you
decide which window of the PCI memory to map in? And, forgive me the
silly question, why is the memory not accessed in the window directly
with its hardware address, thus why the back-and-forth of physical and
logical mapping? After all, a program can only access a single PCI
device at a time due to windowing in first place, do I get this right?

If I may: It is probably easier not to use descriptors in first place.
The library can give you accesso the CPU pipeline, i.e. you can get
accss to the data the code has written, and the data the code has just
read. You can define an exception hook that picks up the data that has
just been written, its size, and perform the transfer to the target
address within the window manually.

The downside is that this does not work for movem, i.e. it can only
catch up byte, word or long-word reads or writes.

If nothing helps, I can probably add another flag to SetPageProperties()
to have "non-bundled" indirect descriptor setup, though this might take
a while until I have an implementation.

Let me know how it goes and whether I can do something for you to
support you further.

Greetings,
Thomas

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Hello,

2016-02-01 21:04 GMT+01:00 Thomas Richter [email protected]:

Am 01.02.2016 um 20:17 schrieb DvdBoon:

I see. So in essence, a write into the PCI area (logical address) has to
be remapped to go into the window at $200000 instead, and for every

For every access, that is correct.

potential PCI access you have to perform a remap? However, how do you

I need to do a remap if the access is outside the current 8MB window. The
Mediator can hardware remap/mirror an 8MB window of PCI memory to $200000
No remap is needed if concurrent accesses are within this window. As soon
as there is an access outside this window, I need to remap.

decide which window of the PCI memory to map in? And, forgive me the

By looking at the upper bits of the logical address. These bits are used to
move the Mediator hardware remap window to the correct PCI 32 bit address.
Then I need to remap the PCI addresses to this window at $200000 using the
68K MMU.

Example:

If I access $80000000. The mediator window need to move to this address so
the window at $200000 is reflecting PCI memory at $80000000-$80800000 (it's
how the mediator works)
Then I remap, using the MMU, access to $80000000-$80800000 to
$200000-$a00000. Every access within $80000000-$80800000 is now
transparently handled.

Then an access to $90000000 happens. Then I need to configure the mediator
to show/mirror $90000000-$90800000 at the $200000 window and tell the MMU
to translate these addresses to $200000
All this needs to be as fast as possible. I hope I'm a bit clear :-)

silly question, why is the memory not accessed in the window directly
with its hardware address, thus why the back-and-forth of physical and
logical mapping? After all, a program can only access a single PCI
device at a time due to windowing in first place, do I get this right?

I need to run 68K code from within PCI memory. As Amiga code is mostly
small, I don't expect a lot of switching of the MMU and Mediator window,
but that's why I need it to be as fast as possible.

If I may: It is probably easier not to use descriptors in first place.
The library can give you accesso the CPU pipeline, i.e. you can get
accss to the data the code has written, and the data the code has just
read. You can define an exception hook that picks up the data that has
just been written, its size, and perform the transfer to the target
address within the window manually.

The downside is that this does not work for movem, i.e. it can only
catch up byte, word or long-word reads or writes.

I need to run code inside the PCI memory. I think the above method is
indeed the best if you just shovel data around.

If nothing helps, I can probably add another flag to SetPageProperties()
to have "non-bundled" indirect descriptor setup, though this might take
a while until I have an implementation.

Let me know how it goes and whether I can do something for you to
support you further.

I will, thanks for all your comments so far!

Greetings,
Thomas


Reply to this email directly or view it on GitHub
#21 (comment)
.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Using loads and loads of indirect pages using 1 page at the time with SetProperties takes too long, unfortunately (each access to SetProperties becomes slower and slower for every new page added).

I am guessing doing a range using SetProperties is MUCH more efficient. I still like the indirect approach a lot. Maybe just set up with a single indirect page descriptor using the mmu.library and then adjust the table with all the needed indirect page descriptors outside of the mmu.library.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Looking at the documentation of 040/060 it is maybe easier to just do a normal remapped for each 8 MB block. Set them all invalid except for the initial one and just change the 16 pointer level table descriptors which make up the 8MB blocks to mark them either valid or invalid when needed.

I think i see the mmu.library does not (directly) support modification of tables.

Probably won't work on other MMU's

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

If you want my opinion, I think doing anything that would rule out 68030 usage is not very wise. 68030 is still hugely popular with Mediator owners (include me...). Maybe we could have separate methods of solving this problem for 68030 and 68040/68060.

I have to admit I just don't know enough about 68k MMUs to be of any help here. I only dug a bit in NetBSD's 68k MMU code but it is very complex (over 66kB) and in NetBSD every process is living in a separate virtual address space - it's a design completely different than AmigaOS.

Again, maybe @thorfdbg can add the necessary functions to mmu.library, since in one of the above comments he expressed the will to help with this.

I'd prefer to avoid messing with the MMU directly, I'm 100% sure that would cause further problems. Note that some people already have mmu.library installations, anything we do here would create a conflict. Additionally, various CPU-board specific libraries mess with the MMU (68040.library etc.), that's why mmu.library is providing own implementation of CPU libraries...

from sonnetamiga.

thorfdbg avatar thorfdbg commented on June 17, 2024

On 01.02.2016 22:29, DvdBoon wrote:

Using loads and loads of indirect pages using 1 page at the time with
SetProperties takes too long, unfortunately (each access to
SetProperties becomes slower and slower for every new page added).

Try to setup the properties in inverse order, i.e. start with the
highest page. This should give you linear performance instead of
quadratic running time.

I am guessing doing a range using SetProperties is MUCH more efficient.
I still like the indirect approach a lot. Maybe just set up with a
single indirect page descriptor using the mmu.library and then adjust
the table with all the needed indirect page descriptors outside of the
mmu.library.

I believe I still have to think of a good design for all this. The
current design is not exactly suitable for the use case you have. There
are a couple of issues, however: First, you cannot really control the
page size(s), that's up to the environment to select, and also hardware
dependent. Thus, even if I give you access to a higher level in the page
table, it is still unclear whether the page size is sufficient for you.
Or whether such a descriptor level exists in first place. The
Apollo/Vampire core will probably only have a linear (one-level) page
addressing model, so there will be no higher level at all.

Second, what happens if a context switch occurs and the tables are
exchanged. I need to carry modifications over. In principle possible if
you administrate the level yourself.

Third, what happens on DMA transfer. I need to cache-inhibit the
boundary pages due to the Amiga bus design, so one way or another the
library needs to know where your pages are and how to modify them.

Last, how to handle top-level modifications of the page layout.

I do not have an immediate answer to these questions at this point, and
coming up with a good design will probably take a while - even more so
as I'm busy with a lot of other things.

But anyhow, thanks for looking into this. I'm confident we'll come up
with something, even though it might probably take a bit longer than you
might have planned - sorry for that.

Greetings,
Thomas

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

All the help is appreciated :-). In the end, the only thing I really want is that I get notified when an access occurs outside the current Mediator window (being 8MB or 4MB depending on jumper) so I can slide this Mediator window to reflect the correct PCI memory range.

I constrain this to only PCI memory addresses $80000000-$A0000000 to make life easier. 512MB should be enough for now.

You say something about page sizes? I don't want to control them. I just noticed that I can also mark the pointer tables as being invalid. As I see it, there are 128 root tables controlling 32MB each (which is too big) the next level are pointer tables which control 128kb each. So I need to mark 32 (not 16, I miscalculated...) pointer tables as invalid to invalidate all the pages in the next level (at least on the 040. I assume that the 060 is the same. Don't know about 030). Also assuming a page size of 4k, Haven't looked what happens at a size of 8k.

Or am I wrong here?

I'll try the reverse addressing you mentioned first to stay mmu.library compliant.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

As an addition: (if the reverse addressing does not work)

So I am thinking about doing 64 (in the case of 8MB) remaps of 8MB each with every one pointing to $200000 and marked as being valid. The level above it (the 32 table pointers for each 8MB window) I'll mark as invalid.

When a hit occurs, I set the correct window (setting the (U)DT bits) on valid and slide the mediator window to the correct PCI memory address. The window we slide away from we mark as invalid.

No need to use INDIRECT. Just need access to the pointer tables.

from sonnetamiga.

thorfdbg avatar thorfdbg commented on June 17, 2024

On 02.02.2016 10:44, DvdBoon wrote:

As an addition: (if the reverse addressing does not work)

So I am thinking about doing 64 (in the case of 8MB) remaps of 8MB each
with every one pointing to $200000 and marked as being valid. The level
above it (the 32 table pointers for each 8MB window) I'll mark as invalid.

When a hit occurs, I set the correct window (setting the (U)DT bits) on
valid and slide the mediator window to the correct PCI memory address.
The window we slide away from we mark as invalid.

No need to use INDIRECT. Just need access to the pointer tables.

Which you don't. (-: The problem is really that I cannot even ensure
that there are pointer tables in first place. The current abstraction
is really a list of pages, not a tree, and it might really happen that
the MMU table is not a tree at all. As said, I need to think about it
how to abstract this and come back to you.

Greetings,
Thomas

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

For my information, you mean that on the mmu.library level, there is no tree just a list of pages? Ok, I get it now, I think :-) I could set it all up using the mmu.library and then directly poke at the pointer tables using the hardware, but with an outside rebuildtree everything gets lost.

I'll come back to here when I tried the INDIRECT approach with starting at the top of the address range.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Setting the properties in reverse order worked. Only the RebuildTree takes forever now.

from sonnetamiga.

thorfdbg avatar thorfdbg commented on June 17, 2024

Am 02.02.2016 um 11:14 schrieb DvdBoon:

I'll come back to here when I tried the INDIRECT approach with starting
at the top of the address range.

As a related question: How large is the window you want to map in? Does
it change in size? And how many different configurations would you need
to support?

Thanks,
Thomas

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

The Mediator 1200 window into PCI memory space is 4MB or 8MB, depending on hardware configuration (a jumper, actually). The size does not change.

If you want I can give you a detailed description how does it work.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

I like to map 64 8 MB windows or 128 4 MB windows to 1 static 8 or 4 MB window. If all windows are in place just switching on and off (valid/invalid) would be the fastest, I think.

OR

I like to map 1 8 MB window or 1 4 MB window to 1 static 8 MB or 4 MB window if creating and destroying a single remap on the fly is fast enough. In this case the whole 512 MB is invalid except for the 8 or 4 MB remap which is created on the fly. The previous one is destroyed (made invalid again).

These 8 MB or 4 MB windows are on a 8 or 4 MB boundary starting from $80000000 up to $A0000000

The static window is $200000-$A00000 (Mediator Z2/PCI memory window, can also be $200000-$600000 in case of 4 MB)..

The window size is determined at power up (using a jumper on the mediator) and does not change during operation.

Extra info as to why the 512 MB:

512 MB is wanted as to address 256 MB of Radeon GFX memory and 128 MB Sonnet 7200 memory (or 128 Radeon / 256 Sonnet or 32 Voodoo / 256 Sonnet etc etc). These must be on their own 256 MB and 128 MB etc boundary so actually at least 128 MB is wasted. (small part in use by PCI config registers). This gfx/sonnet memory (~360 MB) has to hold PPC and 68K code.

So for example:
$80000000-xxx PCI config space (BARs etc). not much else, mostly within the first 128k
$88000000-$90000000 Sonnet memory 128 MB
$90000000-$A0000000 Radeon Gfx 256 MB

from sonnetamiga.

thorfdbg avatar thorfdbg commented on June 17, 2024

Am 02.02.2016 um 22:34 schrieb Radosław Kujawa:

The Mediator 1200 window into PCI memory space is 4MB or 8MB, depending
on hardware configuration (a jumper, actually). The size does not change.

If you want I can give you a detailed description how does it work.

I believe I understand how it works. I'm just trying to get the
operating parameters straight to come up with a useful design for it.

Part of the problem is that you would need to do the switch under
uncontrolled conditions. The mmu.library can catch an invalid access
just fine, and react on that by calling user level code to perform some
action on that. The call into user code from the bus error recovery is
completely transparent and does not require any higher level Os magic.

However, most high-level mmulib interface functions use semaphore
locking to be thread (or rather "task"-)safe. That is not a problem -
the problem is that a task that runs into a semaphore will necessarily
break the Forbid() or Disable() state of exec (naturally, what else can
it do), and this implies that the task that tried to make the invalid
page access, and fixing that by a higher level accessor function would
potentially need to be halted because some other task is working on the
mmu tables at the very same time. And this "halting" might potentially
break a protocol the first task is implementing.

IOWs, the only chance to get the page swap done is by low-level
functions, and that's the part which is a bit "touchy" because there is
no abstraction of "page level" in the library, and because high-level
definiions can overwrite low-level definitions at any time. (And there
cannot be a page-level abstraction because it is quite likely that some
future extensions or developments will not have a tree-based MMU).

Greetings,
Thomas

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

@DvdBoon

I'd highly suggest to use method discussed in this thread to access PCI memory space only. Remember that the window within Z2 memory space (0x20000-0xA00000) is only for accessing the PCI memory space.

To access the PCI configuration space through this MMU mapping, we'd also need to mess with the second Mediator board, the one within Z2 I/O space. This board is internally divided into two 64kB spaces. However, the first 64kB is used to access Mediator bridge setup registers. The second 64kB space is shared between PCI configuration and PCI I/O space. Whether you see config space or I/O space there depends on register at offset 0x7 within the bridge setup space.

Long story short, it would make things even more complex. I'm afraid touching it will create further problems, as only pci.library knows the current state of this area.

On a side note, I really should write an article about low level Mediator programming...

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

I think I made things confusing with saying config space. I meant special registers set up by the PCI BARs like the configuration block (EUMB) of the Sonnet which is placed in PCI memory (MEMSPACE, ROMSPACE), not config memory (IOSPACE).

Anyway, I don't need to mess with that if the pci.library is running and sets it all up.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Question @rkujawa

0xAddress >> 0x10 & 0xFF80 = value to store at $EA0002 to slide window (byte-swapped)
But if I read $EA0002 I see a value 0x2f90. which gives after byte-swapping 0x902f. This value is not 8 MB or 4 MB aligned. pci.library does not do & 0xFF80 (or & 0xFFC0 for 4 MB)? Or do I also need to AND again after reading and byteswapping?

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

0xAddress >> 0x10 & 0xFF80 = value to store at $EA0002 to slide window (byte-swapped)

That's what I did in the Mediator 1200 NetBSD driver and it worked, but...

But if I read $EA0002 I see a value 0x2f90. which gives after byte-swapping 0x902f. This value is not 8 MB or 4 MB aligned. pci.library does not do & 0xFF80 (or & 0xFFC0 for 4 MB)?

To be honest, I reverse engineered pci.library back in 2012 and I don't remember now what does it exactly do. I suspect two possibilities:

  • Mediator doesn't actually need an aligned address in the window register. Unlikely but possible...
  • This register is probably not supposed to be read, or if it does, you should AND the result with 0xFF80/0xFFC0? I'd guess it represents a state of all PCI address lines not connected directly to the window area. So only highest 9 bits of PCI address are in fact connected to this register (in case of 8MB window), bit 23-31. Of course they are stored in 16-bit register, so became bits 7-15. But do the low bits 0-5 have any meaning? I don't know.

Or do I also need to AND again after reading and byteswapping?

Most likely, but you should probably do some experiments with cleanly booted system (i.e. no pci.library running). Write some values to this register, read them back, see what you get.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

I hope to have some proof of concept code ready later this evening just to test whether this approach will work. It is a hybrid now of using the mmu.library (detecting mmu type, page-size, setting up the MMU, adding an segfault handler and activating it) and direct poking in the table-pointers to set stuff invalid/valid very quickly (bit 1 of the UDT bits).

Preliminary test are promising so far. Haven't written a complete segfault handler yet, but the set-up is complete.

If the pci.library does not do 4 MB/8 MB alignment I also need to take over the gfx window handling by the pci.library (getting a step closer to eliminating its use altogether almost).

I hope that it will be 100% mmu library compliant in the future. For now, I probably need a hook on RebuildTree to detect tree rebuilds which will reset the UDT bits probably.

from sonnetamiga.

thorfdbg avatar thorfdbg commented on June 17, 2024

Am 03.02.2016 um 13:16 schrieb DvdBoon:

I hope to have some proof of concept code ready later this evening just
to test whether this approach will work. It is a hybrid now of using the
mmu.library (detecting mmu type, page-size, setting up the MMU, adding
an segfault handler and activating it) and direct poking in the
table-pointers to set stuff invalid/valid very quickly (bit 1 of the UDT
bits).

Please don't do that. It might seem to work, but it will not, and it
will cause a lot of problems. The bottom page table layout of a
mmu.library generated mmu table is more than just the table
descriptors. If you allocate your page table yourself, the additional
information the mmu.library keeps at the page level is not there, and
hence the library will overwrite innocent memory.

Just to let you know what is there in addition:

*) The page level contains for each page a DMA cache disable counter.
This counter is incremented each time a DMA transfer is initiated, and
then causes the corresponding page to go into cache-inhibit. (more or
less). It is decremented when the page goes out of DMA. If the counter
reaches zero, the original cache state is restored.

) The page level *also includes (besides the native MMU descriptor)
the abstract descriptor you read by GetPagePropertiesA(). This abstract
descriptor also keeps the cache state the page should have according
to the library, but which it cannot have due to pending DMA transfers.

*) An additional user-pointer that can be deployed for any type of VM
application.

So, in other words: No, you cannot simply replace branches of the MMU
table with your own branches and hope this works. It will not.

Please do not attempt this.

How exactly the page table is populated and where this information is
stored is an implementation detail. It is not documented, and might
change at any time.

As said, I need to come up with a design for your problem, but replacing
table pointers does not work.

Greetings,
Thomas

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

In short, as I am on the move: I am not replacing table pointers, I set
them up using the mmu library as valid and then directly (using urp, srp)
toggle a bit in the udt to mark all, except one window as invalid. But for
now I guess I give it a rest :-)
Op 3 feb. 2016 14:10 schreef "Thomas Richter" [email protected]:

Am 03.02.2016 um 13:16 schrieb DvdBoon:

I hope to have some proof of concept code ready later this evening just
to test whether this approach will work. It is a hybrid now of using the
mmu.library (detecting mmu type, page-size, setting up the MMU, adding
an segfault handler and activating it) and direct poking in the
table-pointers to set stuff invalid/valid very quickly (bit 1 of the UDT
bits).

Please don't do that. It might seem to work, but it will not, and it
will cause a lot of problems. The bottom page table layout of a
mmu.library generated mmu table is more than just the table
descriptors. If you allocate your page table yourself, the additional
information the mmu.library keeps at the page level is not there, and
hence the library will overwrite innocent memory.

Just to let you know what is there in addition:

*) The page level contains for each page a DMA cache disable counter.
This counter is incremented each time a DMA transfer is initiated, and
then causes the corresponding page to go into cache-inhibit. (more or
less). It is decremented when the page goes out of DMA. If the counter
reaches zero, the original cache state is restored.

) The page level *also includes (besides the native MMU descriptor)
the abstract descriptor you read by GetPagePropertiesA(). This abstract
descriptor also keeps the cache state the page should have according
to the library, but which it cannot have due to pending DMA transfers.

*) An additional user-pointer that can be deployed for any type of VM
application.

So, in other words: No, you cannot simply replace branches of the MMU
table with your own branches and hope this works. It will not.

Please do not attempt this.

How exactly the page table is populated and where this information is
stored is an implementation detail. It is not documented, and might
change at any time.

As said, I need to come up with a design for your problem, but replacing
table pointers does not work.

Greetings,
Thomas


Reply to this email directly or view it on GitHub
#21 (comment)
.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

I meant I build the whole tree using mmu library with the PAGES as being valid and toggle bits in the TABLE POINTERS (a level above the pages). But directly IN the tree.

Just to make things clear (I hope).

from sonnetamiga.

thorfdbg avatar thorfdbg commented on June 17, 2024

Am 03.02.2016 um 15:31 schrieb DvdBoon:

I meant I build the whole tree using mmu library with the PAGES as being
valid and toggle bits in the TABLE POINTERS (a level above the pages).
But directly IN the tree.

Yes, same problem. The page level needs to be build by the library or
you're going to be in deep trouble. A "quick and dirty trick"(tm) would
be to create not one context, but eight MMU contexts and exchange their
upper level pointers between the contexts. This still does not keep the
upper and the lower abstraction level in sync, but at least it gives the
library a valid MMU table to allow switching.

Allocating parts of the MMU table itself is a bad idea(tm) as I
already said.

Anyhow, I don't see the need for any rush at this point, so I would
avoid any hacky approach at all. There need to be a solution that is
cleanly integrated into the overall design and not a work-around.

Greetings,
Thomas

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

Anyhow, I don't see the need for any rush at this point, so I would
avoid any hacky approach at all.

Agreed about that, we already have enough hacks and workarounds.

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

Hey @thorfdbg did you think of a clean solution that could be implemented using mmu.library? A month has passed and the discussion has kind of stalled ;).

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Had a quick look today and more complications. As the mediator as a default puts the extra memory >$8000000 some of the WarpOS programs fail in general. For example, CyberPi loads an address to be used in a DOS Write() and does a bmi check. This branch will always be taken now and nothing is printed when the string is in an area >$80000000

from sonnetamiga.

thorfdbg avatar thorfdbg commented on June 17, 2024

Hi folks,

no, I haven't forgotten you. I'm unfortunately just very busy these days
and have only rarely enough time to work on the Amiga.

So yes, I did some work on the mmu.library and I believe that I have now
a design that might work. At this time, it is only implemented, not
tested. I will do some elementary tests this and next week, though
development may continue at the same glacial speed it did in the past -
sorry again for this.

So here is how it will work:

  1. Build N additional MMU contexts where each defines the physical to
    logical mapping for the memory window of the Mediator target window(s).

  2. Reserve an additional four byte (one long word) pointer that will
    keep the active window. Set this to NULL initially. This is a
    "MMUContext **".

  3. In the default MMU context, mark the range into which the windows
    will appear as MAPP_WINDOW. This will be a MMU property flag in the next
    version. Additionally, one tag has to be set to the pointer above to
    identify the window (there can be an arbitary number of windows and this
    pointer identifies the window).

  4. Call "RebuildTree()". This will build a new MMU tree with the
    "window" area as invalid.

  5. When you need to change the memory layout, install one of the N
    additional contexts with MapWindow(). This call is interrupt and
    supervisor-callable. So you probably want to install a page access
    handler in the default context that registers the access and then calls,
    as reaction MapWindow().

There are a couple of restrictions, though. The additional contexts and
the mapping of the default context must "fit" together to make this
possible, i.e. the separation of the contexts into "nodes" must be the
same, at the same addresses. This is something I probably still have to
think about.

I neither know whether "MapWindow" will be fast enough. It is not
exactly a long function, but it might be possible to come up with an
additional call that pre-computes some of the internal values and stores
them in an opaque structure for easier and faster reuse.

DMA transfers, however, will interact fine even when the mapping
changes, so the system remains consistent. The system should hence
interact nicely with the rest of the MMU system.

I'll probably think about the quirks again, and it will still take a
while until I'll have a workable version for testing, and probably a
demo, but I'll just want to let you know where I am, approximately, and
give you a chance to give some feedback on this if this design does not
fit with your application.

Happy Easter!

Greetings,
Thomas

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Happy Easter Thomas!

Thanks for the update. Speed indeed will be crucial in the end. I am still suspecting that the Mediator 8MB window can be split in 2x4MB with 1 keeping it on the gfx memory, because when I am in COP i can read the addresses (after envi.m) at $80000000 (default window) and at $90000000 (gfx memory) but not let's say $88000000. It would make things a bit easier. (COP is the only mmu program to work when mmu=yes for mediator on my system).

As said, one of the things I thought were the result of pci.library/sonnet.library not working properly was actually an artifact of the code/data being in the negative (if you look at them as signed values) address range (>$80000000) I'm not sure how this affects some of the other WarpOS programs. I have to look further into this. I rather use the, let's say $40000000-$60000000 range.

Focus on my side is to get the virtual signal pool going (PPC and 68K sharing the same signals) before I shift my attention back to the A1200 (I've contacted Sam Jordan for this). So take all the time you need; you're not slower than my speed of coding ;-)

Regards,

Dennis.

from sonnetamiga.

thorfdbg avatar thorfdbg commented on June 17, 2024

Hi David,

short question: Did my update of the mmu.library from last week reach
you? I'm not sure whether this email address of yours allows the attachment
of binaries.

Greetings,
Thomas

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Hello,

If by David you mean me, then no. I have not seen any binaries. Which e-mail address did you use?

Regards,

Dennis

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Coincedentally, I tried some stuff yesterday with manually moving the window while in interrupt (I am guessing it does it automatically when running nornal code - by another interrupt) and got some strange results like crashing when just letting the program run, while it finishes correctly when a certain amount of delay is inserted... (all without mmu.library btw).

from sonnetamiga.

thorfdbg avatar thorfdbg commented on June 17, 2024

Am 10.05.2016 um 13:57 schrieb DvdBoon:

Hello,

If by David you mean me, then no. I have not seen any binaries. Which
e-mail address did you use?

This one - on github. You probably want to send me your private mail (or
reply by that) so we don't have to go through github for communication.

Greetings,
Thomas

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

It's dennsvdboon at gmail

from sonnetamiga.

thorfdbg avatar thorfdbg commented on June 17, 2024

Am 10.05.2016 um 14:34 schrieb DvdBoon:

You should have mail now. Let me know whether this worked.

Greetings,
Thomas

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Sorry Thomas, I made an error. It's dennisvdboon. The 'i' was missing.

from sonnetamiga.

thorfdbg avatar thorfdbg commented on June 17, 2024

Am 10.05.2016 um 15:24 schrieb DvdBoon:

Sorry Thomas, I made an error. It's dennisvdboon. The 'i' was missing.

Ok, third try. (-: Please check.

Greetings,
Thomas

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Received an update of the mmu.library from Thomas. I'll be trying to get some results on my A1200 this weekend.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

The PPC now starts up on the A1200 without the use of the MMU routines from the mediator (ENV:Mediator/MMU = No). It sets up its memory block of 128MB and reports it to the system. The next step is to use Thor's MMU library to shift the Z2 window around within this memory block when needed. I foresee a problem with 68K code from Z2-Window1 trying to access data from Z2-Window2. Both code and data needed by that code need to be located in 1 window. Any ideas welcome.

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

As we know, AmigaOS exec will by default allocate the memory with highest priority first. Typically, this is the memory on turbo board. Unless we are talking about hunk marked explicitly as Sonnet PPC/Mediator DMA. If we have all hunks in the executable marked correctly, chances of such problems are minimal. But then, if we use some automated method to patch the binary and just add the Sonnet memory extended attribute to every hunj... Of course 68k code might also end up in Sonnet memory.

Second thing... I was under impression that MMU can handle the situation where some service routine has to be executed to fetch both the code and data needed by the running application. If some 68k task is running on top of this virtual 128M address space, why does it care how often are we changing the window position? It only sees the virtual addresses. Maybe I'm missing something here?

Besides, Mediator's memory management routines do seem to work in this exact situation?

from sonnetamiga.

thorfdbg avatar thorfdbg commented on June 17, 2024

Hi folks,

As we know, AmigaOS exec will by default allocate the memory with
highest priority first. Typically, this is the memory on turbo board.
Unless we are talking about hunk marked explicitly as Sonnet
PPC/Mediator DMA. If we have all hunks in the executable marked
correctly, chances of such problems are minimal. But then, if we use
some automated method to patch the binary and just add the Sonnet memory
extended attribute to every hunj... Of course 68k code might also end up
in Sonnet memory.

Again, placing memory that is mapped by the MMU is a bad idea, and I
would highly recommend not doing so. The problem is DMA. Such memory
cannot be reached by DMA savely, even with all MMU library precautions
on exactly this matter. The problem is that many DMA devices do not use
Os functions correctly to translate logical addresses (as seen by the
CPU) to phsyical addresses (as seen by the DMA device).

The problem becomes even worse in case the memory can go away any time,
even under the feet of a device performing active DMA.

Second thing... I was under impression that MMU can handle the situation
where some service routine has to be executed to fetch both the code and
data needed by the running application.

The MMU is not the problem, the processor is. As far as I remember, the
68060 uses a restart model, i.e. in case it detects that an instruction
may create a bus error on its execution, it will first trigger the
access fault, and when returning, will execute the instruction again. As
far as I remember, there is no internal state in the 68060 (unlike
earlier models), so it has to re-fetch the instruction again. The cache
will, of course, be flushed as part of the window swap.

IOW, in such a case, you end up with a ping-pong of bus-errors, and
hence a dead-lock.

The 68030 is much more microcode driven than the 68060, and there it may
actually work because the instruction pipeline is stored as part of the
exception stackframe, but on the 68060, this is not the case - it's much
more simplified and streamlined.

If some 68k task is running on
top of this virtual 128M address space, why does it care how often are
we changing the window position? It only sees the virtual addresses.
Maybe I'm missing something here?

You're probably missing the situation where the data an instruction
addresses is in a window different from the one where the instruction
itself is.

Besides, Mediator's memory management routines do seem to work in this
exact situation?

I highly doubt this. I would rather believe that they simply ignore
the problem.

Greetings,
Thomas

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

I'm open for any other ideas. The way it is set up now on the A3000/A4000:

PPC sets up physical memory from 0x0 - 0x10000000
PPC MMU remaps this to logical 0x70000000 - 0x80000000
Any PPC access to logical 0x70000000 is relocated to physical 0x0

Sonnet ATU (address translating unit) maps 0x70000000 PCI memory to Physical 0x0 Sonnet memory. Sonnet memory now is visible in PCI memory starting from 0x70000000. Any access from the 68K to 0x70000000 actually gets through to Sonnet memory 0x0 thanks to the Sonnet ATU. Any access from the PPC to 0x70000000 get remapped to 0x0 thanks to the PPC MMU. So both PPC and 68K access actually 0x0 when addressing 0x70000000.

The Z3 window is from 0x60000000-0x80000000 and it also as a default points to the same addresses. (So 0x70000000 is actually 0x70000000). MMU is not in use by the pci.library (not needed).

The Z2 window is from 0x200000-0xa00000. It is mostly in a state that it points to gfx mem/gfx hardware registers courtesy of the gfx driver. (0x200000 points mostly to either 0x80000000 or 0x90000000; addresses are a little bit different than on Z3). It can move to other PCI addresses if need be. Let's for the sake of argument say that, again, sonnet memory is at 0x70000000. So if we want to access 0x70000000 as in our previous example, we need to shift the Z2 window to this address and access 0x200000 OR access 0x70000000 with the 68K MMU redirecting 0x70000000 to 0x200000. The latter keeps addressing consistent for both processors (pointers etc.). And is what I want to achieve with the mmu.library

WarpOS programs get loaded inside this 0x70000000-0x80000000 area as we can not readily distinguish 68K code from PPC code (except maybe for the first code hunk which is always 68K). So the Z2 windows needs to shift to different spots within 0x70000000-0x80000000 for the 68K to execute code or fetch data if need be.

Indeed, maybe loading data from disk by WarpOS programs would give problems....But it normally would go through 68K DOS/Read(). I don't know how DMA is in this case.

A couple of things:

Data is mostly manipulated by the PPC (for speed, why else have a PPC) and not the 68K.
68K code is kept to a minimum (mostly startup).
Mixed 68K/PPC libraries could be a problem.

I'll know after trying :-)

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

I'm sure the issues Thor is mentioning are valid, but this discussion is in context of Mediator 1200 series only (we don't need to mess with MMU setup on big box Mediators). In case of Mediator 1200 we don't have to worry much about DMA. Nothing besides the CPU will access Mediator 1200. Even if some A1200 turbo boards have DMA-capable devices (like SCSI controllers), they typically only DMA to on-board memory on turbo card (no one ever expected that something will sit between the turbo board and main board, which has only chip RAM). Of course I could imagine some badly written driver might try that, but I think the risk is minimal. It would be a problem if we were talking about A3000/A4000, where DMA can happen to memory located anywhere.

In context of Mediator itself, DMA can happen on the PCI bus only. Since PCI bus has separate address space (shared only through the window with 68k space), it shouldn't bother us. Such DMA-capable driver has to be specially written for the Mediator and would never use 68k addresses (since PCI device can't access 68k side at all).

Btw. On a side note, I'm sure that you know but I wanted to stress that Z2 window address can change (it's not hardcoded to be at 0x200000, but is subject to normal AutoConfig mechanism). Of course when the window is 8MB there's just no more space available so it does have to fit at 0x200000 or it won't work at all.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Just wanted to show (hopefully) how it works on the Z3 Amigas and what the problems are when transposing this on the A1200.

Also, concerning AutoConfig, only the last few updates were regarding correct error handling and giving out error messages ;-) I still need to add window size checking, for example.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Letting some time pass between various attempts to get it working on the A1200 let to the realization that maybe the 8MB window is actually made up from 2x 4MB windows.

And indeed, some testing showed that it actually is the case. I successfully pointed the first 4 MB to 0x98000000-0x983fffff and the second 4 MB window to 0x99000000-0x993ffffff (so 0x200000 pointed to 0x98000000 and 0x600000 to 0x99000000). @rkujawa It was indeed bit 12 which is the selector.

This makes it a lot easier regarding code and data in different windows.

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

It's certainly possible. I must have never noticed the problem on NetBSD, as if I remember correctly, I mostly tested with 4MB window.

However, I am a bit worried about trying to change window position bypassing pci.library. It might lead to some breakage of Mediator drivers. Or the these drivers might try to change window position when we least expect that.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Drivers change the window by
Disable()
origwindow=GetZorroWindow()
SetZorroWindow(driverspace)
Stuff()
SetZorroWindow(origwindow)
Enable()

The voodoo driver does this using some sort of VBlank interrupt.as far as I
can see. Even if the Disable() / Enable() pair are missing I plan on doing
a patch on SetZorroWindow to track changes. If SetZorroWindow is called the
patch should invalidate all sonnet memory by 68K MMU. Something like that.
I'm not there yet in development.

2016-07-20 9:22 GMT+02:00 Radosław Kujawa [email protected]:

It's certainly possible. I must have never noticed the problem on NetBSD,
as if I remember correctly, I mostly tested with 4MB window.

However, I am a bit worried about trying to change window position
bypassing pci.library. It might lead to some breakage of Mediator drivers.
Or the these drivers might try to change window position when we least
expect that.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#21 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKdWEESdNhX370mQpJWf1ZwiqBOAvluuks5qXcy_gaJpZM4GvuJN
.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

As far as I can see, the changes to the mmu.library do what they are suppose to do. I can set up a memory range, for example 192MB at 0x40000000 for the sonnet and redirect it dynamically to the Mediator Z2 window at 0x200000. So that is good news. With FileX I can look at an arbitrary address range within that window. I can see the PPC debug output and also see the sonnet libbase and functions within this range. So also the library is correctly set up in Sonnet space.

The bad news is that shifting the Mediator window seems to interfere with the Voodoo driver. Gadgets and icons are not updated, for example. Other things like the FileX window contents are updated.

Looking a bit closer at the pci library I noticed that when the MMU is used by the pci library, the drivers only use a 4MB window, leaving the second 4MB window (at 0x600000) probably for the MMU.

Needs some more investigation.

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

I wonder if in this situation it would be better to use pci.library function for changing window position? At least that should solve the problem with other drivers using pci.library?

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

I am using the functions of the pci.library at the moment. I'm not sure if I am using them correctly, though, because I lack the documentation. I can have a look at other drivers. And have another look at the bus exception handler of the pci.library.

Want I meant with above post is to let the Mediator think it has a 4MB window and directly manipulate the second window with the sonnet.library. I think that is what happens when the option MMU=yes is set. Also with that option the pci library installs its own bus exception handler and manipulates the MMU directly. We don't want that. Like I said, I am now manipulating the full 8MB window using the pci library functions with MMU=no in ENV/Mediator.

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

Understood. I thought you're no messing with the window register directly. I hoped that with the new version of pci.library we'd be able to avoid patching it.

I agree with the idea to use 4MB window and try to manipulate the second one with sonnet.library. Using the pci.library's MMU mode is problematic, also agreed about that. Just that when MMU=no, library assumes that it can present the whole 8MB window space to the drivers... Which again is problematic, if we want to steal half of that.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

I'm going to revisit this soon. As it stands now, there are two options. One through the mmu.library (with MMU=no) and one through the pci.library (with MMU=yes).

A recent development confirmed some of the above assumptions regarding the pci.library so I want to retry that again. I think that I now know why the earlier attempts were not succesfull.

During the Sonnet Interrupt. I have to move the window myself. Normally this is done by the pci.library when MMU option is enabled but during interrupts the programmer has to do it.

Looking back to the earliest code the sonnet interrupt did move the window but not always when needed. The interrupt calls functions like putmsg, getmsg and replymsg and these functions can address memory ranges which are not within the same Z2 window thus getting the wrong info.

The solution would be to write those functions as part of the interrupt, such that the Z2 window gets moved to the right position (address of the message, address of the msglist, address of predecessor, address of successor and address of msgport for example).

Fingers crossed.

I already noticed that some WarpOS software depend on some output being negative when a failure occured. With the address range above 0x80000000 (and thus negative when signed) some of the programs don't work correctly. CyberPi for example.

from sonnetamiga.

rkujawa avatar rkujawa commented on June 17, 2024

wXR at EAB added a bounty for this issue:
https://www.bountysource.com/issues/28897993-get-the-sonnet-library-working-on-the-1200tx-mediator

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

PCI memory seems to be copyback. I need it to be cache inhibited like on the A3000/A4000. Maybe in the future this will change. Also, the address at which the WarpOS programs are running on the A1200 (>2GB range) does not help. I'm trying to get something in place that will bring it <1GB range (the max range the sonnet can install its memory into).

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

I have a pci.library now which sets pci memory as cache inhibited. I'll try to move all Zorro window conflicting stuff out of the interrupt and into the master control process. (Supervisor versus user mode) and see if that improve things.

The >2 gb addresses issue still needs to be resolved too.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Again, no luck. Programs just halt after a while. Can be immediately, can be after 10 seconds. There is no crash. Both the 68K and the PPC task go to a wait status.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Seems that it is related to the time between putting data in the queue that needs to be transferred to the PPC. When a message pointer is put in a certain register of the mpc107 this pointer is automatically transferred to a circular FIFO on the PPC side and an interrupt is raised. When writing two values into this register close to each other time-wise, the first message gets lost as it is not transferred to the ppc.

I saw a same kind of behavior with the big box mediator while reading from this register. It should give unique pointers but sometimes the ppc side was not updated quickly enough and two consecutive pointers were equal. A simple compare and reread sufficed. Here it is a bit more difficult as I cannot test whether the message was updated correctly while writing. I might have to organize the FIFO manually and just use a simple interrupt when done.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

There are more latency problems. Every message that goes through the ports (the 2 PCI registers on the PCI memory side, OFQPR and IFQPR, ) have a chance to be missed on the A1200 Mediator.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Looks like both the reading of free messages to send to PPC and reading of messages send from the PPC (so in both cases the 68K reading from the EUMB registers in PCI space) are affected. Now I also re-read when a duplicate is found from messages send from PPC and Voxelspace has been running for 2 hours now.

Not sure if writing to the registers is affected. I already rewrote the sending of messages to PPC, but releasing used messages is also done by writing to these EUMB registers by the 68K. However, there is a pool of 4K messages so any problems here will show up MUCH later.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Received an updated pci.library where the mediator automatically adjusts the zorro window also in the 0x10000000-0x20000000 range. Strange behaviour of some programs (Quake quitting on memory allocation error, cyberpi not printing any printf text, voxelspace not showing info window) seem to be resolved.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

move.l #$89000000,a0
move.l #$8B000000,a1
move.l #$ABCDABCD,(a0)
move.l (a0),(a1)
rts

does not work with current pci.library version. Version 11.0 and 12.0 worked with above program. Back to Elbox.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Got a fixed library from Elbox and on the A1200 the Sonnet memory is now placed at 0x20000000 and up. This has solved some of the issues like no text output in WarpOS programs due to addresses being negative when >0x80000000.

Simple stuff (the demos and tools from the WarpOS distribution seem to work now on my Amiga with the latest build. So the hanging which happened intermittently was also caused by the previous version of the pci.library.

Voxelspace however seems twice as slow as normal. QuakeWOS (the software 3D version) crashes after loading AHI. There seems to be some memory trashing somewhere.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Fixed the memory error. Now another error popped up while starting QuakeWOS. Some jump to zero page.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

If the Amiga OS handles anything which is in Sonnet memory, there is a chance on a crash. For example, the Wait() function inside the library crashes when entered with the 68K stackpointer pointing to Sonnet memory. I think this has something to do with the Supervisor context of the 68K MMU. The Wait() function itself calls Switch() which enters supervisor mode.

Placing a StackSwap() with a normal Amiga memory stack pointer before and after the Wait() fixes this issue, but any switch to Supervisor mode with for example the 68K stack pointing to Sonnet Memory or the task structure itself residing in Sonnet memory can (not always) lead to a crash.

I have to contact Elbox to see if their library uses the Supervisor context MMU (correctly). I know it is fully functional in mmu.library. Maybe consider again to try to implement the mmu.library.

A complete rewrite of the AllocMem() patch is also a posibilty. 68K stack, 68K task structures needs to go to 68K memory, PPC segments, PPC library/device bases need to go to Sonnet memory.

Seems more more and harder to implement, though.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

Latest build adds more compatibility. Stack and task structures are now forced to 68K memory. All the WarpOS v4 packages demos now work. FlashMandel works. WarpRace works. Voxelspace increased in speed from 80 to 100FPS at standard 320x240. Context switches are being measured at 400us. (In comparison the A3000 does 125FPS/200us).

QuakeWOS crashes or hangs inside input.device. Same with Quake2 and ADoomWOS. Input device tries to traverse a list and data on it is from Sonnet memory. Somehow the memory is not there and it loads $FFFFFFFF into address registers.

Looking into Supervisor mode and/or an Interrupt being installed by those games. Asking Elbox to fix the Supervisor MMU context also still an option.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

QuakeWOS was hanging as data was not loaded correctly. De first 16 bytes of files are skipped if the buffer of Read() is in PPC memory. This has been fixed for now by intercepting the Read() and putting the buffer in FAST RAM and then copy it to PPC memory. Now QuakeWOS runs with sound off.

With sound on it still hangs during sound initialization.

Most apps work now. Audio does not work. Warp3D needs testing.

from sonnetamiga.

DvdBoon avatar DvdBoon commented on June 17, 2024

The assembly version of the library will not support the 1200TX mediator beyond what is implemented now. Refer to https://github.com/Sakura-IT/PowerPCAmiga for possible support in the new C version in the future.

from sonnetamiga.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.