I am disassembling the Apple IIc ROM Rev3 (345-0445-A), bank 1. Starting in $C800, I a

Ah... <div class="snippet-clipboard-content notranslate position-relative overflow

Problems with multi-bank references about 6502bench HOT 10 CLOSED

baldengineer commented on August 20, 2024

Problems with multi-bank references

from 6502bench.

Comments (10)

fadden commented on August 20, 2024 1

I figured it was something like that. I had a similar issue while fiddling with Metroid (https://6502disassembly.com/nes-metroid/)... 8 banks of ROM, 7 mapped to the same address, code in the first bank referencing entry points in 5 out of the other 7.

Some issues can be resolved by setting the address regions appropriately. Sometimes you have to set the operand symbols explicitly, because there are multiple identical addresses. SourceGen treats the address map as a tree and does a depth-first search, but that doesn't disambiguate all situations.

Another example: ProSel's CAT.DOCTOR (https://6502disassembly.com/a2-prosel8/) does a bunch of relocations, so that map got pretty interesting. FWIW, the ASCII-art address map is generated by View > Show Address Map.

from 6502bench.

fadden commented on August 20, 2024 1

I've been thinking about the issue of references to overlapping banks. The problem at hand is that there is a reference to an address (such as $c80e) that exists in more than one place. The difficulty is that the disassembler's code analyzer wants to map that address to a file offset. There are three basic scenarios:

The target offset doesn't exist in the file. For your project, $c80e can only be mapped to the offset of the wrong $c80e, because it's the only place that address exists.
The target offset does exist in the file. Multi-bank situations are easy to handle when the only references are to locations within the same address region, because the address-to-offset resolver has a notion of scoping rules, and will bind to the closest thing. It's more difficult to handle when the references cross banks, because there's no way to tell the code analyzer which of the locations to use.
The target offset exists in multiple places. This was the case in Metroid, where the "game engine" chunk made calls to $80b0, which existed in 5 of the 7 ROM banks mapped into $8000-bfff. (Annoyingly, the bank that the code analyzer decided to use as the target wasn't one that had an entry point at $80b0, leading to a situation like we have here.)

Ideally it would be possible to add something to the operand that told it which of the various addresses were the correct ones, so that the code analyzer could automatically visit all of them. The operand editor would need to have a list of checkboxes, one per potential target offset. In practice this is probably more confusing and more work than just adding a code start tag at those offsets, and would be difficult to maintain if the address map was updated. The one clear advantage it has is that the References list would be correct.

In theory we could use a symbol specified for the operand as a signal. If the operand is given a symbol that is defined in a different part of the address map, we could start the offset resolution process in that region instead of the instruction's region. This doesn't help with multiple targets though, and I'm not sure how this would affect existing behavior. (Also, we don't normally apply labels until after the code analyzer runs.)

A simpler approach would be to add a "do not follow" checkbox for absolute branch instructions (JMP/JSR). If set, the code analyzer simply doesn't follow the trail. For this project, the box could be checked on the various JMP instructions to eliminate the mid-instruction execution seen in the initial problem report. This isn't ideal, but it's fairly straightforward, and eliminates the annoying multi-path code issue.

I've added a TO DO list item for this.

from 6502bench.

fadden commented on August 20, 2024

The "attr" column for $c80e has a ">", which means $c80e (the 3rd byte of the LDX) is a branch target. If you select the line, the References window in the top left will tell you what is referencing that address. Because of the branch, the disassembler sees a path through the code that passes through the end of the opcode.

$c80e is interpreted as a two-byte LDY, which eats into the 3-byte LDY at $c80f, creating another mid-instruction opcode. This propagates for a bit, gets a break at the CLD, then starts up again when something branches into $c816.

The downward arrows are there to let you know that there are instructions with opcodes in the middle of them. Some code does this deliberately; there's an example near the bottom of https://6502bench.com/sgtutorial/odds-ends.html .

If you want to send me the project (e-mail or attach here) I may be able to tell you more.

from 6502bench.

baldengineer commented on August 20, 2024

Ah okay, I did not know what the > in attributes meant. (And I completely missed the references panel!)

The issue is there are jmps to Bank 2, which the analyzer is linking together.

I haven't figured out how to handle the 2nd bank yet, but at least I understand the overall behavior better now.

Thanks!

from 6502bench.

baldengineer commented on August 20, 2024

Thanks! I'll take a look at how those projects are structured. I know I need to figure this out, but it's not my primary focus for doing this exercise. So, I'll come back and probably have more questions. :)

FWIW, here are the project files I have been working on. The "bank 1" project is my third start of this process. I keep learning the things I did wrong midway through. :)

from 6502bench.

fadden commented on August 20, 2024

A few notes that may be helpful...

You don't need to put the .sym65 files in the SourceGen RuntimeData directory. You can just put them next to your project file, and use "add symbol files from project" instead of "add symbol files from runtime". (They also shouldn't be copyright faddenSoft, since you wrote them.) This would remove the manual installation step from the download.

The relevant manual file section (hit F1 and find the "Platform Symbol Files (.sym65)" section, or open RuntimeData/Help/advanced.html) has some additional details on the .sym65 format that may be helpful. For example, addresses and constants are specified differently, so the address resolver doesn't try to use the constants, and you can specify different symbols for read vs. write operations on memory-mapped I/O locations.

I couldn't really play with the project because I didn't find a ROM binary. The html output looks like you're making good progress.

The stretch at $c780 caught my eye because of the 24-bit math:

C780: 8D 28 C0     swrti           ADR     ROMBANK+$BF6865   ;RTI to the other bank

Looks like that got turned into data rather than code (those are alternating STA/JMP). A couple of them are referenced with JSRs, the others might need code start tags on the $8Ds.

Mapping the chunk at $c000-c0ff to a different address (or no address at all) might be necessary if you want the address resolver to find the project/platform symbols for the I/O addresses. SourceGen prioritizes in-file addresses over external addresses.

from 6502bench.

baldengineer commented on August 20, 2024

Thanks for the follow-up. Regarding the license, oops! I intended to clean those files up before sharing them (and then forgot.)

I'll re-review the manual on symbol files again. Your instructions make more sense now that I've created and used them a bit.

The issue around $C780 is because I flattened the code in that area to be inline data. It is all stuff that jumps to the second bank. So, by effectively ignoring it, the rest of the project is easier to read. (Side note, for the immediate goal I am trying to accomplish, I just need to see when code jumps to that block. It's just a jump table anyway, so I only care when I see things going to that address range. Those are the jumps I need to patch around.)

I'll keep working to understand how to do address mapping. Thanks again!

from 6502bench.

fadden commented on August 20, 2024

Ah...

C760: 4C 0E C8                     ADR     fixlc+$C7463E

That's what caused the problem in the initial report.

from 6502bench.

Problems with multi-bank references about 6502bench HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent