Hi, Thanks for the great product! I have been using it for couple of years. I thin

I've added new command line options -s and <code clas

Add symbol output about customasm HOT 10 CLOSED

hlorenzi commented on May 14, 2024

Add symbol output

from customasm.

Comments (10)

hlorenzi commented on May 14, 2024

I'm so glad you're finding it useful! As I understand it, you're using it for teaching at a university? I'd love to know more details!

A symbol file output was definitely in the plans. Do you have a particular need for that syntax? I was thinking something like this:

start = 0x0
main = 0xfa
main.sublabel = 0x2abc

...which would mirror the source code syntax (except that sublabels have to be written in full), and should still be easy for other people to write a regex to extract values.

Also, could you please check the LogiSim output formats? I've tried reading LogiSim's documentation online, and adapting the code from your fork, but since I've never used that app, I'm not sure if the outputs are correct.

from customasm.

milanvidakovic commented on May 14, 2024

Hi, yes, I am using it on two demonstration projects for my students: 1) computer made in LogiSim (yes, your LogiSim formats are correct) and 2) FPGA-based computer (this one would use the symbols if/when you make them). My students were impressed to see custom assembler for both platforms. Thanks again for that!
Regarding the symbols format, your proposition is correct. I don't have any need for any syntax - your syntax is perfectly usable for me.

from customasm.

hlorenzi commented on May 14, 2024

I've added new command line options -s and --symbol for this!
Currently it outputs address labels as well as any variables you defined (with =). Let me know if this is a problem.

from customasm.

milanvidakovic commented on May 14, 2024

Thank you! I will start on the debugger part of my emulator to introduce symbols. You have made an excellent software!

from customasm.

milanvidakovic commented on May 14, 2024

Hi,
I have just implemented symbols in my FPGA emulator and it works (almost) perfect! Thanks again for this great assembler!

However, I have encountered a funny problem: sometimes in my assembler code, I just enter the number, not the label:
mov.w r0, 25
However, in the same code, I have a variable (VK_P, for example) which has the value of 25, so my debugger makes a wrong translation:
mov.w r0, VK_P

Is it possible to extend your symbol file with the list of addresses of the exact usage of that symbol?

If there is a list of addresses where the VK_P symbol was actually used, then my debugger would know not to put the VK_P symbol in the line where it was not used.

For example:
draw_next_line = 0xb946 (0xb230, 0xb2a4)

The example above would mean that the symbol draw_next_line has the actual value of 0xb946, and it was used at the addresses: 0xb230 and 0xb2a4. Those addresses could be the actual addresses in the assembled code where the label draw_next_line existed and was used.

I don't know if I was clear enough, so please ask me more details about this feature request.

from customasm.

hlorenzi commented on May 14, 2024

Yeah, I had the feeling this would be a problem... I'm not exactly sure how to solve this right now.

My first instinct was to make the symbol output differentiate between address labels and variables, like so:

; labels
draw_next_line = 0xb946
some_other_label = 0x8080

; variables
VK_P = 0x19
some_var = 0x55

...but you'd probably want the debugger to show variable names as well as address labels, when you actually use them. So this distinction might not be very useful in your case.

About your solution involving usage listings, I think it might not completely solve the issue in the general case? Imagine I had some kind of instruction like add r0, 25, VK_P -- you wouldn't be able to tell which 25 was the one that actually used a label. That said, this might be the best solution so far.

Another idea is to use the annotated output format, which lists addresses, bytes, and the source code excerpts that generated them. This annotated file should be easy to parse, too. What do you think?

from customasm.

milanvidakovic commented on May 14, 2024

Hi,
here are my thoughts:
Some variables and labels can occur several times in the executable. For instance:

VIDEO_A = VIDEO + 15*160
VIDEO_C = VIDEO_A + 1
mov.w r1, hello  		; r1 holds the address of the "HELLO WORLD" string
mov.w r2, VIDEO_C		; r2 points to the character part of the video memory
mov.w r4, VIDEO_A		; r4 points to the attribute part of the video memory
mov.w r5, VIDEO_A		; r5 points to the attribute part of the video memory
...
hello: 
#str "Hello\0"

The resulting executable would look like this:
location: content

0xB014: 01 C0 00 00 B0 7C	; mov.w r1, hello (0x0000B07C)
0xB01a: 02 C0 00 00 0D 61	; mov.w r2, VIDEO_C  (0x00000D61)
0xB020: 04 C0 00 00 0D 60	; mov.w r4, VIDEO_A  (0x00000D60)
0xB026: 05 C0 00 00 0D 60	; mov.w r5, VIDEO_A  (0x00000D60)
...
0xB07C: 48 65 6C 6C 6F 00	; label hello: "Hello\0"

01 C0, 02 C0, 04 C0 and 05 C0 are opcodes for the mov.w rX, number instructions

Now, the symbols are:
variables:

VIDEO = 0x400
VIDEO_A = 0xd60
VIDEO_C = 0xd61

labels:
hello = 0xb07c

The occurrences of symbols in the executable are at the following addresses:

hello = 0xb16				; it goes from 0xb16 to 0xb19 (four bytes, 32-bits)
VIDEO_A = 0xb022, 0xb028		; VIDEO_A is used in two mov.w instructions
VIDEO_C = 0xb01c			; VIDEO_C is used in one mov.w instruction

So, whenever your assembler replaces a label or variable with the actual number, it would be nice if it would write down the exact address where that symbol was used some way similar, or equal to the one written above.

Is it possible for your code to write down the exact address of the occurrence of a symbol? For example, in the code above, the mov.w r4, VIDEO_A instruction is written at the 0xB020 address, having the following bytes starting at that address: 04 C0 00 00 0D 60 .
So, at the 0xB022 starts the actual number: 00 00 0D 60. You can see that in my proposition:
VIDEO_A = 0xb022, 0xb028 ; VIDEO_A is used in two mov.w instructions, at two memory locations.
Regarding your example:
add r0, 25, VK_P
that also should not be the problem, since the VK_P would be stored in memory at a certain address, just like in my examples above. I don't have that particular instruction, but, let's suppose that I have it:

VK_P = 0x51
add r0, 25, VK_P

The executable would be like this:
0xB000: 00 50 00 00 00 19 00 00 00 51
00 50 would be the add opcode, 00 00 00 19 would be the number 25, and 00 00 00 51 would be VK_P.
The VK_P symbol would appear at the 0xB006.

Huh, so many words. I hope that I was at least a bit clear about this topic.

from customasm.

hlorenzi commented on May 14, 2024

I've been thinking a lot about this, but I still haven't got the time to work on a solution. The crux of the problem seems to be keeping track of named variables as they go through arbitrarily-complex expressions in the body of an instruction definition (since those can even be blocks of code with multiple expressions in sequence). Perhaps something can be done for the simpler cases of a single expression with clean variable usage.

from customasm.

milanvidakovic commented on May 14, 2024

Thanks for still thinking about this. I agree that the simplest cases should be done. That certainly works for me.

from customasm.

hlorenzi commented on May 14, 2024

I'll close this since there's now an annotated output format which should do more-or-less what you're describing! Feel free to open this issue again if you still need help!

from customasm.

Add symbol output about customasm HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent