craigthomas / cocoassembler Goto Github PK
View Code? Open in Web Editor NEWA Tandy Color Computer 1, 2, and 3 assembler written in Python
License: MIT License
A Tandy Color Computer 1, 2, and 3 assembler written in Python
License: MIT License
Describe the enhancement
The SETDP
pseudo-operation is used to tell the assembler to optimize code. Specifically, the SETDP
mnemonic allows the programmer to specify what the contents of the direct page register are during compilation. The assembler can then look for any addresses during assembly that have a most significant byte the same as what is stored in the direct page register. It will then transform those statements from their extended addressing mode equivalents into direct addressing mode statements instead.
For example, normally:
LDA $0F01
Would generate the following machine code:
B6 0F 01
However, if the direct page were loaded with $0F
, and the SETDP
pseudo-operation were implemented, then:
SETDP $0F
LDA $0F01
Would generate the following machine code:
96 01
Implement functionality that will let a user extract a file from a DSK file and save it to a CAS file. The CAS file can be a new, blank cassette file, or it can be an existing CAS image file, in which case the extracted file will be appended to the CAS file.
Implement the ASRD
6309 instruction mnemonic in the assembler. Performs an arithmetic shift right of double-byte register D, storing in D.
Inherent - $1047, 2 bytes
Example:
ASRD
Many assemblers allow for conditional assembly of statement blocks. This issue will implement conditional assembly by adding a COND
pseudo operation that defines the start of a conditional block, and ENDC
which defines the end of a conditional block. Conditionals come in the form of:
COND <expression>
<statements>
ENDC
Where:
<expression>
is a traditional expression that contains two values separated by an operation.<statements>
are the assembly language statements to be included in the conditional block.ENDC
ends the conditional block.The conditional block is only assembled if the value of the <expression>
results in a non-zero value. Conditional blocks cannot be nested.
Describe the bug
When an expression shows up in the left hand side of an indexed program counter relative expression, it is not resolved correctly.
To Reproduce
test.asm
and add the following contents to it: ORG $3F00
TEMP EQU $0001
START STX 1+TEMP,PCR
END START
python3 assembler.py test.asm --print
-- Assembled Statements --
$3F00 ORG $3F00 ;
$3F00 TEMP EQU $0001 ;
$3F00 AF8C00 START STX 1+TEMP,PCR ;
$3F02 END START ;
Expected behavior
Correct output should be:
-- Assembled Statements --
$3F00 ORG $3F00 ;
$3F00 TEMP EQU $0001 ;
$3F00 AF8D0002 START STX 1+TEMP,PCR ;
$3F04 END START ;
Desktop (please complete the following information):
Additional Context
Currently the output of AF8C00
is exhibiting the problem as described in issue #51 . The correct output of AF8D
as the instruction and addressing mode should be fixed by that issue first. Then the issue of the expression resolution can be fixed.
A feature of Disk Extended Color Basic (DECB) is that it prefers to use granules that are closer to the file allocation table and directory entries (track 17) before using granules that are closer to the outer edges of the disk. This makes sense, since it means stepping the read/write head less frequently after reading the file system data. Currently, the assembler will do a search for free granules starting at granule 0 (the first track). While this isn't an issue when using virtual files, on real media, there may be a noticeable delay when loading data from disk, since the read/write head needs to step a lot in order to read the granule data. The purpose of this feature request is to enable the same behavior in the assembler for populating granules that is seen when using DECB.
Add the NAM
mnemonic to the list of pseudo operations. This will allow the programmer to set the name of the program within the assembly listing so that the --name
switch does not need to be passed when saving to disk or cassette images.
Describe the bug
Immediate negative integer values not translated correctly.
To Reproduce
Steps to reproduce the behavior:
test.asm
. ORG $0E00
START CMPB #-2
END START
Run the assembler on the test file as follows:
python3 assembler.py --print test.asm
Output will be as follows:
-- Assembled Statements --
$0E00 ORG $0E00 ;
$0E00 C102 START CMPB #-2 ;
$0E02 END START ;
Expected behavior
The compiled statement for the CMPB #-2
should not be C1 02
. The value 02
should be negative, so FE
. The output below should be correct:
-- Assembled Statements --
$0E00 ORG $0E00 ;
$0E00 C1FE START CMPB #-2 ;
$0E02 END START ;
Desktop (please complete the following information):
In order to be useful, assembled programs need to be stored in a format the contains the machine code offset where the program should be stored and executed from. One such format is in a DSK (virtual disk) format. The purpose of this Issue is to add a FileUtility package that the assembler can call on to save to the assembled statements as a binary file on a DSK file. In this issue, only JV1 style virtual disk types need to be supported. Only Disk Basic filesystem formats need to be supported. The first pass of this issue is to create a new DSK file with a freshly initialized filesystem on it, and the binary file placed in the DSK image. This issue requires the completion of issue #8 to be completed first.
Describe the enhancement
The assembler currently handles indexed addressing modes for 8-bit and 16-bit constant offsets correctly. However, it does not correctly implement any optimizations for 5-bit offsets. According to the Motorola data set for the 6809, constant 5-bit offsets values are allowed in the range of -16 to +15 to be stored directly in the post-byte. Currently, the assembler treats any 5-bit offsets the same way it treats 8-bit offsets, thus adding an additional byte to the operation. As an example:
LDA 5,Y
Generates the following code:
B6 A8 05
If implemented, then the 5-bit constant offset indexed optimization would instead produce:
B6 25
This saves an additional byte each time a 5-bit constant is used.
Implement the BAND
6309 instruction mnemonic in the assembler. Logically ANDs the specified bit in the A
, B
, or CC
with a bit in memory. Result is stored in the source register. Direct addressing mode only. The first two bytes of the instruction are the instruction code, the next byte is a postbyte, and the last byte is the address least significant byte.
Direct - $1130, 4 bytes
Example:
BAND B,2,4,$40
The above would AND bit 4 of B
with bit 2 of DP
:40, storing the result in B
. Note the strange order here - following B
you specify the bit in the memory location, followed by the bit in the register. The resulting machine code would be:
11 30 A2 40
The postbyte is composed of the following sections:
CC
, 01 = A
, 10 = B
, 11 = invalidAdd functionality that will allow a user to extract all files from a CAS image file. The CAS image will be scanned, and the files extracted and save to the host computer. The completion of this issue requires issue #8 to be completed first.
The RMB operation will set aside the specified number of bytes at the location where the RMB is defined. This means that if the user specifies:
RMB $8
then 8 bytes will be inserted at the specified location, all containing a zero value.
Implement the ANDD
6309 instruction mnemonic in the assembler. Performs a logical AND with a double-byte and register D, storing in D.
Immediate - $1084, 4 bytes
Direct - $1094, 3 bytes
Indexed - $10A4, 3+ bytes
Extended - $10B4, 4 bytes
Example:
ANDD #$1010
A new fileutil
package should be introduced that will handle saving the assembled contents of a program to and from various disk and tape file formats. The cocoasm
package will call the fileutil
package to perform any I/O routines.
Describe the bug
Negative values within the operands causes an assembly error.
To Reproduce
negtest.asm
: NAM NEGTEST
ORG $0000
BEGIN STD -1,X
END BEGIN
python3 ./assemble.py --print negtest.asm
Will result in:
[-1] is an invalid value
$ BEGIN STD -1,X
Expected behavior
With the print statement in place, should provide the following output:
-- Assembled Statements --
$0000 NAM NEGTEST ;
$0600 ORG $0600 ;
$0600 ED1F BEGIN STD -1,X ;
$0602 END BEGIN ;
Desktop (please complete the following information):
Implement the ADDR
6309 instruction mnemonic in the assembler. Adds contents of a source register to the contents of the destination register. All registers except for Q
and MD
are allowed.
Immediate - $1030, 3 bytes
The command works with the following syntax:
ADDR r0,r1
Where r0
is the source and r1
is the destination. The source is stored as the high nibble of the operand byte, and the destination is stored as the low nibble of the operand byte. Valid source and destination values are:
0000
- D
0001
- X
0010
- Y
0011
- U
0100
- S
0101
- PC
0110
- W
0111
- V
1000
- A
1001
- B
1010
- CC
1011
- DP
1110
- E
1111
- F
Example:
$10 30 8E ADDR A,E
Describe the bug
When using direct addressing mode, no bytes seem to be created. I am using ADCA
as an example, but the issue seems to affect all direct addressing mode mnemonics.
To Reproduce
Steps to reproduce the behavior:
test.asm
: ORG $1000
ADCA <$9F
python assembler.py --print test.asm
Expected behavior
With the print parameter in place, it should produce this output:
-- Assembled Statements --
$1000 ORG $1000 ;
$1000 999F ADCA <$9F
Actual behavior
I am getting no byte codes at all:
-- Assembled Statements --
$1000 ORG $1000 ;
$1000 ADCA <$9F
Describe the bug
Assembling RTS statements are skipped in some instances.
To Reproduce
Steps to reproduce the behavior:
rtsbug.asm
: NAM RTSBUG
ORG $0E00
START LDA #$01 ;
RTS ;
END START
python3 assembler.py --print rtsbug.asm
RTS
statement will not be compiled:-- Assembled Statements --
$0000 NAM RTSBUG ;
$0E00 ORG $0E00 ;
$0E00 8601 START LDA #$01 ;
$0E02 RTS ; ;
$0E02 END START ;
Expected behavior
Expected behavior results in the following output:
-- Assembled Statements --
$0000 NAM RTSBUG ;
$0E00 ORG $0E00 ;
$0E00 8601 START LDA #$01 ;
$0E02 39 RTS ; ;
$0E03 END START ;
Desktop (please complete the following information):
Describe the bug
16-bit instructions using program counter relative addressing should use 16-bit offsets by default. Currently only LEAX
, LEAY
, LEAS
and LEAU
use the correct behavior. Program counter indexed relative addressing modes for instructions STX
, STU
, STD
, ADDD
, CMPX
, LDD
, LDX
, LDU
, SUBD
, CMPD
, CMPS
, CMPU
, LDS
, CMPY
, LDY
, STS
, and STY
need to be fixed.
To Reproduce
test.asm
and add the following contents: ORG $3F00
START STX 1,PCR
LDA #$0A
NEXT RTS
END START
python3 assembler.py test.asm --print
-- Assembled Statements --
$3F00 ORG $3F00 ;
$3F00 AF8C01 START STX 1,PCR ;
$3F03 860A LDA #$0A ;
$3F05 39 NEXT RTS ;
$3F06 END START ;
Expected behavior
Correct output should be:
-- Assembled Statements --
$3F00 ORG $3F00 ;
$3F00 AF8D0001 START STX 1,PCR ;
$3F04 860A LDA #$0A ;
$3F06 39 NEXT RTS ;
$3F07 END START ;
Desktop (please complete the following information):
In order to be useful, assembled programs need to be stored in a format the contains the machine code offset where the program should be stored and executed from. One such format is in a CAS (cassette tape) format. The purpose of this Issue is to update the fileutil
package that the assembler can call on to save to the assembled statements in a file in a cassette tape file. This issue requires completion of issue #8 to be completed first.
Add the AIM
instruction. Performs a logical AND of an 8-bit immediate value with the contents of a memory byte and stores it in the designated memory byte. Meant to collapse the LDA
, ANDA
, and STA
6809 operations to perform the same task.
AIM
:
Direct - $02, 3 bytes
Indexed - $62, 3+ bytes
Extended - $72, 4 bytes
The command works with the following syntax:
AIM #$0E;$FFFE
Note the semi-colon used to separate the immediate value and the memory address.
Currently, FDB
only allows one double byte to be defined per line, so if multiple double bytes need to be specified, it multiple lines need to be inserted - one FDB
definition per line:
VAR FDB $4832
FDB $4532
FDB $4C32
FDB $4C32
FDB $4F32
In some assembly language listings, the FDB
pseudo operation allows for multiple double bytes to be defined:
VAR FDB $4832,$4532,$4C32,$4C32,$4F32
It would be more convenient to allow for this condensed version.
Implement the BEOR
6309 instruction mnemonic in the assembler. Logically XORs the specified bit in the A
, B
, or CC
with a bit in memory. Result is stored in the source register. Direct addressing mode only. The first two bytes of the instruction are the instruction code, the next byte is a postbyte, and the last byte is the address least significant byte.
Direct - $1134, 4 bytes
Example:
BEOR B,2,4,$40
The above would XOR bit 4 of B
with bit 2 of DP
:40, storing the result in B
. Note the strange order here - following B
you specify the bit in the memory location, followed by the bit in the register. The resulting machine code would be:
11 34 A2 40
The postbyte is composed of the following sections:
CC
, 01 = A
, 10 = B
, 11 = invalidAdditional unit tests are needed to cover functionality in classes Operands, Statement, Program, and Instruction. Additional integration tests are also necessary to ensure top to bottom validity of the assembler. See CodeCov report for information on where unit tests are necessary.
Add the ability to detect whether a cassette file already exists, and allow for appending to the file instead of blindly overwriting it.
Implement the ANDR
6309 instruction mnemonic in the assembler. Performs a logical AND with a source and destination register, storing the result in the destination register.
Immediate - $1034, 4 bytes
Register codes:
D
- 0000
X
- 0001
Y
- 0010
U
- 0011
S
- 0100
PC
- 0101
W
- 0110
V
- 0111
A
- 1000
B
- 1001
CC
- 1010
DP
- 1011
E
- 1110
F
- 1111
Example:
ANDR A,B
Extend the ADD
mnemonic to allow for E
and F
variants. Adds the byte of the specified memory location to the register, and stores in the register.
ADDE
:
Immediate - $118B, 3 byes
Direct - $119B, 3 bytes
Indexed - $11AB, 3+ bytes
Extended - $11BB, 4 bytes
ADDF
:
Immediate - $11CB, 3 byes
Direct - $11DB, 3 bytes
Indexed - $11EB, 3+ bytes
Extended - $11FB, 4 bytes
The command works with the following syntax:
ADDE #$7A
Example:
$11 8B 7A ADDE #$7A
Describe the bug
The file utility is unable to read files that do not have preamble and postamble data associated with their file content. Preamble and postamble content is meant to describe binary files. It contains information relating to the length of the binary file, the load address, and the executable address of the binary after it has been loaded into memory. Currently, the file_util.py
script assumes that all files have preamble and postamble content, thus, when a BASIC file or other data file are read, they are dismissed as being malformed since the script cannot determine how many bytes are in the file (plus it sees the preamble header as being incorrect).
The fix needs to occur to the list_files
function on the DiskFile
class in cocoasm/virtualfiles/disk.py
around line 140 (look for a TODO
block). The function needs to check the file type to see if it is an object file. If not, then it needs an alternate way to calculate file length. Most notably, the FAT for the final granule records how many sectors are in use. The directory entry record returns how many bytes are used in the last sector of the last granule in the file. For non-object files, these two pieces of information can be used to obtain the total length of the file (number of data bytes to read). This should effectively let the file utility read the files on disk and properly extract them to cassette files or binary files.
To Reproduce
DSK
file image that contains a BASIC file or non-machine language file.python3 file_util.py --list diskimage.dsk
Expected behavior
The BASIC file should be listed in the disk contents.
Desktop (please complete the following information):
Implement the ADCR
6309 instruction mnemonic in the assembler. Adds contents of a source register, plus the carry flag, to the contents of the destination register. All registers except for Q
and MD
are allowed.
Immediate - $1031, 3 bytes
The command works with the following syntax:
ADCR r0,r1
Where r0
is the source and r1
is the destination. The source is stored as the high nibble of the operand byte, and the destination is stored as the low nibble of the operand byte. Valid source and destination values are:
0000
- D
0001
- X
0010
- Y
0011
- U
0100
- S
0101
- PC
0110
- W
0111
- V
1000
- A
1001
- B
1010
- CC
1011
- DP
1110
- E
1111
- F
Example:
$10 31 8E ADCR A,E
Describe the bug
When program counter relative indexing is used within a program, the assembler will by default use a 16-bit relative offset instead of checking to see if an 8-bit offset would be appropriate.
To Reproduce
pcrtest.asm
: NAM PCRTEST
ORG $0600
VAR FCB 0
BEGIN LDA $FF
STY VAR,PCR
END BEGIN
python3 ./assemble.py --print pcrtest.asm
-- Assembled Statements --
$0000 NAM PCRTEST
$0600 ORG $0600
$0600 00 VAR FCB 0
$0601 96FF BEGIN LDA $FF
$0603 10AF8DFFFE STY VAR,PCR
$0608 END BEGIN
Expected behavior
With the print statement in place, should provide the following output:
-- Assembled Statements --
$0000 NAM PCRTEST
$0600 ORG $0600
$0600 00 VAR FCB 0
$0601 96FF BEGIN LDA $FF
$0603 10AF8CF9 STY VAR,PCR
$0607 END BEGIN
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Currently, the assembler prints out lines that are greater than 120 characters wide when using the --print
or --symbols
switches. Historically, console windows have been limited to 80 or 100 characters wide when initialized, unless users specify an override. In order for the output of the assembler to be more readable, the output width should be truncated to a reasonable size - approximately 100 or 120 characters. Additionally, a switch should allow for users to specify how wide they want the output to be when printing to the screen.
The purpose of this issue is to add the ability to append an assembled program to an existing DSK image. Only Disk Basic filesystems need to be supported. The FileUtility class will need to check to make sure the disk is properly formatted, and that enough free granules exist to add the file to the disk. The directory structure will also need to be updated to contain the file. This Issue requires issue #6 to be completed before being implemented.
Similar in nature to a CAS file, implement functionality that will save the file as a WAV file so that the user can play back the file and use cassette IO on a CoCo to load the file.
Implement the ADDW
6309 instruction mnemonic in the assembler. Adds contents of double byte to the W
register.
Immediate - $108B, 4 bytes
Direct - $109B, 3 bytes
Indexed - $10AB, 3+ bytes
Extended - $10BB, 4 bytes
Example:
ADDW #$1010
Describe the bug
When a PSHU
or PULU
instruction is entered, varying error messages are thrown regarding their operands. For example a PSHU A
throws an error that [A] not in symbol table
. Similarly, adding additional registers throws other errors, for example when PSHU A,B
is entered, it throws a Instruction [PSHU] does not support indexed addressing
error.
To Reproduce
Steps to reproduce the behavior:
pshu_test.asm
and add the following instruction to it: PSHU A,B
python3 ./assembler.py pshu_test.asm
Expected behavior
The file should assemble correctly.
Desktop (please complete the following information):
Additional context
This is probably due to the fact that the two instructions (PSHU
and PULU
) are not tagged as being is_special
in the instruction table, and accordingly in the operands.py
file under SpecialOperand
class, not being included with PSHS
and PULS
.
Describe the bug
Loading D with an 8-bit value results in a printing error when the assembly source is printed during compilation. The actual compiled source is correct, but the print values are incorrect.
To Reproduce
Steps to reproduce the behavior:
lddtest.asm
: NAM LDDBUG
ORG $0600
START LDD #$1
END START
python3 assembler.py --print lddbug.asm --bin_file lddbug.bin
-- Assembled Statements --
$0000 NAM LDDBUG ;
$0600 ORG $0600 ;
$0600 CC01 START LDD #$1 ;
$0603 END START ;
Note that the LDD #$1
is compiled to CC01
when it should be CC0001
. The size of the instruction is correct however, as the next statement begins at $0603
, indicating that the instruction plus operands consumed 3 bytes. However, when you view the file with a hex editor, you will see that it is two bytes long containing the sequence CC 01
. This means that the 16-bit value is being ignored in favor of 8-bits instead.
Expected behavior
Expected that the resultant output would be:
-- Assembled Statements --
$0000 NAM LDDBUG ;
$0600 ORG $0600 ;
$0600 CC0001 START LDD #$1 ;
$0603 END START ;
The resultant output file should contain the sequence:
CC 00 01
Desktop (please complete the following information):
Describe the bug
Character literals are not accepted as operands for immediate addressed instructions. Character literals are usually denoted with 'C
for example, representing the letter C
. This saves having to convert the character literal to a numeric value manually.
To Reproduce
literal.asm
with the following code: NAM LITERAL
ORG $0600
START LDA #'C
END START
python3 assembler.py --print literal.asm
Invalid operand value
line: START LDA #'C
Expected behavior
The following output should occur from the assembler:
$0600 ORG $0600
$0600 8643 START LDA #'C
$0602 END START
Desktop (please complete the following information):
Implement functionality that will let a user extract a file from a CAS file, and save it to a virtual DSK file. The DSK file can be a new, blank disk, or it can be an existing disk, in which case the operation will append to the file allocation table, and save it to the disk (assuming there is enough space).
Add functionality that will allow a user to extract a file from a CAS image file. The CAS image will be scanned, and the file extracted and save to the host computer. The completion of this issue requires issue #8 to be completed first.
There are currently several pathways through the codebase that do not have adequate test coverage. Prior to Release 1.0.0, tests should be implemented to cover missing LOC to ensure that future additions do not cause regressions or inadvertent bugs.
Prior to Release 1.0.0, README
documentation should be completed to ensure that users of the project know how to use the assembler, as well as understand the mnemonics used by the assembler.
Implement the ADCD
6309 instruction mnemonic in the assembler. Adds contents of double byte, plus carry flag, plus memory value, and stores in D
.
Immediate - $1089, 4 bytes
Direct - $1099, 3 bytes
Indexed - $10A9, 3+ bytes
Extended - $10B9, 4 bytes
Example:
ADCD #$1010
Functionality should be added that will allow a user to save any arbitrary file to a DSK image. This issue requires the completion of issue #8 to be completed first.
A macro is a block of assembly language statements that can be defined and then inserted into an assembly language program at any location. A macro is defined by a MACRO
and then ENDM
pair. A macro must also have a label. The macro is called within the assembly language routine by using the symbol name as if it were any other mnemonic. Values can be passed to macros within the operand field when specifying the macro. Passed values are specified within the macro definition with the \
character and the number for the value (the first passed value will be \0
, the next \1
, etc). For example:
TEST MACRO
LDA \0
ENDM
ORG $0E00
TEST #$40
END
Macros cannot be nested.
Functionality should be added to allow a user to extract a file from a DSK image. The DSK image will be scanned for the filename in question, and it's contents will be extracted and saved to the host computer. This issue requires the completion of issue #8 to be completed first.
The purpose of this feature request is to allow for binary numbers to be represented as numeric literals. Right now, both decimal and hex values are allowed:
LDA #$80 ; Loads 128 into A
LDA #128 ; Loads 128 into A
Binary values should be allowable with a prefix of %
:
LDA #%10000000 ; Loads 128 into A
Currently, FCB
only allows one byte to be defined per line, so if multiple bytes need to be specified, it multiple lines need to be inserted - one FCB
definition per line:
VAR FCB $48 ; HELLO
FCB $45
FCB $4C
FCB $4C
FCB $4F
In some assembly language listings, the FCB
pseudo operation allows for multiple single bytes to be defined:
VAR FCB $48,$45,$4C,$4C,$4F ; HELLO
It would be more convenient to allow for this condensed version.
Implement the ASLD
6309 instruction mnemonic in the assembler. Performs an arithmetic shift left of the double-byte and register D, storing in D.
Inherent - $1048, 2 bytes
Example:
ASLD
In certain circumstances, a user may issue an instruction such as:
LBCC FUNCTION
This instructs the CPU to perform a long branch to where FUNCTION
is located in memory. Long branches are useful since they can be used to branch to anywhere in the 64K memory space. However, if FUNCTION
is within +129 bytes or -126 bytes of the current instruction, then we can transform the LBCC
to a BCC
instead, saving several op-cycles, as well as a byte in instruction op-codes.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.