Comments (9)
Hi :)
Thanks for checking out my code and thanks for commenting on it. I did actually consider using an optimization like you suggest, but it resulted in larger code size for x86_64, Atmel Mega16 and ARM-Cortex M3 so I decided against it. At least I can't get it any smaller ;)
The size difference for doing what you suggest above using my toolchain (mentioned in the README) can be seen below. It seems the "naive code" using multiplication result in fewer instructions.
I haven't had the time to check the assembly output or benchmark it to see if there is any measurable performance difference and I've only got access to x86 and ARM platforms at the moment...
You do raise a valid point though especially for platforms with slow mul-operations, but as one of the priorities of my library is small code size rather than speed, I think I will close this issue. Unless you can rewrite anything that makes the binary output smaller than it is, then I'll gladly accept a pull request :)
$ arm-none-eabi-gcc -mthumb -Os -c aes.c ; size aes.o
text data bss dec hex filename
1883 0 204 2087 827 aes.o
$ arm-none-eabi-gcc -mthumb -Os -c aes2.c ; size aes2.o
text data bss dec hex filename
1903 0 204 2107 83b aes2.o
$ avr-gcc -Os -c aes2.c ; size aes2.o
text data bss dec hex filename
2817 0 198 3015 bc7 aes2.o
$ avr-gcc -Os -c aes.c ; size aes.o
text data bss dec hex filename
2687 0 198 2885 b45 aes.o
$ gcc -Os -c aes.c ; size aes.o
text data bss dec hex filename
2760 0 224 2984 ba8 aes.o
$ gcc -Os -c aes2.c ; size aes2.o
text data bss dec hex filename
2818 0 224 3042 be2 aes2.o
from tiny-aes-c.
The below two solutions produce the same binary size. I'm guessing it's the same assembly output as well, but I'm too lazy to objdump
it and check..
for(i = 0; i < Nk*4; i+= 4)
{
RoundKey[i + 0] = Key[i + 0];
RoundKey[i + 1] = Key[i + 1];
RoundKey[i + 2] = Key[i + 2];
RoundKey[i + 3] = Key[i + 3];
}
i = Nk;
...
for(i = 0; i < Nk; ++i)
{
RoundKey[(i * 4) + 0] = Key[(i * 4) + 0];
RoundKey[(i * 4) + 1] = Key[(i * 4) + 1];
RoundKey[(i * 4) + 2] = Key[(i * 4) + 2];
RoundKey[(i * 4) + 3] = Key[(i * 4) + 3];
}
from tiny-aes-c.
Some compilers will identify a multiply by a power of 2 and will replace it with a shift. Also, cortex-m series has a multiply-accumulate instruction so (i * 4) + 1 could be single cycle. @blackswords you could define MUL4 (or more general MUL2N) as a macro and try it both ways in your environment and see.
from tiny-aes-c.
Well, I did some testings with my compiler and I noticed something very interesting. When compiling for a ARM Cortex-M0 (which doesn't handle multiplications by hardware), all integers multiplications are converted to a combination of left shiftings and sums. A library call is only issued for divisions.
I'm glad to see that my complier is not as dumb as I thought. I always preferred coding things explicitly as making assumptions on what the compiler will do. But I learnt today that it's in fact a good thing to look at code produced by the compiler. I won't bother with integer multiplication from now.
So, with a decent compiler, you can leave your code as it is without compromising the execution speed.
I will make one last comment, it may be a good thing to use the preprocessor to use the debugging features or not (printf & scanf) because on small chips it may consume a lot of useful memory when not needed.
from tiny-aes-c.
That's what I thought :) GCC optimizes very aggressively in my experience and that sounds confirming.
Regarding the printf, it's only included in the test-file, which I think is easily portable to something else :)
I use this macro (with the Rowley Crossworks IDE, hence the non-standard cross_studio_io etc.).
It makes it easy to switch output on/off between rebuilds.
#if (PRINT_DEBUG_MESSAGES == 1)
/* STM32 */
#if defined(__ARM_ARCH_7M__)
#include <cross_studio_io.h>
#define dbg_printf(...) debug_printf(__VA_ARGS__)
/* RM42, _WIN32_, ... */
#else
#include <stdio.h>
#define dbg_printf(...) printf(__VA_ARGS__)
#endif
#else
#define dbg_printf(...) /* macro expands to nothing! */
#endif
debug_printf
is easily replaceable with some other variadic debug function :)
from tiny-aes-c.
Hi @blackswords - I use the Linaro gcc compiler as well - @kokke you should check it out. And you are correct, the M0 doesn't have HW multiply, I forgot about that. I'm actually looking at using this AES on a much smaller (8/16) bit platform that unfortunately doesn't have as nice a compiler. This is written kind of like assembly in C so I think most compilers should treat it pretty well.
from tiny-aes-c.
I should mention that on some older, obscure compilers variadic macros, and indeed even va_args isn't well supported, so I typically resort to dbg_printf1(), dbg_printf2(), etc, where the numbers are the number of arguments.
from tiny-aes-c.
@revlon What 8/16 bit platform are you referring to? I know some of them have a GCC port
from tiny-aes-c.
@revlon I used this library in OFB mode to implement encryption in a Z-Wave SoC using a 16bit Keil CX51 8051 compiler. Horrible horrible stuff man, but it had about the same RAM usage, can't remember ROM/FLASH size to be honest. Except for a few compiler specific things - IIRC writing __flash in front of the const arrays to properly place them in ROM - this code compiled out of the box. Worked out of the box on a PIC18 as well. I think it's easily portable to whatevs.
from tiny-aes-c.
Related Issues (20)
- Storing IV after encryption HOT 2
- How to pass the data type as string for key and plain_test varibles in test_encrypt_ecb_verbose method HOT 5
- AES256 mode HOT 1
- Add AES-CFB.. HOT 1
- aes.h comment error
- AES reversing HOT 3
- Zenner default AES keys would be needed HOT 3
- Uninitialized AES key HOT 2
- Change bitrate at runtime, via Init ?
- USML or ECCN number? HOT 2
- openssl 1.1.1t support TLSv1 and TLSv1_1? or how to enable TLSv1 and TLSv1_1 on openssl 1.1.1t? HOT 1
- HOW TO IMPLEMENT IN STM32 MICROCONTROLLERS HOT 1
- tiny-AES-c AES256 CTR interoperability HOT 3
- heap overflow while decrypting HOT 1
- Different values in x64 and x32 HOT 1
- Issues with Include HOT 1
- Request for AES128_CBC Encryption Help HOT 1
- Maybe you want replace INT to UINT8_T here.
- Does tiny-AES-c support GCM mode?
- Reduce size even more: generate s-box table? HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tiny-aes-c.