Giter VIP home page Giter VIP logo

Comments (23)

chipaca avatar chipaca commented on July 22, 2024

Out of curiosity, which libc functions take or return ints? (turns out it's hard to get a list of that from google).

from micropython.

pfalcon avatar pfalcon commented on July 22, 2024

Checking closer, I don't see careful handling:

machine_int_t mp_obj_get_int(mp_obj_t arg);
int ord = mp_obj_get_int(o_in);

Careful handling for that would require explicit cast. And one may only wonder why -Wall in gcc doesn't include truncation warning. It takes -Wconversion to enable it.

from micropython.

pfalcon avatar pfalcon commented on July 22, 2024

Out of curiosity, which libc functions take or return ints? (turns out it's hard to get a list of that from google).

Like, you want complete list? ;-) (Almost) any C standard lib func which takes integer value, takes it as type int, even memchr is defined as memchr (void *__s, int __c, size_t __n). Most trivial one which returns int is strlen().

from micropython.

hagenkaye avatar hagenkaye commented on July 22, 2024

not really, functions such as strlen return size_t

Generally, on a 32 bit machine size_t would be 32 bits while on a 64 bit machine size_t would be 64 bits. I say generally, because even these cases are not 100% true at all times.

The C library defines most functions with these types of types instead of ints for portability. Also, the c sizeof() operator returns a size_t.

from micropython.

dpgeorge avatar dpgeorge commented on July 22, 2024

On my machine, it's size_t strlen().

Yes, I have not been careful enough with the distinction between int, unsigned int, machine_int_t and machine_uint_t. Whenever I have to write one of these in a type decl or defn, I stress about what the right thing to do is.

STM is 32-bit, so int, size_t and void* are all the same size for that port, so I wasn't super concerned about it. But I agree it needs to be sorted out.

Well, as suggested by @pfalcon, I would say use machine_int_t for cases where you are getting the value of a small int and doing small int arithmetic.

Don't you anyway have the problem on 64-bit machines using int, that it's not 64 bits?

from micropython.

pfalcon avatar pfalcon commented on July 22, 2024

not really, functions such as strlen return size_t

Good catch, I told that I don't do 64 bits and still partly live in happy world of K&R C which didn't have stuff like size_t ;-).

from micropython.

pfalcon avatar pfalcon commented on July 22, 2024

Ok, then if we need to deal with machine_int_t, then we need to be able to printf it in any port: #56.

from micropython.

dpgeorge avatar dpgeorge commented on July 22, 2024

Also, there's a fundamental problem with float vs double in printf. In C, for variadic arguments, you can only have doubles, and so in printf you can only print doubles. Problem: in the STM port we want to use only floats, since that's what the hardware supports. Really trick to get around (and gives some warnings currently for the stm port).

One solution I thought of was to pass pointers to floats for printf, and implement a "print-pointer-to-float" format specifier.

(Really should open another issue for this, but it's not high priority.)

from micropython.

hagenkaye avatar hagenkaye commented on July 22, 2024

perhaps this is helpful-

http://en.wikipedia.org/wiki/C_data_types#stdint.h

intptr_t is defined as an int that can hold a pointer in C99

Perhaps some other helpful info-

http://en.wikibooks.org/wiki/C_Programming/C_Reference/stdint.h#Integers_wide_enough_to_hold_pointers

from micropython.

pfalcon avatar pfalcon commented on July 22, 2024

From #683 :

I thought the issue was that int is used where it should be machine_int_t?

Yes.

Using int and then later promoting to machine_int_t is fine, no?

If it's used to hold value known to be strictly int, it's fine, in all other cases - no, as it can lead to truncation issues (we know that sizeof(int) <= sizeof(machine_int_t)).

from micropython.

dpgeorge avatar dpgeorge commented on July 22, 2024

If it's used to hold value known to be strictly int, it's fine

If a number starts out life as an int, and is manipuated using ints, then it's fine. Of course, this may overflow, but that's a separate issue (since using a machine_int_t might also overflow).

in all other cases - no

What other cases? Only if there was at some point a conversion from machine_int_t down to int is there a problem.

from micropython.

pfalcon avatar pfalcon commented on July 22, 2024

Of course, this may overflow, but that's a separate issue (since using a machine_int_t might also overflow).

But they overflow differently.

What other cases? Only if there was at some point a conversion from machine_int_t down to int is there a problem.

I see it differently - all Python values are (at least) machine_int_t, all Python length are machine_int_t. That means that all values (and lengths) should be manipulated as machine_int_t. The only way int can be used is if it can be guaranteed it won't overflow (which is often hard).

from micropython.

dpgeorge avatar dpgeorge commented on July 22, 2024

But they overflow differently.

Until we have a decent amount of checking for int/machine_int_t overflow, I think the distinction you are making is a sub-leading effect.

If it's going to overflow, that's bad. If we use int then it'll overflow earlier (on 64 bit), so using machine_int_t will help a little, but won't solve the true problem.

from micropython.

pfalcon avatar pfalcon commented on July 22, 2024

If we use int then it'll overflow earlier (on 64 bit), so using machine_int_t will help a little, but won't solve the true problem.

The issue, if the proper type is used, it might not overflow at all. And it's rather expected that 64-bit build should operate of values of ~64 bit magnitude without overflow (and operate on objects of similar length).

If you say it can wait sure, I agree. But the issue does exist. And again,such issues are scholastic and boring, and become real issues only when CVE's are issued for them ;-).

from micropython.

dpgeorge avatar dpgeorge commented on July 22, 2024

I'm happy to start going through the core code and ironing out these issues. I won't really understand exactly what to do until I have a few good examples.

Start with obj.c/mp_get_index: there are 3 cases of ints: len passed in, i used internally, and the returned value. Now, the returned value should definitely be changed to a machine_int_t. But what about the signedness of all these? Input len is always non-negative, and i could be negative, while the return value is guarandeed to be non-negative.

from micropython.

pfalcon avatar pfalcon commented on July 22, 2024

Well, the simplest solution would be to just ban usage of int/uint (except for interfacing with "external APIs" (like C stdlib)). But I don't think that's good solution ;-) (but it might be).

Signedness (#683) is first approximation can stay as is. Trying to clean it up either, we'd need to: 0) check if it's defined how C performs signed vs unsigned. I'd say that most people intuitively expect it to be done as unsigned then (well, I surely do), but I'm not sure it's actually defined.

Assuming we can't rely on C sign-mix comparisons, we either need to track signed vs unsigned value in separate variables (not C way at all), or to cast it explicitly in comparisons. For example, in https://github.com/micropython/micropython/blob/master/py/obj.c#L342, i is known to be non-negative, so should be case to machine_int_t.

And as I already hinted, I'd really suggest to consider renaming machine_int_t to (for example) mp_int_t.

from micropython.

dpgeorge avatar dpgeorge commented on July 22, 2024

And as I already hinted, I'd really suggest to consider renaming machine_int_t to (for example) mp_int_t.

I don't recall your hint, but that sounds sensible. Also we can probably remove machine_ptr_t and machine_const_ptr_t, since they are always (true for all machines?) void *, and are only used to define mp_obj_t and const version.

from micropython.

pfalcon avatar pfalcon commented on July 22, 2024

I mentioned that people use "int" because "machine_int_t to" is just too long. ;-)

Yes, sounds good about the other types.

from micropython.

dpgeorge avatar dpgeorge commented on July 22, 2024

Okay, this is getting old, and I guess we came to a consensus: rename machine_int_t to mp_int_t (and uint version), and try to use that everywhere instead of int/uint.

If no further objections I'll make this change.

from micropython.

pfalcon avatar pfalcon commented on July 22, 2024

+1 for rename. I'm not sure if you intend to just do blind search&replace s/int/mp_int_t/ - one of the reason this is getting old is because some people might think that it's kind of "optimization" for 64-bit to use int is it's known that value is small enough. Are you ready to give that up? ;-)

from micropython.

dpgeorge avatar dpgeorge commented on July 22, 2024

On stmhal, it's 32 bit only, so we're not giving up anything :)

from micropython.

dpgeorge avatar dpgeorge commented on July 22, 2024

Renamed machine to mp in 40f3c02, and started converting int/uint to mp_int_t/mp_uint_t in 54eb4e7 and 3816182. The latter change needs to be done by hand, since there are occurrences of int/uint in comments and strings that should not be changed.

from micropython.

dpgeorge avatar dpgeorge commented on July 22, 2024

Almost all of the cases of int/uint in uPy core header files are now changed to mp_int_t/mp_uint_t where appropriate. And a lot of the cases in the .c source have also been fixed.

For future reference: you can use int/uint, but only if you know that your number will fit in 16 bits (since that's the minimum int/uint is guaranteed to be). In most cases you will use mp_int_t/mp_uint_t.

from micropython.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.