Comments (32)
I think they are using a JS proxy, so that from the JS point of view the python dict looks like an object, but mutating the object also mutates the python dict.
PythonMonkey dev here: Yup - PythonMonkey passes the underlying data by reference for dictionaries and objects! Thanks for pointing out that behaviour with the number keys, there's some other edge cases like that present at the moment we're working on ironing out!
If anyone has any questions about PythonMonkey or just wants to chat, just send an email to [email protected] to get in touch with the devs!
Also, its really cool to see one of our projects mentioned here!!!! We're huge fans of Pyodide and it plays an important role in our main product at Distributive (Distributed computing using JS/WASM engines)
from pyodide.
Opened PR #4576.
from pyodide.
Random idea for "fixing" the weird behavior when mixing string and non-string keys. We could introduce the following rules:
- non-string keys are completely hidden and inaccessible from the JS world
- if a python dict contains any non-string keys, the JS world cannot mutate it (the problem is that this is non-trivial to check and potentially expensive)
Basically, if a python dict contains only string keys, it's isomorphic to a JS object and they can freely interact. If it contains other keys, we try to avoid that the JS world can mess up with it. But it's also 1:52 am for me, so I might be saying nonsense :)
from pyodide.
Hi, apologies for the naive small-brained fumblings, bear with me here... π¬ π§
I think this suggestion is a good one.
Why?
- It doesn't break anything (folks [i.e. PyScipt and other Pyodide users] can provide an optional alternative converter that works for their particular use case - Pyodide just keeps on working as before should such an alternative not be given).
- While I understand a
map
might be a more correct mapping (no pun intended π ), all the JS APIs and modules I know (and I suspect JS users in general) expect an object literal and NOT a map. As a result there is an added burden of conversion in user-created code that it would be nice to (optionally) remove for folks, if given the opportunity to do so. - Let many flowers bloom etc etc etc π π·
I know @tedpatrick pointed out this situation pretty much in his first week working with Pyodide, and he and @fpliger started work on a "hack" around it. I believe @WebReflection's suggestion in this issue of an optional parameter would help folks avoid having to create such hacks and give everyone a single and easy to understand way to flex Pyodide's power to their own use-cases.
I want to be very clear here: I care very deeply about Pyodide as a project upstream to one of my own (PyScript), and I don't want folks in such upstream projects to feel burdened by me. Rather, I want to engage in a supportive yet open manner, especially when I'm unsure why there's not been any movement on an issue. (I'm 99.9% certain it's because of time constraints - I have huge empathy for that situation).
So I have two questions:
- What can we do to engage, collaboratively and in a friendly spirit of moving things forward, so progress is made on addressing this situation so Pyodide can grow..?
- If we (folks in PyScript) provided a PR that implements the proposed behaviour in as small and self-contained manner as possible, would such a PR be considered for merging by Pyodide maintainers..?
Best wishes to all. π€
from pyodide.
Sorry for the late reply! I also think it would be useful to convert dict to object regarding its wider usage in JS ecosystem. But I don't want to introduce a breaking change for this right now, so I am +1 for adding defaultDictConverter
(or whatever we will call it).
So, unless @hoodmane rejects, I am open to accepting if PyScript folks open a PR implementing this.
from pyodide.
You can access Module and js APIs inside python2js so probably you can start from there.
from pyodide.
Thanks for all the proposals @WebReflection. I'm not sure I'll have to think about it more and get back to you. This LiteralMap
suggestion is an interesting idea.
from pyodide.
P.S. any Map operation on that instance would work so that get
, set
, or has
works out of the box but should never be revealed as object literal entry unless explicityl re-defined ,,, the in
operator would work too so as it is, my latest code, should cover all the the details and I think you could swap current Map to be that LiteralMap and see all your tests green already!
from pyodide.
In a nutshell: https://medium.com/@willkantorpringle/pythonmonkey-javascript-wasm-interop-in-python-using-spidermonkey-bindings-4a8efce2e598#f7a6
from pyodide.
I played a bit with pythonmoneky.
They are not converting python dicts to JS objects, even if they have a meme which says so π
. They even write it explicitly few lines below the meme:
For example, when a Python dictionary is passed to a JavaScript function, a βproxyβ object is created.
So, this is very different than pyodide's to_js(..., dict_converter=js.Object.fromEntries
), which creates a copy.
However, I think they are doing something interesting compared to pyodides' PyProxy: I think they are using a JS proxy, so that from the JS point of view the python dict looks like an object, but mutating the object also mutates the python dict.
This is the code which I used to play with it:
from pythonmonkey import eval as js_eval
import pythonmonkey as pm
make_jsobj = js_eval(r"""
(x) => {
return {};
};
""")
identity = js_eval(r"""
(x) => {
return x;
};
""")
add_some_keys = js_eval(r"""
(x) => {
x.foo = 42;
x['bar'] = 43;
x[100] = 44;
};
""")
JS objects are also not converted to python dicts: they become proxies when they enter the python world:
>>> obj = make_jsobj()
>>> obj
{}
>>> type(obj)
<class 'pythonmonkey.JSObjectProxy'>
If you pass a python dict to JS and back, the round-trip preserves the identity, which is very good:
>>> d = {}
>>> d2 = identity(d)
>>> type(d2)
<class 'dict'>
>>> d is d2
True
The fact that python dicts are proxied becomes very apparent if we try to mutate the object in JS land: the changes are reflected to the python dict (good):
>>> add_some_keys(d)
>>> type(d)
<class 'dict'>
>>> d
{'foo': 42.0, 'bar': 43.0, '100': 44.0}
However, there are still very weird corner cases. For example, consider this:
>>> d3 = {100: 'hello'}
>>> add_some_keys(d3)
>>> d3
{100: 'hello', 'foo': 42.0, 'bar': 43.0, '100': 44.0}
Notice what happened: when you set a numeric property on a JS obj, the property is always converted into a string, so x[100] = 44
is effectively equivalent to x['100'] = 44
.
This quickly leads to more unexpected things:
>>> get_100 = js_eval(r"""
... (x) => {
... return x[100];
... };
... """)
>>>
>>> d = {100: 'hello'}
>>> print(get_100(d)) # I believe this is `undefined` converted to None
None
>>>
>>> d2 = {100: 'hello', '100': 'world'}
>>> get_100(d2)
'world'
And Object.keys
becomes even more weird (NOTE: Object.entries
segfaults):
>>> log_keys = js_eval(r"""
... (x) => {
... console.log(Object.keys(x));
... };
... """)
>>> log_keys(d2)
[ '100', '100' ]
And even more:
```pycon
>>> class MyClass:
... pass
...
>>> d = {MyClass(): 1}
>>> log_keys(d)
[ 'undefined' ]
I have mixed feelings about the pythonmonkey solution:
- it solves the "python-dicts-used-as-js-literals" problem (VERY GOOD)
- it preserves mutability (good)
- it preserves identity on a round trip (good)
- it introduces very weird corner cases (bad, of course)
As usual, my position is still that a "perfect solution" cannot exist because python dicts and js objects are not isomorphic but, all considered, I think this is an idea worth exploring. Thanks @WebReflection for discovering it.
@hoodmane I'm curious to know your opinions
from pyodide.
For the "using JS objects as Python dictionaries" side we have as_object_map
which has been a useful addition for the "objects as drop-in replacements for dictionaries" crowd. There is also an unstable hack for enabling this as the global default, which some people have apparently been using:
from _pyodide_core import jsproxy_typedict
from js import Object
jsproxy_typedict[0] = type(Object.new().as_object_map())
I think adding a "using Python dictionaries as JS objects" analog of this would be a good next step.
from pyodide.
we also have o.as_object_map(hereditary=True)
so if o = run_js("{a: {b: 2}}")
then o.as_object_map(hereditary=True)["a"]["b"]
works. But what we can't do is deal with o = run_js("{a: [{b: 2}]}")
where we want o["a"][0]["b"]
because hereditary=True
doesn't make its way through the list. This should ideally be fixed...
from pyodide.
I'm not sure to fully understand how as_object_map
works, but let me add a remark:
There is also an unstable hack for enabling this as the global default, which some people have apparently been using:
I know that this is a hack, but in case you are considering to turn it into a real, configurable option, I suggest to be very careful, because it's a slipper slope.
Hopefully, python-in-the-browser will be a success and we will see an explosion of python+js libraries: but if the Python<=>JS mapping depends on a global option, it's very easy to end up in a situation on which half of those libs work only with "option 1" and the other half only with "option 2", making them mutually incompatible.
from pyodide.
There are no active plans to turn it into a stable option for exactly this reason. But I'm also not planning to intentionally break it. But if it keeps working as a not-stable thing for long enough it could have the same bifurcate the ecosystem effect anyways...
from pyodide.
I keep asking this without answers β¦ which projects passes Maps to JS signature without converting these to literals? Iβd love to see data to discuss βthe future with issuesβ
from pyodide.
long time no see, but this request is still open ...
MicroPython converts Python dictionaries to JS objects literals
PythonMonkeys does the same ... quoting:
Python Lists and Dicts behave as Javacript Arrays and Objects, and vice-versa, fully adapting to the given context.
it's irrelevant for users if proxies are used or not, Proxies are inevitable but also born and meant to be a transparent layer for the running program itself.
Accordingly ... as no modern JS API accepts Map as configuration objects or init or anything else, can we please focus on the required feature and answer yes/no so that we can eventually do monkey patch that ourselves?
Thank you!
from pyodide.
Thanks @ryanking13, the idea was indeed to not break at all anything, just propagate a different default converter if, and only if, specified while bootstrapping. All details about being it deep or not are also left to the user bootstrapping so that nothing would change on Pyodide side but we have an entry point that one day, maybe, could become the default.
I'll have a look at pyodide internals to see if such proposal can be minimal, non-breaking, and reasonable enough, hoping that @hoodmane would eventually review and approve it too.
from pyodide.
I went this far and managed to obtain a default converter on the JS side, here the patch:
diff --git a/src/core/pyproxy.ts b/src/core/pyproxy.ts
index 502950a7..376ec9e5 100644
--- a/src/core/pyproxy.ts
+++ b/src/core/pyproxy.ts
@@ -689,7 +689,7 @@ export class PyProxy {
depth = -1,
pyproxies = undefined,
create_pyproxies = true,
- dict_converter = undefined,
+ dict_converter = Module.defaultDictConverter,
default_converter = undefined,
}: {
/** How many layers deep to perform the conversion. Defaults to infinite */
diff --git a/src/js/pyodide.ts b/src/js/pyodide.ts
index 3f3545ae..52a91852 100644
--- a/src/js/pyodide.ts
+++ b/src/js/pyodide.ts
@@ -39,6 +39,7 @@ export type ConfigType = {
_node_mounts: string[];
env: { [key: string]: string };
packages: string[];
+ defaultDictConverter?: (array: Iterable<[key: string, value: any]>) => any;
};
/**
@@ -152,6 +153,11 @@ export async function loadPyodide(
* while Pyodide bootstraps itself.
*/
packages?: string[];
+ /**
+ * The default `dict_converter` to use via explicit `to_js` or whenever
+ * Python dictionaries are passed to a JS callback.
+ */
+ defaultDictConverter?: (array: Iterable<[key: string, value: any]>) => any;
/**
* Opt into the old behavior where PyProxy.toString calls `repr` and not
* `str`.
@@ -192,6 +198,7 @@ export async function loadPyodide(
Module.print = config.stdout;
Module.printErr = config.stderr;
Module.arguments = config.args;
+ Module.defaultDictConverter = config.defaultDictConverter;
const API = { config } as API;
Module.API = API;
diff --git a/src/js/types.ts b/src/js/types.ts
index c812d28b..b972b988 100644
--- a/src/js/types.ts
+++ b/src/js/types.ts
@@ -261,6 +261,7 @@ export interface Module {
module: WebAssembly.Module,
) => void,
) => void;
+ defaultDictConverter?: (array: Iterable<[key: string, value: any]>) => any;
}
type LockfileInfo = {
However I wouldn't know what's the best way to reflect that on the python2js side as that needs to be built and it's not clear to me if I can tell it anyhow to use the very same Module.defaultDictConverter
passed at loadPyodide
time so any hint/help would be more than welcome.
That being said, building via conda was extemely easy and straight forward to do but I was missing a single xxd executable which I've manually added on my Linux distro and after that everything went super smooth: kudos to that!
from pyodide.
@WebReflection bravo!
If I understand correctly, you're saying the Python -> JS handler is compiled and so cannot be given as a handler function at runtime on initialisation of Pyodide.
Hmm. This is problematic. Perhaps we can brainstorm approaches..?
from pyodide.
Hmm. This is problematic. Perhaps we can brainstorm approaches..?
As I am not familiar with the C++ side of affairs in here, my thinking is that through the WASM exposed methods it's possible to set different things, for instance, like the string handling part but I wouldn't know where to store that in the C++ side as reference to check when the dict_converter
is not explicitly passed.
I dojn't want to mess up specially that code but I'll try to see if I can reflect other methods able to change the WASM behavior.
from pyodide.
I'm still -1 on this because of what @antocuni said which I think is an critical flaw:
but if the Python<=>JS mapping depends on a global option, it's very easy to end up in a situation on which half of those libs work only with "option 1" and the other half only with "option 2", making them mutually incompatible.
I agree that accepting a patch implementing behavior like this would be irresponsible. I've been thinking about what to do for you guys, we need to come up with a design that works for you but does not have the drawback.
from pyodide.
@hoodmane for the little I've read the code it looks to me, and please correct me if I am wrong, that the Module
is created fresh-new per each loadPyodide(...)
invoke, granting that if one pyodide was bootstrapped in a different way, other pyodide instances won't be affected.
If I am already wrong in here, I fully agree if the patch ends up poisoning everything else before or after, we need a better idea and you can stop reading already π
If I understand your concern correctly, and you are still here reading, you are saying that foreign packages might break unexpectedly if we bootstrap pyodide with a different default converter and they actually trust the resulting object on the JS side is a Map and not an object literal.
I think there are at least 2 scenarios to consider here though:
- a JS function expect a PyProxy to deal with the real Python reference and changes in there should be also reflected back to the Python world
- a JS function expect JS variables, which I believe is the 90% of the cases ...
As example, pyodide internally uses the Object.fromEntries
to pass an object literal to the Request (fetch) API and that's what we're after: every JS API expects JS values to deal with but all Web APIs expect object literals and usually no API expects a Map, if not for internal usage.
So let's consider this basic code:
globalThis.log = dict => {
console.log({...dict});
};
pyodide.runPython('import js; js.log({"ok": True})');
In this really basic scenario the value
will be a Proxy with an Object handler and nothing would work.
To make it work as expected for APIs that expects objects to config or pass values/references around, we need the to_js
which by default passes a map.
Now we do that:
pyodide.runPython('from pyodide.ffi import to_js;import js; js.log(to_js({"ok": True}))');
As result, we have an empty object in the most simple JS utility that expects a read-only literal.
Enter the Workers world
As Proxy cannot survive a postMessage
dance, we need to intercept proxies and inevitably convert these into structured-clone friendly primitives and there the 1:1 relation which makes mutability possible is kinda lost.
On the other hand, we never had an issue with any package to date by passing those references as copy and as object literals, not Map, because all users' code and packages using JS APIs will eventually use Object.fromEntries
as dict_converter
field, just like pyodide does already, while passing Map copy from a PyProxy will happily break everything we've demoed to date and that's working.
Experimental, maybe?
Assuming you are following the logic here, where mutating a Python received reference is rare while expecting object literals is natural in the JS world and for most common JS APIs if not all of them on the Web, could we maybe flag the proposed change as experimentalDefaultDictConverter
and actually see what breaks and what suddenly works out of the box instead?
I think we would be overly-happy to provide that data or finally realize that was a bad ideal all together but if that's not the case, I think pyodide users will finally, eventually, win once this flag becomes common and battle-tested enough, and we see there's no side-effect around it:
- Python reference that mean to be mutated will still work as expected - as a matter of fact, these never want you to use
to_js
or that relation is lost regardless of what we decide in here - JS APIs that won't likely ever expect a Python reference or won't likely mutate foreign objects just "because" will work without issues
Alternatively
We could at least have PyProxy for objects work also as object literals by skipping Map
methods and yet allow direct field access behind the scene (that is obj.a
working just like obj.get('a')
via Proxy) and add an ownKeys
and an in
trap so that spreading these will work and in
operations will work too ... could this be an option?
Thanks!
from pyodide.
@hoodmane as code might speak thousand words, this is my idea around the current Map returned by default when to_js
is used:
const descriptor = (value, enumerable) => ({
value,
enumerable,
writable: enumerable,
configurable: true,
});
class LiteralMap {
constructor(...args) {
return new Proxy(new Map(...args), this);
}
getOwnPropertyDescriptor(map, key) {
if (map.has(key)) return descriptor(map.get(key), true);
if (key in map) return descriptor(map[key], false);
}
getPrototypeOf() {
return this;
}
has(map, key) {
return map.has(key) || key in map;
}
get(map, key, proxy) {
const proxied = map.has(key);
const value = proxied ? map.get(key) : map[key];
return typeof value === 'function' ? value.bind(proxied ? proxy : map) : value;
}
set(map, key, value) {
return !!map.set(key, value);
}
ownKeys(map) {
return [...map.keys()];
}
}
It can be improved with an isProxy
static utility or something, if needed, but all of this would work out of the box:
const lm = new LiteralMap([['a', 1], ['b', 2]]);
console.assert(lm.a === 1);
console.assert(lm.get('b') === 2);
lm.c = 3;
lm.set('d', 4);
console.log({...lm}); // {a: 1, b: 2, c: 3, d: 4}
console.log(lm instanceof LiteralMap); // true
lm.method = function () {
console.assert(this === lm);
console.log(this.a); // 1
};
lm.method();
The idea is that this Map would not break current code assuming there is a map and new code assuming it's an object literal should transparently work out of the box ... the dict_converter
would still apply without pyodide needing to change much code behind the scene.
Is this approach somehow welcomed?
from pyodide.
@hoodmane thanks for considering that ... if instanceof Map
operation is any of your concerns, you can just add extends Map
to the class
definition and nothing should break out of brand-checking after.
class LiteralMap extends Map {
constructor(...args) {
return new Proxy(new Map(...args), super());
}
// ... the rest of the code ...
}
edit that would create a duplicated Map but if that's the issue I can find better ways to avoid such issue
edit2 ... formerly, as I've been too smart up there:
const { prototype } = Map;
const { getOwnPropertyDescriptor } = Object;
const getOwnDescriptor = value => ({
value,
enumerable: true,
writable: true,
configurable: true
});
const handler = {
getOwnPropertyDescriptor(map, key) {
if (map.has(key)) return getOwnDescriptor(map.get(key));
if (key in map) return getOwnPropertyDescriptor(prototype, key);
},
get(map, key, proxy) {
const proxied = map.has(key);
const value = proxied ? map.get(key) : map[key];
return typeof value === 'function' ?
value.bind(proxied ? proxy : map) :
value;
},
has: (map, key) => map.has(key) || key in map,
set: (map, key, value) => !!map.set(key, value),
ownKeys: map => [...map.keys()],
};
class LiteralMap extends Map {
constructor(...args) {
return new Proxy(super(...args), handler);
}
}
This allows both brand checks on lm instanceof Map
and lm instanceof LiteralMap
with actually terser code too.
from pyodide.
It does seem like it could be a good best-of-both-worlds fix. I haven't come up with any downsides yet. We'll probably want to return keys from the Map.prototype
in favor of keys from the dictionary, otherwise dictionaries with keys like "get"
will cause trouble.
@ryanking13 what do you think?
from pyodide.
We'll probably want to return keys from the Map.prototype in favor of keys from the dictionary, otherwise dictionaries with keys like "get" will cause trouble.
that won't play nice on {...spread}
because that is not an operation people do on Map instances so I think the way it is now works for, to use your words, the best-of-both-worlds scenarios ... but that was my idea in general:
- do not break anything expecting a map
- provide a consumable Proxy for all other cases
Details are evil only if a passed dictionary contains get
or set
or others out there, but then again those would also break already and be a not-cool-python-dictionary in doing so, as that would be super ambiguous in Pythoh too .. for any other configuration object, get
and set
could still be a valid object literal entry to consume in the JS world, so we're good there with use and edge cases.
from pyodide.
to be fully explicit on my latest suggestion, this is the class: https://github.com/WebReflection/literal-map#readme
from pyodide.
Thanks for suggesting and testing LiteralMap
@WebReflection! Yes, I think this is the best compromise we can come up with to satisfy our mutual wishes.
We'll have to be careful to identify and document where this LiteralMap behaves differently than an Object, and in what part people might get unexpected result, but I don't think that's a blocker for adapting this. We already know when there are differences between the corresponding types in Python and JavaScript, and I think Pyodide's users will understand this too.
from pyodide.
@ryanking13 I am working "as we speak" on a 100% code-covered module with better logic that should never fail expectations. The base code there is not perfect and there is actually a possible issue with one trap and a missing depeteProperty
trap too ... I am on it though and I will ping you once it's published so you can easily test and/or contribute and/or tell me which part works and which one doesn't. I hope this is fine to you.
from pyodide.
There we go: https://github.com/WebReflection/literal-map#readme
This could be the default converter for to_js
if no different dict_converter
is specified ... basically instead of a Map it returns a LiteralMap and that's pretty much it and it should "just work" β’οΈ
/cc @hoodmane @ryanking13
If you find anything that could be improved in the README feel free to suggest either in here or via a PR.
All functionalities have been tested and fully code-covered but if I've missed something or there's something that should work differently I am happy to improve there.
If this will be successfully tested and used in Pyodide I will be happy to flag it as v1.0.0
as I think there's really no need for maintenance or changes as the primitives behind are pretty robust/stable so that maybe that versioning would hint the utility/helper is actually done.
from pyodide.
Just wanted to add what a pleasure it has been watching this solution take shape. Bravo to everyone involved. π
from pyodide.
FYI I have published a v0.2.0
of the module after brainstorming myself how to circumvent the fact that in Python dict.get(...)
and dict["get"]
are a very well established pattern while in JS they all result into ref["get"]
access, from a Proxy point of view.
I hope this version with extra utilities helps deciding that everything is always under control but no leaks of the underlying map happen: https://github.com/WebReflection/literal-map?tab=readme-ov-file#differently-from-python-dictionaries
The "nitty-gritty" of the solution is that the class itself can forward to the hidden map all operations in a safe way.
from pyodide.
Related Issues (20)
- Debug pyodide release artifacts (and pyodide build debug support)
- "Building and testing Python packages out of tree" commands need some dev tools installed as prerequisites HOT 4
- Add esbuild-py HOT 4
- Add `PyMuPDF` package
- runPython return missing
- cmake -E capabilities is broken HOT 1
- Release 0.26.0 HOT 5
- datasets (huggingface) HOT 6
- Cannot create a WebGL2 context when using SDL HOT 3
- request to add pandasai HOT 1
- request to add llama-index
- request to add langchain HOT 1
- request to add Boto3 HOT 4
- fastobo
- loadPackage() not to access the file system if the cache already exists HOT 2
- cvxpy-base 1.4.3 HOT 2
- html5_canvas_backend no longer works. HOT 1
- RFC Package unvendored stdlibs to sdist / wheel? HOT 3
- `make` in docker container `pyodide/pyodide-env` fails with `could not create work tree dir 'emsdk'` HOT 8
- DOM types in pyodide lead to build issues with Angular and typescript. HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyodide.