Giter VIP home page Giter VIP logo

Comments (19)

MestreLion avatar MestreLion commented on August 22, 2024 2

The fix itself is quite simple:

class Root(Compound):
    __slots__ = (
        'root_name',
    )

    # Optional, but safer than relying on .parse() to be always invoked
    def __init__(self, root_name: str = "", *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.root_name: str = root_name

    # Delete all other properties and its setters
    # or leave a temporary "DeprecationWarning":
    @property
    def root(self):
        print("Root.root is deprecated, just access its contents directly") 
        return self

    @classmethod
    def parse(cls, buff, byteorder='big'):
        tag_id = read_numeric(BYTE, buff, byteorder)
        if not tag_id == cls.tag_id:
            # Possible issues with a non-Compound root:
            # - root_name might not be in __slots__(), and thus not assignable
            # - not returning a superclass might be surprising and have side-effects
            raise TypeError("Non-Compound root tags is not supported:"
                            f"{cls.get_tag(tag_id)}")
        name = read_string(buff, byteorder)
        self = super().parse(buff, byteorder)
        self.root_name = name
        return self

    def write(self, buff, byteorder='big'):
        write_numeric(BYTE, self.tag_id, buff, byteorder)
        write_string(getattr(self, 'root_name', "") or "", buff, byteorder)
        super().write(buff, byteorder)

The biggest challenging in implementing this API breakage considerations. But if needed I'd happily provide a PR

from nbtlib.

vberlier avatar vberlier commented on August 22, 2024 1

The documentation, at least at the time, was a bit fuzzy. It said that nbt files are implicitly always compound tags. In python since we have dictionaries nbtlib doesn't see nbt compounds as a collection of named tags, where each tag has an intrisic name, but rather as a proper dictionary where tags don't have any intrisic name. This makes the interpretation of the root compound ambiguous. Since in this implementation tags don't have any intrisic names, we can make the assumption that the file is an implicit compound in the sense that we can assume it's prefixed with the TAG_Compound marker. This makes the root name a key of the implicit compound.

That's the reasoning that led to the way this is currently handled, but I know this just ends up adding unnecessary noise so I'd be happy to get rid of it. But yeah due to API breakage this would need to be published in a 2.0 release.

from nbtlib.

MestreLion avatar MestreLion commented on August 22, 2024 1

In my mind it didn't make much sense to keep Root around as you can just as well use File.parse() for extracting chunks in .mca files. I guess the awkward thing would be the useless save() and load() methods.

The useless (and potentially misleading) save and load in chunks are the exact reasons why I wanted a separate class for non-file root compounds.

The idea of making a Root class that could be specialized for all tag types is really neat but it's probably over-engineered for something that will realistically only be used for compound tags.

I agree. Specially now that we have "official" docs saying it's always Compounds. Not only realistically, but per the spec. But still... oh my... so tempting...

I have an idea that would probably be simpler to implement and just as flexible.

Sweet! I'm listing...

I could make it so that some_tag.write(fileobj, root="hello") prefixes the actual value with the tag name.

... tag name and tag type too, don't forget it. Hum, I like it. Gives me a shiver when I see a tag handles its own name (pymclevel-style of things which I mentioned in another issue), but this would be a honorable, worthwhile exception.

Or an additional pair of methods available for all tags like parse_root() and write_root() that would handle the root name and then delegate to the regular parse() and write() methods. [...]

Ewww, noooo, no, no. nbtlib is too elegant for this.

In the end, I think the most straight-forward solution is probably the best here, I'll restore the Root class.

Thanks!!!! :-D

(And if I fail to resist the temptation, I in the future extended Root with a meta like List does (I really like that pattern you did) to convert Root from a hardcoded Compound to a "dynamic mixin" class that could inherit from any class (and from itself, so Root[Compound] is stll a plain Root too))

But I will resist this temptation. There are none non-compound roots in the wild. It's in the damn spec. It's useless. I'm better than that.

from nbtlib.

MestreLion avatar MestreLion commented on August 22, 2024

While I'm still digesting your great points, and elaborating an extended answer, I've noticed you dropped Root class as a whole. Just to note this wasn't the proposed fix to the spurious Compound. File, as it is, will still create one.

And Root is still welcome as a class, because some tags, namely the chunks in an mca file, are not a File but still not a regular Compound.

from nbtlib.

MestreLion avatar MestreLion commented on August 22, 2024

There are 2 important and distinct design choices involved here, both are open to interpretation by the "official" docs:

1 - Who "owns" a tag's name? Itself or its parent?
2 - Is the root tag of an NBT data always a Compound?

For 1, some libraries, such as the old pymclevel and its Amulet-NBT successor, adopted the "tags have their own names" direct interpretation. It makes the API really clunky and awkward to use, having to set tag.name and the infamous tag.value everywhere.

nbtlib OTOH, thankfully adopted another interpretation: like in Python dicts, a tag's name is defined in (and by) its parent. A tag itself is unaware of its own name. Tags in a list don't even have names. And that's true in the NBT format too. It was a wise design decision that I'm really glad you did, it makes working with nbtlib a real joy, with its seamless and fluid integration with Python's builtins

from nbtlib.

MestreLion avatar MestreLion commented on August 22, 2024

But a tag not owning its own name leaves nbtlib with a (small) problem: What about the root? In NBT, the root tag (whatever type it is), does have a name. Minecraft might always set it to an empty string, but the data is still there and it has to be parsed.

In nbtlib's data model, that name belongs to the parent. But the root tag has no parent, so what to do with the parsed name? I see 2 options:

  • Discard it. It's always blank anyway. Assume so when writing. It'll work, but it feels uncomfortable to discard data, right?
  • Keep it. Somewhere. Just for the sake of using it again on write(). But where? That was the idea behind the Root class: a wrapper on Compound that simply stores the parsed name. A plain, read/write self.root_name: str attribute.

In my library (and in the example above), I did the latter. Root exists solely for parsing the initial NBT bytes and storing the root name.

Can that name be saved at File instead? Sure. But that would "force" all root tags to be a file. That's fine for .dat files. But a chunk in an mca file is not a file, not it a sense you could Chunk.load(otherchunk.filename). Chunks should not have self.filename. So it makes sense to have an intermediate Root between Fileand Compound, if even just for the sake of allowing a Compound to store a root_name while not having a filename attribute or a .load() method

from nbtlib.

MestreLion avatar MestreLion commented on August 22, 2024

So handling a root's name while keeping nbtlib's data model is easy, and can be addressed in many ways. Now for the hard part, the other major design decision:

  • Is the root tag always a Compound?

In practice, yes. And virtually all libraries assume so, in one way or the other. nbtlib assumed so when it created the "extra" Compound in File.parse(): by invoking super().parse(buff, ...) prior to parsing any bytes, it will create a Compound that thinks the root tag is its (only) child (whatever the type root actually is). On write(), that extra Compound will do what Compounds do: loop and write only their children, not themselves. And thus that (sole) child, the actual root tag, gets written, and everything works fine.

from nbtlib.

MestreLion avatar MestreLion commented on August 22, 2024

So, NBT-wise, nbtlib is not assuming the root is a compound. If it were something else, it would be handled properly and losslessly. But to do so it wraps the real root in that extra Compound, exposing that awkward tag[''][...] (or tag.root[...]) to clients, something this issue is proposing to eliminate. We should be able to handle a root tag like any other tag, the return value of File.load(...) should be a tag we can directly say tag[...]. Yes, it would be nameless, like all tags are. If you want to retrieve the original name, for whatever reason, go elsewhere. tag.root_name for example. But the root's name should not be an item of the the tag, being root should not create that extra layer.

from nbtlib.

MestreLion avatar MestreLion commented on August 22, 2024

In order to avoid the extra layer and preserve the root name, we must parse it separately from other tags, as each tag's .parse() assume its own name (and tag type!) was already processed by its parent. So someone, File itself or someone else, must read the initial NBT bytes and:

  • Parse the root name, then save or discard it
  • Parse the root tag type, then either:
    • Impose it must be a Compound. That's the Amulet approach. And mine too, in an extent. But I'm not happy with it.
    • Discard it, and on write() assume it was a Compound. Ouch! But, if you did the above, that's no problem.
    • Let it be anything, without creating an extra layer. That would be amazing, but hard to do. Keep reading...
  • Invoke and return the type's .parse(). Seems easy, right?

Problem is... a tag itself not only does not know its own name, it also does not write its own tag type on .write(). So whatever tag was returned by File.load(), it is unable to write itself as a root tag! root.write() would be a broken NBT if root was an ordinary tag. Writing root's tag type is somebody else's job. Its parent's job. And for root, who is that?

We need a way to save a tag, any tag, as root. That is, to write its own type and its own name, unlike other tags. So a root, conceptually, is not an entity by itself, but rather a trait. If we allow non-compounds as root, then "rootness" should be inherited by any tag. Root(or File) in this case is not a Compound subclass, but rather a Mixin class.

from nbtlib.

MestreLion avatar MestreLion commented on August 22, 2024

In this scenario, class File/Root inheriting from Compound becomes a problem. It means the root tag is always a Compound. And if we go that road, life is easy and removing the extra layer is simple:

  • File|Root.parse() reads the root tag type, aborts if not a Compound
  • Read and save the name.
  • write() the saved name and its own tag type
  • invoke super().parse(), which is actually Compound.parse(), which will loop and write its children

Life is then good and simple. But we're imposing a restriction on NBT. Spec implicitely says its fine, but technically the NBT data does allow more.

from nbtlib.

MestreLion avatar MestreLion commented on August 22, 2024

We need a way to save a tag, any tag, as root. That is, to write its own type and its own name, unlike other tags. So a root, conceptually, is not an entity by itself, but rather a trait. If we allow non-compounds as root, then "rootness" should be inherited by any tag. Root(or File) in this case is not a Compound subclass, but rather a Mixin class.

Just had this crazy (or brilliant) idea: why not make Root behave like your List? I.e., indexing it would create a new class on-the-fly. Just like List[Int] means "A List of type (or "flavor") Int", so would Root[Int] would mean "A Root (tag) of type Int". And there you go: any tag can now be a root tag!

  • Root's metaclass, like List, would restrict the subclasses types to only concrete tags (i.e. no Base, Array, Numeric, or similar generic superclasses)
  • It could, like List, keep a dictionary of the already created subclasses, and reuse them. So all Root[Compound]s would be the same class.
  • Root, as a parent, would be responsible for:
    • Initial NBT parsing of tag type and name in .parse()
    • Create (or reuse) a Root[type] class and instantiate it. tag_type = X (or similar name) would be a class attribute
    • Store the root name in the a root_name instance attribute
    • On .write(), use root_name and tag_type to write the initial bytes, then delegate to super().write()
  • It would not have load() or save() or filename(). that's a task for File
  • File could inherit from Root (base, without any index/subtype)
  • Root, unlike List, would not allow direct instantiation without indexing a subclass
  • It could have an __init__(*args, root_name: str = "", **kwargs) that pass on all other arguments to super()

So, what do you think of this idea?

from nbtlib.

MestreLion avatar MestreLion commented on August 22, 2024

I've just found this:

http://web.archive.org/web/20110723210920/http://www.minecraft.net/docs/NBT.txt

An NBT file consists of a single GZIPped Named Tag of type TAG_Compound.

Decoding example:
(Use http://www.minecraft.net/docs/test.nbt to test your implementation)
First we start by reading a Named Tag.
After unzipping the stream, the first byte is a 10. That means the tag is a TAG_Compound (as expected by the specification).

Which I guess settles the question. No need for Root to be a mixin (although I still believe it would be nice to), it can directly inherit Compound, and File can assume so when reading and writing.

That said, can I make 2 requests?

  • Can you resurrect the Root class? It's a very useful class to distinguish between some NBT data from an actual file.
  • Can you reopen this issue while we discuss this?

from nbtlib.

vberlier avatar vberlier commented on August 22, 2024

In my mind it didn't make much sense to keep Root around as you can just as well use File.parse() for extracting chunks in .mca files. I guess the awkward thing would be the useless save() and load() methods.

The idea of making a Root class that could be specialized for all tag types is really neat but it's probably over-engineered for something that will realistically only be used for compound tags. I have an idea that would probably be simpler to implement and just as flexible. I could make it so that some_tag.write(fileobj, root="hello") prefixes the actual value with the tag name. Or an additional pair of methods available for all tags like parse_root() and write_root() that would handle the root name and then delegate to the regular parse() and write() methods.

Of course the issue is that you could still call parse() or write() directly by mistake, so it's a weaker alternative to the Root class. In the end, I think the most straight-forward solution is probably the best here, I'll restore the Root class.

from nbtlib.

MestreLion avatar MestreLion commented on August 22, 2024

Full disclaimer here: in the meantime since I've opened this 12 days ago, I've played with some ideas about the chunk/region relationship that are completely unrelated to this File/Root discussion.

I'm trying to decouple Region from the NBT implementation, so Region acts more like a storage system delivering raw NBT data for someone else to instantiate as a Chunk/Compound, rather than the first class that Dimension holds, meaning API-wise Chunk would "talk" directly to World/Dimension: A Dimension would be a dictionary** of Chunks, not regions, using absolute (cx, cz) chunk coordinates, and the Anvil .mca files would a Chunk implementation detail.

*: well, not Chunks, but rather "[category][Chunk]", as of 1.17 we have region/, entities/ and poi/ subdirs.

In this new model, it does make sense for a Chunk to have a .save() and .load(), a specialized one that invokes Region's (cached) parsers for loading and saving.

This is all to say that, in the end, it is possible that my Chunks end up inheriting from File, so I won't use Root after all, lol.

But I still think they have a reason to exist, both conceptually and in my heart :)

from nbtlib.

MestreLion avatar MestreLion commented on August 22, 2024

I've been silent as I've spent the last 10 weeks or so playing Minecraft instead of coding for it. Well, actually 10% playing and 90% creating mods and resourcepacks. Inevitably I stumbled on MCP and Minecraft's source code, and it gave many useful answers and insights:

  • Root/File is always a Compound. Strictly typed as such in arguments and return types of NbtIo.read/write() and *Compressed() counterparts, throwing IOException on read if not (tag instanceof CompoundTag)
  • Root/File tag is always unnamed . NbtIo.read() simply discards whatever name is in the file data with a data.readUTF() call without any assignments, and NbtIo.write() calls data.writeUTF("") as name.

from nbtlib.

Netherwhal avatar Netherwhal commented on August 22, 2024

I am seeing this error:

TypeError: Non-Compound root tags is not supported: <class 'nbtlib.tag.End'>

on a couple of player.dat files that I know are not corrupt.

from nbtlib.

MestreLion avatar MestreLion commented on August 22, 2024

@Netherwhal , care to post some example files? I'm assuming they're empty players, correct?

In any case, I believe this would be better handled as a separate issue

from nbtlib.

Netherwhal avatar Netherwhal commented on August 22, 2024

@MestreLion / @vberlier - no those are proper nbt files that I can open without any issues with other tools.

I am happy to pay a commission for this to be fixed in this library.

from nbtlib.

MestreLion avatar MestreLion commented on August 22, 2024

@Netherwhal , what version are you using? I've tested latest 2.0.4 from Pypi and it handles your test case just fine:

~/minecraft/test $ venv
Cache entry deserialization failed, entry ignored
Collecting pip
  Using cached https://files.pythonhosted.org/packages/08/e3/57d4c24a050aa0bcca46b2920bff40847db79535dc78141eb83581a52eb8/pip-23.1.2-py3-none-any.whl
Collecting setuptools
  Using cached https://files.pythonhosted.org/packages/c7/42/be1c7bbdd83e1bfb160c94b9cafd8e25efc7400346cf7ccdbdb452c467fa/setuptools-68.0.0-py3-none-any.whl
Collecting wheel
  Using cached https://files.pythonhosted.org/packages/61/86/cc8d1ff2ca31a312a25a708c891cf9facbad4eae493b3872638db6785eb5/wheel-0.40.0-py3-none-any.whl
Installing collected packages: pip, setuptools, wheel
  Found existing installation: pip 9.0.1
    Uninstalling pip-9.0.1:
      Successfully uninstalled pip-9.0.1
  Found existing installation: setuptools 39.0.1
    Uninstalling setuptools-39.0.1:
      Successfully uninstalled setuptools-39.0.1
Successfully installed pip-23.1.2 setuptools-68.0.0 wheel-0.40.0

(venv) ~/minecraft/test $ pip install nbtlib
Collecting nbtlib
  Downloading nbtlib-2.0.4-py3-none-any.whl (28 kB)
Collecting numpy (from nbtlib)
  Using cached numpy-1.24.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
Installing collected packages: numpy, nbtlib
Successfully installed nbtlib-2.0.4 numpy-1.24.4

(venv) ~/minecraft/test $ python
Python 3.8.0 (default, Feb 28 2023, 16:22:29) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import nbtlib
>>> data = nbtlib.load('../mcworldlib/data/6d26bff4.dat')
>>> type(data)
<class 'nbtlib.nbt.File'>
>>> len(data)
55

Maybe this was fixed already by eda4d32 ?

from nbtlib.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.