Giter VIP home page Giter VIP logo

Comments (7)

stephenberry avatar stephenberry commented on May 24, 2024 2

@Nukoooo, I think I fixed the bug with #670. The issue was with hashing unicode characters, I was casting char bytes to uint64_t, which would make the values enormous for unicode bytes. The fix was to use: uint64_t(uint8_t(character)), this way the uint64_t value is between 0 and 255 as intended.

If you run into more issues, please let me know. And, thanks so much for pointing this out!

from glaze.

stephenberry avatar stephenberry commented on May 24, 2024

Thanks for this bug report. Could you share some of the unicode keys that are failing for you? I haven't tested much with unicode keys, so I'll have to look into it and add more unit tests.

from glaze.

Nukoooo avatar Nukoooo commented on May 24, 2024

Sure, the struct for one of my csv files

struct FishRecord
{
     float Duration;
     float FishSize;
     std::uint8_t Amount;

     std::string FishBaitName;
     std::string SurfaceSlapFishName;
     std::string MoochFishName;
     std::string BuffName;
     std::string FishingSpotPlaceName;

     std::string BiteTypeName;
     std::string CaughtFishName;
     std::string HooksetName;
};

template <>
struct glz::meta<FishRecord>
{
    using T = FishRecord;
    static constexpr auto value = object("上钩的鱼", &T::CaughtFishName, //
                                         "间隔", &T::Duration, //
                                         "尺寸", &T::FishSize, //
                                         "数量", &T::Amount, //
                                         "鱼饵", &T::FishBaitName, //
                                         "拍水的鱼", &T::SurfaceSlapFishName, //
                                         "以小钓大的鱼", &T::MoochFishName, //
                                         "Buff", &T::BuffName, //
                                         "钓场", &T::FishingSpotPlaceName, //
                                         "咬钩类型", &T::BiteTypeName, //
                                         "提钩类型", &T::HooksetName //
    );
};

from glaze.

Nukoooo avatar Nukoooo commented on May 24, 2024

I went ahead and tested a little bit, "when there are one or more unicode keys in struct" doesn't seem to be the case.
With this struct with more than 2 unicode keys, it can be compiled:

template <>
struct glz::meta<unicode_keys_t>
{
    using T = unicode_keys_t;
    static constexpr auto value = object("简体汉字", &T::field0, // simplified chinese characters
                                         "漢字寿限無寿限無五劫", &T::field1, // traditional chinese characters / kanji
                                         "こんにちはむところやぶらこうじのぶらこうじパイポパイポパイポのシューリンガンシューリンガンのグーリンダイグーリンダイのポンポコピーのポンポコナの", &T::field2, // katakana 
                                         "한국인", &T::field3, // korean
                                         "русский", &T::field4, // cyrillic
                                         "สวัสดี", &T::field5, // thai
                                         "english", &T::field6 
    );
};

however, when I replace field2 with こんにちはむところやぶら, which basically removes a bunch of katakanas, it just fails to compile

from glaze.

stephenberry avatar stephenberry commented on May 24, 2024

@Nukoooo, I'm looking into this now. I'm wondering how you're using these structures with CSV reading. Glaze is designed to read into containers like std::vector<std::string> or std::deque<float>. Are you trying to read into a structure, like std::vector<FishRecord>? This is a reasonable desire/request, but it is currently not supported. Could you provide an example CSV for one of these use cases? I'm curious that sometimes things are working for you. I converted the types to std::vector<int> and was not able to replicate the unicode issues you are having, so there's something going on with the structure.

If you can clarify your use case a little bit more I think we can resolve this. Thanks!

from glaze.

Nukoooo avatar Nukoooo commented on May 24, 2024

Ah, I just realized I was providing the struct I used for testing and I'm sorry for that.

Are you trying to read into a structure, like std::vector

No, it was for testing in JSON, because I originally thought it was because of the CSV format so I tried removing std::vector in the struct.

I was reading into std::vector<float>, std::vector<std::uint8_t>, std::vector<std::string> for user-readable data and std::vector<std::int32_t>, std::vector<bool> for raw data:

struct FishRecord
{
    std::vector<float> Duration;
    std::vector<float> FishSize;
    std::vector<std::uint8_t> Amount;

    std::vector<std::string> FishBaitName;
    std::vector<std::string> SurfaceSlapFishName;
    std::vector<std::string> MoochFishName;
    std::vector<std::string> BuffName;
    std::vector<std::string> FishingSpotPlaceName;

    std::vector<std::string> BiteTypeName;
    std::vector<std::string> CaughtFishName;
    std::vector<std::string> HooksetName;
    std::vector<std::string> IsLargeSizeName;
    std::vector<std::string> IsCollectableName;
};

template <>
struct glz::meta<FishRecord>
{
    using T = FishRecord;
    static constexpr auto value = object("上钩的鱼", &T::CaughtFishName, //
                                         "间隔", &T::Duration, //
                                         "尺寸", &T::FishSize, //
                                         "数量", &T::Amount, //
                                         "鱼饵", &T::FishBaitName, //
                                         "拍水的鱼", &T::SurfaceSlapFishName, // 
                                         "以小钓大的鱼", &T::MoochFishName, //
                                         "Buff", &T::BuffName, //
                                         "钓场", &T::FishingSpotPlaceName, //
                                         "咬钩类型", &T::BiteTypeName, //
                                         "提钩类型", &T::HooksetName, //
                                         "大尺寸", &T::IsLargeSizeName, //
                                         "收藏品", &T::IsCollectableName //
    );
};

struct FishRecordRaw
{
    std::vector<float> Duration;
    std::vector<float> FishSize;
    std::vector<std::uint8_t> Amount;

    std::vector<std::int32_t> FishBaitIds;
    std::vector<std::int32_t> SurfaceSlapFishIds;
    std::vector<std::int32_t> MoochFishIds;
    std::vector<std::int32_t> BuffFlags;
    std::vector<std::int32_t> FishingSpotPlaceNameId;

    std::vector<std::uint8_t> BiteType;
    std::vector<std::uint32_t> CaughtFishId;
    std::vector<std::uint8_t> Hookset;
    std::vector<bool> IsLargeSize;
    std::vector<bool> IsCollectable;
}

template <>
struct glz::meta<FishRecordRaw>
{
    using T = FishRecordRaw;
    static constexpr auto value = object(
                                         "上钩的鱼", &T::CaughtFishId, //
                                         "间隔", &T::Duration, //
                                         "尺寸", &T::FishSize, //
                                         "数量", &T::Amount, //
                                         "鱼饵", &T::FishBaitIds, //
                                         "拍水的鱼", &T::SurfaceSlapFishIds, // 
                                         "以小钓大的鱼", &T::MoochFishIds, //
                                         "Buff", &T::BuffFlags, //
                                         "钓场", &T::FishingSpotPlaceNameId, //
                                         "咬钩类型", &T::BiteType, //
                                         "提钩类型", &T::Hookset, //
                                         "大尺寸", &T::IsLargeSize, //
                                         "收藏品", &T::IsCollectable //
    );
};

The user readable csv data:

上钩的鱼,间隔,尺寸,数量,鱼饵,拍水的鱼,以小钓大的鱼,Buff,钓场,咬钩类型,提钩类型,大尺寸,收藏品
珊瑚蝶,8.365,12.5,1,万能拟饵,无,无,撒饵,利姆萨·罗敏萨下层甲板,重杆,提钩,否,否
海港鲱,7.586,11.1,1,万能拟饵,无,无,撒饵,利姆萨·罗敏萨下层甲板,重杆,提钩,否,否
猛犸章鱼,26.044,55.6,1,万能拟饵,无,海港鲱,撒饵,利姆萨·罗敏萨下层甲板,鱼王杆,提钩,否,否

The raw data:

上钩的鱼,间隔,尺寸,数量,鱼饵,拍水的鱼,以小钓大的鱼,Buff,钓场,咬钩类型,提钩类型,大尺寸,收藏品
4876,8.365,12.5,1,29717,0,0,2,29,2,1,0,0
4874,7.586,11.1,1,29717,0,0,2,29,2,1,0,0
7707,26.044,55.6,1,29717,0,4874,2,29,3,1,0,0

from glaze.

stephenberry avatar stephenberry commented on May 24, 2024

Thanks for this. I made some progress today figuring it out. It has to do with the naive_map and unicode. Will hopefully have a fix soon.

from glaze.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.