Giter VIP home page Giter VIP logo

scnlib's People

Contributors

adembudak avatar agga avatar cjvaughter avatar danra avatar ednolan avatar eliaskosunen avatar jiayuehua avatar matbech avatar pawelwod avatar phoebe-leong avatar superwig avatar uilianries avatar verri avatar xvitaly avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scnlib's Issues

Prepare for 0.2

  • Resolve #9 (sort of)
  • Fix interop with std::string_view (fixed in 0d5929c)
  • Investigate vscan overload range reference qualifiers
  • Clean up CMake
  • Clean up directories (fixed in c004db3)
  • Fix warnings: -Wunused in benchmark/bench_int.cpp and in test/usertype.cpp (fixed in a087ca8)
  • Fix 64-bit builds in VS 2019 over at Appveyor (fixed in 1b7c2df)
  • Finish docs on input ranges

May need to be postponed:

Function requirements

//Add a function similar to the following

std::string strRepos = "https://github.com/fmtlib/fmt";
std::string strUrl;
std::string strProto;
std::string strUserName;
std::string strProject;
scn::scan(strRepos , "{0}://{1}/{2}/{3}", strProto, strUrl, strUserName, strProject);

How to scan from std::string

With an older version of this library, I used to do

std::string str = ...;
auto stream = scn::make_stream(str);
auto ret = scn::scan(stream, ...);

After updating to the latest, make_stream is no longer present (although it still appears in some test files).

What is the canonical way to parse from a std::string? Is there a reason this doesn't work (I get verbose template errors)?

std::string str = ...;
auto ret = scn::scan(str, ...);

I had some success with

std::string str = ...;
auto ret = scn::scan(std::string_view(str), ...);

but that seems unnecessarily verbose and there might be some issues with AppleClang on macOS.

Before i dig deeper, to find the error, I wanted to check what is the way it's supposed to work. Maybe scanning from std::string warrants a prominent example (in the Readme; the docs seem to be down).

How to scan to customized types correctly.

Scanning one variable into customized types is described in test/usertype.cpp.
But what if I want to scan multiple values?
If there is one user_type inside arguments, the indexing will be wrong.

# usertype.cpp #68

        std::string s1;
        std::string s2;
        auto ret = scn::scan("123 [4, 20] 1234", "{} {} {}", s1, ut, s2);
        CHECK(ret);
        CHECK(ut.val1 == 4);
        CHECK(ut.val2 == 20);
        CHECK(s1 == "123");
        CHECK(s2 == "1234");

excepted: test pass.
current:

[doctest] doctest version is "2.3.4"
[doctest] run with "--help" for options
===============================================================================
..\..\..\test\usertype.cpp(90):
TEST CASE:  user type<user_type>
  regular

..\..\..\test\usertype.cpp(75): ERROR: CHECK( s2 == "1234" ) is NOT correct!
  values: CHECK(  == 1234 )

===============================================================================
..\..\..\test\usertype.cpp(90):
TEST CASE:  user type<user_type2>
  regular

..\..\..\test\usertype.cpp(75): ERROR: CHECK( s2 == "1234" ) is NOT correct!
  values: CHECK(  == 1234 )

===============================================================================
[doctest] test cases:      3 |      1 passed |      2 failed |      0 skipped
[doctest] assertions:     21 |     19 passed |      2 failed |
[doctest] Status: FAILURE!

'NOMINMAX': macro redefinition

# file.cpp 32:
#elif SCN_WINDOWS
#define WIN32_LEAN_AND_MEAN
#define NOMINMAX  // no minmax redefinition
#include <Windows.h>
#undef NOMINMAX
#undef WIN32_LEAN_AND_MEAN
#endif

should check if NOMINMAX is defined or not first.

Explore scanning at compile-time

Perhaps CTRE-esque

int i;
scn::scan<"{}">(stream, i);

Or do we only limit ourselves to compile-time format string checking (fmt-esque):

int i;
scn::scan(stream, SCN_STRING("{}"), i);

scn::owning_file + scn::getline() 1000x slower than STL in release and 3000x slower in debug

I am reading 962 lines of text from a file, as 3 comma separated strings.

Source code (note I elided some error checking for clarity):
STL:

while (true)
        {
            std::getline(playbackFile, timestampStr, ',');
            if (playbackFile.eof())
                break;

            std::getline(playbackFile, instrumentName, ',');

            std::getline(playbackFile, str);

            playbackStrings_.emplace_back(timestampStr, instrumentName, str);
        }

scn:

       while (true)
        {
            result = scn::getline(result.range(), timestampStr, ',');
            if (result.error() == scn::error::end_of_range)
                break;

            result = scn::getline(result.range(), instrumentName, ',');
            result = scn::getline(result.range(), str);

            playbackStrings_.emplace_back(timestampStr, instrumentName, str);
    }

std::istream + std::getline, debug:

Load took 13ms202us

std::istream + std::getline, release:

Load took 2ms457us

scn::owning_file + scn::getline, debug:

Load took 47s636ms

scn::owning_file + scn::getline, release:

Load took 3s223ms

So that's a pretty massive difference.

I tried to use scn::mapped_file but that failed because of CRLF Windows vs Linux, scn::getline() returns something like "foo\r". There doesn't seem to be an easy fix for this as all the "separator" optional params are all char rather than strings, so you can't set "\r\n".
If I comment out enough checks I got it to run, and that is indeed fast:
scn::mapped_file + scn::getline, debug:

Load took 12ms843us

scn::mapped_file + scn::getline, release:

Load took 864us

So it's a choice between unusable due to CRLF or unusable due to being 1000x-3000x slower than std.

For now I'm sticking with STL (I don't actually need scnlib for this particular thing, I just wanted to try it out as I may need more sophisticated parsing in future). Please let me know if I'm doing anything dumb here, or there's a solution I've missed - thanks.

clang-cl v13.0.0 (msvcrt runtime). H/w: Core i9 12900K @ 4.9GHz + PCI4 M2 SSD @7GB/s.

scn::scan/prompt from stdin with literal text does not work

When trying to scan a format string with a literal at the start, a debug assertions fires:

File: C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.26.28801\include\xstring
Line: 1853

Expression: cannot seek string iterator before begin

Code I used, adapted from the examples (using scnlib 0.3):

#include <scn/scn.h>
#include <iostream>

int main() {
    int i = 0;
    if (scn::prompt("Hi there! What's your favorite number? ", "uhh, {}", i)) {
        std::cout << "Ooh, " << i << ", interesting!\n";
    }
    else {
        std::cout << "That doesn't look like any number I've ever seen.\n";
    }
}

Add support for fixed length arrays output variables

I was expecting something like this to work:

char bla[8];
scn::scan("input", "{}", bla);

However I'm getting this compiler error:
scnlib\include\scn\detail\args.h(118,53): error C2079: 's' uses undefined struct 'scn::v0::scanner<char,T,void>'

I'm aware that I can use a string_view output argument as a workaround:

string_view bla;
scn::scan("input", "{}", bla);

scan_list_ex() issue

This is pretty much an exact example from the docs:

std::vector<std::string> vec;
auto result = scn::scan_list_ex("123, 456,\n789", vec, scn::list_separator_and_until(',', '\n'));

The (unexpected) result is that the vector contains 2 elements "123," and "456," (ie the comma is in the result string).

If I change the input string to "123,456,\n789" (ie remove the space) the output is 1 element of "123,456,". So it looks a lot like it's just splitting at the whitespace.

I'm using v1.0 (latest available in vcpkg).

getline() does'nt accept std::string as output

Platform: MSVC

std::string_view line;
auto ret = scn::getline("aa\nbb", line);

error log:

usertype.cpp
C:\projects\scnlib\include\scn\detail/scan.h(354): error C2039: 'clear': is not a member of 'std::basic_string_view<char,std::char_traits<char>>'
c:\program files (x86)\microsoft visual studio\2019\community\VC\Tools\MSVC\14.24.28314\include\xstring(1709): note: see declaration of 'std::basic_string_view<char,std::char_traits<char>>'
C:\projects\scnlib\include\scn\detail/scan.h(424): note: see reference to function template instantiation 'scn::v0::scan_result<scn::v0::basic_string_view<char>,scn::v0::wrapped_error> scn::v0::detail::getline_impl<T,String,CharT>(WrappedRange &,String &,CharT)' being compiled
        with
        [
            T=scn::v0::detail::range_wrapper<scn::v0::string_view>,
            String=std::string_view,
            CharT=char,
            WrappedRange=scn::v0::detail::range_wrapper<scn::v0::string_view>
        ]
C:\projects\scnlib\include\scn\detail/scan.h(445): note: see reference to function template instantiation 'scn::v0::scan_result<scn::v0::basic_string_view<char>,scn::v0::wrapped_error> scn::v0::getline<const char(&)[6],String,char>(Range,String &,CharT)' being compiled
        with
        [
            String=std::string_view,
            Range=const char (&)[6],
            CharT=char
        ]
C:\projects\scnlib\test\usertype.cpp(139): note: see reference to function template instantiation 'scn::v0::scan_result<scn::v0::basic_string_view<char>,scn::v0::wrapped_error> scn::v0::getline<const char(&)[6],std::string_view,char>(Range,String &)' being compiled
        with
        [
            Range=const char (&)[6],
            String=std::string_view
        ]
c:\program files (x86)\microsoft visual studio\2019\community\VC\Tools\MSVC\14.24.28314\include\chrono(632): note: see reference to class template instantiation 'std::chrono::duration<double,std::ratio<1,1>>' being compiled
c:\program files (x86)\microsoft visual studio\2019\community\VC\Tools\MSVC\14.24.28314\include\chrono(178): note: see reference to class template instantiation 'std::chrono::duration<__int64,std::nano>' being compiled
c:\program files (x86)\microsoft visual studio\2019\community\VC\Tools\MSVC\14.24.28314\include\chrono(610): note: see reference to class template instantiation 'std::chrono::time_point<std::chrono::steady_clock,std::chrono::nanoseconds>' being compiled
C:\projects\scnlib\include\scn\detail/scan.h(355): error C2039: 'resize': is not a member of 'std::basic_string_view<char,std::char_traits<char>>'
c:\program files (x86)\microsoft visual studio\2019\community\VC\Tools\MSVC\14.24.28314\include\xstring(1709): note: see declaration of 'std::basic_string_view<char,std::char_traits<char>>'
C:\projects\scnlib\include\scn\detail/scan.h(368): error C2039: 'pop_back': is not a member of 'std::basic_string_view<char,std::char_traits<char>>'
c:\program files (x86)\microsoft visual studio\2019\community\VC\Tools\MSVC\14.24.28314\include\xstring(1709): note: see declaration of 'std::basic_string_view<char,std::char_traits<char>>'

Float parser accesses invalid heap memory

Hello,

I have stumbled on this behavior when troubleshooting a completely different problem with my code using address sanitizer. It seems that when there is no default locale, scn's float parser accesses invalid memory. This happens during calls to std::setlocale that override and restore locale (for instance here). Obviously, if there is no locale to restore this does not make too much sense.

The bug is present in all scn versions up to the latest. It does not really disrupt program execution, however it correctly triggers ASAN's hooks, so it could potentially lead to undefined behavior in the right circumstances. At the moment, the bug can be mitigated by ensuring that a default locale is always configured (e.g. by running the program in question with LC_NUMERIC=C).

I attach a (partially censored) ASAN report: asan.log

runtime performance comparison with boost::spirit::x3

At a first glance this library seems to be targeting simple pattern parsing and not grammar parsing (that's the one that I need most of the time) but simple parsing can also be done with boost spirit x3 so I use that library for more or less for all string parsing.

Since this is library is now the basis of a C++ standard proposal I'd be interested in the runtime performance compared to the library that I would like to see being the basis of a standardization efforts - boost::spirit::x3.

I wasn't able to find an example but my basic pattern is usually not a single item but a list of items of unknown length so comparison with something like this would be nice:

#include <boost/config/warning_disable.hpp>
#include <boost/spirit/home/x3.hpp>

#include <iostream>
#include <string>
#include <vector>

int main()
{
    std::string str{"1.3, 4.5, 20"};

    using boost::spirit::x3::double_;
    using boost::spirit::x3::phrase_parse;
    using boost::spirit::x3::ascii::space;
    
    std::vector<double> my_vector;

    bool r = phrase_parse(
        str.begin(),
        str.end(),
        double_ >> *(',' >> double_),
        space,
        my_vector
    );
    if (!r) // fail if we did not get a full match
        return 1;

    std::cout << "-------------------------\n";
    std::cout << "Parsing succeeded\n";

    for(auto f : my_vector)
        std::cout << f << ",\n";

    return 0;
}

This can be executed inside https://wandbox.org/

Feature request: Simple way to consider remaining input a failure

I often need to scan a string and check that the entire string was consumed. I can do:

int num;
auto result = scn::scan(filename, "item-{}.txt", num);
if (result && result.range().empty()) {
    // Filename is valid
}

but I would rather type something like:

int num;
if (scn::scan_all(filename, "item-{}.txt", num)) {
    // Filename is valid
}

Could this be considered (or is there some other simple way to do this that I am missing)?

int8_t and uint8_t support: unify char_scanner and integer_scanner

Currently, char and wchar_t are scanned by char_scanner, and other integral types are scanned by integer_scanner.
char_scanner reads a single code unit from the input range, and integer_scanner scans an integral value (parsing it from the input range).

int8_t (signed char) and uint8_t (unsigned char) are distinct types from char, but are currently not supported by either char_scanner or integer_scanner. Scanning 8-bit integers should be supported.

Proposed solution

  • Remove char_scanner, use integer_scanner for all integral types
  • Add {:c} option to integer_scanner, that scans a code unit
  • Default behavior for char: {:c} (scan a code unit), int8_t and uint8_t: {} (scan an integer)
// Proposed solution for v1.1
uint8_t a;
int8_t b, c;
char d, e;
auto ret = scn::scan("1 2 3 4 5", "{} {} {:c} {} {:c}", a, b, c, d, e);
// a == 1
// b == 2
// c == '3' (0x33)
// d == '4' (0x34)
// e == '5' (0x35)

Workaround for v1.0

// Workaround for v1.0
int8_t val;
// Scan an int
int a;
auto ret = scn::scan("123", "{}", a);
// Check limits
if (ret && a <= std::numeric_limits<int8_t>::max() && a >= std::numeric_limits<int8_t>::min()) {
    // valid int8_t scanned
    val = static_cast<int8_t>(a);
} else {
    // error
}

Invalid read after `setlocale`

I've just tracked down an invalid read with ASAN ("heap-use-after-free"), to the setlocale calls in scn::detail::read_float_impl. E.g.:

scnlib/src/reader.cpp

Lines 92 to 97 in 0a2a4ba

const auto loc = std::setlocale(LC_NUMERIC, nullptr);
std::setlocale(LC_NUMERIC, "C");
errno = 0;
double d = std::strtod(str, &end);
chars = static_cast<size_t>(end - str);
std::setlocale(LC_NUMERIC, loc);

It's not entirely clear from cppreference.com, but I believe that the value of setlocale("LC_NUMERIC", nullptr) is invalidated after std::setlocale(LC_NUMERIC, "C"). It somehow only made my Windows CI tests crash, but is pretty consistent there, and ASAN also triggers it on Linux.

ASAN stops complaining when I make the following change:

@@ -62,12 +62,12 @@ namespace scn {
             static expected<float> get(const wchar_t* str, size_t& chars)
             {
                 wchar_t* end{};
-                const auto loc = std::setlocale(LC_NUMERIC, nullptr);
+                const auto loc = std::string(std::setlocale(LC_NUMERIC, nullptr));
                 std::setlocale(LC_NUMERIC, "C");
                 errno = 0;
                 float f = std::wcstof(str, &end);
                 chars = static_cast<size_t>(end - str);
-                std::setlocale(LC_NUMERIC, loc);
+                std::setlocale(LC_NUMERIC, loc.c_str());
                 if (errno == ERANGE) {
                     errno = 0;
                     return error(error::value_out_of_range,

If you wish, I could make a PR to fix this?


That being reported, a related question:
One of the reasons I am very keen on using scn is because it provides locale-independent float-to-string conversions, and I had hoped it would stay away from the ugly global setlocale state. So kind of related to #24, would you be interested in getting rid of these setlocale calls?

I have a bit of a hackish solution for the moment, tapping into the low-level parts of iostream, but it does manage to not call setlocale. And afborchert/fmt-scan seems to be doing something similar (though probably much better than me). Maybe such a solution can be hidden inside scnlib? Compared to the suggestions in #24, these wouldn't require external dependencies.

Prepare for 0.1

The most glaring issue is the lack of documentation, but some other minor issues remain as well, that need to be fixed before release.

Unable to use scn::scan with string_view under MSVC.

The code from README is unable to compile under MSVC (Visual Studio 2019 16.4.2):

#pragma once
#include <string>
#include <string_view>
#include <scn/scn.h>
#include <iostream>

using namespace std;
using namespace std::string_literals;
using namespace std::string_view_literals;

int main()
{
    std::string_view str = "Hello world!"sv;

    std::string word;
    scn::scan(str, "{}", word);

    std::cout << word << '\n'; // Will output "Hello"
    std::cout << str << '\n';  // Will output " world!"
}

Error Output:

1>Source.cpp
1>C:\projects\vcpkg\installed\x64-windows\include\scn\detail\range.h(77,1): error C2440: 'return': cannot convert from 'initializer list' to 'std::basic_string_view<char,std::char_traits<char>>'
1>C:\projects\vcpkg\installed\x64-windows\include\scn\detail\range.h(77,1): message : No constructor could take the source type, or constructor overload resolution was ambiguous
1>C:\projects\vcpkg\installed\x64-windows\include\scn\detail\range.h(121): message : see reference to function template instantiation 'std::basic_string_view<char,std::char_traits<char>> scn::v0::detail::reconstruct<char,std::char_traits<char>,std::_String_view_iterator<_Traits>,std::_String_view_iterator<_Traits>>(scn::v0::detail::reconstruct_tag<std::basic_string_view<char,_Traits>>,Iterator,Sentinel)' being compiled
1>        with
1>        [
1>            _Traits=std::char_traits<char>,
1>            Iterator=std::_String_view_iterator<std::char_traits<char>>,
1>            Sentinel=std::_String_view_iterator<std::char_traits<char>>
1>        ]
1>C:\projects\vcpkg\installed\x64-windows\include\scn\detail\range.h(119): message : while compiling class template member function 'std::basic_string_view<char,std::char_traits<char>> scn::v0::detail::range_wrapper<std::string_view &>::range(void) const'
1>C:\projects\vcpkg\installed\x64-windows\include\scn\detail\range.h(126): message : see reference to function template instantiation 'std::basic_string_view<char,std::char_traits<char>> scn::v0::detail::range_wrapper<std::string_view &>::range(void) const' being compiled
1>C:\projects\vcpkg\installed\x64-windows\include\scn\detail\range.h(263): message : see reference to class template instantiation 'scn::v0::detail::range_wrapper<std::string_view &>' being compiled
1>C:\projects\vcpkg\installed\x64-windows\include\scn\detail\range.h(261): message : while compiling class template member function 'scn::v0::detail::range_wrapper<std::string_view &> scn::v0::detail::_wrap::fn::operator ()<std::string_view&>(Range) noexcept const'
1>        with
1>        [
1>            Range=std::string_view &
1>        ]
1>C:\projects\vcpkg\installed\x64-windows\include\scn\detail\scan.h(32): message : see reference to class template instantiation 'scn::v0::detail::range_wrapper_for<Range>' being compiled
1>        with
1>        [
1>            Range=std::string_view &
1>        ]
1>C:\projects\vcpkg\installed\x64-windows\include\scn\detail\scan.h(32): message : see reference to alias template instantiation 'scn::v0::detail::range_wrapper_for_t<std::basic_string_view<char,std::char_traits<char>>&>' being compiled
1>C:\Users\Weizehua\source\repos\test_scnlib\test_scnlib\Source.cpp(15): message : see reference to class template instantiation 'scn::v0::detail::scan_result_for_range<std::string_view &,scn::v0::wrapped_error>' being compiled
1>C:\projects\vcpkg\installed\x64-windows\include\scn\detail\scan.h(73): message : see reference to alias template instantiation 'scn::v0::detail::scan_result_for_range_t<std::string_view&,scn::v0::wrapped_error>' being compiled
1>C:\projects\vcpkg\installed\x64-windows\include\scn\detail\scan.h(135): message : see reference to function template instantiation 'scan_result_for_range<Range,scn::v0::wrapped_error>::type scn::v0::scan(Range &&,scn::v0::detail::default_t,Args &...)' being compiled
1>Done building project "test_scnlib.vcxproj" -- FAILED.

It seems std::basic_string_view don't accept an iterator and a length as arguments.
std::basic_string_view can only construct from (char *, size_t), So this forum is accetable:

// range.h #77:
// distance uses end - begin, which is unsupported for std::string_view with different range in DEBUG mode.
            return {&*begin, static_cast<size_t>(std::to_address(end) - std::to_address(begin))};

BTW, I guess the test case inside string_view.cpp should construct an run time string(std::string, raw string buffer, etc) instead of passing a compile-time string literal as first argument to scn::scan. Because nobody needs a library that scans from compile-time string.

small-vector test failure when building with GCC 11

small-vector test failure when building with GCC 11.

GCC version 11.0.0.

Log:

+ /usr/bin/ctest --output-on-failure --force-new-ctest-process -j48
Test project /builddir/build/BUILD/scnlib-0.4/x86_64-redhat-linux-gnu
      Start  1: test
      Start  2: empty
      Start  3: util
      Start  4: small-vector
      Start  5: string-view
      Start  6: reader
      Start  7: range
      Start  8: locale
      Start  9: std-string-view
      Start 10: pmr-string
      Start 11: result
      Start 12: istream
      Start 13: tuple-return
      Start 14: char
      Start 15: integer
      Start 16: float
      Start 17: string
      Start 18: buffer
      Start 19: bool
      Start 20: usertype
      Start 21: list
      Start 22: each-integer
      Start 23: each-char
      Start 24: file
 1/24 Test  #1: test .............................   Passed    0.02 sec
 2/24 Test  #2: empty ............................   Passed    0.02 sec
 3/24 Test  #3: util .............................   Passed    0.02 sec
 4/24 Test  #5: string-view ......................   Passed    0.02 sec
 5/24 Test  #6: reader ...........................   Passed    0.02 sec
 6/24 Test  #7: range ............................   Passed    0.01 sec
 7/24 Test  #8: locale ...........................   Passed    0.01 sec
 8/24 Test  #9: std-string-view ..................   Passed    0.01 sec
 9/24 Test #10: pmr-string .......................   Passed    0.01 sec
10/24 Test #11: result ...........................   Passed    0.01 sec
11/24 Test #12: istream ..........................   Passed    0.01 sec
12/24 Test #13: tuple-return .....................   Passed    0.01 sec
13/24 Test #14: char .............................   Passed    0.01 sec
14/24 Test #15: integer ..........................   Passed    0.01 sec
15/24 Test #16: float ............................   Passed    0.01 sec
16/24 Test #17: string ...........................   Passed    0.01 sec
17/24 Test #18: buffer ...........................   Passed    0.01 sec
18/24 Test #19: bool .............................   Passed    0.01 sec
19/24 Test #20: usertype .........................   Passed    0.01 sec
20/24 Test #21: list .............................   Passed    0.01 sec
21/24 Test #23: each-char ........................   Passed    0.00 sec
22/24 Test #24: file .............................   Passed    0.00 sec
23/24 Test #22: each-integer .....................   Passed    0.07 sec
24/24 Test  #4: small-vector .....................Subprocess aborted***Exception:   0.10 sec
free(): invalid size
[doctest] doctest version is "2.4.4"
[doctest] run with "--help" for options
===============================================================================
../test/small_vector.cpp:263:
TEST CASE:  small_vector<signed char>
  size+value construct stack
../test/small_vector.cpp:97: ERROR: CHECK( vec.is_small() ) is NOT correct!
  values: CHECK( false )
../test/small_vector.cpp:99: ERROR: CHECK( vec.capacity() == 64 ) is NOT correct!
  values: CHECK( 3038287259199220266 == 64 )
===============================================================================
../test/small_vector.cpp:263:
TEST CASE:  small_vector<signed char>
  accessors stack
../test/small_vector.cpp:263: FATAL ERROR: test case CRASHED: SIGABRT - Abort (abnormal termination) signal
===============================================================================
../test/small_vector.cpp:263:
TEST CASE:  small_vector<signed char>
DEEPEST SUBCASE STACK REACHED (DIFFERENT FROM THE CURRENT ONE):
  accessors stack
===============================================================================
[doctest] test cases:  1 |  0 passed | 1 failed | 5 skipped
[doctest] assertions: 41 | 39 passed | 2 failed |
[doctest] Status: FAILURE!
96% tests passed, 1 tests failed out of 24
Total Test time (real) =   0.11 sec
The following tests FAILED:
	  4 - small-vector (Subprocess aborted)

Some tests failed on Big Endian architecture

Test project /builddir/build/BUILD/scnlib-1.0/redhat-linux-build
      Start  1: test
      Start  2: empty
 1/30 Test  #1: test .............................   Passed    0.00 sec
      Start  3: fwd
 2/30 Test  #2: empty ............................   Passed    0.00 sec
      Start  4: util
 3/30 Test  #3: fwd ..............................   Passed    0.00 sec
      Start  5: small-vector
 4/30 Test  #4: util .............................   Passed    0.00 sec
      Start  6: string-view
 5/30 Test  #5: small-vector .....................   Passed    0.00 sec
      Start  7: reader
 6/30 Test  #6: string-view ......................   Passed    0.00 sec
      Start  8: range
 7/30 Test  #7: reader ...........................   Passed    0.00 sec
      Start  9: locale
 8/30 Test  #8: range ............................   Passed    0.00 sec
      Start 10: std-string-view
 9/30 Test  #9: locale ...........................   Passed    0.00 sec
      Start 11: pmr-string
10/30 Test #10: std-string-view ..................   Passed    0.00 sec
      Start 12: wrap
11/30 Test #11: pmr-string .......................   Passed    0.00 sec
      Start 13: utf8
12/30 Test #12: wrap .............................   Passed    0.00 sec
      Start 14: utf16
13/30 Test #13: utf8 .............................   Passed    0.00 sec
      Start 15: result
14/30 Test #14: utf16 ............................   Passed    0.00 sec
      Start 16: istream
15/30 Test #15: result ...........................   Passed    0.00 sec
      Start 17: format
16/30 Test #17: format ...........................   Passed    0.00 sec
      Start 18: tuple-return
17/30 Test #16: istream ..........................   Passed    0.01 sec
      Start 19: char
18/30 Test #18: tuple-return .....................   Passed    0.00 sec
      Start 20: integer
19/30 Test #20: integer ..........................***Failed    0.00 sec
[doctest] doctest version is "2.4.8"
[doctest] run with "--help" for options
===============================================================================
/builddir/build/BUILD/scnlib-1.0/test/integer.cpp:822:
TEST CASE:  consistency
  simple
/builddir/build/BUILD/scnlib-1.0/test/integer.cpp:822: ERROR: test case THREW exception: exception thrown in subcase - will translate later when the whole test case has been exited (cannot translate while there is an active exception)
===============================================================================
/builddir/build/BUILD/scnlib-1.0/test/integer.cpp:822:
TEST CASE:  consistency
DEEPEST SUBCASE STACK REACHED (DIFFERENT FROM THE CURRENT ONE):
  simple
/builddir/build/BUILD/scnlib-1.0/test/integer.cpp:822: ERROR: test case THREW exception: basic_string::substr: __pos (which is 12884901888) > this->size() (which is 7)
===============================================================================
[doctest] test cases:  38 |  37 passed | 1 failed | 0 skipped
[doctest] assertions: 941 | 941 passed | 0 failed |
[doctest] Status: FAILURE!
      Start 21: float
20/30 Test #19: char .............................   Passed    0.01 sec
      Start 22: string
21/30 Test #22: string ...........................   Passed    0.00 sec
      Start 23: string-set
22/30 Test #21: float ............................***Failed    0.01 sec
[doctest] doctest version is "2.4.8"
[doctest] run with "--help" for options
===============================================================================
/builddir/build/BUILD/scnlib-1.0/test/floating.cpp:285:
TEST CASE:  consistency
  simple
/builddir/build/BUILD/scnlib-1.0/test/floating.cpp:285: ERROR: test case THREW exception: exception thrown in subcase - will translate later when the whole test case has been exited (cannot translate while there is an active exception)
===============================================================================
/builddir/build/BUILD/scnlib-1.0/test/floating.cpp:285:
TEST CASE:  consistency
DEEPEST SUBCASE STACK REACHED (DIFFERENT FROM THE CURRENT ONE):
  simple
/builddir/build/BUILD/scnlib-1.0/test/floating.cpp:285: ERROR: test case THREW exception: basic_string::substr: __pos (which is 17179869184) > this->size() (which is 9)
===============================================================================
[doctest] test cases:  13 |  12 passed | 1 failed | 0 skipped
[doctest] assertions: 247 | 247 passed | 0 failed |
[doctest] Status: FAILURE!
      Start 24: buffer
23/30 Test #23: string-set .......................   Passed    0.00 sec
      Start 25: bool
24/30 Test #25: bool .............................   Passed    0.00 sec
      Start 26: usertype
25/30 Test #24: buffer ...........................   Passed    0.01 sec
      Start 27: list
26/30 Test #26: usertype .........................   Passed    0.00 sec
      Start 28: each-integer
27/30 Test #27: list .............................   Passed    0.00 sec
      Start 29: each-char
28/30 Test #29: each-char ........................   Passed    0.00 sec
      Start 30: file
29/30 Test #30: file .............................   Passed    0.00 sec
30/30 Test #28: each-integer .....................   Passed    0.06 sec
93% tests passed, 2 tests failed out of 30

__host__ __device__ errors when compiling with Clang 9 in CUDA mode

When including <scn/tuple_return.h>, for example, Clang 9 in CUDA mode results in the following errors:

.../scnlib-master/include/scn/detail/locale.h:268:9: error: reference to __host__ function '~unique_ptr' in __host__ __device__ function
        basic_locale_ref() = default;
        ^
.../scnlib-master/include/scn/detail/util.h:663:13: note: '~unique_ptr' declared here
            ~unique_ptr() noexcept
            ^
.../scnlib-master/include/scn/detail/locale.h:336:20: note: called by 'get_default'
            return basic_locale_ref();
                   ^
.../scnlib-master/include/scn/detail/locale.h:336:20: error: reference to __host__ function '~basic_locale_ref' in __host__ __device__ function
            return basic_locale_ref();
                   ^
.../scnlib-master/src/locale.cpp:177:20: note: in instantiation of member function 'scn::v0::basic_locale_ref<char>::get_default' requested here
    template class basic_locale_ref<char>;
                   ^
.../scnlib-master/src/locale.cpp:177:20: note: '~basic_locale_ref' declared here

.../scnlib-master/include/scn/detail/locale.h:268:9: error: reference to __host__ function '~unique_ptr' in __host__ __device__ function
        basic_locale_ref() = default;
        ^
.../scnlib-master/include/scn/detail/util.h:663:13: note: '~unique_ptr' declared here
            ~unique_ptr() noexcept
            ^
.../scnlib-master/include/scn/detail/locale.h:336:20: note: called by 'get_default'
            return basic_locale_ref();
                   ^
.../scnlib-master/include/scn/detail/locale.h:336:20: error: reference to __host__ function '~basic_locale_ref' in __host__ __device__ function
            return basic_locale_ref();
                   ^
.../scnlib-master/src/locale.cpp:178:20: note: in instantiation of member function 'scn::v0::basic_locale_ref<wchar_t>::get_default' requested here
    template class basic_locale_ref<wchar_t>;
                   ^
.../scnlib-master/src/locale.cpp:178:20: note: '~basic_locale_ref' declared here

I've tried and failed to consolidate it into a minimal reproducing example in order to figure out exactly what is going on, but it compiles successfully if we comment out

constexpr static basic_locale_ref get_default()
{
    return basic_locale_ref();
}

Perhaps we could exclude this function from compilation when Clang in CUDA mode is detected?

Running fuzzers in CI

scnlibs fuzzing suite can run in the CI with ClusterfuzzLite.

ClusterfuzzLite can be set up so that only fuzzers affected by a given PR will run.

Is that something that is of interest to the scnlib project? If so, I will be happy to set it up.

Explore Unicode-support

The library should be smarter what comes to non-ASCII or platform specific wide encodings, for example UTF-8.

Skip an item

Is there a way to skip an item when use scnlib? I mean in scanf we can use * to skipt an item:

int count;
scanf("%d%*c", &count);

*c means, that a char will be read but won't be assigned, for example for the input "30a" it will assign 30 to count, but 'a' will be ignored.
Is there a similay way in scnlib to implement this?

Unable to access documentation

Attempting to access the documentation results in:

Forbidden
You don't have permission to access this resource.

Apache/2.4.41 (Ubuntu) Server at scnlib.dev Port 443

Feature: compile time parsing of format

Similar to std::format, the format string should be parsed at compile time. This increases performance and adds validation of the format string at compile time.

Returning scanned values instead of passing by reference

A concern has been raised by multiple people, as to why does scn take arguments by reference, instead of returning a tuple, forcing the user to default construct their arguments.

// What we have now
int i;
scn::input("{}", i);
// Hypothetical alternative
auto [err, i] = scn::input<int>("{}");

Building with packaged fast_float is broken

Build with packaged version of fast_float library is broken:

[17/81] : && /usr/bin/g++ -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64  -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -DNDEBUG -Wl,-z,relro -Wl,--as-needed  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 test/CMakeFiles/test-test.dir/test.cpp.o -o test/test-test  -Wl,-rpath,/builddir/build/BUILD/scnlib-1.0/redhat-linux-build  libscn.so.0  -lfast_float && :
FAILED: test/test-test 
: && /usr/bin/g++ -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64  -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -DNDEBUG -Wl,-z,relro -Wl,--as-needed  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 test/CMakeFiles/test-test.dir/test.cpp.o -o test/test-test  -Wl,-rpath,/builddir/build/BUILD/scnlib-1.0/redhat-linux-build  libscn.so.0  -lfast_float && :
/usr/bin/ld: cannot find -lfast_float
collect2: error: ld returned 1 exit status

fast_float is a header-only library, that's why asking linker to link against it with -lfast_float is bogus: https://github.com/eliaskosunen/scnlib/blob/v1.0/CMakeLists.txt#L86-L87

find_package(FastFloat) and FastFloat::fast_float should be used.

scn::prompt requires extra newlines

Same code snippet as in #29,

#include <scn/scn.h>
#include <iostream>

int main() {
    int i = 0;
    if (scn::prompt("Hi there! What's your favorite number? ", "uhh, {}", i)) {
        std::cout << "Ooh, " << i << ", interesting!\n";
    }
    else {
        std::cout << "That doesn't look like any number I've ever seen.\n";
    }
}

When entering for example uhh, 3 as the input, scn::prompt does not return until I enter an additional 7 newlines.
Tested with current master/0.4.

Improvements for embedded targets

Problem

This is a great library, but there are a few changes that would drastically improve support for embedded targets.
These ideas are mostly based on what has been done for fmt.

  • Conditional locale support would remove an enormous amount of code.
  • Conditional float, double, and long double support would provide another substantial reduction. Especially for targets without a floating-point unit.
  • Making the fast_float -> std::from_chars -> strtod fallback optional would help as well.
  • The current version of ARM GCC has partial support for std::from_chars (no floating-point), so SCN_HAS_FLOAT_CHARCONV misbehaves.
  • The bundled fast_float is missing two PRs that fix ARM GCC support (#122 & #123).

Testing

Targeting a Cortex-M4 and optimizing for size, any reference to scn increased my code size from ~78K to ~284K (+206K)!
After taking an ax to some of the code, removing locale and long double, it reduced to ~151K (+73K).
Removing the fallback got it down to ~136K (+58K).
A total reduction of 148K.

I believe that's about as good as it's going to get, and I'm happy to accept that size given it makes my application far simpler.
It's about double the size of fmt, but that comes with the territory.

Proposal

  • Equivalents to FMT_USE_FLOAT, FMT_USE_DOUBLE, and FMT_USE_LONG_DOUBLE
  • SCN_DEFAULT_LOCALE_* instead of hard-coded values for scn::v1::detail::locale_defaults so the user can override
  • SCN_USE_STATIC_LOCALE
    • Skip localization
    • Reject 'L' and 'n' format string flags at compile time, fail at runtime, or ignore
  • SCN_SKIP_FROM_CHARS
    • Skip std::from_chars and fallback directly to strtod
  • SCN_SKIP_STRTOD
    • Fail at runtime when strtod fallback would have occurred
  • I'm not sure how to detect partial std::from_chars support. There doesn't seem to be any indication except the documentation. SCN_SKIP_FROM_CHARS may be sufficient for now.
  • Update fast_float

I could potentially do a PR if I have time. Let me know your thoughts.

scn::scan returns a scan_result, which does not include a value() member

#include <iostream>
#include <scn/scn.h>

int main(int argc, char** argv)
{
    int a, b;
    float c, d;
    auto result = scn::scan("test 1 2 3 4", "test {} {} {} {}", a, b, c, d);
    std::cout << result.value();

    exit(0);
}

This example returns (GCC 7.3):

[build] Scanning dependencies of target main
[build] [ 50%] Building CXX object CMakeFiles/main.dir/main.obj
[build] C:\Users\Amalia\Desktop\test\src\main.cpp: In function 'int main(int, char**)':
[build] C:\Users\Amalia\Desktop\test\src\main.cpp:9:25: error: 'class scn::v0::scan_result<scn::v0::basic_string_view<char>, scn::v0::wrapped_error>' has no member named 'value'
[build]      std::cout << result.value();
[build]                          ^~~~~
[build] mingw32-make.exe[2]: *** [CMakeFiles\main.dir\build.make:63: CMakeFiles/main.dir/main.obj] Error 1
[build] mingw32-make.exe[1]: *** [CMakeFiles\Makefile2:75: CMakeFiles/main.dir/all] Error 2
[build] mingw32-make.exe: *** [Makefile:83: all] Error 2

warning C4127: conditional expression is constant

I get the following warning with VS 2022 17.3

1>C:\Projects\Libraries\scnlib\scnlib\src\locale.cpp(311,9): warning C4127: conditional expression is constant
1>C:\Projects\Libraries\scnlib\scnlib\src\locale.cpp(311,9): message : consider using 'if constexpr' statement instead
1>C:\Projects\Libraries\scnlib\scnlib\src\locale.cpp(311): message : while compiling class template member function 'bool scn::v1::detail::basic_custom_locale_ref<char>::is_alnum(scn::v1::span<const char>) const'
1>C:\Projects\Libraries\scnlib\scnlib\src\locale.cpp(538): message : see reference to class template instantiation 'scn::v1::detail::basic_custom_locale_ref<char>' being compiled

The fix is to replace several occurrences of:
if (sizeof(CharT) == 1) {
with
if constexpr (sizeof(CharT) == 1) {

PR: #61

Right way of reading a large file without performance loss

What is the right way of reading a large file without performance loss?

If I try something like:

  scn::owning_file file{"bench_stdio.txt", "r"};
  auto ran = scn::make_result(file);
  double x;
  while ((ran = scn::scan(fan.range(),"{}",x))) {
    v.push_back(x);
  }

Performance seems to be bad.

Can we do this better?

Scaning to Char []?

Without support to char[], we must construcat at least one std::string before calling scn::scan, if the input contains string, which is anoying.

What's more, istream support scanning to char[]. People (such as me) will get confused if scanning into char[] is unsupported when porting code from istream from scnlib.

Scanning with width, or customize separator.

What if input string is not seperate by space?

    std::string_view v1, v2;
    auto ret = scn::scanf("42,43", "%2s,%2s", v1, v2);
// or
    auto ret = scn::scan("42,43", "{:2}{:2}", v1, v2);

both failed.
but sscanf works fine with following code:

    char buf1[20], buf2[20];
    sscanf("42,43", "%2s,%2s", buf1, buf2);

Can't correctly use scn::input

First I would like to thank you for this amazing work.
Now, I'm having some problems with scn::cstdin().

I need to create a code similar to this:

#include <fmt/format.h>
#include <fmt/ranges.h>
#include <scn/all.h>

int main()
{
    fmt::print("Insert 2 numbers: ");
    auto size = 0;
    auto b = 0;
    auto res = scn::input("{} {}", size, b);
    if (not res) {
        fmt::print(stderr, "Bad input\n");
        return 1;
    }
    fmt::print("Now enter some values: ");
    auto v = std::vector<int>{};

    scn::scan_list(scn::cstdin().lock(), v);
    fmt::print("Entries: {}\n", v);
}

Now, if I compile it (with GCC 10.1):

[rbrugo@vim ~/test] $ g++ -std=c++20 scan.cpp -lfmt -lscn && ./a.out 
Insert 2 numbers: 1 2
a.out: /usr/local/include/scn/detail/file.h:538: scn::v0::basic_file<CharT>::view_type scn::v0::basic_file<CharT>::lock() [with CharT = char; scn::v0::basic_file<CharT>::view_type = scn::v0::basic_file_view<char>]: Assertion `(!is_locked()) && "Precondition violation"' failed.
Aborted (core dumped)

Studying the implementation a bit, I've found out that scn::cstdin() is locked by scn::input, and the lock is saved inside the res variable. So I tried to destroy res:

#include <fmt/format.h>
#include <fmt/ranges.h>
#include <scn/all.h>

int main()
{
    fmt::print("Insert 2 numbers: ");
    auto size = 0;
    auto b = 0;
    {
        auto res = scn::input("{} {}", size, b);
        if (not res) {
            fmt::print(stderr, "Bad input\n");
            return 1;
        }
    }

    fmt::print("Now enter some values: ");
    auto v = std::vector<int>{};

    scn::scan_list(scn::cstdin().lock(), v);
    fmt::print(stderr, "Entries: {}\n", v);
}

And the output become this:

[rbrugo@vim ~/test] $ g++ -std=c++20 scan.cpp -lfmt -lscn && ./a.out 
Insert 2 numbers: 2 3
Now enter some values: 1 2 3 4 5
Segmentation fault (core dumped)

Am I using this functionality in a wrong way? I would check the documentation but the site seems to be offline or something.

Document or fix interaction of scn::prompt and result range()

Hi. Modifying the first example given in the README, I have:

#include <scn/scn.h>
#include <cstdio>

int main() {
    int i, j;
    // Read an integer from stdin
    // with an accompanying message
    auto res = scn::prompt("What's your favorite number? ", "{}", i);
    scn::scan(res.range(), "{}", j);
    printf("Oh, cool, %d!\n", i);
    printf("Oh, even cooler, %d!\n", j);
}
// Example invocation:
// What's your favorite number? 345 6789
// Oh, cool, 345!
// Oh, even cooler, 9!

My "678" got eaten unexpectedly!
Please make scnlib give it back, or if this is expected, document obviously the behavior of using range() to scan leftover input from different possible types.

I am using version 1.1

Locale not respected when parsing floating-points

Hi! I discovered a locale-related parsing bug.

For floating-point values, the docs say:

First, there's a localization specifier:

  • n: Use decimal and thousands separator from the given locale
  • (default): Use . as decimal point and , as thousands separator

This would suggest that parsing of floating-points is independent of locale by default. This turns out to be not the case. After receiving bug reports from users with more outlandish environments, I investigated and found that the default floating-point parser uses STL's std::strto{f,d,ld} functions, which do depend on the locale. Furthermore, I found no trace of code that would respond to the localization specifier mentioned above.

Here's an obligatory minimal failing example with decimal commas:

#include <string>
#include <iostream>
#include <scn/scn.h>

void main()
{
  double value_point;
  const std::string str_point{"123.456"};
  auto ret_point{scn::scan(str_point, "{}", value_point)};
  std::cout << "parsed decimal point: " << ret_point << ", value: " << value_point << std::endl;

  double value_comma;
  const std::string dec_comma{"123,456"};
  auto ret_comma{scn::scan(dec_comma, "{}", value_comma)};
  std::cout << "parsed decimal comma: " << ret_comma << ", value: " << value_comma << std::endl;

  // with LC_ALL=cs_CZ.UTF-8:
  //   parsed decimal point: 0, value: 0
  //   parsed decimal comma: 1, value: 123.456
  // ... this is the behavior I would expect from the format string "{:n}", not from "{}"
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.