Giter VIP home page Giter VIP logo

Comments (4)

mrix avatar mrix commented on August 16, 2024

Hi @jgadrow. Are you sure that this is a good place to make a decision about token type? You can do that on the parsing stage. The Lexer should just split a string into few tokens. If the identifier and string tokens look the same why do you need them both?

But if you really need it, you can decorate the StringSubLexer class like this:

<?php
class IdentifierSubLexer implements SubLexerInterface
{
    private $stringSubLexer;
    private $operatorSubLexer;

    public function __construct(SubLexerInterface $stringSubLexer, SubLexerInterface $operatorSubLexer)
    {
        $this->stringSubLexer = $stringSubLexer;
        $this->operatorSubLexer = $operatorSubLexer;
    }

    /**
     * @inheritdoc
     */
    public function getTokenAt($code, $cursor)
    {
        $stringToken = $this->stringSubLexer->getTokenAt($code, $cursor);
        if ($stringToken === null || !$stringToken->test(Token::T_STRING)) {
            return null;
        }

        $operatorToken = $this->operatorSubLexer->getTokenAt($code, $stringToken->getEnd());
        if ($operatorToken === null || !$operatorToken->test(Token::T_OPERATOR)) {
            return null;
        }

        return new Token(
            Token::T_IDENTIFIER,
            $stringToken->getValue(),
            $stringToken->getStart(),
            $stringToken->getEnd()
        );
    }
}

Please let me know if I've got you right.

from rql-parser.

jgadrow avatar jgadrow commented on August 16, 2024

It's not a look-ahead operation but a look-behind operation. So it would be:

$operatorToken = $this->operatorSubLexer->getTokenAt($code, $cursor-1);

However, this relies on expecting the length of the previous token to be exactly 1 char which is not always correct.

If we had access to the current TokenStream, we could just look at the last token on the stack (or even further if we had a use case for it).

class IdentifierSubLexer implements SubLexerInterface
{
    /**
     * @inheritdoc
     */
    public function getTokenAt($code, $cursor, array $tokens)
    {
        if (Token::T_OPERATOR === $tokens[count($tokens)-1]->getType()
            || !preg_match('/[a-z]([a-z0-9.-_]|\%[0-9a-f]{2})*/Ai', $code, $matches, null, $cursor)
        ) {
            return null;
        }

        return new Token(
            Token::T_IDENTIFIER,
            rawurldecode($matches[0]),
            $cursor,
            $cursor + strlen($matches[0])
        );
    }
}

It's a clean, dependable way to do a look-behind check which allows me to skip the regex operation entirely if I already know my token type isn't valid in this context.

from rql-parser.

mrix avatar mrix commented on August 16, 2024

So you just need to implement your own Lexer class or override existing one where you can pass $tokens to the getTokenAt method.

It would be the easiest solution in this case because I'm not allowed to break the stable API.

from rql-parser.

jgadrow avatar jgadrow commented on August 16, 2024

True, and this was an option I considered exploring (in the end, I got my front-end devs to agree to write a custom URL encoder to also encode the -_.~ characters). But there's no rule saying all open issues have to be implemented in v1.x.

Thus, I submitted this for possible inclusion in a v2.x where you would be able to alter the interface.

Also, because I know text is a terrible way to convey tone, please know that I greatly appreciate the time you have put into this package! I considered a few different ones before deciding that of the available options I liked yours the best. :)

from rql-parser.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.