Giter VIP home page Giter VIP logo

Comments (10)

badalsarkar avatar badalsarkar commented on July 28, 2024 1

Yes. It does. Let me do some work.

from butterfly.

fabiocarvalho777 avatar fabiocarvalho777 commented on July 28, 2024 1

Hi Favio,

I have looked into all three options-

Java Core API
We are already using it.

Unix4j

It is not possible to match regex over multiple line directly. It doesn't offer the N option as in SED. We need to replace the \n with some special character and than do the matching. I found that butterfly is already using this mechanism at EolHelper class.

Apart from this some other operations this library allows are - translate, append, insert , change. Butterfly has implementation for all these except translate, I believe. So, I think that this library is not going to add much value.

Jawk

This basically compiles AWK script for JVM which means we can use full power of AWK. I think we can use it in two ways-

  1. We can provide some pre-defined API which under the hood can use AWK.
  2. Expose API that takes an AWK script from user and run it using this library.

I am not totally sure, which one would be more appropriate or if it is even appropriate to use this library considering the use cases of butterfly.

Please let me know your thoughts.

Thanks for all your research. So, let's leave this issue here as is for now. Since we can't use unix4j, the effort to have a sed like TO would probably be too much at this moment.

Coincidentally, we already have an issue to have an awk TO. It is #129. Let's proceed then with work using #129 instead of this issue. By the way, option number 2 ("Expose API that takes an AWK script from user and run it using this library.") is better. Thanks Badal!!

from butterfly.

badalsarkar avatar badalsarkar commented on July 28, 2024

Hi Fabio,
I have started working on this one.

from butterfly.

fabiocarvalho777 avatar fabiocarvalho777 commented on July 28, 2024

Ok, thanks!!

from butterfly.

badalsarkar avatar badalsarkar commented on July 28, 2024

Hi,
Finally got some time as my exams are over.

I have been browsing the code base and design documents and found that Butterfly exposes some API for manipulating text. One of the classes is ReplaceText. I believe sed will be used to expose some kind of public API like ReplaceText. Would you please provide me a few use cases about how sed will be used in Butterfly?

Thank you..

from butterfly.

fabiocarvalho777 avatar fabiocarvalho777 commented on July 28, 2024

Hello,

Yes, ReplaceText is similar to what a sed TO could offer. However, ReplaceText has some limitations. The most important one is, when analyzing text, the provided regular expression is limited to a single line of text (per line break).

Instead of creating a brand new TO using sed, we could modify ReplaceText, enhancing it by using the sed library. However, that would probably break backward compatibility. So, it is better to add a brand new TO. We can call it RunSed.

Does it make sense?

from butterfly.

badalsarkar avatar badalsarkar commented on July 28, 2024

I looked at the library https://github.com/tools4j/unix4j. It doesn't have out of the box support for evaluating pattern over multiple lines. We have to manually replace the new line character with some special character and then do the matching. You can look at tools4j/unix4j#71 (comment) issue.

Java core api's Pattern class provides two flags DOTALL and MULTILINE which provides similar functionality. I have also found a library Jawk which processes awk script. I am looking at both of theses options and sees what will suit our need best.

Please let me know if you have any feedback. 😄

from butterfly.

badalsarkar avatar badalsarkar commented on July 28, 2024

Hi Favio,

I have looked into all three options-

Java Core API
We are already using it.

Unix4j

It is not possible to match regex over multiple line directly. It doesn't offer the N option as in SED. We need to replace the \n with some special character and than do the matching. I found that butterfly is already using this mechanism at EolHelper class.

Apart from this some other operations this library allows are - translate, append, insert , change. Butterfly has implementation for all these except translate, I believe. So, I think that this library is not going to add much value.

Jawk

This basically compiles AWK script for JVM which means we can use full power of AWK. I think we can use it in two ways-

  1. We can provide some pre-defined API which under the hood can use AWK.
  2. Expose API that takes an AWK script from user and run it using this library.

I am not totally sure, which one would be more appropriate or if it is even appropriate to use this library considering the use cases of butterfly.

Please let me know your thoughts.

from butterfly.

fabiocarvalho777 avatar fabiocarvalho777 commented on July 28, 2024

I looked at the library https://github.com/tools4j/unix4j. It doesn't have out of the box support for evaluating pattern over multiple lines. We have to manually replace the new line character with some special character and then do the matching. You can look at tools4j/unix4j#71 (comment) issue.

What a bummer. That issue 71 you found under unix4j represents very well our main motivation for a sed like TO in Butterfly. It is unfortunate unix4j doesn't support that.

from butterfly.

fabiocarvalho777 avatar fabiocarvalho777 commented on July 28, 2024

Java core api's Pattern class provides two flags DOTALL and MULTILINE which provides similar functionality. I have also found a library Jawk which processes awk script. I am looking at both of theses options and sees what will suit our need best.
Please let me know if you have any feedback. 😄

That is exactly what I would recommend next, look for an alternative to unix4j. Looks like you are doing it already :-)
By the way, if we go with a awk Java based library, then I guess we will end up creating a TO for awk, instead of sed. Which is ok.

from butterfly.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.