Giter VIP home page Giter VIP logo

Comments (9)

RadLikeWhoa avatar RadLikeWhoa commented on June 4, 2024

Hi Brian,

thanks for the nice feedback, great to hear. I've already replaced line 66 (don't know what went through my head there), but I can't quite figure out the correct RegEx to get rid of the punctuation.

Which characters would be affected, anyway?

And don't worry about being pedantic, it's great to have another eye on the code to spot the little things. :)

from countable.

freqdec avatar freqdec commented on June 4, 2024

Hi Sacha,

Thinking further, the punctuation regExp will have to change according to language i.e. the regExp for the Spanish language will contain characters not necessary in the English language (inverted question mark for example).

It may be possible to create an uber regExp that covers most languages but you will never keep everyone happy! Here's a most terrible attempt at something that might work:

/['";:,./?¿-!¡]/g

Good Luck!

from countable.

epmatsw avatar epmatsw commented on June 4, 2024

If you want to remove punctuation, I think a better regex would be something like str.replace(/[^A-Za-z0-9 ]/g, ''). That should remove anything that's not a space, number, or letter.

On the other hand, I don't think that this is something that would be desirable. It's not very intuitive, and it makes it so that count.js output doesn't match Microsoft Word's count, which would probably be the standard you'd want to follow.

from countable.

freqdec avatar freqdec commented on June 4, 2024

Hi Will, your regExp will fail dramattically on any language that has accented characters.

from countable.

RadLikeWhoa avatar RadLikeWhoa commented on June 4, 2024

Not entirely sure, but I think it would only matter if a character is preceded by a space (e.g. question mark or exclamation point in French). Likte that, wouldn't it actually be save to just remove those characters (plus the space), wherever needed?

from countable.

freqdec avatar freqdec commented on June 4, 2024

You are right! So this might work - looks for a space before a punctuation character and replaces them both...

.replace(/\s['";:,./?¿-!¡]/g, '').split(/\s).length

from countable.

epmatsw avatar epmatsw commented on June 4, 2024

Ah yeah, don't know what I was thinking really. Still, I think the Microsoft Word question is valid. A solitary punctuation is also treated as a word by wc. I don't think getting different results from both of those is a good idea.

wordcount

Screen Shot 2013-03-14 at 8 05 49 AM

from countable.

RadLikeWhoa avatar RadLikeWhoa commented on June 4, 2024

Just tested how some other tools treat this situation. Google Docs, Drafts for iOS and iA Writer all ignore the punctuation and count your example as three words. I think it would be better to follow the lead of more recent projects like the aforementioned. I'll look into it later.

from countable.

epmatsw avatar epmatsw commented on June 4, 2024

Well, that makes sense. I guess it's up to you haha.

from countable.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.