alejandrofrias / case-conversion Goto Github PK

View Code? Open in Web Editor NEW

18.0 2.0 5.0 175 KB

Python package ported from Davis Clark's CaseConversion Sublime Plugin

License: MIT License

Python 100.00%

case-conversion's People

Contributors

Stargazers

Watchers

Forkers

pombredanne brunsgaard vijaykrishna-fc olsonpm arpitjain799

case-conversion's Issues

regex dependency prevents vendorizing

Just wanted to let you know that the regex dependency prevented me from vendorizing case-conversion (it's not purely python). This isn't a big deal because I can fork and be on my way, but an alternative is to use python's unicodedata. I replaced your regexes with the following functions and all your tests passed. Granted, locally I cheated a little by replacing regex.compile with re.compile when searching for acronyms, but a function could wrap that logic as well.

** I did not test for performance differences and I also understand you may prioritize having regex over case-conversion being vendorize'able

Click to reveal code

def _charIsSep(aChar):
    return (
        not _charIsUpper(aChar)
        and not _charIsLower(aChar)
        and not _charIsNumberDecimalDigit(aChar)
    )

# _isSep vs '^[^\p{Ll}\p{Lu}\p{Nd}]$'
def _isSep(aString):
    return len(aString) == 1 and _charIsSep(unicode(aString))


def _charIsNumberDecimalDigit(aChar):
    return unicodedata.category(aChar) == "Nd"


def _charIsLower(aChar):
    return unicodedata.category(aChar) == "Ll"


def _charIsUpper(aChar):
    return unicodedata.category(aChar) == "Lu"

# _isUpper vs '^[\p{Lu}]$'
def _isUpper(aString):
    return len(aString) == 1 and _charIsUpper(unicode(aString))

# _isValidAcronym vs '^[\p{Ll}\p{Lu}\p{Nd}]+$'
def _isValidAcronym(aString):
    if len(aString) == 0:
        return False

    for aChar in aString:
        if _charIsSep(unicode(aChar)):
            return False

    return True

Consider: Basing internals on regex operations

The current implementation of the string conversion logic does not strike me as optimal.

Albeit it is working, I wonder whether using an approach to detect string case, acronyms and segmenting into substrings, as well as the final reassembling of those, is the most elegant approach. Likely basing these types of operations on regular expression may provide a much more succinct code.

On starting point:
https://stackoverflow.com/questions/1175208/elegant-python-function-to-convert-camelcase-to-snake-case

This idea is to be explored.

Submit to Awesome List

This package fills a unique void, left by other packages.
Namely it does two things:

many string conversions
acronym detection

Hence it would be useful to make it known to the python community.
The best way would be an entry to the awesome list at:
https://github.com/vinta/awesome-python

This may be done, once more cases have been added and the internal structure has undergone further refactoring. The extensive tests provided speak for the quality of this package and may improve its acceptance.

TODO before ready for submission:

Consider: Smart title case

What can be borrowed from John Gruber's?
How relevant is this for other cases than title case?

Explanation from https://daringfireball.net/2008/05/title_case:

It knows about small words that should not be capitalized. Not all style guides use the same list of words — for example, many lowercase with, but I do not. The list of words is easily modified to suit your own taste/rules:

my @small_words = qw(a an and as at but by en for if in of
                    on or the to v[.]? via vs[.]?);

(The only trickery here is that “v” and “vs” include optional dots, expressed in regex syntax.)

The script assumes that words with capitalized letters other than the first character are already correctly capitalized. This means it will leave a word like “iTunes” alone, rather than mangling it into “ITunes” or, worse, “Itunes”.
It also skips over any words with line dots; “example.com” and “del.icio.us” will remain lowercase.
It has hard-coded hacks specifically to deal with odd cases I’ve run into, like “AT&T” and “Q&A”, both of which contain small words (at and a) which normally should be lowercase.
The first and last word of the title are always capitalized, so input such as “Nothing to be afraid of” will be turned into “Nothing to Be Afraid Of”.
A small word after a colon will be capitalized.

Python implementation:
https://muffinresearch.co.uk/titlecasepy-titlecase-in-python/

Sublime Text package:
https://packagecontrol.io/packages/Smart%20Title%20Case

Consider: Adding CI tests

I believe adding Travis CI or Circle CI would improve quality gating.
This would also make good use of the extensive test coverage that is already provided.

Consider: Pytest

I do not have strong preferences for one test suite over another.. However I happen to use pytest for all my tests. Is there any reason (besides the additional dev dependency maybe) we shouldn't use pytest?

Consider: Dropping python 2.7 support

@AlejandroFrias
I would really like to drop support for Python 2.
Considering that Python 2 is going EOL 1st of Jan 2020 anyway it is not worth the pain maintaining both legacy and modern python.

If you agree I am going to develop against Python 3.5+ only.

Import Error

I see this when importing case-conversion, python 3.5.2.

(asgt) ⋊> ~/e/vml-model-econbankrec on master ⨯ pip install -U case-conversion 
Requirement already up-to-date: case-conversion
Requirement already up-to-date: regex>=2016.2.25

(asgt) ⋊> ~/e/vml-model-econbankrec on master ⨯ python -c 'import case_conversion'  
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/brunsgaard/.virtualenvs/asgt/lib/python3.5/site-packages/case_conversion/__init__.py", line 1, in <module>
    from case_conversion import (
ImportError: cannot import name 'camelcase'

Consider: Type annotations

@AlejandroFrias
Should we harden the code base via type annotations?
I feel it adds reliability and robustness to a code base.
For example in case of heavy refactoring this can be hugely beneficial.

In general as it is being said the type annotations make sense whenever tests do.
case-conversion come with extensive test coverage thanks to your efforts, hence type checking would add upon that.

Related: #15

Add more case conversions

I believe this package fills a void in the current python eco system. There are no other packages which offer conversion to many cases and also incoorporate acronym detection. I believe we could make this package even more useful by adding more case conversions.

The current code allows for easy extension with new case conversions:
Maybe we could collect ideas for new case types in this issue before extending.

Currently supported:

camelCase
PascalCase
snake_case
dash-case
CONST_CASE
dot.case
separate words
slash/case
backslash\case

New case types:

Ada_Case (jdavisclark/CaseConversion#41)
Title_case
Capitalcase**
lowercase**
UPPERCASE**

** provided by standard lib, but it would be logical if we incoorporated them into the interface

README.md Usage section is outdated

The Usage section of the readme is somewhat outdated:

>>> import case_conversion
>>> case_conversion.snake("fooBarHTTPError")
'foo_bar_h_t_t_p_error'  # ewwww :(
>>> case_conversion.snake("fooBarHTTPError", acronyms=['HTTP'])
'foo_bar_http_error'  # pretty :)

The current API is now .snakecase() instead of .snake()

acronyms parameter also seems no longer supported:

>>> case_conversion.snakecase("fooBarHTTPError", acronyms=['HTTP'])
'foo_bar_h_t_t_p_error'

Tested using case-conversion 2.1.0

Consider publishing on package control

@AlejandroFrias
It would be great to have this package available as a dependency for Sublime Text plugins.
In case you are an user of Sublime Text yourself, you may want to consider to publish this package on
packagecontrol.io, hence plugins could use the advanced string case conversion capabilities.

Consider: Pipenv or poetry

@AlejandroFrias

Sorry, for my long absence. I got washed away with a mega project at work eating up all my free time. I have been wanting to overhaul this package as we initially discussed ever since. However, I am here now and want to invest my time to bring the python ecosystem finally a good string conversion library.

In this wake I thought we should consider switching to a modern package format like
Pipfile (pipenv) or pyproject.toml (poetry). What do you think about it?

Improve API

Consider an API like CommonRegex

Usage

>>> from case_conversion import CaseConverter
>>> parsed_text = CaseConverter("testString")

>>> parsed_text.snake
test_string
>>> parsed_text.pascal
TestString
>>> parsed_text.dot
test.string
>>> parsed_text.const
TEST_STRING
...

Alternatively, you can generate a single CommonRegex instance and use it to parse multiple segments of text.

>>> converter = CaseConverter()
>>> converter.snake("testString")
test_string

Method aliasing

Additionally such class based code structure would also allow for aliasing method names via decorators, which is much cleaner than the current implementation.
Example:

@alias('kebabcase', 'spinalcase')
def dashcase(text, acronyms=None):
    ...

See:

Auto-detect acronyms

Would it be an idea to automatically set detect_acronyms to True if any non-empty list of acronyms is passed? Now, a user must always specify both the acronyms and the boolean, while this is unnecessary. It would only be required to pass both arguments if a user passes a non-empty list of acronyms, but does not want to use them, so: if detect_acronyms=False is passed explicitly, only then any acronyms are ignored; otherwise, any acronyms are detected automatically (none by default).

New release

Hello, thank you for the great library! Could you publish a new release? I'm particularly interested in the Http-Header-Case 🙂

Consider: Removing regex dependency

Investigate: Can we drop the sole dependency consisting of the alternative regex implementation?

IMHO case-conversion represents a small and defined functionality. Hence I believe it should be possible to make it dependency-free.

UPDATE:
I understand that regex was introduced for handling unicode regexes. I am not suggesting we should sacrifice that unicode support but maybe it is worth striving for a dependency-free solution.

Open thread: What would you case-conversion to be like?

This is issue is to collect ideas, wishes, and complaints about case-conversion? We would really like to make case-conversion the foremost python package for anything case conversion-related.
In order to do so, we would like to have some input from users in order to gain some better ideas on what to improve

Some guiding question:

How do you use it?
What feature would you like to see?
What s*cks?
Anything else you want to see?