casics / spiral Goto Github PK
View Code? Open in Web Editor NEWA Python 3 module that provides functions for splitting identifiers found in source code files.
License: GNU General Public License v3.0
A Python 3 module that provides functions for splitting identifiers found in source code files.
License: GNU General Public License v3.0
The identifier [unloadAssemblies] has been divided as ['unload', 'Ass', 'embl', 'ies']
Hello,
I'm having a problem trying to use the ronin splitter
on python version 3.10.4.
The error message is:
AttributeError: module 'collections' has no attribute 'Iterable'
I think this is caused by the fact that collections.iterable is deprecated since python version 3.3 and removed in 3.10, as explained in this StackOverflow answer.
However, it is still used in the file spiral/utils.py
:
if isinstance(el, collections.Iterable) and not isinstance(el, (str, bytes)):
A possible way to fix this problem and keep downwards compatibility is explained here.
This solution worked for me. Now, the ronin splitter is working like a charm :)
If you want, I can provide a Pull Request with this solution for you!
The bug is that Ronin may split the same identifier into different results due to the term order in the set of common_terms_with_numbers
.
I added md5sum
into the set of common_terms_with_numbers
and then ran ronin.split("md5sum")
several times.
The splitting results were sometimes ["md5sum"]
and sometimes ["md5", "sum"]
.
I checked the code and found that the heuristic_split
function in simple_splitters.py relys on the regex expression _exceptions_re
.
The _exceptions_re
is generated from common_terms_with_numbers
without considering term order in the set.
It means that if "md5" is before "md5sum" in _exceptions_re
, the split result is ["md5", "sum"]
; If "md5sum" is before "md5" in _exceptions_re
, the split result is ["md5sum"]
.
Solution: Sort the terms by term length when generating _exceptions_re
.
_exceptions_re = re.compile(r'(' + '|'.join(sorted(common_terms_with_numbers, key=lambda term: len(term), reverse=True)) + ')', re.I)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.