Comments (9)
This is the new behavior!
from abvd.
It is there to catch explicitly these errors.
from abvd.
Go to line 222 in pylexibank/providers/abvd.py
and change add_lexemes
to add_forms_from_value
.
First fix, as add_lexemes
is deprecated.
from abvd.
Ah, and if you think that the ?
is "normal" behavior, you can fix this also from within there, and add a line that checks if this form actually is a question makr or not. I.e., you check entry.name
and see if it is a valid entry.
from abvd.
Ah, I get it now, sorry for not reading properly before. There is something wrong, I guess, as the entry should not pass the clean_form
command, which is called after splitting it, and yields an empty form, which is then NOT passed to add_form
. And clean_form reacts per default specifically on ?
as a character here.
from abvd.
So the fix we need is in pylexibank code, dataset.py
:
def split_forms(self, item, value):
if value in self.lexemes: # pragma: no cover
self.log.debug('overriding via lexemes.csv: %r -> %r' % (value, self.lexemes[value]))
value = self.lexemes.get(value, value)
return [self.clean_form(item, form)
for form in split_text_with_context(value, separators='/,;')]
needs to be modified to:
def split_forms(self, item, value):
if value in self.lexemes: # pragma: no cover
self.log.debug('overriding via lexemes.csv: %r -> %r' % (value, self.lexemes[value]))
value = self.lexemes.get(value, value)
forms = [self.clean_form(item, form)
for form in split_text_with_context(value, separators='/,;')]
return [f for f in form if f]
Or similar. As this will only return forms that are not None
, and this is crashing the code by now.
from abvd.
I'd say, this is a good example, why it was good to modify the behavior of the add_lexemes
to being more transparent.
from abvd.
Just linked this as a bug in pylexibank.
from abvd.
Fixed.
from abvd.
Related Issues (20)
- Normalize contributors HOT 2
- update
- change glottocode for Dadu'a to dadu1237 HOT 1
- Consider split or new Glottocode for 'Angkola / Mandailing' [lang id 863] HOT 1
- Missing glottocodes HOT 1
- Inconsistent use of slash / solidus HOT 16
- Lengo glottocode HOT 1
- Dayak Ngaju /salawi/ is 'Twenty-Five', not 'Twenty' HOT 1
- Some glottocodes / ISO codes to check HOT 1
- is who the cognacy experts are in a list somewhere?
- subcognacy question HOT 14
- Wrong glottocode for 1686 (Betawi Malay (Tengahan dialect)) HOT 1
- Wrong glottocode for 399 Megiar HOT 1
- Typo in 186-1_hand-1 (Sekar 'hand'): Should be <nima-n>, not <nina-n> HOT 1
- Typo in 853-201_five-1 (Wetan 'five'): Should be <wolima>, not <wolina> HOT 1
- Typo in 1387-201_five-1 (Pak 'five'): Should be <nuron>, not <muron> HOT 2
- reality check cognacy - possible extra cognates filled in relatively easily! HOT 7
- 11 -> 1 for 1193-92_toopenuncover-1 ? HOT 2
- . -> , in cognacy field HOT 1
- gita class 1 for Kubokota and Tabar HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from abvd.