Comments (1)
Hi @atoutou
pipe = nlu.load('xx.embed_sentence.labse', gpu=True)
pipe.print_info()
will print
The following parameters are configurable for this NLU pipeline (You can copy paste the examples) :
>>> pipe['bert_sentence@labse'] has settable params:
pipe['bert_sentence@labse'].setBatchSize(8) | Info: Size of every batch | Currently set to : 8
pipe['bert_sentence@labse'].setIsLong(False) | Info: Use Long type instead of Int type for inputs buffer - Some Bert models require Long instead of Int. | Currently set to : False
pipe['bert_sentence@labse'].setMaxSentenceLength(128) | Info: Max sentence length to process | Currently set to : 128
pipe['bert_sentence@labse'].setDimension(768) | Info: Number of embedding dimensions | Currently set to : 768
pipe['bert_sentence@labse'].setCaseSensitive(False) | Info: whether to ignore case in tokens for embeddings matching | Currently set to : False
pipe['bert_sentence@labse'].setStorageRef('labse') | Info: unique reference name for identification | Currently set to : labse
>>> pipe['deep_sentence_detector@SentenceDetectorDLModel_c83c27f46b97'] has settable params:
pipe['deep_sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setExplodeSentences(False) | Info: whether to explode each sentence into a different row, for better parallelization. Defaults to false. | Currently set to : False
pipe['deep_sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setStorageRef('SentenceDetectorDLModel_c83c27f46b97') | Info: storage unique identifier | Currently set to : SentenceDetectorDLModel_c83c27f46b97
pipe['deep_sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setEncoder(com.johnsnowlabs.nlp.annotators.sentence_detector_dl.SentenceDetectorDLEncoder@7f47d7d6) | Info: Data encoder | Currently set to : com.johnsnowlabs.nlp.annotators.sentence_detector_dl.SentenceDetectorDLEncoder@7f47d7d6
pipe['deep_sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setImpossiblePenultimates(['Bros', 'No', 'al', 'vs', 'etc', 'Fig', 'Dr', 'Prof', 'PhD', 'MD', 'Co', 'Corp', 'Inc', 'bros', 'VS', 'Vs', 'ETC', 'fig', 'dr', 'prof', 'PHD', 'phd', 'md', 'co', 'corp', 'inc', 'Jan', 'Feb', 'Mar', 'Apr', 'Jul', 'Aug', 'Sep', 'Sept', 'Oct', 'Nov', 'Dec', 'St', 'st', 'AM', 'PM', 'am', 'pm', 'e.g', 'f.e', 'i.e']) | Info: Impossible penultimates | Currently set to : ['Bros', 'No', 'al', 'vs', 'etc', 'Fig', 'Dr', 'Prof', 'PhD', 'MD', 'Co', 'Corp', 'Inc', 'bros', 'VS', 'Vs', 'ETC', 'fig', 'dr', 'prof', 'PHD', 'phd', 'md', 'co', 'corp', 'inc', 'Jan', 'Feb', 'Mar', 'Apr', 'Jul', 'Aug', 'Sep', 'Sept', 'Oct', 'Nov', 'Dec', 'St', 'st', 'AM', 'PM', 'am', 'pm', 'e.g', 'f.e', 'i.e']
pipe['deep_sentence_detector@SentenceDetectorDLModel_c83c27f46b97'].setModelArchitecture('cnn') | Info: Model architecture (CNN) | Currently set to : cnn
>>> pipe['document_assembler'] has settable params:
pipe['document_assembler'].setCleanupMode('shrink') | Info: possible values: disabled, inplace, inplace_full, shrink, shrink_full, each, each_full, delete_full | Currently set to : shrink
With
pipe['bert_sentence@labse'].setBatchSize(8) | Info: Size of every batch | Currently set to : 8
Should fix your problem.
Let me know if it helps
from nlu.
Related Issues (20)
- using NLU for biobert embeddings -- takes a really long time on list of 10,000 words, and on 1 word HOT 3
- combining 'sentiment' and 'emotion' models causes crash HOT 1
- Unknown environment issue with BioBert HOT 6
- Difference between JSL’s “nlu” and “spark-nlp” packages? HOT 1
- Something went wrong during completing the DAG for the Spark NLP Pipeline. HOT 1
- Error while trying to load nlu.load('embed_sentence.bert') HOT 1
- error while using biobert PubMed PMC
- error while download hebrewner
- did the last version support python==3.8.10
- bad casing for nlp ref
- nlu.load function m1_chip parameter is not passed on correctly
- support pandas >= 2.0 HOT 1
- Column names and pipe settings change after saving HOT 1
- Breaking dependencies HOT 2
- Java problems when using the library
- DataFrame problem with pyspark and pandas interaction HOT 2
- Feature request: (bioschemas)[bioschemas.org] based ner extractor
- predict() - pyspark IndexError on python 3.11.4
- Model Loading
- [New Feature] LLMs for Machine Translation of slot-annotated data
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nlu.