Comments (4)
Hey @o1iv3r thanks for sharing! I'll try to reproduce soon and share an update here.
from text-generation-inference.
FYI, I think it might be a problem in the outlines library which also doesn't work for me with a large number of fields.
from text-generation-inference.
Hi!
The LLM has to worry about generating the JSON as well as the fields in the schema, I think that's the issue. Grammar works 99% of the time really well with smaller schemas. I have to admit I've never seen a schema so long, but the use-case is absolutely something that should work effectively.
I've been doing some reading around schema based generation and I came across this article from Lamini here... it looks like they present to the LLM the pre-filled JSON, this saves on compute plus all the LLM has to do is generate the field contents. The schema parsing would never fail this way.
@drbh I'm not totally sure on the implementation currently in TGI but I'm assuming the LLM is also generating the JSON right now. Is there scope to implement something like this going forwards? I can see great benefit in this if so :)
Thanks.
from text-generation-inference.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
from text-generation-inference.
Related Issues (20)
- Sparse Marlin HOT 3
- protobuf version not compatible HOT 1
- Qwen/Qwen2-72B-Instruct-AWQ gibberish output in 2.0.4 HOT 3
- Will Support Disaggregating Prefill and Decoding? HOT 1
- `top_p` messes up `top_logprobs` HOT 2
- template_error in /chat/completions HOT 2
- how to launch a service using downloaded model weights? HOT 2
- TGI keeps crashing with 'device-side assert triggered' HOT 1
- AttributeError: 'MixtralLayer' object has no attribute 'mlp' HOT 3
- Error loading Qwen2-72B-Instruct with EETQ HOT 1
- [RFC]Add Auto-Round Support HOT 18
- OpenAI-compatible API has a discrepancy with original OpenAI API when using tool calls HOT 2
- Error with sharded Mixtral HOT 5
- Unable to load the local model file into LoRA adaptors HOT 10
- Could not import Flash Attention enabled models: cannot import name 'FastLayerNorm' HOT 1
- Error "EOF while parsing an object..." with tool_calls HOT 1
- DeepSeek Coder V2: sharded is not supported for AutoModel HOT 2
- loading fast tokenizer implementation for cached model with `HF_HUB_OFFLINE=1` fails HOT 2
- Can tgi surpport non tokenizer model? HOT 1
- Unknown variant Qwen2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from text-generation-inference.