ChatXinference
Xinference is a powerful and versatile library designed to serve LLMs, speech recognition models, and multimodal models, even on your laptop. It supports a variety of models compatible with GGML, such as chatglm, baichuan, whisper, vicuna, orca, and many others.
Overviewโ
Integration detailsโ
| Class | Package | Local | Serializable | [JS support] | Package downloads | Package latest |
|---|---|---|---|---|---|---|
| ChatXinference | langchain-xinference | โ | โ | โ | โ | โ |
Model featuresโ
| Tool calling | Structured output | JSON mode | Image input | Audio input | Video input | Token-level streaming | Native async | Token usage | Logprobs |
|---|---|---|---|---|---|---|---|---|---|
| โ | โ | โ | โ | โ | โ | โ | โ | โ | โ |