serge-chat/serge
Fork: 407 Star: 5663 (更新于 2024-10-18 17:06:34)
license: Apache-2.0
Language: Svelte .
A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.
最后发布版本: 0.9.0 ( 2024-02-14 12:56:33)
Serge - LLaMA made easy 🦙
Serge is a chat interface crafted with llama.cpp for running GGUF models. No API keys, entirely self-hosted!
- 🌐 SvelteKit frontend
- 💾 Redis for storing chat history & parameters
- ⚙️ FastAPI + LangChain for the API, wrapping calls to llama.cpp using the python bindings
🎥 Demo:
⚡️ Quick start
🐳 Docker:
docker run -d \
--name serge \
-v weights:/usr/src/app/weights \
-v datadb:/data/db/ \
-p 8008:8008 \
ghcr.io/serge-chat/serge:latest
🐙 Docker Compose:
services:
serge:
image: ghcr.io/serge-chat/serge:latest
container_name: serge
restart: unless-stopped
ports:
- 8008:8008
volumes:
- weights:/usr/src/app/weights
- datadb:/data/db/
volumes:
weights:
datadb:
Then, just visit http://localhost:8008, You can find the API documentation at http://localhost:8008/api/docs
🌍 Environment Variables
The following Environment Variables are available:
Variable Name | Description | Default Value |
---|---|---|
SERGE_DATABASE_URL |
Database connection string | sqlite:////data/db/sql_app.db |
SERGE_JWT_SECRET |
Key for auth token encryption. Use a random string | uF7FGN5uzfGdFiPzR |
SERGE_SESSION_EXPIRY |
Duration in minutes before a user must reauthenticate | 60 |
NODE_ENV |
Node.js running environment | production |
🖥️ Windows
Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models.
☁️ Kubernetes
Instructions for setting up Serge on Kubernetes can be found in the wiki.
🧠 Supported Models
Category | Models |
---|---|
Alfred | 40B-1023 |
BioMistral | 7B |
Code | 13B, 33B |
CodeLLaMA | 7B, 7B-Instruct, 7B-Python, 13B, 13B-Instruct, 13B-Python, 34B, 34B-Instruct, 34B-Python |
Codestral | 22B v0.1 |
Gemma | 2B, 1.1-2B-Instruct, 7B, 1.1-7B-Instruct, 2-9B, 2-9B-Instruct, 2-27B, 2-27B-Instruct |
Gorilla | Falcon-7B-HF-v0, 7B-HF-v1, Openfunctions-v1, Openfunctions-v2 |
Falcon | 7B, 7B-Instruct, 11B, 40B, 40B-Instruct |
LLaMA 2 | 7B, 7B-Chat, 7B-Coder, 13B, 13B-Chat, 70B, 70B-Chat, 70B-OASST |
LLaMA 3 | 11B-Instruct, 13B-Instruct, 16B-Instruct |
LLaMA Pro | 8B, 8B-Instruct |
Mathstral | 7B |
Med42 | 70B, v2-8B, v2-70B |
Medalpaca | 13B |
Medicine | Chat, LLM |
Meditron | 7B, 7B-Chat, 70B, 3-8B |
Meta-LlaMA-3 | 3-8B, 3.1-8B, 3.2-1B-Instruct, 3-8B-Instruct, 3.1-8B-Instruct, 3.2-3B-Instruct, 3-70B, 3.1-70B, 3-70B-Instruct, 3.1-70B-Instruct |
Mistral | 7B-V0.1, 7B-Instruct-v0.2, 7B-OpenOrca, Nemo-Instruct |
MistralLite | 7B |
Mixtral | 8x7B-v0.1, 8x7B-Dolphin-2.7, 8x7B-Instruct-v0.1 |
Neural-Chat | 7B-v3.3 |
Notus | 7B-v1 |
Notux | 8x7b-v1 |
Nous-Hermes 2 | Mistral-7B-DPO, Mixtral-8x7B-DPO, Mistral-8x7B-SFT |
OpenChat | 7B-v3.5-1210? 8B-v3.6-20240522 |
OpenCodeInterpreter | DS-6.7B, DS-33B, CL-7B, CL-13B, CL-70B |
OpenLLaMA | 3B-v2, 7B-v2, 13B-v2 |
Orca 2 | 7B, 13B |
Phi | 2-2.7B, 3-mini-4k-instruct, 3.1-mini-4k-instruct, 3.1-mini-128k-instruct,3.5-mini-instruct, 3-medium-4k-instruct, 3-medium-128k-instruct |
Python Code | 13B, 33B |
PsyMedRP | 13B-v1, 20B-v1 |
Starling LM | 7B-Alpha |
SOLAR | 10.7B-v1.0, 10.7B-instruct-v1.0 |
TinyLlama | 1.1B |
Vicuna | 7B-v1.5, 13B-v1.5, 33B-v1.3, 33B-Coder |
WizardLM | 2-7B, 13B-v1.2, 70B-v1.0 |
Zephyr | 3B, 7B-Alpha, 7B-Beta |
Additional models can be requested by opening a GitHub issue. Other models are also available at Serge Models.
⚠️ Memory Usage
LLaMA will crash if you don't have enough available memory for the model
💬 Support
Need help? Join our Discord
🧾 License
Nathan Sarrazin and Contributors. Serge
is free and open-source software licensed under the MIT License and Apache-2.0.
🤝 Contributing
If you discover a bug or have a feature idea, feel free to open an issue or PR.
To run Serge in development mode:
git clone https://github.com/serge-chat/serge.git
cd serge/
docker compose -f docker-compose.dev.yml up --build
The solution will accept a python debugger session on port 5678. Example launch.json for VSCode:
{
"version": "0.2.0",
"configurations": [
{
"name": "Remote Debug",
"type": "python",
"request": "attach",
"connect": {
"host": "localhost",
"port": 5678
},
"pathMappings": [
{
"localRoot": "${workspaceFolder}/api",
"remoteRoot": "/usr/src/app/api/"
}
],
"justMyCode": false
}
]
}
最近版本更新:(数据更新于 2024-10-07 22:44:14)
2024-02-14 12:56:33 0.9.0
2024-01-02 11:56:46 0.8.2
2023-12-18 13:01:54 0.8.1
2023-12-18 11:25:55 0.8.0
2023-11-27 04:21:10 0.7.0
2023-11-24 09:36:12 0.6.0
2023-11-11 07:54:51 0.5.2
2023-11-01 21:00:28 0.5.1
2023-10-26 20:59:24 0.5.0
2023-09-18 05:26:06 0.4.1
主题(topics):
alpaca, docker, fastapi, llama, llamacpp, nginx, python, svelte, sveltekit, tailwindcss, web
serge-chat/serge同语言 Svelte最近更新仓库
2024-10-12 22:22:24 huntabyte/shadcn-svelte
2024-09-25 00:52:38 open-webui/open-webui
2024-07-29 10:24:42 taikoxyz/taiko-mono
2024-07-07 00:39:12 matt8707/ha-fusion
2024-06-27 03:30:40 saadeghi/daisyui
2024-06-16 21:58:49 orefalo/svelte-splitpanes