Just another PoC with local LLM and FastAPI
Built on Macbook M1 Air 2020 MacOS 14.4 primarily
- LaMini-Flan-T5-77M for primary execution testing.
- Llama-2-7b-chat-hf for actual execution. Planned.
- Primary backend test with - FastAPI.
- Python formatter used - Black.
- weight file is ignored for git. can be downladed from this link, 308 MB in size.
- chathistory.txt has all the previous outcomes during the soft tests.
.
└── talktome/
├── README.md
├── main.py
├── stapp.py
├── index.html
├── model/
│ ├── weight_file.bin
│ └── utilities.extensions
├── assets/
│ └── media_attached.extensions
└── static/
├── index.html
├── script.js
└── style.css
to build docker image, run: docker build -t lamini-app .
run docker container: docker run -p 8000:8000 lamini-app
load server:
uvicorn --host 0.0.0.0 main:app
Open:
index.html
run streamlit:
python3.10 -m streamlit run stapp.py
load server:
uvicorn main:app --reload
run primary test:
python test.py
load server:
uvicorn --host 0.0.0.0 main:app
run streamlit:
python3.10 -m streamlit run stapp.py
load server:
uvicorn --host 0.0.0.0 main:app
Open:
index.html
Requires fixing the UI, so that it can show the chain of conversation.