Giter VIP home page Giter VIP logo

llama-coder's Introduction

Llama Coder

Llama Coder is a better and self-hosted Github Copilot replacement for VS Studio Code. Llama Coder uses Ollama and codellama to provide autocomplete that runs on your hardware. Works best with Mac M1/M2/M3 or with RTX 4090.

VS Code Plugin

Features

  • ๐Ÿš€ As good as Copilot
  • โšก๏ธ Fast. Works well on consumer GPUs. RTX 4090 is recommended for best performance.
  • ๐Ÿ” No telemetry or tracking
  • ๐Ÿ”ฌ Works with any language coding or human one.

Recommended hardware

Minimum required RAM: 16GB is a minimum, more is better since even smallest model takes 5GB of RAM. The best way: dedicated machine with RTX 4090. Install Ollama on this machine and configure endpoint in extension settings to offload to this machine. Second best way: run on MacBooc M1/M2/M3 with enougth RAM (more == better, but 10gb extra would be enougth). For windows notebooks: it runs good with decent GPU, but dedicated machine with a good GPU is recommended. Perfect if you have a dedicated gaming PC.

Local Installation

Install Ollama on local machine and then launch the extension in VSCode, everything should work as it is.

Remote Installation

Install Ollama on dedicated machine and configure endpoint to it in extension settings. Ollama usually uses port 11434 and binds to 127.0.0.1, to change it you should set OLLAMA_HOST to 0.0.0.0.

Models

Currently Llama Coder supports only Codellama. Model is quantized in different ways, but our tests shows that q4 is an optimal way to run network. When selecting model the bigger the model is, it performs better. Always pick the model with the biggest size and the biggest possible quantization for your machine. Default one is codellama:7b-code-q4_K_M and should work everywhere, codellama:34b-code-q4_K_M is the best possible one.

Name RAM/VRAM Notes
codellama:7b-code-q4_K_M 5GB
codellama:7b-code-q6_K 6GB m
codellama:7b-code-fp16 14GB g
codellama:13b-code-q4_K_M 10GB
codellama:13b-code-q6_K 14GB m
codellama:34b-code-q4_K_M 24GB
codellama:34b-code-q6_K 32GB m
  • m - slow on MacOS
  • g - slow on older NVidia cards (pre 30xx)

Changelog

[0.0.8]

  • Improved DeepSeek support and language detection

[0.0.7]

  • Added DeepSeek support
  • Ability to change temperature and top p
  • Fixed some bugs

[0.0.6]

  • Fix ollama links
  • Added more models

[0.0.4]

  • Initial release of Llama Coder

llama-coder's People

Contributors

ex3ndr avatar wrapss avatar staff0rd avatar dre-on avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.