Deploying your own model with Ollama

Setting Up Your Model

For this guide, we'll use Llama 3 8B as our example model, and we'll set it up using Ollama, a platform that makes local development with open-source large language models super easy.

Step 1: Install Ollama

Visit the Ollama website or their GitHub repo.
Download the installer for your operating system (MacOS, Windows, or Linux).
If you're using Linux, you can run this command in your terminal:
```
curl -fsSL <https://ollama.com/install.sh> | sh
```

The installation process usually takes a few minutes. If you have NVIDIA or AMD GPUs, Ollama will automatically detect them (make sure you have the drivers installed). Don't worry if you don't have a GPU – CPU-only mode works fine too, just a bit slower.

Step 2: Download a Model

Check out the Ollama model library to see all the supported models.
For this guide, we'll use Llama 3.1. To download it, open your terminal and run:
```
ollama pull llama3.1
```

Step 3: Try Your Model Locally

In your terminal, run:
```
ollama run llama3.1
```
Start chatting with your model! You now have a powerful AI, similar in performance to GPT-3.5, running completely on your own machine.

Making Your Model Accessible

To use your model with the DecentAI mobile app when you're away from home, you'll need to make it accessible over the internet. We'll use ngrok for this.

Step 1: Set Up ngrok

Visit ngrok.com/download and download ngrok for your operating system.
Install ngrok following the provided instructions.

Step 2: Authenticate ngrok

Go to https://dashboard.ngrok.com/get-started/setup and follow the instructions to get your auth token.

In your terminal, run:

ngrok config add-authtoken <your-auth-token>

Step 3: Deploy Your Model Online

Ollama listens on port 11434 by default. To make it accessible, run this command in your terminal:
```
ngrok http <http://localhost:11434>
```
ngrok will provide you with a public URL. Keep this URL handy – you'll need it for the next step.

PreviousBring Your Own Model NextCredits

Last updated 8 months ago

Deploying your own model with Ollama

Setting Up Your Model

For this guide, we'll use Llama 3 8B as our example model, and we'll set it up using Ollama, a platform that makes local development with open-source large language models super easy.

Step 1: Install Ollama

Visit the Ollama website or their GitHub repo.
Download the installer for your operating system (MacOS, Windows, or Linux).
If you're using Linux, you can run this command in your terminal:
```
curl -fsSL <https://ollama.com/install.sh> | sh
```

Step 2: Download a Model

Check out the Ollama model library to see all the supported models.
For this guide, we'll use Llama 3.1. To download it, open your terminal and run:
```
ollama pull llama3.1
```

Step 3: Try Your Model Locally

In your terminal, run:
```
ollama run llama3.1
```
Start chatting with your model! You now have a powerful AI, similar in performance to GPT-3.5, running completely on your own machine.

Making Your Model Accessible

To use your model with the DecentAI mobile app when you're away from home, you'll need to make it accessible over the internet. We'll use ngrok for this.

Step 1: Set Up ngrok

Visit ngrok.com/download and download ngrok for your operating system.
Install ngrok following the provided instructions.

Step 2: Authenticate ngrok

Go to https://dashboard.ngrok.com/get-started/setup and follow the instructions to get your auth token.

In your terminal, run:

ngrok config add-authtoken <your-auth-token>

Step 3: Deploy Your Model Online

Ollama listens on port 11434 by default. To make it accessible, run this command in your terminal:
```
ngrok http <http://localhost:11434>
```
ngrok will provide you with a public URL. Keep this URL handy – you'll need it for the next step.

PreviousBring Your Own Model NextCredits

Last updated 8 months ago