How to Deploy Ollama on a Kamatera VPS

Most developers default to third-party APIs for large language models because local setup is a pain. The tradeoff: your data leaves your infrastructure, latency creeps in, and costs compound fast.

Ollama lets you run models like LLaMA and Mistral directly on your own hardware with minimal setup. But personal machines often lack the GPU power and memory these models actually need to perform. With customizable server configurations and a pre-built Ollama app image on a Kamatera VPS, you can deploy a dedicated LLM environment without touching complex manual installations.

This guide walks through deploying Ollama on Kamatera using that app image, then configuring it for secure, private model hosting. We will also cover model setup, API exposure, and basic optimization to get you to a production-ready setup.

Step-by-step guide

The first step in deploying Ollama on Kamatera is to log in to your Kamatera cloud management console. Once logged in, you’ll see the dashboard. This is where you’ll manage servers, networking, and storage.

Steps:

Expand the My Cloud option on the right bar and choose Create New Server. Under create new server option, choose the zone.

2. Scroll down and under the Service Images tab, look for Ollama. Select it and then choose the latest version.

Select your server specifications

Scroll down and enable the Detailed View toggle to check the pricing per configuration. Here’s where you’ll input your preferred server specifications.
Choose the Type, CPU, RAM, and Disk storage. Click on the help button on the side to learn more about each setting. You can add more disk storage at any point in your server’s life.

Enable Daily backup option to enable a daily backup of your server storage.
Enable Management Services toggle for a managed server option.

Configuring network and security settings
Here we’ll choose either simple or advanced networking. This lets you define how your server will connect to the internet, and how much outbound data it can use per month.

Simple Mode – Choose between WAN or LAN.
Advanced Mode – NIC #1 (Network Interface Card)
- WAN (selected): This means your server will be accessible from the internet. It will get a public IP.
- The other options (LAN, Private, etc.) are only needed for internal or isolated setups.

Now, let’s look at Advanced Configuration:

Keep Server On Failure: This option controls what happens if something goes wrong; for example, if a startup script fails while setting up your server. If this setting is OFF (default), Kamatera will automatically delete the server if the setup fails. It becomes useful if you want to avoid paying for a broken setup, but you won’t be able to debug what went wrong. If this setting is ON, Kamatera will keep the server running, even if something fails during the initial setup. This lets you log in, investigate the issue, fix problems manually, and retry setup if needed.

Recommended for most users:

Keep it OFF unless you’re running custom scripts or complex configurations. For basic setups like Ollama, the default (OFF) is usually fine.

SSH key: Allows you to securely connect to the server without a password (recommended for better security).
Skip Setting Password: If enabled, your server will only allow SSH key login—no password access. You can click “Generate Key” if you don’t already have one. Input further server notes and tags if necessary.

Finalize deployment

Finally, choose a strong password, select the number of servers, and input a server name.
Once you have added all the details, choose your billing cycle and click on Create server button.

You can check the server creation progress under Server > Task Queue. Once created, you will see your server under the Servers panel.

To open the remote server further click on Connect and Open Remote Console. This will open the remote machine in a separate window, where you can access the console of the remote machine, as well. To disconnect the remote server, click Disconnect.

Accessing the Kamatera VPS

Once the server is created using the Ollama app image, you can access it via the Kamatera web console or SSH.

On first login, you’ll be prompted with a standard Ubuntu terminal:

Username: root
Password: (the one you set during server creation)

After logging in, you should see a system message indicating that Ollama has been installed successfully.

Verifying Ollama Installation

The pre-configured image installs and starts Ollama and NGINX automatically. Terminal output will confirm both services are active. To verify manually:

systemctl status ollama

You should see the service running without errors.

Checking available models

No models are preloaded. Run the following to confirm:

ollama list

An empty list is expected. You will need to pull a model before using the API or CLI.

Downloading a model

This guide uses LLaMA 3, which balances performance and resource usage well. Pull it with:

ollama pull llama3

Depending on the model version, the download may be several GBs and take a few minutes.

Running the model

Once downloaded, start an interactive session:

ollama run llama3

Enter queries directly at the prompt:

>>> Explain what a VPS is
A VPS (Virtual Private Server) is a virtualized server that provides dedicated resources...

Press Ctrl + D to exit.

When a model runs, Ollama loads it into memory and handles all inference locally on your VPS. No data is sent to external APIs, and the model stays cached for faster reuse.

Using Ollama via API

Ollama exposes a local API at http://localhost:11434. This is where its real value as a hosted service comes in: external applications can query your models directly.

Check that the API is responding:

curl http://localhost:11434/api/tags

Send a prompt:

curl http://localhost:11434/api/generate -d '{
"model": "llama3",
"prompt": "Explain what a VPS is"
}'

At this stage, the API is only accessible from within the server. The next section covers external access.

Enabling external access

By default, Ollama binds to localhost. To open it up:

Step 1: Update the service file

sudo nano /etc/systemd/system/ollama.service

Add the following under [Service]:

Environment="OLLAMA_HOST=0.0.0.0"

Step 2: Restart the service

sudo systemctl daemon-reexec
sudo systemctl restart ollama

Step 3: Open port 11434

sudo ufw allow 11434

The API is now reachable externally at http://<your-server-ip>:11434. Test it:

curl http://<your-server-ip>:11434/api/tags

Exposing the API publicly without restrictions is a security risk. The next section covers how to lock it down.

Securing the deployment

Exposing Ollama on 0.0.0.0:11434 means anyone with your server IP can reach your model, which opens the door to misuse and unexpected resource consumption. Here are three ways to address that.

Option 1: Restrict access by IP

Allow only trusted IPs on port 11434:

sudo ufw allow from <your-ip-address> to any port 11434

sudo ufw deny 11434

Option 2: Use SSH tunneling (recommended for most users)

If you don’t need the API publicly accessible, skip opening the port entirely. SSH tunneling lets you reach the API from your local machine without exposing it to the internet:

ssh -L 11434:localhost:11434 root@<your-server-ip>

Option 3: Add a reverse proxy

The Kamatera image includes NGINX, which you can configure to add authentication, route traffic, and enable HTTPS. This is the right path for production setups, but not required for basic usage.

As a baseline, never leave the API publicly accessible without at least one of these controls in place.

Managing models and storage

Ollama stores downloaded models locally. With larger models, disk usage adds up fast.
List installed models:

ollama list

Remove a model you no longer need:

ollama rm llama3

As a rough guide: 7B models typically use a few GBs, while larger models can reach 10-30GB or more. Check your available disk space before pulling multiple models, and remove unused ones regularly to avoid running out of room on smaller VPS instances.

Conclusion

Deploying Ollama on a Kamatera VPS gives you a private, self-contained LLM environment without the latency, cost, or data exposure that comes with third-party APIs.

This guide covered:

Setting up a VPS using the Ollama app image
Verifying the installation and pulling a model
Running the model via CLI and API
Enabling and securing external access
Managing models and disk storage

You now have a working private LLM setup ready to integrate into your applications, with your data staying entirely within your own infrastructure.

How to Deploy Ollama on a Kamatera VPS

Step-by-step guide

Select your server specifications

Configuring network and security settings

Finalize deployment

Accessing the Kamatera VPS

Verifying Ollama Installation

Checking available models

Downloading a model

Running the model

Using Ollama via API

Enabling external access

Securing the deployment

Option 1: Restrict access by IP

Option 2: Use SSH tunneling (recommended for most users)

Option 3: Add a reverse proxy

Managing models and storage

Conclusion

Related Topics

How to Deploy Portainer on Kamatera

How to Set Up OpenClaw on a Kamatera Server

How to Host a Discord Bot

How to Set Up Prometheus and Grafana Monitoring

How to Monitor CPU Spikes on Kamatera

Start Your 30-Day Free Trial