How to run DeepSeek locally

DeepSeek is an open-source LLM (Large Language Model) project that’s gaining attention for its strong performance and accessibility. If you’re looking to test or fine-tune DeepSeek on your local machine, this guide will walk you through the setup process.

Why Run DeepSeek Locally?

Running DeepSeek locally gives you full control over the model without relying on cloud services. It’s great for:

Testing custom prompts and workflows
Ensuring data privacy
Saving on API costs
Fine-tuning or experimenting with model behavior

Prerequisites

Before diving in, make sure your machine meets the following requirements:

Operating System: Linux or macOS (Windows users should use WSL)
GPU: NVIDIA GPU with at least 16GB VRAM (for decent performance)
Python: 3.10 or later
CUDA & cuDNN: Installed and configured
Disk Space: 50GB+ (model files can be large)
Git and pip: Installed

Step 1: Clone the DeepSeek Repository

Open a terminal and run:

git clone https://github.com/deepseek-ai/DeepSeek-V2.git
cd DeepSeek-V2

If you’re using a different version (e.g., DeepSeek Coder), be sure to clone the appropriate repo.

Step 2: Set Up a Python Environment

Create a virtual environment to keep dependencies clean:

python3 -m venv venv
source venv/bin/activate

Then install dependencies:

pip install -r requirements.txt

Step 3: Download the Model Weights

Model weights are usually hosted on Hugging Face. Visit https://huggingface.co/deepseek-ai and choose the variant you want (7B or 67B, base or chat).

Download using git-lfs:

git lfs install
git clone https://huggingface.co/deepseek-ai/deepseek-llm-7b-base

Move the downloaded weights into a models/ folder inside your DeepSeek directory.

Step 4: Run the Model with Transformers

You can now load the model using Hugging Face’s transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "./models/deepseek-llm-7b-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

inputs = tokenizer("Hello, DeepSeek!", return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=50)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

This will output a basic completion. You’re up and running!

Optional: Use Text Generation WebUI

For a user-friendly interface, consider running DeepSeek via oobabooga/text-generation-webui. This supports multiple LLMs, including DeepSeek, with a web interface and GPU acceleration.

Troubleshooting Tips

Out of Memory Errors? Try using 4-bit or 8-bit quantized models via bitsandbytes.
CUDA Errors? Make sure your GPU drivers and CUDA toolkit versions are compatible.
Slow Load Times? Use torch.compile() or inference acceleration libraries like vllm if supported.

Final Thoughts

DeepSeek is a powerful LLM, and running it locally opens up a lot of flexibility for custom development and private experimentation. With a capable GPU and the right setup, you can have a local AI assistant that doesn’t rely on the cloud.

Let me know in the comments if you run into any issues or want a tutorial on fine-tuning DeepSeek!

Why Run DeepSeek Locally?

Prerequisites

Step 1: Clone the DeepSeek Repository

Step 2: Set Up a Python Environment

Step 3: Download the Model Weights

Step 4: Run the Model with Transformers

Optional: Use Text Generation WebUI

Troubleshooting Tips

Final Thoughts

Comments

Leave a Reply Cancel reply

More posts

How to Change Root Password on a VPS via SolusVM

How to SSH into your VPS from a Windows or Linux Computer

How to Install KDE Plasma on Ubuntu 24.04

How to run DeepSeek locally

Why Run DeepSeek Locally?

Prerequisites

Step 1: Clone the DeepSeek Repository

Step 2: Set Up a Python Environment

Step 3: Download the Model Weights

Step 4: Run the Model with Transformers

Optional: Use Text Generation WebUI

Troubleshooting Tips

Final Thoughts

Comments

Leave a Reply Cancel reply

More posts

How to Change Root Password on a VPS via SolusVM

How to enable two factor authentication (2FA) on RackNerd account

How to SSH into your VPS from a Windows or Linux Computer

How to Install KDE Plasma on Ubuntu 24.04