Skip to content

D-Lab's 2-hour introduction to running open-weight language models (LLMs) locally on your computer for research applications.

Notifications You must be signed in to change notification settings

dlab-berkeley/Local-LLMs-for-Research

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

D-Lab Local LLMs for Research

License: CC BY 4.0

This repository contains the materials for D-Lab's Local LLMs for Research workshop.

Workshop Overview

This workshop teaches you how to run open-weight language models (LLMs) locally on your computer for research applications. You'll learn to use Ollama (a simple, powerful tool for running models like Llama, Qwen, and Mistral) to perform text classification, quality evaluation, and batch processing of research data, all without cloud APIs or usage limits.

Prerequisites

We recommend attending Python Fundamentals and Python Data Wrangling prior to this workshop. Basic familiarity with pandas and working with text data is helpful but not required.

Check out D-Lab's Workshop Catalog to browse all workshops, see what's running now, and review prerequisites.

Workshop Goals

In this workshop, we cover how to use local large language models (LLMs) for research tasks using Ollama. By the end of this workshop, you will be able to:

  • Install and configure Ollama to run open-weight LLMs on your laptop
  • Build text classifiers with structured outputs for research data
  • Use LLM-as-Judge techniques to evaluate text quality, detect bias, and assess arguments
  • Process research datasets efficiently with batch operations
  • Validate and quality-check LLM outputs for reproducible research
  • Apply best practices for prompt engineering and model selection

Why Local LLMs?

Running LLMs locally offers several advantages for research:

  • Privacy: Your data never leaves your machine—critical for sensitive research data
  • Cost: No per-token charges or subscription fees—run unlimited experiments
  • Reproducibility: Control exact model versions for reproducible results
  • No Rate Limits: Process large datasets without API throttling
  • Offline Work: Continue working without internet connectivity

Common Research Use Cases

This workshop covers practical applications including:

  • Text Classification: Categorize survey responses, social media posts, or documents
  • Content Analysis: Extract themes, sentiments, or entities from qualitative data
  • LLM-as-Judge: Evaluate argument quality, detect biases, assess text characteristics
  • Data Annotation: Generate training labels for machine learning projects
  • Quality Control: Check data consistency and identify anomalies

Installation Instructions

Prerequisites

Before the workshop, you'll need to install:

  1. Python and Jupyter: We recommend Anaconda (Python 3.10 or higher)
  2. Ollama: The local LLM runtime (installation instructions below)

Step 1: Install Anaconda

Anaconda is a useful package management software that allows you to run Python and Jupyter notebooks easily. If you would like to run Python on your own computer, complete the following steps prior to the workshop:

  1. Download and install Anaconda (Python 3.10 or higher). Click the "Download" button.

  2. Download the Local LLMs workshop materials:

    • Click the green "Code" button in the top right of the repository information.
    • Click "Download Zip".
    • Extract this file to a folder on your computer where you can easily access it (we recommend Desktop).
  3. Optional: if you're familiar with git, you can instead clone this repository by opening a terminal and entering the command git clone [email protected]:dlab-berkeley/Local-LLMs-Ollama.git.

Step 2: Install Ollama

Ollama is required to run the workshop materials. Install it before the workshop:

  1. Visit ollama.com/download
  2. Download the installer for your operating system (Mac, Windows, or Linux)
  3. Run the installer and follow the prompts
  4. Verify installation by opening Terminal/Command Prompt and typing: ollama --version

Mac/Linux: Ollama starts automatically after installation.
Windows: Look for Ollama in your system tray.

Step 3: Download a Model

Before the workshop, download at least one model to save time:

ollama pull qwen2.5:3b

This downloads the Qwen 2.5 3B model (~2GB), which we'll use in the workshop. This is a one-time download and may take several minutes depending on your internet connection.

Hardware Requirements

  • Minimum: 8GB RAM, modern CPU (Intel/AMD/Apple Silicon)
  • Recommended: 16GB RAM, dedicated GPU (optional but speeds up processing)
  • Disk Space: ~5-10GB for models and dependencies

Note: You can run these workshops on most modern laptops. A GPU is not required, though it will improve performance.

Run the Code

  1. Open the Anaconda Navigator application. You should see the green snake logo appear on your screen. Note that this can take a few minutes to load up the first time.

  2. Click the "Launch" button under "Jupyter Notebooks" and navigate through your file system to the Local-LLMs-Ollama folder you downloaded above.

  3. Open the lessons folder, and click ollama_research_workshop.ipynb to begin.

  4. Press Shift + Enter (or Ctrl + Enter) to run a cell.

  5. The necessary packages for this workshop can be installed by running the first few cells of the notebook, or by running the following line in its own cell:

    %pip install ollama pandas tqdm pydantic

Note that all of the above steps can be run from the terminal, if you're familiar with how to interact with Anaconda in that fashion. However, using Anaconda Navigator is the easiest way to get started if this is your first time working with Anaconda.

Workshop Structure

The workshop is organized into the following sections:

  1. Getting Started with Ollama: Installation, setup, and downloading models
  2. Text Classification for Research: Building sentiment analyzers and custom classifiers with structured outputs
  3. LLM-as-Judge: Evaluating text quality, detecting bias, and assessing arguments
  4. Batch Processing Research Data: Efficiently processing datasets with error handling
  5. Best Practices and Tips: Temperature settings, prompt engineering, validation, and ethical considerations

Additional Resources

Troubleshooting

Ollama Installation Issues

  • Verify Ollama is running: type ollama --version in terminal
  • Mac/Linux: Ollama should start automatically; if not, run ollama serve
  • Windows: Check system tray for Ollama icon

Model Download Issues

  • Ensure stable internet connection for initial model download
  • Downloads are large (~2-5GB per model) and may take time
  • Check available disk space

Performance Issues

  • Start with smaller models (3B parameters) on older hardware
  • Close other applications to free up memory
  • Consider using CPU if GPU drivers are problematic

For additional help, visit D-Lab office hours or email [email protected].

About the UC Berkeley D-Lab

D-Lab works with Berkeley faculty, research staff, and students to advance data-intensive social science and humanities research. Our goal at D-Lab is to provide practical training, staff support, resources, and space to enable you to use computational tools for your own research applications. Our services cater to all skill levels and no programming, statistical, or computer science backgrounds are necessary. We offer these services in the form of workshops, one-to-one consulting, and working groups that cover a variety of research topics, digital tools, and programming languages.

Visit the D-Lab homepage to learn more about us. You can view our calendar for upcoming events, learn about how to utilize our consulting and data services, and check out upcoming workshops. Subscribe to our newsletter to stay up to date on D-Lab events, services, and opportunities.

Other D-Lab Python Workshops

D-Lab offers a variety of Python workshops, catered toward different levels of expertise.

Foundational Workshops

Advanced Workshops

Related Workshops

Contributors

Acknowledgments

This workshop builds on the growing ecosystem of open-source AI tools and the work of the broader LLM community. Special thanks to the Ollama team for creating such an accessible tool for running local LLMs.

About

D-Lab's 2-hour introduction to running open-weight language models (LLMs) locally on your computer for research applications.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published