Local Large Language Models in R with Ollama

Introduction

This tutorial introduces Ollama and the ollamar R package — a toolkit for running open-source large language models (LLMs) directly on your own machine and calling them from R. Unlike cloud-based AI services, Ollama requires no API key, sends no data to external servers, and works entirely offline once a model has been downloaded. This makes it particularly well suited for research involving sensitive or proprietary text, for reproducible analyses that must not depend on third-party service availability, and for teaching environments without budget for commercial API access.
The tutorial covers the conceptual foundations of local LLM inference, installation and setup, and eight practical workflows relevant to corpus linguistics and NLP research: basic text generation, multi-turn conversation, sentiment analysis and text classification, named entity recognition, text summarisation, generating embeddings, corpus-scale batch processing, and using a model to assist with writing R code.
Before working through this tutorial, you should be comfortable with:
- Getting Started with R — R objects, functions, and the tidyverse
- String Processing in R — working with text in R
- Loading and Saving Data — reading files into R
Familiarity with basic NLP concepts (tokens, sentiment, named entities) is helpful but not required — concepts are introduced as they arise.
By the end of this tutorial you will be able to:
- Explain what Ollama is, how local LLM inference works, and when to prefer it over cloud APIs
- Install Ollama, pull a model, and verify the connection from R
- Generate text from a prompt using
generate() - Build multi-turn conversations using
chat()and conversation history management - Use prompt engineering to perform sentiment analysis, NER, and text summarisation
- Generate sentence embeddings with
embed()for downstream analysis - Process a corpus of texts at scale using parallelisation
- Use a local LLM to assist with writing and debugging R code
Martin Schweinberger. 2026. Local Large Language Models in R with Ollama. The Language Technology and Data Analysis Laboratory (LADAL), The University of Queensland, Australia. url: https://ladal.edu.au/tutorials/ollama/ollama.html (Version 2026.03.27), doi: .
What Is Ollama and Why Use It?
What you will learn: What Ollama is and how it works; the difference between local and cloud-based LLM inference; the key advantages of running models locally; hardware requirements; and how ollamar connects R to the Ollama server
Local vs Cloud LLM Inference
When you use a cloud-based LLM service — such as OpenAI’s GPT, Anthropic’s Claude, or Google’s Gemini — your text is sent over the internet to a remote server, processed there, and the response is returned to you. This works well for many tasks but raises concerns in three areas:
Privacy — any text you send to a cloud API is processed on servers you do not control. For research involving sensitive data (patient records, confidential documents, anonymised survey responses), this is often unacceptable under institutional ethics approvals and data governance policies.
Cost — commercial APIs charge per token. For corpus linguistics, where you may need to process thousands of texts, costs can escalate rapidly. A corpus of 10,000 abstracts processed with a cloud API at typical 2025 pricing can cost tens to hundreds of dollars.
Reproducibility — cloud models are updated without notice. An analysis run today against GPT-4o may produce different results next month against a silently updated version of the same model. Local models, by contrast, are fixed: the weights you download today are the weights you use in six months.
Ollama eliminates all three concerns by running the model entirely on your own hardware.
What Ollama Is
Ollama is a free, open-source application that downloads, manages, and serves open-source LLMs locally. It provides a REST API on http://127.0.0.1:11434 that any application — including R, Python, or a web browser — can call to generate text, chat, or produce embeddings. From R, the ollamar package (Lin and Safi 2025) wraps this API in a clean set of R functions.
Your R script
│
▼
ollamar (R package) ──HTTP──▶ Ollama server (127.0.0.1:11434)
│
▼
Local LLM weights
(stored on your machine)
│
▼
Response text
Ollama supports a large and growing library of models. Once you have installed Ollama, pulling a new model is a single command.
Hardware Requirements
LLMs vary greatly in size. The model used throughout this tutorial, llama3.2:3b, is a 3-billion-parameter model that runs on virtually any modern laptop with at least 8 GB of RAM, without a GPU. Larger models require more resources:
| Model size | RAM required | GPU needed? | Typical use |
|---|---|---|---|
| 1B–3B | 4–8 GB | No | Teaching, prototyping, simple tasks |
| 7B–8B | 8–16 GB | Optional | Most NLP research tasks |
| 13B | 16 GB | Recommended | High-quality generation |
| 70B+ | 48 GB+ | Required | Near-GPT-4 quality |
For the tasks in this tutorial, llama3.2:3b is sufficient and runs comfortably without a GPU on a standard research laptop.
The ollamar Package
The ollamar package (Lin and Safi 2025) uses the httr2 library to make HTTP requests to the Ollama server. Most functions return an httr2_response object by default, which must be parsed with resp_process(). Alternatively, you can specify the output format directly using the output parameter, which accepts "text", "df" (tibble), "jsonlist", "raw", or "resp" (the default httr2 response).
ollamar is an R interface to Ollama — a separate application that must be installed on your machine before any ollamar function will work. Installing the R package alone is not sufficient.
To install Ollama:
- Go to ollama.com in your browser
- Click Download — the site detects your operating system (Windows, Mac, or Linux) automatically
- Run the installer (
OllamaSetup.exeon Windows; drag to Applications on Mac) - After installation, close and reopen your terminal — Windows and Mac need a fresh terminal session to recognise the new
ollamacommand - Verify the installation worked by opening a terminal and running:
ollama --versionYou should see a version number. If you see “not recognised” or “command not found”, restart your computer and try again.
Then download the model used in this tutorial:
ollama pull llama3.2This downloads the 3B model (~2 GB) and only needs to be done once. Ollama stores model weights automatically in its own cache folder — you do not need to specify a location.
Once installed, Ollama runs as a background service and starts automatically at login. You can confirm it is running by checking the system tray (Windows) or menu bar (Mac) for the Ollama icon. From R, verify the connection with ollamar::test_connection() before running any analysis code.
Q1. A researcher wants to use an LLM to analyse 5,000 interview transcripts that contain sensitive personal information about mental health. She is considering using a commercial cloud API. What are the two most important reasons she should use a local model via Ollama instead?
Q2. A colleague says: ‘llama3.2:3b is only 3 billion parameters — it must be far worse than GPT-4 and not worth using for research.’ What is the most accurate response to this argument?
Setup
What you will learn: How to install Ollama; how to pull (download) a model; how to install and load ollamar; how to test the connection; and what to do when things go wrong
Step 1 — Install Ollama
Download and install Ollama from ollama.com. Installers are available for Windows, Mac, and Linux. After installation, Ollama runs as a background service and starts automatically when your computer boots.
To verify Ollama is running, open a terminal and type:
ollama --versionYou should see a version number. If you see “command not found”, Ollama is not installed or not on your PATH.
Step 2 — Pull a Model
Download the llama3.2:3b model. This is a 2 GB download and only needs to be done once:
ollama pull llama3.2To see all models you have downloaded:
ollama listllama3.2 defaults to the 3B parameter version, which runs on any laptop with 8 GB RAM and no GPU. If you have more resources available, llama3.1:8b (requires 8 GB RAM) produces noticeably better output for complex tasks. Pull it with ollama pull llama3.1. Throughout this tutorial we use llama3.2 for accessibility, with notes where a larger model would improve results.
Step 3 — Install ollamar
Code
# Stable version from CRAN
install.packages("ollamar")
# Development version with latest features (optional)
# install.packages("remotes")
# remotes::install_github("hauselin/ollamar")
# Additional packages used in this tutorial
install.packages(c(
"dplyr", "purrr", "tibble", "stringr",
"ggplot2", "flextable", "httr2", "checkdown"
))Step 4 — Load Packages
Code
library(ollamar)
library(dplyr)
library(purrr)
library(tibble)
library(stringr)
library(ggplot2)
library(flextable)
library(httr2)
library(checkdown)Step 5 — Test the Connection
Code
# Check that Ollama is running and R can reach it
ollamar::test_connection()<httr2_response>
GET http://localhost:11434/
Status: 200 OK
Content-Type: text/plain
Body: In memory (17 bytes)
A successful connection prints a confirmation message. If you see "Ollama local server not running or wrong server", check that the Ollama application is open and running in the background.
Code
# See which models you have downloaded
ollamar::list_models() name size parameter_size quantization_level
1 llama3.2:latest 2 GB 3.2B Q4_K_M
2 nomic-embed-text:latest 274 MB 137M F16
modified
1 2026-03-20T08:40:36
2 2026-03-20T08:40:37
Expected output:
name size parameter_size quantization_level modified
1 llama3.2:latest 2.0 GB 3.2B Q4_K_M 2025-01-15
ollamar communicates with Ollama via HTTP. If Ollama is not running when you call generate(), chat(), or embed(), you will get a connection error. Always check test_connection() at the start of your session if you are unsure.
Q3. You run test_connection() and see "Ollama local server not running or wrong server". You are sure Ollama is installed. What should you check first?
Q4. You run list_models() and see an empty data frame. What does this mean and what should you do?
Basic Text Generation
What you will learn: How generate() works; the output parameter and its five format options; how to write effective prompts; and how to inspect and process the response object
The generate() Function
generate() is the simplest way to get a response from a model. It takes a model name and a prompt and returns a response in the format you specify:
Code
library(ollamar)
# Generate a response — returns httr2_response object by default
resp <- ollamar::generate("llama3.2", "What is corpus linguistics?")
# Inspect the raw response object
resp<httr2_response>
POST http://127.0.0.1:11434/api/generate
Status: 200 OK
Content-Type: application/json
Body: In memory (5322 bytes)
Code
# <httr2_response>
# POST http://127.0.0.1:11434/api/generate
# Status: 200 OK
# Extract just the text
ollamar::resp_process(resp, "text")[1] "Corpus linguistics is a subfield of linguistics that focuses on the study of language using large databases, or corpora, of texts. A corpus is a collection of texts that are representative of a particular language variety, register, or dialect.\n\nThe core idea behind corpus linguistics is to analyze and quantify linguistic phenomena by comparing large amounts of text data with statistical methods. This approach allows researchers to identify patterns, trends, and relationships in language that may not be apparent through traditional qualitative analysis.\n\nCorpus linguistics typically involves the following steps:\n\n1. Data collection: Gathering a large corpus of texts from various sources, such as books, articles, websites, or social media.\n2. Preprocessing: Cleaning and preparing the data for analysis by removing punctuation, converting to lowercase, and tokenizing (breaking down) text into individual words or phrases.\n3. Analysis: Using statistical and computational methods to analyze the corpus, such as frequency analysis, sentiment analysis, and topic modeling.\n\nCorpus linguistics has many applications in language teaching, linguistics research, natural language processing, and other fields. Some common uses of corpus linguistics include:\n\n1. Language teaching: Corpus-based materials can be used to create authentic language learning resources, such as reading comprehension exercises or writing prompts.\n2. Linguistic analysis: Corpus linguistics helps researchers understand the structure and evolution of languages, including grammar, vocabulary, and syntax.\n3. Sentiment analysis: Corpora can be used to analyze public opinion and sentiment towards certain topics or issues.\n4. Text classification: Corpus-based machine learning algorithms can be trained on labeled data to classify new texts into categories (e.g., spam vs. non-spam emails).\n5. Language documentation: Corpus linguistics helps document endangered languages, providing insights into their structure, vocabulary, and usage.\n\nSome notable examples of corpus linguistic research include:\n\n* The British National Corpus (BNC): A 100-million-word database of written English from the UK.\n* The Google Books Ngram Viewer: A tool for analyzing changes in language over time based on book publication data.\n* The Corpus of Contemporary American English (COCA): A database of spoken and written American English from the early 20th century to the present.\n\nCorpus linguistics has revolutionized the way we study, teach, and analyze language, providing a wealth of new insights into the complex dynamics of human communication."
Code
# Or get a tidy tibble with metadata
ollamar::resp_process(resp, "df")# A tibble: 1 × 3
model response created_at
<chr> <chr> <chr>
1 llama3.2 "Corpus linguistics is a subfield of linguistics that foc… 2026-03-2…
Output Formats
The output parameter saves you from calling resp_process() separately:
Code
# Text string — most convenient for single outputs
txt <- ollamar::generate("llama3.2",
"Define collocations in corpus linguistics.",
output = "text")
cat(txt)In corpus linguistics, a collocation is a fixed combination of two or more words that are commonly used together in a language, often in a specific context or register. Collocations can be composed of any type of word, including nouns, verbs, adjectives, and adverbs.
Collocations differ from other linguistic phenomena such as idioms and phrasal verbs, which also involve the use of multiple words together. However, collocations are typically characterized by their high frequency of occurrence in a language corpus, often with a relatively low probability of being used on their own (i.e., without the accompanying word(s)).
The concept of collocation was first introduced by Michael Halliday in his 1967 work "Lexicogrammatical Analysis". Since then, it has become an important area of research in corpus linguistics, as it allows researchers to identify and analyze patterns of word use that may not be immediately apparent through traditional linguistic analysis.
Collocations can provide valuable insights into language use, such as:
1. Informing the development of dictionaries and lexicons
2. Helping to explain regional or cultural differences in language use
3. Providing evidence for linguistic change over time
4. Enabling researchers to better understand how words are used in context
In corpus linguistics, collocations are often studied using statistical methods, such as frequency analysis and clustering algorithms, to identify patterns of word use that may not be immediately apparent through manual observation or traditional linguistic analysis.
Some examples of collocations include:
* "the big house"
* "run fast"
* "make mistakes"
* "get angry"
These combinations of words are commonly used together in English language corpora, and can provide insights into the language's syntax, semantics, and pragmatics.
Code
# Tibble — useful for storing results alongside metadata
df <- ollamar::generate("llama3.2",
"Name three open-source corpora of English.",
output = "df")
glimpse(df)Rows: 1
Columns: 3
$ model <chr> "llama3.2"
$ response <chr> "Here are three open-source corpora of English:\n\n1. Corpu…
$ created_at <chr> "2026-03-29T07:47:18.3687287Z"
Code
# Columns: model, response, done, total_duration, ...
# JSON list — useful for programmatic parsing
jl <- ollamar::generate("llama3.2",
"What is TF-IDF?",
output = "jsonlist")output value |
Returns | Best for |
|---|---|---|
"resp" (default) |
httr2 response object | Checking status codes; low-level access |
"text" |
Character string | Simple single-call workflows |
"df" |
Tibble with metadata | Storing results with timing and model info |
"jsonlist" |
Named R list | Programmatic access to all response fields |
"raw" |
Raw bytes | Advanced / debugging |
Writing Effective Prompts
The quality of the output depends heavily on how the prompt is written. Three principles are especially important:
Be specific about the task. A vague prompt produces a vague response. “Tell me about this text” is far less effective than “In one sentence, identify the main topic of the following text.”
Specify the output format. If you need structured output, say so explicitly: “Respond with only a single word: positive, negative, or neutral.” Without this, the model may explain its reasoning at length, which complicates downstream processing.
Provide context. The model knows nothing about your research project. Brief framing — “You are a corpus linguistics researcher analysing parliamentary debate transcripts” — shapes the register and vocabulary of the response.
Code
# Vague prompt — produces a generic, hard-to-process response
ollamar::generate("llama3.2",
"What do you think about this sentence: 'The bill was passed.'",
output = "text")[1] "A very straightforward and simple sentence! I think it's a well-structured sentence that conveys a clear meaning. It's a declarative sentence, stating a fact or event that has occurred.\n\nIn terms of grammatical correctness, the sentence is perfectly formed:\n\n* \"The\" is used as an article to modify the noun.\n* \"bill\" refers to the legislative proposal or document being discussed.\n* The verb \"was passed\" uses the present perfect tense to indicate that the action (passing) has occurred at some point in the past and still has a connection to the present.\n\nOverall, I think the sentence is clear, concise, and effectively conveys its intended meaning."
Code
# Specific prompt — produces a precise, processable response
ollamar::generate("llama3.2",
paste(
"You are a computational linguist.",
"Identify the grammatical voice (active or passive) of the following sentence.",
"Respond with exactly one word: active or passive.",
"Sentence: 'The bill was passed.'"
),
output = "text"
)[1] "Passive."
Code
# Expected: "passive"Q5. You call generate("llama3.2", "Summarise this corpus.", output = "text") and receive a lengthy explanation of what corpus summarisation is rather than a summary of your corpus. What is the most likely cause and how should you fix it?
Q6. You want to generate responses for 100 different prompts using generate() in a loop. A colleague suggests using output = "df" rather than output = "text". What advantage does the "df" format offer for a batch workflow?
Multi-Turn Chat
What you will learn: The difference between generate() and chat(); how conversation history is structured as a list of messages; how to use create_message(), append_message(), and related helpers; how system prompts shape model behaviour; and how to build a reusable chat loop
generate() vs chat()
generate() is stateless — each call is independent and the model has no memory of previous calls. chat() maintains conversation context by accepting a message history: a list of all previous turns (both user messages and assistant responses). This allows follow-up questions, iterative refinement, and role-playing scenarios where the model maintains a consistent persona across turns.
The message history is a list of named lists, each with two elements: role (one of "system", "user", or "assistant") and content (the message text):
list(
list(role = "system", content = "You are a helpful linguist."),
list(role = "user", content = "What is a hapax legomenon?"),
list(role = "assistant", content = "A hapax legomenon is a word that occurs only once..."),
list(role = "user", content = "Can you give me an example from Shakespeare?")
)Creating and Managing Message History
ollamar provides helper functions so you never need to construct these lists manually:
Code
# create_message() — start a history with one message
# The second argument is the role; default is "user"
messages <- ollamar::create_message(
"You are a corpus linguistics expert. Give concise, precise answers.",
"system"
)
# append_message() — add a user turn
messages <- ollamar::append_message(
"What is the difference between type frequency and token frequency?",
"user",
messages
)
# Send to the model and get a response
resp <- ollamar::chat("llama3.2", messages, output = "df")
# The model's reply
cat(resp$content)Token frequency refers to the number of times a specific word or form (including variants) appears in a text.
Type frequency, on the other hand, refers to the count of unique words or forms that appear at least once in a text. It is often used as an estimate of vocabulary size.
To continue the conversation, append the assistant’s reply and a new user message:
Code
# Append the model's reply to maintain context
messages <- ollamar::append_message(resp$content, "assistant", messages)
# Add the next user question
messages <- ollamar::append_message(
"How would I calculate type-token ratio in R?",
"user",
messages
)
# Continue the conversation
resp2 <- ollamar::chat("llama3.2", messages, output = "df")
cat(resp2$content)To calculate Type-Token Ratio (TTR) in R, you can use the following formula:
1. Count the number of unique words (`n_unique`) using `setdiff()` or `unique()`.
2. Count the total number of words (`n_total`) using `length()` or `sum()`.
3. Calculate TTR as `(n_unique / n_total) * 100` and display it.
Example code:
```r
text_data <- "your text data here"
# count unique words
n_unique <- length(setdiff(strsplit(text_data, "\\s+")[[1]], collapse = ""))
# count total words
n_total <- sum(nchar(gsub("\\s+", "", strsplit(text_data, "\\s+")[[1]])))
# calculate TTR
ttr <- (n_unique / n_total) * 100
print(paste("Type-Token Ratio:", round(ttr, 2), "%"))
```
Alternatively, you can use packages like `textstat` or `tm` for more efficient and convenient calculations:
```r
library(textstat)
text_data <- "your text data here"
ttr <- adjust_for_multiples(count_words(text_data)) * 100
print(paste("Type-Token Ratio:", round(ttr, 2), "%"))
```
Note: Make sure to preprocess your text data before calculating TTR.
System Prompts
A system prompt is a message with role = "system" placed at the start of the history. It sets the model’s persona, constraints, and output format for the entire conversation. System prompts are one of the most powerful tools for producing consistent, well-formatted output:
Code
# System prompt for a structured linguistic analysis assistant
sys_prompt <- paste(
"You are a linguistic analysis assistant specialising in corpus linguistics.",
"You always respond in plain text, without markdown formatting.",
"Your answers are concise and technically precise.",
"When asked to classify text, you respond with only the label — no explanation."
)
messages <- ollamar::create_message(sys_prompt, "system")
messages <- ollamar::append_message(
"Classify the register of this text: 'Pursuant to the provisions of section 42...'",
"user",
messages
)
ollamar::chat("llama3.2", messages, output = "text")[1] "Formal/Technical"
Code
# Expected: "legal/formal"A Reusable Chat Loop
For interactive use, you can build a simple loop that manages the conversation history automatically:
Code
# Initialise with a system prompt
messages <- ollamar::create_message(
"You are a helpful R programming assistant for linguists.",
"system"
)
# Simple interactive chat loop — run in RStudio console, not knitted document
chat_with_model <- function(model = "llama3.2") {
msgs <- ollamar::create_message(
"You are a helpful R programming and corpus linguistics assistant.",
"system"
)
cat("Chat started. Type 'quit' to exit.\n\n")
repeat {
user_input <- readline("You: ")
if (trimws(user_input) == "quit") { cat("Goodbye!\n"); break }
msgs <- ollamar::append_message(user_input, "user", msgs)
resp <- ollamar::chat(model, msgs, output = "df")
reply <- resp$content
msgs <- ollamar::append_message(reply, "assistant", msgs)
cat("Model:", reply, "\n\n")
}
}
chat_with_model()ollamar provides a full set of message manipulation helpers:
prepend_message()— add a message to the beginning of the historyinsert_message()— insert at a specific position (positive or negative index)delete_message()— remove a message at a specific positioncreate_messages()— create a history with multiple messages at once
These are particularly useful when building complex multi-turn workflows where the conversation history needs to be edited programmatically.
Q7. You are building a chat workflow for annotating linguistic data. After 10 turns, you notice the model has started ignoring your system prompt instructions. What is the most likely cause?
Q8. What is the key practical difference between using generate() and chat() for a sequence of related prompts about the same document?
Sentiment Analysis and Text Classification
What you will learn: How to use prompt engineering to turn a general-purpose LLM into a text classifier; how to enforce structured output; how to process a vector of texts and collect results; and how to evaluate classification output
Prompt-Based Classification
Rather than fine-tuning a model, LLMs can be directed to perform classification tasks through careful prompt design. The key principle is to ask for a constrained response — a single label from a defined set — rather than an open-ended answer.
Code
classify_sentiment <- function(text, model = "llama3.2") {
prompt <- paste(
"Classify the sentiment of the following text.",
"Respond with exactly one word: positive, negative, or neutral.",
"Do not explain your answer.",
paste0("Text: '", text, "'")
)
ollamar::generate(model, prompt, output = "text") |>
trimws() |>
tolower()
}
# Test on a single sentence
classify_sentiment("The results were surprisingly strong and exceeded all expectations.")[1] "positive"
Code
# Expected: "positive"
classify_sentiment("The methodology is flawed and the conclusions are unwarranted.")[1] "negative"
Code
# Expected: "negative"Batch Classification
Apply the classifier to a vector of texts using purrr::map_chr():
Code
reviews <- tibble::tibble(
id = 1:6,
text = c(
"An outstanding contribution to the field — clear, rigorous, and insightful.",
"The paper is poorly structured and the argument is difficult to follow.",
"The study replicates previous findings without offering new theoretical insights.",
"A welcome addition to the literature on discourse coherence.",
"The sample size is too small to support the generalisations made.",
"The cross-linguistic comparison is both ambitious and well executed."
)
)
reviews <- reviews |>
dplyr::mutate(
sentiment = purrr::map_chr(text, classify_sentiment)
)
reviews |>
dplyr::select(id, sentiment, text) |>
flextable::flextable() |>
flextable::set_table_properties(width = .95, layout = "autofit") |>
flextable::theme_zebra() |>
flextable::fontsize(size = 10) |>
flextable::set_caption(caption = "Sentiment classification of academic review sentences using llama3.2.") |>
flextable::border_outer()id | sentiment | text |
|---|---|---|
1 | positive | An outstanding contribution to the field — clear, rigorous, and insightful. |
2 | negative | The paper is poorly structured and the argument is difficult to follow. |
3 | negative | The study replicates previous findings without offering new theoretical insights. |
4 | positive | A welcome addition to the literature on discourse coherence. |
5 | negative | The sample size is too small to support the generalisations made. |
6 | positive | The cross-linguistic comparison is both ambitious and well executed. |
Multi-Class Topic Classification
The same pattern extends to any classification scheme. Here we classify academic sentences by rhetorical function:
Code
classify_rhetorical <- function(text, model = "llama3.2") {
prompt <- paste(
"Classify the rhetorical function of the following academic sentence.",
"Choose exactly one label from: background, method, result, conclusion.",
"Respond with that single word only.",
paste0("Sentence: '", text, "'")
)
ollamar::generate(model, prompt, output = "text") |>
trimws() |>
tolower()
}
sentences <- c(
"Previous research has established a strong link between frequency and acceptability.",
"We collected data from 120 native speakers using an online survey platform.",
"The analysis revealed a significant effect of register on hedging frequency (p < .001).",
"These findings suggest that usage-based accounts require revision."
)
purrr::map_chr(sentences, classify_rhetorical)[1] "conclusion" "method" "conclusion" "result"
Code
# Expected: c("background", "method", "result", "conclusion")Q9. You run batch sentiment classification on 200 texts and find that about 15% of results contain extra words like “The sentiment is positive” instead of just “positive”. What prompt change would most reliably fix this?
Q10. You want to classify 1,000 newspaper headlines by topic (politics, economics, sport, culture, science) using a local LLM. A colleague suggests evaluating the classifier on 50 manually labelled headlines before using it on the full corpus. Why is this a good practice?
Named Entity Recognition
What you will learn: How to prompt a local LLM to identify and classify named entities; how to request structured JSON output for easier parsing; how to parse and process entity output in R; and the trade-offs between LLM-based NER and dedicated NER models
Prompting for NER
Named entity recognition asks the model to identify spans of text that refer to real-world objects and classify them by type (person, organisation, location, etc.). Requesting JSON output makes the response much easier to parse programmatically:
Code
extract_entities <- function(text, model = "llama3.2") {
prompt <- paste(
"Extract all named entities from the following text.",
"Return a JSON array where each element has two fields:",
"'entity' (the text span) and 'type' (one of: PERSON, ORG, LOC, DATE, MISC).",
"Return only the JSON array — no explanation, no markdown, no code block.",
paste0("Text: '", text, "'")
)
raw <- ollamar::generate(model, prompt, output = "text") |> trimws()
# Try to isolate just the JSON array in case the model added surrounding text
json_str <- stringr::str_extract(raw, "\\[.*\\]")
if (is.na(json_str)) json_str <- raw
result <- tryCatch(
jsonlite::fromJSON(json_str, simplifyDataFrame = TRUE),
error = function(e) {
warning("JSON parsing failed. Raw output was: ", raw)
NULL
}
)
# Validate that we got a proper data frame with the right columns
if (is.null(result) ||
!is.data.frame(result) ||
!all(c("entity", "type") %in% names(result))) {
return(tibble::tibble(entity = NA_character_, type = NA_character_))
}
tibble::as_tibble(result)
}
# Test on a news sentence
result <- extract_entities(
"Christine Lagarde met Rishi Sunak in London last Tuesday to discuss IMF reform."
)
result# A tibble: 5 × 2
entity type
<chr> <chr>
1 Christine Lagarde PERSON
2 Rishi Sunak PERSON
3 IMF ORG
4 London LOC
5 Tuesday DATE
Code
# entity type
# 1 Christine Lagarde PERSON
# 2 Rishi Sunak PERSON
# 3 London LOC
# 4 last Tuesday DATE
# 5 IMF ORGCorpus-Scale NER
Apply the extractor across a corpus and bind the results into a single data frame:
Code
news_corpus <- tibble::tibble(
doc_id = paste0("doc", 1:4),
text = c(
"The European Central Bank announced rate rises in Frankfurt, affecting markets across Germany and France.",
"Ursula von der Leyen met Joe Biden at the G7 summit in Hiroshima to discuss trade policy.",
"Oxford University published a landmark study on language acquisition in the journal Nature.",
"Amazon opened a new fulfilment centre near Manchester, creating 1,500 jobs in the region."
)
)
ner_results <- purrr::pmap_dfr(news_corpus, function(doc_id, text) {
ents <- extract_entities(text)
if (nrow(ents) > 0 && !is.na(ents$entity[1])) {
dplyr::mutate(ents, doc_id = doc_id)
} else {
tibble::tibble(entity = NA, type = NA, doc_id = doc_id)
}
})
ner_results# A tibble: 4 × 3
entity type doc_id
<lgl> <lgl> <chr>
1 NA NA doc1
2 NA NA doc2
3 NA NA doc3
4 NA NA doc4
LLM-Based NER vs Dedicated Models
LLM-based NER via prompting has different trade-offs compared to dedicated NER models (such as those available through udpipe or the BERT-based models in the BERT/RoBERTa tutorial):
| Property | LLM (Ollama) | Dedicated NER model |
|---|---|---|
| Setup complexity | Low — no fine-tuning | Low — pre-trained weights |
| Speed | Slow (seconds per text) | Fast (milliseconds per text) |
| Customisability | High — change entity types in prompt | Low — fixed to training categories |
| Output consistency | Variable — JSON parsing can fail | Consistent structured output |
| Domain adaptation | Easy — describe domain in prompt | Requires fine-tuning |
| Best for | Flexible exploration, novel entity types | Production pipelines, large corpora |
For corpora of thousands of documents, dedicated models are far more practical. LLM-based NER is most useful when you need non-standard entity types, when the domain is unusual, or when you are exploring a new task before committing to a heavier infrastructure.
Q11. You run the extract_entities() function on 50 texts and find that about 10% return a JSON parsing error. What are two good defensive coding strategies to handle this?
Text Summarisation
What you will learn: How to prompt for extractive and abstractive summaries; how to control summary length and format; how to apply summarisation at corpus scale; and how to compare model output across different prompt formulations
Single-Document Summarisation
Summarisation is one of the tasks where local LLMs perform most reliably. The key prompt design decisions are specifying the desired length, the target audience, and whether the summary should be extractive (drawn verbatim from the source) or abstractive (paraphrased in the model’s own words):
Code
summarise_text <- function(text,
n_sentences = 3,
model = "llama3.2") {
prompt <- paste(
paste0("Summarise the following text in exactly ", n_sentences, " sentences."),
"Write in plain academic prose. Do not use bullet points.",
"Do not include any preamble — begin immediately with the summary.",
paste0("\n\nText:\n", text)
)
ollamar::generate(model, prompt, output = "text") |> trimws()
}
# Darwin abstract (illustrative)
darwin_passage <- paste(
"The struggle for existence amongst all organic beings throughout the world,",
"which inevitably follows from their high geometrical powers of increase,",
"will be treated of. This is the doctrine of Malthus, applied to the whole",
"animal and vegetable kingdoms. As many more individuals of each species are",
"born than can possibly survive, and as, consequently, there is a frequently",
"recurring struggle for existence, it follows that any being, if it vary",
"however slightly in any manner profitable to itself, will have a better",
"chance of surviving and thus be naturally selected."
)
cat(summarise_text(darwin_passage, n_sentences = 2))The struggle for existence among all organic beings worldwide arises from their high reproductive capacities, which lead to an overwhelming number of individuals born compared to those that can survive. This disparity results in a recurring competition for survival, favoring individuals who exhibit variations beneficial to themselves, thereby increasing their chances of surviving and being naturally selected.
Varying Summary Length
Code
# Compare summaries at different lengths
lengths <- c(1, 2, 3)
summaries <- purrr::map_chr(
lengths,
~ summarise_text(darwin_passage, n_sentences = .x)
)
# Display side by side
tibble::tibble(
n_sentences = lengths,
summary = summaries
) |>
flextable::flextable() |>
flextable::set_table_properties(width = .95, layout = "autofit") |>
flextable::theme_zebra() |>
flextable::fontsize(size = 10) |>
flextable::set_caption(caption = "Same passage summarised at 1, 2, and 3 sentences.") |>
flextable::border_outer()n_sentences | summary |
|---|---|
1 | The concept of natural selection, as articulated by Malthus, posits that the struggle for existence among all organic beings worldwide is driven by an inevitable disparity between birth rates and survival capabilities, resulting in the advantageous adaptation being more likely to survive and reproduce. |
2 | The doctrine proposed by Thomas Malthus posits that the struggle for existence among all organic beings worldwide is inevitable due to their high reproductive capabilities. As a result, individuals with minor variations that confer advantages to themselves are more likely to survive and be naturally selected, leading to increased chances of reproduction. |
3 | The struggle for existence amongst all organic beings throughout the world is a doctrine derived from Thomas Malthus' theory, which asserts that the high geometrical powers of increase among living organisms lead to an overwhelming abundance of individuals. As a result, most species face intense competition for survival, necessitating variations in form or function that confer a selective advantage. Those individuals exhibiting beneficial traits are more likely to survive and reproduce, thereby becoming naturally selected over their less adapted peers. |
Corpus-Scale Summarisation
For a corpus of documents, apply the function across rows and store results:
Code
# Illustrative corpus of five abstracts
abstracts <- tibble::tibble(
paper_id = paste0("P", 1:5),
abstract = c(
"This study investigates the frequency distribution of hedging devices in spoken and written academic English using a corpus of 500,000 words drawn from lectures and journal articles...",
"We present a computational model of lexical alignment in dialogue, trained on the British National Corpus...",
"The paper examines the development of grammaticalisation in Old English modal verbs using diachronic corpus data spanning the 7th to 12th centuries...",
"Using eye-tracking methodology, we investigate how readers process garden-path sentences in English and German...",
"This paper reports on a large-scale survey of attitudes towards language change among speakers of Irish English in three urban centres..."
)
)
abstracts_summarised <- abstracts |>
dplyr::mutate(
summary = purrr::map_chr(
abstract,
~ summarise_text(.x, n_sentences = 1)
)
)
abstracts_summarised |>
dplyr::select(paper_id, summary) |>
flextable::flextable() |>
flextable::set_table_properties(width = .95, layout = "autofit") |>
flextable::theme_zebra() |>
flextable::fontsize(size = 10) |>
flextable::set_caption(caption = "One-sentence summaries generated by llama3.2.") |>
flextable::border_outer()paper_id | summary |
|---|---|
P1 | A corpus-based analysis of 500,000 words of academic English reveals the frequency distribution of hedging devices in both spoken and written forms. |
P2 | A computational model of lexical alignment in dialogue has been developed and trained on the British National Corpus. |
P3 | This study investigates the evolution of grammaticalization in Old English modal verbs through a longitudinal analysis of linguistic corpora from the 7th to 12th centuries. |
P4 | Researchers employed eye-tracking methodology to study how English and German-speaking readers process garden-path sentences, a linguistic phenomenon where syntactic ambiguities lead to temporary confusion during comprehension. |
P5 | A large-scale survey was conducted to investigate attitudes towards language change among speakers of Irish English in three urban centers, providing insight into societal perceptions of linguistic evolution. |
Q12. You summarise 200 abstracts and find that about 20% of summaries begin with “Here is a one-sentence summary:” despite your prompt saying “Do not include any preamble.” What is the most robust fix?
Generating Embeddings
What you will learn: What embed() does and when to use it; how to extract numeric embedding vectors; how to compute cosine similarity between embeddings; and how to apply embeddings to semantic grouping and nearest-neighbour search
What Are Embeddings?
The embed() function sends text to the model and returns a numeric vector (the embedding) rather than generated text. An embedding is a fixed-length representation of meaning in a high-dimensional space: texts with similar meaning produce vectors that are close together (high cosine similarity); texts with unrelated meaning produce vectors that are far apart.
Ollama can produce embeddings using models specifically optimised for the task. nomic-embed-text is a widely used embedding model that is fast, small (~270 MB), and produces high-quality 768-dimensional embeddings:
Code
# Pull the embedding model (one-time download, ~270 MB)
ollamar::pull("nomic-embed-text")
# Generate an embedding for a single sentence
emb <- ollamar::embed("nomic-embed-text", "Corpus linguistics studies language in use.")
length(emb$embeddings[[1]]) # 768 dimensionsNot all Ollama models support embeddings. Use nomic-embed-text or mxbai-embed-large for embedding tasks — do not use generation models like llama3.2 for embeddings, as their output will be of lower quality. Conversely, embedding models cannot generate text. Pull the right model for the right task.
Cosine Similarity
Code
# Cosine similarity function
cosine_sim <- function(a, b) {
sum(a * b) / (sqrt(sum(a^2)) * sqrt(sum(b^2)))
}
sentences <- c(
"Frequency effects are central to usage-based theories of grammar.",
"Usage-based linguistics emphasises the role of input frequency in acquisition.",
"The morphosyntax of Swahili noun class agreement has been extensively studied.",
"Corpus data reveal robust collocational preferences in academic writing.",
"Token frequency shapes the entrenchment of linguistic constructions."
)
embeddings <- purrr::map(
sentences,
~ ollamar::embed("nomic-embed-text", .x)[, 1]
)
# Compute pairwise cosine similarity matrix
n <- length(embeddings)
sim <- matrix(0, n, n, dimnames = list(paste0("S", 1:n), paste0("S", 1:n)))
for (i in seq_len(n)) for (j in seq_len(n)) {
sim[i, j] <- cosine_sim(embeddings[[i]], embeddings[[j]])
}
round(sim, 3) S1 S2 S3 S4 S5
S1 1.000 0.691 0.600 0.597 0.662
S2 0.691 1.000 0.665 0.658 0.801
S3 0.600 0.665 1.000 0.643 0.639
S4 0.597 0.658 0.643 1.000 0.644
S5 0.662 0.801 0.639 0.644 1.000
Expected: S1, S2, and S5 (all about frequency/usage-based linguistics) should cluster together with high mutual similarity (~0.85+). S4 (collocations) should be moderately similar to them. S3 (Swahili morphosyntax) should be dissimilar to all others.
Nearest-Neighbour Search
Embeddings support semantic search: given a query, find the most similar texts in a corpus:
Code
# Simple nearest-neighbour search
semantic_search <- function(query, corpus_texts, corpus_embeddings, top_n = 3) {
query_emb <- ollamar::embed("nomic-embed-text", query)[, 1]
sims <- purrr::map_dbl(
corpus_embeddings,
~ cosine_sim(query_emb, .x)
)
tibble::tibble(
text = corpus_texts,
similarity = sims
) |>
dplyr::arrange(dplyr::desc(similarity)) |>
head(top_n)
}
# Find the sentences most similar to a query
semantic_search(
query = "How does experience shape language knowledge?",
corpus_texts = sentences,
corpus_embeddings = embeddings,
top_n = 3
)# A tibble: 3 × 2
text similarity
<chr> <dbl>
1 Usage-based linguistics emphasises the role of input frequency in … 0.706
2 Token frequency shapes the entrenchment of linguistic construction… 0.638
3 Frequency effects are central to usage-based theories of grammar. 0.591
Q13. You use embed() with llama3.2 (a generation model) instead of nomic-embed-text and find the similarity scores are much lower and less meaningful. Why?
Corpus-Scale Batch Processing
What you will learn: Why LLM inference is slow and what determines speed; how to process a corpus sequentially with progress tracking; how to use httr2’s parallel request functionality for speedup; and practical strategies for managing large-scale processing jobs
Why Batch Processing Requires Care
Unlike fast string-processing functions, each generate() or chat() call involves running a neural network — a process that takes seconds per text even on modern hardware. A corpus of 1,000 texts at 3 seconds each takes roughly 50 minutes sequentially. Three strategies reduce this:
Parallelisation — ollamar integrates with httr2’s req_perform_parallel() to issue multiple requests simultaneously, reducing total time proportionally to the degree of parallelism.
Batching — grouping short texts together in a single prompt reduces the number of API calls.
Model selection — smaller, faster models (3B) are appropriate for simple tasks; reserve larger models for tasks where quality is critical.
Sequential Processing with Progress
For modest corpora (up to a few hundred texts), sequential processing with a progress indicator is the simplest approach:
Code
classify_batch_sequential <- function(texts, model = "llama3.2") {
n <- length(texts)
results <- character(n)
for (i in seq_along(texts)) {
cat(sprintf("Processing %d of %d...\r", i, n))
results[i] <- classify_sentiment(texts[i], model)
Sys.sleep(0.1) # small pause to avoid overwhelming the local server
}
cat("\nDone.\n")
results
}Parallel Processing with httr2
For larger corpora, ollamar supports parallelisation by building request objects first (output = "req") and then executing them all simultaneously with httr2::req_perform_parallel():
Code
library(httr2)
texts_to_classify <- c(
"The results confirm the central hypothesis and extend previous findings.",
"The methodology contains several unacknowledged limitations.",
"No significant difference was found between the two groups.",
"This work represents a major advance in our understanding of acquisition.",
"The conclusions are not supported by the data presented."
)
# Step 1: Build a system prompt (shared across all requests)
sys_msg <- ollamar::create_message(
paste(
"You classify academic sentences by sentiment.",
"Respond with exactly one word: positive, negative, or neutral."
),
"system"
)
# Step 2: Create a list of httr2_request objects — one per text
reqs <- lapply(texts_to_classify, function(txt) {
msgs <- ollamar::append_message(txt, "user", sys_msg)
ollamar::chat("llama3.2", msgs, output = "req")
})
# Step 3: Execute all requests in parallel
resps <- httr2::req_perform_parallel(reqs)
# Step 4: Extract results
results <- dplyr::bind_rows(
lapply(resps, ollamar::resp_process, "df")
)
tibble::tibble(
text = texts_to_classify,
sentiment = trimws(tolower(results$content))
)# A tibble: 5 × 2
text sentiment
<chr> <chr>
1 The results confirm the central hypothesis and extend previous find… positive
2 The methodology contains several unacknowledged limitations. negative
3 No significant difference was found between the two groups. neutral
4 This work represents a major advance in our understanding of acquis… positive
5 The conclusions are not supported by the data presented. negative
Ollama processes requests on your local hardware. Issuing 50 simultaneous requests does not make your laptop 50× faster — it will saturate your CPU or GPU and may actually slow down individual responses or cause timeouts. In practice, 2–4 parallel requests is a sensible limit for a laptop CPU. Experiment to find the sweet spot for your hardware.
Saving and Resuming Large Jobs
For very large corpora, saving results incrementally prevents losing progress if the session is interrupted:
Code
process_corpus_with_checkpointing <- function(texts, ids,
output_file = "results.rds",
model = "llama3.2") {
# Load existing results if a checkpoint exists
if (file.exists(output_file)) {
existing <- readRDS(output_file)
done_ids <- existing$id
cat("Resuming from checkpoint:", nrow(existing), "texts already processed.\n")
} else {
existing <- tibble::tibble(id = character(), text = character(), result = character())
done_ids <- character()
}
# Process only texts not yet done
todo_idx <- which(!ids %in% done_ids)
cat("Remaining:", length(todo_idx), "texts.\n")
for (i in todo_idx) {
result <- classify_sentiment(texts[i], model)
new_row <- tibble::tibble(id = ids[i], text = texts[i], result = result)
existing <- dplyr::bind_rows(existing, new_row)
# Save checkpoint every 10 texts
if (i %% 10 == 0) saveRDS(existing, output_file)
}
saveRDS(existing, output_file)
existing
}Q14. You are processing 500 newspaper articles with a local LLM and want to use parallel requests. You try 20 parallel requests at once and find the total processing time is actually longer than sequential processing. What is the most likely explanation?
Using a Local LLM to Help Write R Code
What you will learn: How to prompt a local LLM to generate R code; how to ask for code explanations and debugging help; how to use the model as a programming assistant within an R workflow; and the limitations and best practices for LLM-assisted coding
Why a Local LLM for R Help?
Commercial AI coding assistants (GitHub Copilot, ChatGPT) are excellent but require internet access and send your code to external servers. A local LLM provides a privacy-preserving coding assistant that works offline and never transmits your proprietary scripts or data descriptions to third parties.
For R-specific tasks, a well-prompted model can:
- Generate boilerplate code — read a CSV, reshape a data frame, run a t-test
- Explain unfamiliar functions — “What does
purrr::reduce()do?” - Debug error messages — paste the error and the code that produced it
- Suggest improvements — “How can I make this loop more idiomatic in R?”
- Write
dplyrorggplot2pipelines — describe what you need, get working code
Setting Up an R Coding Assistant
Code
# System prompt for an R coding assistant
r_assistant_sys <- paste(
"You are an expert R programmer specialising in data science, corpus linguistics,",
"and natural language processing. You write clean, idiomatic R code using the",
"tidyverse (dplyr, purrr, ggplot2, stringr) and base R where appropriate.",
"When asked to write code: provide only the R code, no prose explanation unless asked.",
"When asked to explain code: be concise and precise.",
"When debugging: identify the error cause first, then provide the fixed code."
)
# Helper function for one-off coding questions
ask_r <- function(question, model = "llama3.2") {
msgs <- ollamar::create_message(r_assistant_sys, "system")
msgs <- ollamar::append_message(question, "user", msgs)
ollamar::chat(model, msgs, output = "text") |> trimws()
}Code Generation Examples
Code
# Generate a ggplot2 visualisation
ask_r("Write R code to create a bar chart showing word frequency from a character vector called 'words', using ggplot2. Show the top 20 words, with bars sorted by frequency.")[1] "```r\nlibrary(ggplot2)\nlibrary(tidyverse)\n\nwords <- c(\"the\", \"of\", \"and\", \"a\", \"in\", \"that\", \"is\", \"for\", \"it\", \n \"with\", \"as\", \"on\", \"at\", \"by\", \"from\", \"they\", \"this\", \n \"have\", \"or\", \"but\")\n\ndf <- as_tibble(words) %>%\n count(value, sort = TRUE) %>%\n arrange(desc(n)) %>%\n head(20)\n\nggplot(df, aes(x = value, y = n)) +\n geom_bar(stat = \"identity\") +\n coord_flip() +\n labs(title = \"Word Frequency\", x = \"\", y = \"\") +\n theme_minimal()\n```"
Code
# Explain an unfamiliar function
ask_r("Explain what purrr::accumulate() does and give a simple example relevant to text processing.")[1] "`purrr::accumulate()` applies a function to each element of a list (or vector in base R) cumulatively from left to right, so as to reduce the list to a single output value.\n\nHere's an example using text processing:\n\n```r\nlibrary(dplyr)\n\n# Sample data frame with word frequencies\ndf <- tibble(\n Word = c(\"hello\", \"world\", \"hello\", \"again\"),\n Count = c(1, 1, 2, 3)\n)\n\n# Count the total occurrences of each unique word using accumulate()\nresult <- df %>% group_by(Word) %>% arrange(desc(Count)) %>% \n summarise(Total = accumulate(.x = Count, fun = sum))\n\nprint(result)\n```\n\nOutput:\n\n```r\n Word Total\n1 world 2\n2 hello 3\n```\nIn this example, we group by each unique word and calculate the total occurrences using `accumulate()`."
Code
# Debug an error
error_context <- paste(
"I get this error:",
"Error in UseMethod('select'): no applicable method for 'select' applied to 'character'",
"From this code:",
"result <- my_text |> select(word)"
)
ask_r(error_context)[1] "```r\nresult <- data.frame(word = str_extract(my_text, '\\\\w+'))\n```"
Code
# Expected response: explains that select() is a dplyr function for data frames,
# not for character vectors; suggests str_extract() or other string functions instead.
# Get a complete data processing pipeline
ask_r(paste(
"Write a complete R pipeline that:",
"1. Reads a CSV file called 'corpus.csv' with columns 'doc_id' and 'text'",
"2. Tokenises the text column into words using tidytext::unnest_tokens()",
"3. Removes stop words using tidytext's stop_words data",
"4. Counts word frequency per document",
"5. Returns a tibble sorted by frequency descending"
))[1] "```r\nlibrary(tidytext)\nlibrary(dplyr)\n\npipeline <- corpus %>%\n unnest_tokens(word, text) %>%\n anti_join(stop_words, by = c(\"word\", \"lemma\")) %>%\n count(doc_id, word, sort = desc(n)) %>%\n arrange(desc(n)) %>%\n pivot_wider(name_from = word, id_cols = doc_id)\n```"
Stateful Coding Session
For a sustained coding session where you want the model to remember earlier code and context:
Code
# Start a persistent coding session
msgs <- ollamar::create_message(r_assistant_sys, "system")
# Turn 1: Ask for initial code
msgs <- ollamar::append_message(
"Write a function called read_corpus() that reads all .txt files from a folder and returns a tibble with columns doc_id and text.",
"user", msgs
)
resp1 <- ollamar::chat("llama3.2", msgs, output = "df")
msgs <- ollamar::append_message(resp1$content, "assistant", msgs)
cat(resp1$content)```r
read_corpus <- function(folder_path) {
corpus_data <- tibble(doc_id = rep(1:nrow(getDirFiles(folder_path)[[2]]), each = 1),
text = getDirFiles(folder_path)[[2]])
return(corpus_data)
}
getDirFiles <- function(path) {
files <- dir.path(path, pattern="*.txt", full.names = TRUE)
nfiles <- length(files)
names(dir.list(path)) <- c(nfile = nfiles)
return(list(file_names = names(dir.list(path)), file_path = path))
}
```
Code
# Turn 2: Ask for an improvement — model remembers the function
msgs <- ollamar::append_message(
"Now add error handling: if the folder does not exist, print a clear message and return NULL.",
"user", msgs
)
resp2 <- ollamar::chat("llama3.2", msgs, output = "df")
cat(resp2$content)```r
read_corpus <- function(folder_path) {
dir_list <- dir.path(folder_path)
if (is.null(dir_list)) {
stop("The specified folder does not exist.")
return(NULL)
}
files <- dir(path = folder_path, pattern="*.txt", full.names = TRUE)
nfiles <- length(files)
corpus_data <- tibble(doc_id = rep(1:nrow(files), each = 1),
text = readLines(files))
return(corpus_data)
}
```
A local LLM will sometimes produce R code that looks plausible but contains errors — deprecated function names, incorrect argument names, or logic that does not match the stated intent. Always run generated code in a test environment and verify the output before using it in a real analysis. The model is a first-draft assistant, not an infallible oracle.
Q15. You ask the model to write a function that reads a corpus and the code it produces uses read.csv() with stringsAsFactors = TRUE (the old R 3.x default). What does this tell you about LLM-generated code, and what should you do?
Q16. You are using the local LLM to help debug a complex purrr::map() pipeline that processes proprietary survey data. A colleague suggests you use ChatGPT instead because it is a better model. What is the strongest argument for continuing to use the local model?
Model Management
What you will learn: How to pull, list, copy, and delete models; how to inspect model metadata; how to choose the right model for a task; and an overview of the model ecosystem available through Ollama
Core Model Management Functions
Code
# List downloaded models
ollamar::list_models()
# Pull a new model (downloads from Ollama library)
ollamar::pull("llama3.2") # 3B generation model, ~2 GB
ollamar::pull("nomic-embed-text") # embedding model, ~270 MB
ollamar::pull("llama3.1") # 8B model, ~4.7 GB (optional)
# Show detailed information about a model
ollamar::show("llama3.2")
# Returns: model parameters, context length, quantisation, architecture
# Copy a model under a new name (useful for creating custom variants)
ollamar::copy("llama3.2", "llama3.2-linguistics")
# List models currently loaded in memory
ollamar::ps()
# Delete a model (frees disk space)
ollamar::delete("llama3.2-linguistics")Recommended Models by Task
Task | Recommended_model | RAM_required | Notes |
|---|---|---|---|
Quick generation / prototyping | llama3.2 (3B) | 4–8 GB | Fast; good for teaching and exploration |
Production classification / NER | llama3.2 (3B) or llama3.1 (8B) | 4–16 GB | Test on labelled sample before full deployment |
High-quality summarisation | llama3.1 (8B) | 8–16 GB | Significantly better output than 3B for long texts |
Sentence embeddings | nomic-embed-text | 4 GB | Do not use generation models for embeddings |
Multilingual tasks | llama3.1 or aya:8b | 8 GB | Cohere's model; strong cross-lingual performance |
Code generation | codellama or llama3.2 | 4–8 GB | Specialised for code; better than general models |
Summary and Further Reading
This tutorial has introduced Ollama and the ollamar R package as a tool for running large language models locally, covering eight practical NLP workflows for corpus linguistics and language research.
Section 1 established the case for local LLMs: privacy for sensitive data, cost predictability for large corpora, and reproducibility from fixed model weights. It introduced the Ollama architecture (a local REST server at 127.0.0.1:11434) and the ollamar package as its R interface.
Section 2 covered setup: installing Ollama, pulling models, installing ollamar, and verifying the connection with test_connection() and list_models().
Section 3 introduced generate() for single-prompt text generation, the five output formats ("resp", "text", "df", "jsonlist", "raw"), and the principles of effective prompt engineering: specificity, constrained output format, and contextual framing.
Section 4 covered multi-turn conversation with chat() and the message history system. It introduced create_message(), append_message(), system prompts, and the full set of message management helpers (prepend_message(), insert_message(), delete_message()).
Section 5 demonstrated prompt-based text classification for sentiment analysis and rhetorical function labelling, with a wrapper function, batch processing via purrr::map_chr(), and the importance of evaluating classifiers on a gold standard before deploying on the full corpus.
Section 6 showed named entity recognition by prompting for JSON output, parsing with jsonlite::fromJSON(), and processing a corpus. It compared LLM-based NER with dedicated models for different use cases.
Section 7 covered text summarisation with length control and corpus-scale application. It discussed post-processing and few-shot prompting as remedies for common formatting failures.
Section 8 introduced embed() for generating sentence embeddings with nomic-embed-text, cosine similarity computation, and nearest-neighbour semantic search.
Section 9 addressed corpus-scale batch processing: sequential processing with progress tracking, parallel processing with httr2::req_perform_parallel(), hardware saturation limits, and checkpoint-based resumption of large jobs.
Section 10 demonstrated using a local LLM as a privacy-preserving R coding assistant for code generation, explanation, debugging, and sustained coding sessions with conversation history.
Section 11 surveyed model management functions and provided a task-by-model recommendation table.
Further reading: The ollamar package is documented at hauselin.github.io/ollama-r. The Ollama model library is at ollama.com/library. Lin and Safi (2025) is the primary package citation. For prompt engineering principles see White et al. (2023). For a very recommendable chapter on the use of transformer-based models in computational Linguistics and Digital Humanities see Schneider (2024). For the broader landscape of open-source LLMs see Touvron, Martin, et al. (2023) and Touvron, Lavril, et al. (2023).
Citation & Session Info
Martin Schweinberger. 2026. Local Large Language Models in R with Ollama. The Language Technology and Data Analysis Laboratory (LADAL), The University of Queensland, Australia. url: https://ladal.edu.au/tutorials/ollama/ollama.html (Version 2026.03.27), doi: .
@manual{martinschweinberger2026local,
author = {Martin Schweinberger},
title = {Local Large Language Models in R with Ollama},
year = {2026},
note = {https://ladal.edu.au/tutorials/ollama/ollama.html},
organization = {The Language Technology and Data Analysis Laboratory (LADAL), The University of Queensland, Australia},
edition = {2026.03.27}
doi = {}
}
Code
sessionInfo()R version 4.4.2 (2024-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8
[2] LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8
time zone: Australia/Brisbane
tzcode source: internal
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] httr2_1.2.2 flextable_0.9.11 ggplot2_4.0.2 stringr_1.6.0
[5] tibble_3.3.1 purrr_1.2.1 dplyr_1.2.0 ollamar_1.2.2
[9] checkdown_0.0.13
loaded via a namespace (and not attached):
[1] utf8_1.2.6 rappdirs_0.3.3 generics_0.1.4
[4] tidyr_1.3.2 fontLiberation_0.1.0 renv_1.1.7
[7] xml2_1.3.6 stringi_1.8.7 digest_0.6.39
[10] magrittr_2.0.4 evaluate_1.0.5 grid_4.4.2
[13] RColorBrewer_1.1-3 fastmap_1.2.0 jsonlite_2.0.0
[16] zip_2.3.2 BiocManager_1.30.27 scales_1.4.0
[19] fontBitstreamVera_0.1.1 codetools_0.2-20 textshaping_1.0.0
[22] cli_3.6.5 rlang_1.1.7 fontquiver_0.2.1
[25] crayon_1.5.3 litedown_0.9 commonmark_2.0.0
[28] withr_3.0.2 yaml_2.3.10 gdtools_0.5.0
[31] tools_4.4.2 officer_0.7.3 uuid_1.2-1
[34] curl_7.0.0 vctrs_0.7.2 R6_2.6.1
[37] lifecycle_1.0.5 htmlwidgets_1.6.4 ragg_1.5.1
[40] pkgconfig_2.0.3 pillar_1.11.1 gtable_0.3.6
[43] glue_1.8.0 data.table_1.17.0 Rcpp_1.1.1
[46] systemfonts_1.3.1 xfun_0.56 tidyselect_1.2.1
[49] rstudioapi_0.17.1 knitr_1.51 farver_2.1.2
[52] patchwork_1.3.0 htmltools_0.5.9 rmarkdown_2.30
[55] compiler_4.4.2 S7_0.2.1 askpass_1.2.1
[58] markdown_2.0 openssl_2.3.2
This tutorial was re-developed with the assistance of Claude (claude.ai), a large language model created by Anthropic. Claude was used to help revise the tutorial text, structure the instructional content, generate the R code examples, and write the checkdown quiz questions and feedback strings. All content was reviewed, edited, and approved by the author (Martin Schweinberger), who takes full responsibility for the accuracy and pedagogical appropriateness of the material. The use of AI assistance is disclosed here in the interest of transparency and in accordance with emerging best practices for AI-assisted academic content creation.