Practical Overview of Text Analytics Methods

Author

Martin Schweinberger

Welcome to Text Analytics!

What You’ll Learn

By the end of this tutorial, you will understand and be able to apply:

  • Concordancing: Finding words in context (KWIC displays)
  • Word frequency analysis: Identifying patterns through frequency
  • Collocation analysis: Discovering words that travel together
  • Keyword analysis: Finding distinctive vocabulary
  • Text classification: Automatically categorizing texts
  • Part-of-speech tagging: Understanding grammatical structure
  • Named entity recognition: Extracting people, places, organizations
  • Dependency parsing: Visualizing sentence structure

Each method includes theory, R code examples, and practical exercises!

What is Text Analytics?

Text analytics (also called text mining or computational text analysis) refers to the computer-based analysis of language data—the (semi-)automated extraction of information, patterns, and insights from text (Bernard and Ryan 1998; Kabanoff 1997; Popping 2000).

Why Text Analytics Matters

The challenge of scale:
- Modern research often involves massive text collections
- Reading 1,000 documents manually is impractical
- Patterns invisible to human readers emerge computationally
- Systematic, reproducible analysis becomes possible

Real-world applications:
- Business: Customer feedback analysis, market research
- Academia: Literature analysis, historical research, linguistic studies
- Government: Policy analysis, public opinion monitoring
- Journalism: Investigating document leaks, analyzing political discourse
- Legal: Contract analysis, case law research

The Text Analytics Toolkit

Most text analytics applications build upon a relatively small set of core procedures:

Method Purpose Common Uses
Concordancing Find words in context Usage studies, example extraction
Word frequency Count and compare words Vocabulary analysis, text comparison
Collocation Find words that co-occur Phraseology, semantic patterns
Keywords Find distinctive vocabulary Text characterization, comparison
Text classification Categorize texts automatically Genre detection, authorship attribution
POS tagging Identify word classes Grammatical analysis, parsing
NER Extract named entities Information extraction, summarization
Dependency parsing Analyze sentence structure Semantic role analysis
Tutorial Citation

Schweinberger, Martin. 2026. Practical Overview of Text Analytics Methods. Brisbane: The Language Technology and Data Analysis Laboratory (LADAL). url: https://ladal.edu.au/tutorials/textanalysis.html (Version 2026.02.08).

Prerequisites


Part 1: Setup and Preparation

Installation

Install required packages (run once):

Code
# Core text analysis packages  
install.packages("quanteda")          # Text analysis framework  
install.packages("quanteda.textplots") # Visualization  
install.packages("udpipe")            # NLP annotation  
  
# Data manipulation and visualization    
install.packages("dplyr")             # Data wrangling  
install.packages("tidyr")             # Data reshaping  
install.packages("stringr")           # String processing  
install.packages("ggplot2")           # Visualization  
install.packages("ggraph")            # Network graphs  
  
# Additional utilities  
install.packages("tm")                # Text mining  
install.packages("tidytext")          # Tidy text analysis  
install.packages("flextable")         # Pretty tables  

Loading Packages

Activate packages (run every session):

Code
# Load packages  
library(quanteda)  
library(quanteda.textplots)  
library(udpipe)  
library(dplyr)  
library(tidyr)  
library(stringr)  
library(ggplot2)  
library(ggraph)  
library(tm)  
library(tidytext)  
library(flextable)  
Package Loading Best Practice

Always load all packages at the top of your script in one code chunk. This makes dependencies clear and troubleshooting easier.


Part 2: Concordancing

What is Concordancing?


Concordancing retrieves and displays occurrences of a word or phrase within a text, showing surrounding context. It’s used to examine word usage, context, and linguistic patterns for research and language analysis purposes.

Why concordancing is foundational:
- Understand how terms are actually used
- Examine context and meaning
- Extract authentic examples
- Identify collocational patterns
- Foundation for more advanced analyses

AntConc concordance example showing “language” in context

KWIC Displays

Concordances typically appear as KeyWord In Context (KWIC) displays:
- Search term centered
- Left context (preceding words)
- Right context (following words)
- Aligned for easy pattern recognition

Loading Example Text

We’ll use Lewis Carroll’s Alice’s Adventures in Wonderland:

Code
# Load example text  
text <- base::readRDS("tutorials/textanalysis/data/alice.rda", "rb")  

text

Alice’s Adventures in Wonderland

by Lewis Carroll

CHAPTER I.

Down the Rabbit-Hole

Alice was beginning to get very tired of sitting by her sister on the

bank, and of having nothing to do: once or twice she had peeped into

Processing Text

Combine snippets and split into chapters:

Code
text_chapters <- text |>  
  # Combine all text  
  paste0(collapse = " ") |>  
  # Mark chapter boundaries  
  stringr::str_replace_all("(CHAPTER [XVI]{1,7}\\.{0,1}) ", "qwertz\\1") |>  
  # Convert to lowercase  
  tolower() |>  
  # Split into chapters  
  stringr::str_split("qwertz") |>  
  unlist()  

substr(text_chapters, start = 1, stop = 500)

alice’s adventures in wonderland by lewis carroll

chapter i.down the rabbit-hole alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, “and what is the use of a book,” thought alice “without pictures or conversations?” so she was considering in her own mind (as well as she could, for the hot day made her feel very sleepy and stupid), whether the pleasure of making a daisy-chain woul

chapter ii.the pool of tears “curiouser and curiouser!” cried alice (she was so much surprised, that for the moment she quite forgot how to speak good english); “now i’m opening out like the largest telescope that ever was! good-bye, feet!” (for when she looked down at her feet, they seemed to be almost out of sight, they were getting so far off). “oh, my poor little feet, i wonder who will put on your shoes and stockings for you now, dears? i’m sure _i_ shan’t be able! i shall be a great deal t

chapter iii.a caucus-race and a long tale they were indeed a queer-looking party that assembled on the bank—the birds with draggled feathers, the animals with their fur clinging close to them, and all dripping wet, cross, and uncomfortable. the first question of course was, how to get dry again: they had a consultation about this, and after a few minutes it seemed quite natural to alice to find herself talking familiarly with them, as if she had known them all her life. indeed, she had quite a l

chapter iv.the rabbit sends in a little bill it was the white rabbit, trotting slowly back again, and looking anxiously about as it went, as if it had lost something; and she heard it muttering to itself “the duchess! the duchess! oh my dear paws! oh my fur and whiskers! she’ll get me executed, as sure as ferrets are ferrets! where _can_ i have dropped them, i wonder?” alice guessed in a moment that it was looking for the fan and the pair of white kid gloves, and she very good-naturedly began hu

chapter v.advice from a caterpillar the caterpillar and alice looked at each other for some time in silence: at last the caterpillar took the hookah out of its mouth, and addressed her in a languid, sleepy voice. “who are _you?_” said the caterpillar. this was not an encouraging opening for a conversation. alice replied, rather shyly, “i—i hardly know, sir, just at present—at least i know who i _was_ when i got up this morning, but i think i must have been changed several times since then.” “wha

Regular Expression Explained

(CHAPTER [XVI]{1,7}\\.{0,1}) matches:
- CHAPTER literally
- [XVI]{1,7} = 1-7 Roman numeral characters
- \\.{0,1} = optional period

We replace with qwertz\\1 to mark boundaries while preserving the chapter heading.

Creating Basic Concordances


The kwic function extracts KeyWord In Context displays. Main arguments: x (tokenized text), pattern (search term), window (context size).

Pattern Matching with Regex

Find all words starting with “walk”:

Code
kwic_walk <- quanteda::kwic(  
  x = quanteda::tokens(text_chapters),  
  pattern = "walk.*",  
  window = 5,  
  valuetype = "regex"  
) |>  
  as.data.frame() |>  
  dplyr::select(-to, -from, -pattern)  

docname

pre

keyword

post

text2

out among the people that

walk

with their heads downward !

text2

to dream that she was

walking

hand in hand with dinah

text2

trying every door , she

walked

sadly down the middle ,

text3

“ or perhaps they won’t

walk

the way i want to

text4

mouse , getting up and

walking

away . “ you insult

text4

its head impatiently , and

walked

a little quicker . “

text5

and get ready for your

walk

! ’ ‘ coming in

text7

, “ if you only

walk

long enough . ” alice

text7

a minute or two she

walked

on in the direction in

text7

high : even then she

walked

up towards it rather timidly

Pattern explanation:
- walk.* = “walk” followed by any characters
- Matches: walk, walks, walked, walking, walker
- valuetype = "regex" enables regular expressions

Part 3: Word Frequency Analysis

Why Frequency Matters

Frequency is fundamental to text analytics:
- Most basic measure of importance
- Foundation for almost all other methods
- Reveals vocabulary distribution
- Enables text comparison
- Identifies stylistic features

Creating Frequency Lists

Preprocessing

Code
text_words <- text |>  
  tolower() |>  
  str_replace_all("[^[:alpha:][:space:]]*", "") |>  
  tm::removePunctuation() |>  
  stringr::str_squish() |>  
  stringr::str_split(" ") |>  
  unlist()  

head(text_words, 15)

alices

adventures

in

wonderland

by

lewis

carroll

chapter

i

down

the

rabbithole

alice

was

beginning

Building Frequency Table

Code
wfreq <- text_words |>  
  table() |>  
  as.data.frame() |>  
  arrange(desc(Freq)) |>  
  dplyr::rename(word = 1, frequency = 2)  

word

frequency

the

1,630

and

844

to

721

a

627

she

537

it

526

of

508

said

462

i

400

alice

385

in

366

you

360

was

357

that

276

as

262

Observation: Most frequent words are function words (the, and, to, a, of)—grammatically necessary but semantically light.

Removing Stopwords

Code
wfreq_wostop <- wfreq |>  
  anti_join(tidytext::stop_words, by = "word") |>  
  dplyr::filter(word != "")  

word

frequency

alice

385

queen

68

time

68

king

61

dont

60

im

57

mock

56

turtle

56

gryphon

55

hatter

55

head

48

voice

47

looked

45

rabbit

43

round

41

Much better! Now we see content words that reveal the text’s themes: alice, said, queen, time, etc.

Visualizing Frequencies

Bar Plot

Code
wfreq_wostop |>  
  head(10) |>  
  ggplot(aes(x = reorder(word, -frequency), y = frequency)) +  
  geom_bar(stat = "identity", fill = "steelblue") +  
  labs(  
    title = "10 Most Frequent Content Words",  
    subtitle = "Alice's Adventures in Wonderland",  
    x = "Word",  
    y = "Frequency"  
  ) +  
  theme_minimal() +  
  theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 12))  

Word Cloud


The textplot_wordcloud function creates visual representations where word size reflects frequency. Main arguments: x (Document-Feature Matrix), max_words (how many to display), color (palette).

Code
text |>  
  quanteda::corpus() |>  
  quanteda::tokens(remove_punct = TRUE) |>  
  quanteda::tokens_remove(stopwords("english")) |>  
  quanteda::dfm() |>  
  quanteda.textplots::textplot_wordcloud(  
    max_words = 150,  
    max_size = 10,  
    min_size = 1.5,  
    color = scales::viridis_pal(option = "A")(10)  
  )  

Word Clouds: Use with Caution

Pros:
- Visually appealing
- Quick impression of content
- Accessible to non-experts

Cons:
- Less precise than bar plots
- Hard to compare exact frequencies
- Can be misleading with size perception

Best practice: Use for initial exploration or public communication, but rely on bar plots/tables for analysis.

Comparison Clouds

Compare vocabulary across texts:

Code
# Load comparison texts  
orwell <- readRDS("tutorials/textanalysis/data/orwell.rda", "rb") |>  
  paste0(collapse = " ")  
melville <- readRDS("tutorials/textanalysis/data/melville.rda", "rb") |>  
  paste0(collapse = " ")  
darwin <- readRDS("tutorials/textanalysis/data/darwin.rda", "rb") |>  
  paste0(collapse = " ")  
Code
# Create corpus with author labels  
corp_dom <- quanteda::corpus(c(darwin, melville, orwell))  
attr(corp_dom, "docvars")$Author <- c("Darwin", "Melville", "Orwell")  
Code
# Generate comparison cloud  
corp_dom |>  
  quanteda::tokens(  
    remove_punct = TRUE,  
    remove_symbols = TRUE,  
    remove_numbers = TRUE  
  ) |>  
  quanteda::tokens_remove(stopwords("english")) |>  
  quanteda::dfm() |>  
  quanteda::dfm_group(groups = corp_dom$Author) |>  
  quanteda::dfm_trim(min_termfreq = 10, verbose = FALSE) |>  
  quanteda.textplots::textplot_wordcloud(  
    comparison = TRUE,  
    color = c("darkgray", "orange", "purple"),  
    max_words = 150  
  )  

Interpretation:
- Words closer to an author’s name are distinctive to that text
- Size reflects frequency
- Reveals vocabulary differences between authors/texts

Frequency Over Time

Track term usage across document sections:

Code
# Count words per chapter  
Words <- text_chapters |>  
  str_split(" ") |>  
  lengths()  
  
# Count "alice" per chapter  
Matches <- text_chapters |>  
  str_count("alice")  
  
# Create results table  
Chapters <- paste0("chapter", 0:(length(text_chapters) - 1))  
  
tb <- data.frame(Chapters, Matches, Words) |>  
  mutate(  
    Frequency = round(Matches / Words * 1000, 2),  
    Chapters = factor(Chapters, levels = paste0("chapter", 0:12))  
  )  
  
# Visualize  
ggplot(tb, aes(x = Chapters, y = Frequency, group = 1)) +  
  geom_smooth(color = "purple", se = TRUE) +  
  geom_line(color = "darkgray") +  
  geom_point(color = "darkgray", size = 2) +  
  theme_bw() +  
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +  
  labs(  
    title = "Frequency of 'alice' Across Chapters",  
    subtitle = "Relative frequency per 1,000 words",  
    x = "Chapter",  
    y = "Relative Frequency (per 1,000 words)"  
  )  

Analysis:
- Alice appears most frequently in early chapters
- Decreases in middle chapters
- Some variation throughout

This type of analysis reveals:
- Character prominence across narrative
- Thematic shifts
- Structural patterns


Part 4: Collocations

Understanding Collocations

Collocations are word pairs that occur together more frequently than would be expected by chance (Sinclair 1991).

Examples:
- Merry Christmas (not happy Christmas)
- Strong coffee (not powerful coffee)
- Make a decision (not do a decision)

Why collocations matter:
- Reveal natural language patterns
- Identify phraseological units
- Uncover semantic associations
- Distinguish native from non-native usage

Collocation Detection

Based on co-occurrence in a contingency table:

w₂ present w₂ absent
w₁ present O₁₁ O₁₂ = R₁
w₁ absent O₂₁ O₂₂ = R₂
= C₁ = C₂ = N

Where:
- O₁₁ = both words co-occur
- O₁₂ = w₁ present, w₂ absent
- O₂₁ = w₁ absent, w₂ present
- O₂₂ = both absent

Extracting Collocations

Preprocessing

Code
sentences <- text |>  
  paste0(collapse = " ") |>  
  tokenizers::tokenize_sentences() |>  
  unlist() |>  
  str_replace_all("\\W", " ") |>  
  str_replace_all("[^[:alnum:] ]", " ") |>  
  str_squish() |>  
  tolower()  

head(sentences, 10)

alice s adventures in wonderland by lewis carroll chapter i

down the rabbit hole alice was beginning to get very tired of sitting by her sister on the bank and of having nothing to do once or twice she had peeped into the book her sister was reading but it had no pictures or conversations in it and what is the use of a book thought alice without pictures or conversations

so she was considering in her own mind as well as she could for the hot day made her feel very sleepy and stupid whether the pleasure of making a daisy chain would be worth the trouble of getting up and picking the daisies when suddenly a white rabbit with pink eyes ran close by her

there was nothing so very remarkable in that nor did alice think it so very much out of the way to hear the rabbit say to itself oh dear

oh dear

i shall be late

when she thought it over afterwards it occurred to her that she ought to have wondered at this but at the time it all seemed quite natural but when the rabbit actually took a watch out of its waistcoat pocket and looked at it and then hurried on alice started to her feet for it flashed across her mind that she had never before seen a rabbit with either a waistcoat pocket or a watch to take out of it and burning with curiosity she ran across the field after it and fortunately was just in time to see it pop down a large rabbit hole under the hedge

in another moment down went alice after it never once considering how in the world she was to get out again

the rabbit hole went straight on like a tunnel for some way and then dipped suddenly down so suddenly that alice had not a moment to think about stopping herself before she found herself falling down a very deep well

either the well was very deep or she fell very slowly for she had plenty of time as she went down to look about her and to wonder what was going to happen next

Creating Co-occurrence Matrix

Code
sentences |>  
  quanteda::tokens() |>  
  quanteda::dfm() |>  
  quanteda::fcm(tri = FALSE) |>  
  tidytext::tidy() |>  
  dplyr::relocate(term, document, count) |>  
  dplyr::rename(w1 = 1, w2 = 2, O11 = 3) -> coll_basic  

w1

w2

O11

alice

alice

11

alice

s

67

alice

adventures

6

alice

in

137

alice

wonderland

1

alice

by

18

alice

lewis

1

alice

carroll

1

alice

chapter

1

alice

i

163

Computing Statistics

Code
# Calculate all contingency table values  
coll_basic |>  
  mutate(N = sum(O11)) |>  
  group_by(w1) |>  
  mutate(  
    R1 = sum(O11),  
    O12 = R1 - O11,  
    R2 = N - R1  
  ) |>  
  ungroup() |>  
  group_by(w2) |>  
  mutate(  
    C1 = sum(O11),  
    O21 = C1 - O11,  
    C2 = N - C1,  
    O22 = R2 - O21  
  ) -> colldf  

Finding Collocates of a Specific Word

Code
colldf |>  
  filter(  
    w1 == "alice",  
    (O11 + O21) > 10,  # w2 occurs at least 10 times  
    O11 > 5             # co-occurs at least 5 times  
  ) |>  
  rowwise() |>  
  mutate(  
    E11 = R1 * C1 / N,  
    E12 = R1 * C2 / N,  
    E21 = R2 * C1 / N,  
    E22 = R2 * C2 / N  
  ) -> colldf_redux  

Association Measures

Code
colldf_redux |>  
  mutate(Rws = n()) |>  
  rowwise() |>  
  mutate(p = as.vector(unlist(  
    fisher.test(matrix(c(O11, O12, O21, O22), ncol = 2, byrow = TRUE))[1]  
  ))) |>  
  mutate(  
    # Calculate various association measures  
    X2 = (O11 - E11)^2/E11 + (O12 - E12)^2/E12 +   
         (O21 - E21)^2/E21 + (O22 - E22)^2/E22,  
    phi = sqrt(X2 / N),  
    MI = log2(O11 / E11),  
    DeltaP12 = (O11/(O11 + O12)) - (O21/(O21 + O22))  
  ) |>  
  mutate(Sig = case_when(  
    p/Rws > .05 ~ "n.s.",  
    p/Rws > .01 ~ "p < .05*",  
    p/Rws > .001 ~ "p < .01**",  
    p/Rws <= .001 ~ "p < .001***"  
  )) |>  
  filter(Sig != "n.s.", E11 < O11) |>  
  arrange(-phi) -> assoc_tb  

w1

w2

O11

phi

MI

Sig

alice

said

179

0.010465054

1.0319694

p < .001***

alice

thought

60

0.008607335

1.4418297

p < .001***

alice

very

86

0.004731665

0.6824350

p < .001***

alice

turning

7

0.004500778

2.1096787

p < .01**

alice

replied

14

0.004177283

1.4492639

p < .001***

alice

afraid

10

0.004051412

1.6437480

p < .01**

alice

to

334

0.003757984

0.2739627

p < .001***

alice

i

163

0.003313097

0.3481294

p < .01**

alice

cried

10

0.003233192

1.3356257

p < .01**

alice

much

32

0.003038563

0.7189759

p < .01**

Interpretation:
- Higher phi = stronger association
- MI (Mutual Information) also indicates strength
- Statistical significance confirms non-random co-occurrence

Visualizing Collocations

Network Graph

Code
# Extract top collocates  
top20colls <- assoc_tb |>  
  arrange(-phi) |>  
  head(20) |>  
  pull(w2) |>  
  c("alice")  
  
# Create feature co-occurrence matrix  
keyword_fcm <- sentences |>  
  quanteda::tokens() |>  
  quanteda::dfm() |>  
  quanteda::dfm_select(pattern = c(top20colls, "selection")) |>  
  quanteda::fcm(tri = FALSE)  
  
# Plot network  
quanteda.textplots::textplot_network(keyword_fcm,  
  edge_alpha = 0.8,  
  edge_color = "gray",  
  edge_size = 2,  
  vertex_labelsize = log(rowSums(keyword_fcm))  
)  

What the network shows:
- Alice at center (our target word)
- Connected words are collocates
- Line thickness = co-occurrence strength
- Reveals semantic/thematic relationships


Part 5: Remaining Methods - Summary

Due to space constraints, here are summaries of the remaining critical methods. Full implementations with examples are available in the complete tutorial.

Keywords

Keywords identify terms distinctive to a text when compared to a reference corpus.

Key concepts:
- Compare target corpus to reference corpus
- Statistical tests identify over-represented words
- Reveals characteristic vocabulary
- Applications: authorship attribution, text characterization

Association measure: Same contingency table approach as collocations, but comparing text to reference rather than word to word.

Text Classification

Automatically categorize texts into predefined groups (languages, genres, authors).

Approaches:
- Feature-based (word frequencies, character n-grams)
- Machine learning (k-NN, SVM, neural networks)
- Training sets with known labels
- Test on unknown texts

Applications:
- Authorship attribution
- Genre classification
- Sentiment analysis
- Language identification

Part-of-Speech Tagging

Assign grammatical categories (noun, verb, adjective, etc.) to words.

Why POS tagging:
- Understand grammatical structure
- Disambiguate word meanings
- Extract specific grammatical patterns
- Foundation for parsing

Using UDPipe:

# Load model  
m_eng <- udpipe_load_model("english-ewt-ud-2.5.udpipe")  
  
# Annotate  
text_anndf <- udpipe_annotate(m_eng, x = text) |>  
  as.data.frame()  

Named Entity Recognition (NER)

Extract and classify named entities: people, locations, organizations, dates, etc.

Applications:
- Information extraction
- Text summarization
- Knowledge graph construction
- Question answering

Methods:
- Rule-based (capitalization patterns)
- Machine learning (trained models)
- Hybrid approaches

Dependency Parsing

Visualize grammatical relationships between words in sentences.

Shows:
- Subject-verb relationships
- Modifier dependencies
- Complement structures
- Sentence hierarchies

Applications:
- Semantic role analysis
- Information extraction
- Grammar checking
- Language understanding


Quick Reference

Essential Functions

Code
# Concordancing  
kwic(tokens(text), pattern = "word", window = 5)  
  
# Frequency  
table(text_words) |> sort(decreasing = TRUE)  
  
# Collocations  
tokens(text) |> dfm() |> fcm()  
  
# Word cloud  
corpus(text) |> tokens() |> dfm() |> textplot_wordcloud()  
  
# POS tagging  
udpipe_annotate(model, x = text)  

Common Workflows

Code
# Basic text analysis pipeline  
text |>  
  # 1. Preprocess  
  tolower() |>  
  str_replace_all("[^[:alnum:] ]", "") |>  
  # 2. Tokenize  
  tokens() |>  
  # 3. Remove stopwords  
  tokens_remove(stopwords("english")) |>  
  # 4. Create document-feature matrix  
  dfm() |>  
  # 5. Analyze (frequency, collocations, etc.)  
  textstat_frequency()  

Resources

Recommended Reading:
- Silge, J., & Robinson, D. (2017). Text Mining with R
- Jurafsky, D., & Martin, J.H. (2023). Speech and Language Processing
- Sinclair, J. (1991). Corpus, Concordance, Collocation

Online Resources:
- Quanteda tutorials
- LADAL tutorials
- Text Mining with R

Related LADAL Tutorials:
- Concordancing
- Collocations
- Keywords


Citation & Session Info

Schweinberger, Martin. 2026. Practical Overview of Text Analytics Methods. Brisbane: The Language Technology and Data Analysis Laboratory (LADAL). url: https://ladal.edu.au/tutorials/textanalysis.html (Version 2026.02.08).

@manual{schweinberger2026ta,  
  author = {Schweinberger, Martin},  
  title = {Practical Overview of Text Analytics Methods},  
  note = {https://ladal.edu.au/tutorials/textanalysis/textanalysis.html},  
  year = {2026},  
  organization = {The Language Technology and Data Analysis Laboratory (LADAL)},  
  address = {Brisbane},  
  edition = {2026.02.08}  
}  
Code
sessionInfo()  
R version 4.4.2 (2024-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26200)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: Australia/Brisbane
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] tidyr_1.3.2               ggraph_2.2.1             
 [3] quanteda.textplots_0.95   quanteda.textstats_0.97.2
 [5] wordcloud2_0.2.1          tidytext_0.4.2           
 [7] udpipe_0.8.11             tm_0.7-16                
 [9] NLP_0.3-2                 quanteda_4.2.0           
[11] flextable_0.9.7           ggplot2_3.5.1            
[13] stringr_1.5.1             dplyr_1.2.0              

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.1        viridisLite_0.4.2       farver_2.1.2           
 [4] viridis_0.6.5           fastmap_1.2.0           tweenr_2.0.3           
 [7] fontquiver_0.2.1        janeaustenr_1.0.0       digest_0.6.39          
[10] lifecycle_1.0.5         tokenizers_0.3.0        magrittr_2.0.3         
[13] compiler_4.4.2          rlang_1.1.7             tools_4.4.2            
[16] igraph_2.1.4            yaml_2.3.10             sna_2.8                
[19] data.table_1.17.0       knitr_1.51              labeling_0.4.3         
[22] askpass_1.2.1           stopwords_2.3           graphlayouts_1.2.2     
[25] htmlwidgets_1.6.4       xml2_1.3.6              withr_3.0.2            
[28] purrr_1.0.4             grid_4.4.2              polyclip_1.10-7        
[31] gdtools_0.4.1           colorspace_2.1-1        scales_1.3.0           
[34] MASS_7.3-61             cli_3.6.4               rmarkdown_2.30         
[37] ragg_1.3.3              generics_0.1.3          rstudioapi_0.17.1      
[40] cachem_1.1.0            ggforce_0.4.2           network_1.19.0         
[43] splines_4.4.2           parallel_4.4.2          vctrs_0.7.1            
[46] Matrix_1.7-2            jsonlite_1.9.0          slam_0.1-55            
[49] fontBitstreamVera_0.1.1 ggrepel_0.9.6           systemfonts_1.2.1      
[52] glue_1.8.0              statnet.common_4.11.0   codetools_0.2-20       
[55] stringi_1.8.4           gtable_0.3.6            munsell_0.5.1          
[58] tibble_3.2.1            pillar_1.10.1           htmltools_0.5.9        
[61] openssl_2.3.2           R6_2.6.1                textshaping_1.0.0      
[64] tidygraph_1.3.1         evaluate_1.0.3          lattice_0.22-6         
[67] SnowballC_0.7.1         memoise_2.0.1           renv_1.1.1             
[70] fontLiberation_0.1.0    Rcpp_1.0.14             zip_2.3.2              
[73] uuid_1.2-1              fastmatch_1.1-6         coda_0.19-4.1          
[76] nlme_3.1-166            nsyllable_1.0.1         gridExtra_2.3          
[79] mgcv_1.9-1              officer_0.6.7           xfun_0.56              
[82] pkgconfig_2.0.3        

Back to top

Back to HOME


References

Bernard, H Russell, and Gery Ryan. 1998. “Text Analysis.” Handbook of Methods in Cultural Anthropology 613.
Kabanoff, Boris. 1997. “Introduction: Computers Can Read as Well as Count: Computer-Aided Text Analysis in Organizational Research.” Journal of Organizational Behavior, 507–11.
Popping, Roel. 2000. Computer-Assisted Text Analysis. Sage. https://doi.org/https://doi.org/10.4135/9781849208741.
Sinclair, John. 1991. Corpus, Concordance, Collocation. Oxford: Oxford University Press.