Rank Frequency Plot

Usage

rank_freq_mplot(text.var, grouping.var = NULL, ncol = 4, jitter = 0.2, log.freq = TRUE, log.rank = TRUE, hap.col = "red", dis.col = "blue", alpha = 1, shape = 1, title = "Rank-Frequency Plot", digits = 2, plot = TRUE)
rank_freq_plot(words, frequencies, plot = TRUE, title.ext = NULL, jitter.ammount = 0.1, log.scale = TRUE, hap.col = "red", dis.col = "blue")

Arguments

text.var
The text variable.
grouping.var
The grouping variables. Default NULL generates one word list for all text. Also takes a single grouping variable or a list of 1 or more grouping variables.
ncol
integer value indicating the number of columns in the facet wrap.
jitter
Amount of horizontal jitter to add to the points.
log.freq
logical. If TRUE plots the frequencies in the natural log scale.
log.rank
logical. If TRUE plots the ranks in the natural log scale.
hap.col
Color of the hapax legomenon points.
dis.col
Color of the dis legomenon points.
alpha
Transparency level of points (ranges between 0 and 1).
shape
An integer specifying the symbol used to plot the points.
title
Optional plot title.
digits
Integer; number of decimal places to round.
plot
logical. If TRUE provides a rank frequency plot.
words
A vector of words.
frequencies
A vector of frequencies corresponding to the words argument.
title.ext
The title extension that extends: "Rank-Frequency Plot ..."
jitter.ammount
Amount of horizontal jitter to add to the points.
log.scale
logical. If TRUE plots the rank and frequency as a log scale.

Rank Frequency Plot

Value

Returns a rank-frequency plot and a list of three dataframes: WORD_COUNTSThe word frequencies supplied to rank_freq_plot or created by rank_freq_mplot. RANK_AND_FREQUENCY_STATSA dataframe of rank and frequencies for the words used in the text. LEGOMENA_STATSA dataframe displaying the percent hapax legomena and percent dis legomena of the text.

Description

rank_freq_mplot - Plot a faceted word rank versus frequencies by grouping variable(s).

rank_freq_plot - Plot word rank versus frequencies.

Note

rank_freq_mplot utilizes the ggplot2 package, whereas, rank_freq_plot employs base graphics. rank_freq_mplot is more general & flexible; in most cases rank_freq_mplot should be preferred.

References

Zipf, G. K. (1949). Human behavior and the principle of least effort. Cambridge, Massachusetts: Addison-Wesley. p. 1.

Examples

## <strong>Not run</strong>: # #rank_freq_mplot EXAMPLES: # x1 <- rank_freq_mplot(DATA$state, DATA$person, ncol = 2, jitter = 0) # ltruncdf(x1, 10) # x2 <- rank_freq_mplot(mraja1spl$dialogue, mraja1spl$person, ncol = 5, # hap.col = "purple") # ltruncdf(x2, 10) # invisible(rank_freq_mplot(mraja1spl$dialogue, mraja1spl$person, ncol = 5, # log.freq = FALSE, log.rank = FALSE, jitter = .6)) # invisible(rank_freq_mplot(raj$dialogue, jitter = .5, alpha = 1/15)) # invisible(rank_freq_mplot(raj$dialogue, jitter = .5, shape = 19, alpha = 1/15)) # # #rank_freq_plot EXAMPLES: # mod <- with(mraja1spl , word_list(dialogue, person, cut.n = 10, # cap.list=unique(mraja1spl$person))) # x3 <- rank_freq_plot(mod$fwl$Romeo$WORD, mod$fwl$Romeo$FREQ, title.ext = 'Romeo') # ltruncdf(x3, 10) # ltruncdf(rank_freq_plot(mod$fwl$Romeo$WORD, mod$fwl$Romeo$FREQ, plot = FALSE) , 10) # invisible(rank_freq_plot(mod$fwl$Romeo$WORD, mod$fwl$Romeo$FREQ, title.ext = 'Romeo', # jitter.ammount = 0.15, hap.col = "darkgreen", dis.col = "purple")) # invisible(rank_freq_plot(mod$fwl$Romeo$WORD, mod$fwl$Romeo$FREQ, title.ext = 'Romeo', # jitter.ammount = 0.5, log.scale=FALSE)) # invisible(lapply(seq_along(mod$fwl), function(i){ # dev.new() # rank_freq_plot(mod$fwl[[i]]$WORD, mod$fwl[[i]]$FREQ, # title.ext = names(mod$fwl)[i], jitter.ammount = 0.5, log.scale=FALSE) # })) # ## <strong>End(Not run)</strong>