Bag of Words

Usage

bag_o_words(text.var, apostrophe.remove = FALSE, ...)
unbag(text.var, na.rm = TRUE)
breaker(text.var)
word_split(text.var)

Arguments

text.var
The text variable.
apostrophe.remove
logical. If TRUE removes apostrophe's from the output.
na.rm
logical. If TRUE NAs are removed before pasting.
...
Additional arguments passed to strip.

Bag of Words

Value

Returns a vector of stripped words.

unbag - Returns a string.

breaker - Returns a vector of striped words and qdap recognized endmarks (i.e., ".", "!", "?", "*", "-").

Description

bag_o_words - Reduces a text column to a bag of words.

unbag - Wrapper for paste(collapse=" ") to glue words back into strings.

breaker - Reduces a text column to a bag of words and qdap recognized end marks.

word_split - Reduces a text column to a list of vectors of bag of words and qdap recognized end marks (i.e., ".", "!", "?", "*", "-").

Examples

## <strong>Not run</strong>: # bag_o_words("I'm going home!") # bag_o_words("I'm going home!", apostrophe.remove = TRUE) # unbag(bag_o_words("I'm going home!")) # # bag_o_words(DATA$state) # by(DATA$state, DATA$person, bag_o_words) # lapply(DATA$state, bag_o_words) # # breaker(DATA$state) # by(DATA$state, DATA$person, breaker) # lapply(DATA$state, breaker) # unbag(breaker(DATA$state)) # # word_split(c(NA, DATA$state)) # unbag(word_split(c(NA, DATA$state))) # ## <strong>End(Not run)</strong>