Check Text For Potential Problems


check_text(text.var, file = NULL)


The text variable.
A connection, or a character string naming the file to print to. If NULL prints to the console. Note that this is assigned as an attribute and passed to print.

Returns a list with the following potential text faults reports:

  • non_character- Text that is non-character.
  • missing_ending_punctuation- Text with no endmark at the end of the string.
  • empty- Text that contains an empty element (i.e., "").
  • double_punctuation- Text that contains two qdap punctuation marks in the same string.
  • non_space_after_comma- Text that contains commas with no space after them.
  • no_alpha- Text that contains string elements with no alphabetic characters.
  • non_ascii- Text that contains non-ASCII characters.
  • missing_value- Text that contains missing values (i.e., NA).
  • containing_escaped- Text that contains escaped (see ?Quotes).
  • containing_digits- Text that contains digits.
  • indicating_incomplete- Text that contains endmarks that are indicative of incomplete/trailing sentences (e.g., ...).
  • potentially_misspelled- Text that contains potentially misspelled words.


Uncleaned text may result in errors, warnings, and incorrect results in subsequent analysis. check_text checks text for potential problems and suggests possible fixes. Potential text anomalies that are detected include: factors, missing ending punctuation, empty cells, double punctuation, non-space after comma, no alphabetic characters, non-ascii, missing value, and potentially misspelled words.


The output is a list but prints as a pretty formatted output with potential problem elements, the accompanying text, and possible suggestions to fix the text.


## <strong>Not run</strong>: # x <- c("i like", "i want. thet them .", "I am ! that|", "", NA, # "they,were there", ".", " ", "?", "3;", "I like goud eggs!", # "i 4like...", "\\tgreat", "She said \"yes\"") # check_text(x) # print(check_text(x), include.text=FALSE) # # y <- c("A valid sentence.", "yet another!") # check_text(y) # ## <strong>End(Not run)</strong>

See also