Other packages > Find by keyword >

corpustools  

Managing, Querying and Analyzing Tokenized Text
View on CRAN: Click here


Download and install corpustools package within the R console
Install from CRAN:
install.packages("corpustools")

Install from Github:
library("remotes")
install_github("cran/corpustools")

Install by package version:
library("remotes")
install_version("corpustools", "0.5.2")



Attach the package and use:
library("corpustools")
Maintained by
Kasper Welbers
[Scholar Profile | Author Map]
All associated links for this package
First Published: 2017-10-03
Latest Update: 2025-07-07
Description:
Provides text analysis in R, focusing on the use of a tokenized text format. In this format, the positions of tokens are maintained, and each token can be annotated (e.g., part-of-speech tags, dependency relations). Prominent features include advanced Lucene-like querying for specific tokens or contexts (e.g., documents, sentences), similarity statistics for words and documents, exporting to DTM for compatibility with many text analysis packages, and the possibility to reconstruct original text from tokens to facilitate interpretation.
How to cite:
Kasper Welbers (2017). corpustools: Managing, Querying and Analyzing Tokenized Text. R package version 0.5.2, https://cran.r-project.org/web/packages/corpustools. Accessed 10 Mar. 2026.
Previous versions and publish date:
0.3.1 (2017-12-13 10:27), 0.3.3 (2018-04-20 13:46), 0.3 (2017-10-03 15:55), 0.4.1 (2019-11-20 00:10), 0.4.2 (2020-01-23 14:00), 0.4.4 (2021-01-07 12:00), 0.4.5 (2021-01-13 11:50), 0.4.6 (2021-02-03 09:50), 0.4.7 (2021-02-28 18:50), 0.4.8 (2021-06-25 09:30), 0.4.9 (2022-01-23 20:32), 0.4.10 (2022-05-11 12:00), 0.5.1 (2023-05-08 11:50)
Other packages that cited corpustools R package
View corpustools citation profile
Other R packages that corpustools depends, imports, suggests or enhances
Complete documentation for corpustools
Functions, R codes and Examples using the corpustools R package
Some associated functions: add_multitoken_label . agg_label . agg_tcorpus . aggregate_rsyntax . as.tcorpus.default . as.tcorpus . as.tcorpus.tCorpus . backbone_filter . browse_hits . browse_texts . calc_chi2 . compare_corpus . compare_documents . compare_subset . corenlp_tokens . count_tcorpus . create_tcorpus . docfreq_filter . dtm_compare . dtm_wordcloud . ego_semnet . export_span_annotations . feature_associations . feature_stats . fold_rsyntax . freq_filter . get_dtm . get_global_i . get_kwic . get_stopwords . laplace . melt_quanteda_dict . merge_tcorpora . plot.contextHits . plot.featureAssociations . plot.featureHits . plot.vocabularyComparison . plot_semnet . plot_words . preprocess_tokens . print.contextHits . print.featureHits . print.tCorpus . refresh_tcorpus . require_package . search_contexts . search_dictionary . search_features . semnet . semnet_window . set_network_attributes . sgt . show_udpipe_models . sotu_texts . stopwords_list . subset.tCorpus . subset_query . summary.contextHits . summary.featureHits . summary.tCorpus . tCorpus-cash-annotate_rsyntax . tCorpus-cash-code_dictionary . tCorpus-cash-code_features . tCorpus-cash-context . tCorpus-cash-deduplicate . tCorpus-cash-delete_columns . tCorpus-cash-feats_to_columns . tCorpus-cash-feature_subset . tCorpus-cash-fold_rsyntax . tCorpus-cash-get . tCorpus-cash-lda_fit . tCorpus-cash-merge . tCorpus-cash-preprocess . tCorpus-cash-replace_dictionary . tCorpus-cash-search_recode . tCorpus-cash-set . tCorpus-cash-set_levels . tCorpus-cash-set_name . tCorpus-cash-subset . tCorpus-cash-subset_query . tCorpus-cash-udpipe_clauses . tCorpus-cash-udpipe_quotes . tCorpus . tCorpus_compare . tCorpus_create . tCorpus_data . tCorpus_docsim . tCorpus_features . tCorpus_modify_by_reference . tCorpus_querying . tCorpus_semnet . tCorpus_topmod . tc_plot_tree . tc_sotu_udpipe . tokenWindowOccurence . tokens_to_tcorpus . top_features . transform_rsyntax . udpipe_clause_tqueries . udpipe_quote_tqueries . udpipe_simplify . udpipe_spanquote_tqueries . udpipe_tcorpus . untokenize . 
Some associated R codes: RcppExports.R .  Full corpustools package functions and examples
Downloads during the last 30 days

Today's Hot Picks in Authors and Packages

envirem  
Generation of ENVIREM Variables
Generation of bioclimatic rasters that are complementary to the typical 19 bioclim variables. ...
Download / Learn more Package Citations See dependency  
nextGenShinyApps  
Craft Exceptional 'R Shiny' Applications and Dashboards with Novel Responsive Tools
Nove responsive tools for designing and developing 'Shiny' dashboards and applications. The scripts ...
Download / Learn more Package Citations See dependency  
plyr  
Tools for Splitting, Applying and Combining Data
A set of tools that solves a common set of problems: you need to break a big problem down into mana ...
Download / Learn more Package Citations See dependency  
gena  
Genetic Algorithm and Particle Swarm Optimization
Implements genetic algorithm and particle swarm algorithm for real-valued functions. Various modific ...
Download / Learn more Package Citations See dependency  

26,293

R Packages

225,784

Dependencies

70,376

Author Associations

26,294

Publication Badges

© Copyright since 2022. All right reserved, rpkg.net.  Based in Cambridge, Massachusetts, USA