Other packages > Find by keyword >

corpustools  

Managing, Querying and Analyzing Tokenized Text
View on CRAN: Click here


Download and install corpustools package within the R console
Install from CRAN:
install.packages("corpustools")

Install from Github:
library("remotes")
install_github("cran/corpustools")

Install by package version:
library("remotes")
install_version("corpustools", "0.5.1")



Attach the package and use:
library("corpustools")
Maintained by
Kasper Welbers
[Scholar Profile | Author Map]
First Published: 2017-10-03
Latest Update: 2023-05-08
Description:
Provides text analysis in R, focusing on the use of a tokenized text format. In this format, the positions of tokens are maintained, and each token can be annotated (e.g., part-of-speech tags, dependency relations). Prominent features include advanced Lucene-like querying for specific tokens or contexts (e.g., documents, sentences), similarity statistics for words and documents, exporting to DTM for compatibility with many text analysis packages, and the possibility to reconstruct original text from tokens to facilitate interpretation.
How to cite:
Kasper Welbers (2017). corpustools: Managing, Querying and Analyzing Tokenized Text. R package version 0.5.1, https://cran.r-project.org/web/packages/corpustools. Accessed 07 Apr. 2025.
Previous versions and publish date:
0.3.1 (2017-12-13 10:27), 0.3.3 (2018-04-20 13:46), 0.3 (2017-10-03 15:55), 0.4.1 (2019-11-20 00:10), 0.4.2 (2020-01-23 14:00), 0.4.4 (2021-01-07 12:00), 0.4.5 (2021-01-13 11:50), 0.4.6 (2021-02-03 09:50), 0.4.7 (2021-02-28 18:50), 0.4.8 (2021-06-25 09:30), 0.4.9 (2022-01-23 20:32), 0.4.10 (2022-05-11 12:00)
Other packages that cited corpustools R package
View corpustools citation profile
Other R packages that corpustools depends, imports, suggests or enhances
Complete documentation for corpustools
Functions, R codes and Examples using the corpustools R package
Some associated functions: add_multitoken_label . agg_label . agg_tcorpus . aggregate_rsyntax . as.tcorpus.default . as.tcorpus . as.tcorpus.tCorpus . backbone_filter . browse_hits . browse_texts . calc_chi2 . compare_corpus . compare_documents . compare_subset . corenlp_tokens . count_tcorpus . create_tcorpus . docfreq_filter . dtm_compare . dtm_wordcloud . ego_semnet . export_span_annotations . feature_associations . feature_stats . fold_rsyntax . freq_filter . get_dtm . get_global_i . get_kwic . get_stopwords . laplace . melt_quanteda_dict . merge_tcorpora . plot.contextHits . plot.featureAssociations . plot.featureHits . plot.vocabularyComparison . plot_semnet . plot_words . preprocess_tokens . print.contextHits . print.featureHits . print.tCorpus . refresh_tcorpus . require_package . search_contexts . search_dictionary . search_features . semnet . semnet_window . set_network_attributes . sgt . show_udpipe_models . sotu_texts . stopwords_list . subset.tCorpus . subset_query . summary.contextHits . summary.featureHits . summary.tCorpus . tCorpus-cash-annotate_rsyntax . tCorpus-cash-code_dictionary . tCorpus-cash-code_features . tCorpus-cash-context . tCorpus-cash-deduplicate . tCorpus-cash-delete_columns . tCorpus-cash-feats_to_columns . tCorpus-cash-feature_subset . tCorpus-cash-fold_rsyntax . tCorpus-cash-get . tCorpus-cash-lda_fit . tCorpus-cash-merge . tCorpus-cash-preprocess . tCorpus-cash-replace_dictionary . tCorpus-cash-search_recode . tCorpus-cash-set . tCorpus-cash-set_levels . tCorpus-cash-set_name . tCorpus-cash-subset . tCorpus-cash-subset_query . tCorpus-cash-udpipe_clauses . tCorpus-cash-udpipe_quotes . tCorpus . tCorpus_compare . tCorpus_create . tCorpus_data . tCorpus_docsim . tCorpus_features . tCorpus_modify_by_reference . tCorpus_querying . tCorpus_semnet . tCorpus_topmod . tc_plot_tree . tc_sotu_udpipe . tokenWindowOccurence . tokens_to_tcorpus . top_features . transform_rsyntax . udpipe_clause_tqueries . udpipe_quote_tqueries . udpipe_simplify . udpipe_spanquote_tqueries . udpipe_tcorpus . untokenize . 
Some associated R codes: RcppExports.R .  Full corpustools package functions and examples
Downloads during the last 30 days
03/0803/0903/1003/1103/1203/1303/1403/1503/1603/1703/1803/1903/2003/2103/2203/2303/2403/2503/2603/2703/2803/2903/3003/3104/0104/0204/0304/0404/0504/06Downloads for corpustools20304050607080TrendBars

Today's Hot Picks in Authors and Packages

probout  
Unsupervised Multivariate Outlier Probabilities for Large Datasets
Estimates unsupervised outlier probabilities for multivariate numeric data with many observations fr ...
Download / Learn more Package Citations See dependency  
ASMap  
Linkage Map Construction using the MSTmap Algorithm
Functions for Accurate and Speedy linkage map construction, manipulation and diagnosis of Doubled Ha ...
Download / Learn more Package Citations See dependency  
MetaAnalyser  
An Interactive Visualisation of Meta-Analysis as a Physical Weighing Machine
An interactive application to visualise meta-analysis data as a physical weighing machine. The inte ...
Download / Learn more Package Citations See dependency  
letsR  
Data Handling and Analysis in Macroecology
Handling, processing, and analyzing geographic data on species' distributions and environmental var ...
Download / Learn more Package Citations See dependency  
r2resize  
In-Text Resize for Images, Tables and Fancy Resize Containers in 'shiny', 'rmarkdown' and 'quarto' Documents
Automatic resizing toolbar for containers, images and tables. Various resizable or expandable contai ...
Download / Learn more Package Citations See dependency  
nutriNetwork  
Structure Learning with Copula Graphical Model
Statistical tool for learning the structure of direct associations among variables for continuous d ...
Download / Learn more Package Citations See dependency  

24,012

R Packages

207,311

Dependencies

64,867

Author Associations

24,013

Publication Badges

© Copyright since 2022. All right reserved, rpkg.net.  Based in Cambridge, Massachusetts, USA