Other packages > Find by keyword >

corpustools  

Managing, Querying and Analyzing Tokenized Text
View on CRAN: Click here


Download and install corpustools package within the R console
Install from CRAN:
install.packages("corpustools")

Install from Github:
library("remotes")
install_github("cran/corpustools")

Install by package version:
library("remotes")
install_version("corpustools", "0.5.1")



Attach the package and use:
library("corpustools")
Maintained by
Kasper Welbers
[Scholar Profile | Author Map]
First Published: 2017-10-03
Latest Update: 2023-05-08
Description:
Provides text analysis in R, focusing on the use of a tokenized text format. In this format, the positions of tokens are maintained, and each token can be annotated (e.g., part-of-speech tags, dependency relations). Prominent features include advanced Lucene-like querying for specific tokens or contexts (e.g., documents, sentences), similarity statistics for words and documents, exporting to DTM for compatibility with many text analysis packages, and the possibility to reconstruct original text from tokens to facilitate interpretation.
How to cite:
Kasper Welbers (2017). corpustools: Managing, Querying and Analyzing Tokenized Text. R package version 0.5.1, https://cran.r-project.org/web/packages/corpustools. Accessed 01 May. 2025.
Previous versions and publish date:
0.3.1 (2017-12-13 10:27), 0.3.3 (2018-04-20 13:46), 0.3 (2017-10-03 15:55), 0.4.1 (2019-11-20 00:10), 0.4.2 (2020-01-23 14:00), 0.4.4 (2021-01-07 12:00), 0.4.5 (2021-01-13 11:50), 0.4.6 (2021-02-03 09:50), 0.4.7 (2021-02-28 18:50), 0.4.8 (2021-06-25 09:30), 0.4.9 (2022-01-23 20:32), 0.4.10 (2022-05-11 12:00)
Other packages that cited corpustools R package
View corpustools citation profile
Other R packages that corpustools depends, imports, suggests or enhances
Complete documentation for corpustools
Functions, R codes and Examples using the corpustools R package
Some associated functions: add_multitoken_label . agg_label . agg_tcorpus . aggregate_rsyntax . as.tcorpus.default . as.tcorpus . as.tcorpus.tCorpus . backbone_filter . browse_hits . browse_texts . calc_chi2 . compare_corpus . compare_documents . compare_subset . corenlp_tokens . count_tcorpus . create_tcorpus . docfreq_filter . dtm_compare . dtm_wordcloud . ego_semnet . export_span_annotations . feature_associations . feature_stats . fold_rsyntax . freq_filter . get_dtm . get_global_i . get_kwic . get_stopwords . laplace . melt_quanteda_dict . merge_tcorpora . plot.contextHits . plot.featureAssociations . plot.featureHits . plot.vocabularyComparison . plot_semnet . plot_words . preprocess_tokens . print.contextHits . print.featureHits . print.tCorpus . refresh_tcorpus . require_package . search_contexts . search_dictionary . search_features . semnet . semnet_window . set_network_attributes . sgt . show_udpipe_models . sotu_texts . stopwords_list . subset.tCorpus . subset_query . summary.contextHits . summary.featureHits . summary.tCorpus . tCorpus-cash-annotate_rsyntax . tCorpus-cash-code_dictionary . tCorpus-cash-code_features . tCorpus-cash-context . tCorpus-cash-deduplicate . tCorpus-cash-delete_columns . tCorpus-cash-feats_to_columns . tCorpus-cash-feature_subset . tCorpus-cash-fold_rsyntax . tCorpus-cash-get . tCorpus-cash-lda_fit . tCorpus-cash-merge . tCorpus-cash-preprocess . tCorpus-cash-replace_dictionary . tCorpus-cash-search_recode . tCorpus-cash-set . tCorpus-cash-set_levels . tCorpus-cash-set_name . tCorpus-cash-subset . tCorpus-cash-subset_query . tCorpus-cash-udpipe_clauses . tCorpus-cash-udpipe_quotes . tCorpus . tCorpus_compare . tCorpus_create . tCorpus_data . tCorpus_docsim . tCorpus_features . tCorpus_modify_by_reference . tCorpus_querying . tCorpus_semnet . tCorpus_topmod . tc_plot_tree . tc_sotu_udpipe . tokenWindowOccurence . tokens_to_tcorpus . top_features . transform_rsyntax . udpipe_clause_tqueries . udpipe_quote_tqueries . udpipe_simplify . udpipe_spanquote_tqueries . udpipe_tcorpus . untokenize . 
Some associated R codes: RcppExports.R .  Full corpustools package functions and examples
Downloads during the last 30 days
04/0104/0204/0304/0404/0504/0604/0704/0804/0904/1004/1104/1204/1304/1404/1504/1604/1704/1804/1904/2004/2104/2204/2304/2404/2504/2604/2704/2804/29Downloads for corpustools303540455055606570758085TrendBars

Today's Hot Picks in Authors and Packages

nextGenShinyApps  
Craft Exceptional 'R Shiny' Applications and Dashboards with Novel Responsive Tools
Nove responsive tools for designing and developing 'Shiny' dashboards and applications. The scripts ...
Download / Learn more Package Citations See dependency  
pander  
An R 'Pandoc' Writer
Contains some functions catching all messages, 'stdout' and other useful information while evaluati ...
Download / Learn more Package Citations See dependency  
dsm  
Density Surface Modelling of Distance Sampling Data
Density surface modelling of line transect data. A Generalized Additive Model-based approach is use ...
Download / Learn more Package Citations See dependency  
representr  
Create Representative Records After Entity Resolution
An implementation of Kaplan, Betancourt, Steorts (2022) that cre ...
Download / Learn more Package Citations See dependency  
fbar  
An Extensible Approach to Flux Balance Analysis
A toolkit for Flux Balance Analysis and related metabolic modeling techniques. Functions are provid ...
Download / Learn more Package Citations See dependency  
geofabrik  
Downloading Open Street Map Data
Download OpenStreetMap data from geofabrik servers httpsdownload.geofabrik.de. This approach usesonl ...
Download / Learn more Package Citations See dependency  

24,142

R Packages

207,311

Dependencies

65,176

Author Associations

24,143

Publication Badges

© Copyright since 2022. All right reserved, rpkg.net.  Based in Cambridge, Massachusetts, USA