Other packages > Find by keyword >

tokenizers  

Fast, Consistent Tokenization of Natural Language Text
View on CRAN: Click here


Download and install tokenizers package within the R console
Install from CRAN:
install.packages("tokenizers")

Install from Github:
library("remotes")
install_github("cran/tokenizers")

Install by package version:
library("remotes")
install_version("tokenizers", "0.3.0")



Attach the package and use:
library("tokenizers")
Maintained by
Lincoln Mullen
[Scholar Profile | Author Map]
All associated links for this package
First Published: 2016-04-02
Latest Update: 2022-12-22
Description:
Convert natural language text into tokens. Includes tokenizers for shingled n-grams, skip n-grams, words, word stems, sentences, paragraphs, characters, shingled characters, lines, Penn Treebank, regular expressions, as well as functions for counting characters, words, and sentences, and a function for splitting longer texts into separate documents, each with the same number of words.The tokenizers have a consistent interface, and the package is built on the 'stringi' and 'Rcpp' packages forfast yet correct tokenization in 'UTF-8'.
How to cite:
Lincoln Mullen (2016). tokenizers: Fast, Consistent Tokenization of Natural Language Text. R package version 0.3.0, https://cran.r-project.org/web/packages/tokenizers. Accessed 21 Nov. 2024.
Previous versions and publish date:
0.1.0 (2016-04-02 23:34), 0.1.1 (2016-04-04 08:37), 0.1.2 (2016-04-14 18:19), 0.1.3 (2016-08-18 23:27), 0.1.4 (2016-08-29 22:59), 0.2.0 (2018-03-21 15:43), 0.2.1 (2018-03-29 22:07), 0.2.3 (2022-09-23 22:00)
Other packages that cited tokenizers R package
View tokenizers citation profile
Other R packages that tokenizers depends, imports, suggests or enhances
Complete documentation for tokenizers
Downloads during the last 30 days
Get rewarded with contribution points by helping add
Reviews / comments / questions /suggestions ↴↴↴

Today's Hot Picks in Authors and Packages

RcppHNSW  
'Rcpp' Bindings for 'hnswlib', a Library for Approximate Nearest Neighbors
'Hnswlib' is a C++ library for Approximate Nearest Neighbors. This package provides a minimal R int ...
Download / Learn more Package Citations See dependency  
pkgdepends  
Package Dependency Resolution and Downloads
Find recursive dependencies of 'R' packages from various sources. Solve the dependencies to obtain ...
Download / Learn more Package Citations See dependency  
crossrun  
Joint Distribution of Number of Crossings and Longest Run
Joint distribution of number of crossings and the longest run in a series of independent Bernoulli ...
Download / Learn more Package Citations See dependency  
r2resize  
In-Text Resize for Images, Tables and Fancy Resize Containers in 'shiny', 'rmarkdown' and 'quarto' Documents
Automatic resizing toolbar for containers, images and tables. Various resizable or expandable contai ...
Download / Learn more Package Citations See dependency  
SCBiclust  
Identifies Mean, Variance, and Hierarchically Clustered Biclusters
Identifies a bicluster, a submatrix of the data such that the features and observations within the s ...
Download / Learn more Package Citations See dependency  
kgschart  
KGS Rank Graph Parser
Restore underlining numeric data from rating history graph of KGS (an online platform of the game o ...
Download / Learn more Package Citations See dependency  

23,229

R Packages

199,929

Dependencies

62,984

Author Associations

23,230

Publication Badges

© Copyright 2022 - present. All right reserved, rpkg.net.  Based in Cambridge, Massachusetts, USA