Other packages > Find by keyword >

tokenizers  

Fast, Consistent Tokenization of Natural Language Text
View on CRAN: Click here


Download and install tokenizers package within the R console
Install from CRAN:
install.packages("tokenizers")

Install from Github:
library("remotes")
install_github("cran/tokenizers")

Install by package version:
library("remotes")
install_version("tokenizers", "0.3.0")



Attach the package and use:
library("tokenizers")
Maintained by
Lincoln Mullen
[Scholar Profile | Author Map]
All associated links for this package
First Published: 2016-04-02
Latest Update: 2022-12-22
Description:
Convert natural language text into tokens. Includes tokenizers for shingled n-grams, skip n-grams, words, word stems, sentences, paragraphs, characters, shingled characters, lines, Penn Treebank, regular expressions, as well as functions for counting characters, words, and sentences, and a function for splitting longer texts into separate documents, each with the same number of words.The tokenizers have a consistent interface, and the package is built on the 'stringi' and 'Rcpp' packages forfast yet correct tokenization in 'UTF-8'.
How to cite:
Lincoln Mullen (2016). tokenizers: Fast, Consistent Tokenization of Natural Language Text. R package version 0.3.0, https://cran.r-project.org/web/packages/tokenizers. Accessed 22 Dec. 2024.
Previous versions and publish date:
0.1.0 (2016-04-02 23:34), 0.1.1 (2016-04-04 08:37), 0.1.2 (2016-04-14 18:19), 0.1.3 (2016-08-18 23:27), 0.1.4 (2016-08-29 22:59), 0.2.0 (2018-03-21 15:43), 0.2.1 (2018-03-29 22:07), 0.2.3 (2022-09-23 22:00)
Other packages that cited tokenizers R package
View tokenizers citation profile
Other R packages that tokenizers depends, imports, suggests or enhances
Complete documentation for tokenizers
Downloads during the last 30 days
Get rewarded with contribution points by helping add
Reviews / comments / questions /suggestions ↴↴↴

Today's Hot Picks in Authors and Packages

tropAlgebra  
Tropical Algebraic Functions
It includes functions like tropical addition, tropical multiplication for vectors and matrices. In t ...
Download / Learn more Package Citations See dependency  
LOGANTree  
Tree-Based Models for the Analysis of Log Files from Computer-Based Assessments
Enables researchers to model log-file data from computer-based assessments using machine-learning te ...
Download / Learn more Package Citations See dependency  
Maintainer: Qi Qin (view profile)
wordspace  
Distributional Semantic Models in R
An interactive laboratory for research on distributional semantic models ('DSM', see < ...
Download / Learn more Package Citations See dependency  
quickcode  
Quick and Essential 'R' Tricks for Better Scripts
The NOT functions, 'R' tricks and a compilation of some simple quick plus often used 'R' codes to im ...
Download / Learn more Package Citations See dependency  
Rfast2  
A Collection of Efficient and Extremely Fast R Functions II
A collection of fast statistical and utility functions for data analysis. Functions for regression, ...
Download / Learn more Package Citations See dependency  
dmlalg  
Double Machine Learning Algorithms
Implementation of double machine learning (DML) algorithms in R, based on Emmenegger and Buehlmann ...
Download / Learn more Package Citations See dependency  

23,394

R Packages

201,798

Dependencies

63,416

Author Associations

23,395

Publication Badges

© Copyright 2022 - present. All right reserved, rpkg.net.  Based in Cambridge, Massachusetts, USA