Other packages > Find by keyword >

doc2vec  

Distributed Representations of Sentences, Documents and Topics
View on CRAN: Click here


Download and install doc2vec package within the R console
Install from CRAN:
install.packages("doc2vec")

Install from Github:
library("remotes")
install_github("cran/doc2vec")

Install by package version:
library("remotes")
install_version("doc2vec", "0.2.0")



Attach the package and use:
library("doc2vec")
Maintained by
Jan Wijffels
[Scholar Profile | Author Map]
First Published: 2020-12-10
Latest Update: 2021-03-28
Description:
Learn vector representations of sentences, paragraphs or documents by using the 'Paragraph Vector' algorithms, namely the distributed bag of words ('PV-DBOW') and the distributed memory ('PV-DM') model. The techniques in the package are detailed in the paper "Distributed Representations of Sentences and Documents" by Mikolov et al. (2014), available at . The package also provides an implementation to cluster documents based on these embedding using a technique called top2vec. Top2vec finds clusters in text documents by combining techniques to embed documents and words and density-based clustering. It does this by embedding documents in the semantic space as defined by the 'doc2vec' algorithm. Next it maps these document embeddings to a lower-dimensional space using the 'Uniform Manifold Approximation and Projection' (UMAP) clustering algorithm and finds dense areas in that space using a 'Hierarchical Density-Based Clustering' technique (HDBSCAN). These dense areas are the topic clusters which can be represented by the corresponding topic vector which is an aggregate of the document embeddings of the documents which are part of that topic cluster. In the same semantic space similar words can be found which are representative of the topic. More details can be found in the paper 'Top2Vec: Distributed Representations of Topics' by D. Angelov available at .
How to cite:
Jan Wijffels (2020). doc2vec: Distributed Representations of Sentences, Documents and Topics. R package version 0.2.0, https://cran.r-project.org/web/packages/doc2vec. Accessed 06 May. 2025.
Previous versions and publish date:
0.1.0 (2020-12-10 10:00), 0.1.1 (2021-01-21 18:20)
Other packages that cited doc2vec R package
View doc2vec citation profile
Other R packages that doc2vec depends, imports, suggests or enhances
Complete documentation for doc2vec
Downloads during the last 30 days
04/0604/0704/0804/0904/1004/1104/1204/1304/1404/1504/1604/1704/1804/1904/2004/2104/2204/2304/2404/2504/2604/2704/2804/2904/3005/0105/0205/0305/04Downloads for doc2vec0102030405060708090TrendBars

Today's Hot Picks in Authors and Packages

tashu  
Analysis and Prediction of Bicycle Rental Amount
Provides functions for analyzing citizens' bicycle usage pattern and predicting rental amount on spe ...
Download / Learn more Package Citations See dependency  
simfam  
Simulate and Model Family Pedigrees with Structured Founders
The focus is on simulating and modeling families with founders drawn from a structured population (f ...
Download / Learn more Package Citations See dependency  
rotations  
Working with Rotation Data
Tools for working with rotational data, including simulation from the most commonly used distributi ...
Download / Learn more Package Citations See dependency  
tbrf  
Time-Based Rolling Functions
Provides rolling statistical functions based on date and time windows instead of n-lagged observatio ...
Download / Learn more Package Citations See dependency  
netUtils  
A Collection of Tools for Network Analysis
Provides a collection of network analytic (convenience) functions which are missing in other standar ...
Download / Learn more Package Citations See dependency  
r2resize  
In-Text Resize for Images, Tables and Fancy Resize Containers in 'shiny', 'rmarkdown' and 'quarto' Documents
Automatic resizing toolbar for containers, images and tables. Various resizable or expandable contai ...
Download / Learn more Package Citations See dependency  

24,205

R Packages

207,311

Dependencies

65,312

Author Associations

24,206

Publication Badges

© Copyright since 2022. All right reserved, rpkg.net.  Based in Cambridge, Massachusetts, USA