#Toolbox

14 messages · Page 1 of 1 (latest)

delicate pewter
#

Needful Things

#

HDBSCAN
HDBSCAN - Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over varying epsilon values and integrates the result to find a clustering that gives the best stability over epsilon. This allows HDBSCAN to find clusters of varying densities (unlike DBSCAN), and be more robust to parameter selection.

In practice this means that HDBSCAN returns a good clustering straight away with little or no parameter tuning -- and the primary parameter, minimum cluster size, is intuitive and easy to select.

HDBSCAN is ideal for exploratory data analysis; it's a fast and robust algorithm that you can trust to return meaningful clusters (if there are any).
https://github.com/scikit-learn-contrib/hdbscan

GitHub

A high performance implementation of HDBSCAN clustering. - GitHub - scikit-learn-contrib/hdbscan: A high performance implementation of HDBSCAN clustering.

#

Top2Vec is an algorithm for topic modeling and semantic search. It automatically detects topics present in text and generates jointly embedded topic, document and word vectors. Once you train the Top2Vec model you can:

Get number of detected topics.
Get topics.
Get topic sizes.
Get hierarchichal topics.
Search topics by keywords.
Search documents by topic.
Search documents by keywords.
Find similar words.
Find similar documents.
Expose model with RESTful-Top2Vec
https://github.com/ddangelov/Top2Vec

GitHub

Top2Vec learns jointly embedded topic, document and word vectors. - GitHub - ddangelov/Top2Vec: Top2Vec learns jointly embedded topic, document and word vectors.

#

Neo4j Graph DB
Native graph storage, data science, ML, analytics, and visualization with enterprise-grade security controls to scale your transactional and analytical workloads – without constraints.
There's a free cloud instance
https://neo4j.com/pricing/

Graph Database & Analytics

Check out Neo4j pricing for AuraDB, our fully managed cloud graph database service, as well as our self-hosted graph database options, Enterprise & Community.

#

Docker

docker run --publish=7474:7474 --publish=7687:7687 --ip=0.0.0.0 neo4j:5.12.0
delicate pewter
delicate pewter
delicate pewter
#