Commit e621cede authored by Eelco van der Wel's avatar Eelco van der Wel :speech_balloon:
Browse files

smaller min topic size

parent 88918508
Pipeline #12558 passed with stage
in 9 minutes and 39 seconds
Showing with 2 additions and 2 deletions
+2 -2
......@@ -10,6 +10,7 @@ def compute_umap(embeddings: np.ndarray) -> np.ndarray:
min_dist=0.0,
metric='cosine',
low_memory=False,
random_state=42,
)
return umap.fit_transform(embeddings)
......
from dataclasses import dataclass
from typing import Dict, List, Optional, Set, Tuple, Union
import hdbscan
import numpy as np
import torch
from keybert import KeyBERT
......
......@@ -18,7 +18,7 @@ from .preprocessing import PreprocessedTweet, preprocess_tweets
from .schema import Cluster, ClusterEntry, TwitterTopicModel
from .utils import get_tweets
MIN_TOPIC_SIZE = 8
MIN_TOPIC_SIZE = 6
NUM_TOPIC_DESCRIPTORS = 3
DESCRIPTION_DIVERSITY = 0.3
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment