Improving model by AmarskiyArtem · Pull Request #18 · SimulatorML/SpamKiller

AmarskiyArtem · 2024-03-21T21:01:41Z

No description provided.

AmarskiyArtem · 2024-03-21T21:06:19Z

        logger.info("Predicting...")
        pred_scores = []
-        name_features = ""
+        name_features = []


Изменил, так как раньше возвращалась общая строка для всего датасета

О, спасибо

sokolgood · 2024-04-09T20:28:24Z

            total_score += temp_score
            name_features += temp_name_features
        total_score_normalized = self._normalize_score(total_score, threshold=1)
+        if len(X.iloc[0, :]["text"].split()) < 2 and all(


мне кажется если хочешь не штрафовать за короткие сообщения, то просто убери правило _len_msg

Такие костыли усложняют код и не логичные

Сделал так, потому короткие сообщения тоже могут быть спамовыми, e.g тг-линк + картинка. Или сообщения из одного стоп-ворда. Однако в основном короткие сообщения это ок и штрафовать их не нужно. Ориентировался на треин/тест/некоторые собранные примеры. Так убирались некоторые FP, однако явный спам все еще блокировался (не рос FN). Просто убирать правило коротких сообщений не приносит профита.

sokolgood · 2024-04-09T20:29:42Z

        logger.info("Predicting...")
        pred_scores = []
-        name_features = ""
+        name_features = []


О, спасибо

sokolgood · 2024-04-09T20:30:40Z

            total_score += temp_score
            name_features += temp_name_features
        total_score_normalized = self._normalize_score(total_score, threshold=1)
+        if len(message["text"].split()) < 2 and all(


ну тут опять так же

на какого рода сообщений это нацелено? прям конкретно усложняет код, а цели я особо не вижу пока

sokolgood · 2024-04-09T20:31:45Z


        return score, feature

-    def _check_len_message(self, message):


из модели для валидации удалил правило, а в prod оставил

уже тогда и из продовой удалить это правило

AmarskiyArtem added 4 commits February 28, 2024 23:29

update gitignore

645163b

Add vpn to gpt promt

ae5a14c

add more dangerous emoji

e65c233

update rools

cbbe594

AmarskiyArtem commented Mar 21, 2024

View reviewed changes

sokolgood suggested changes Apr 9, 2024

View reviewed changes

Pinpupik force-pushed the main branch from 78704c7 to 6c2b377 Compare January 13, 2025 21:30

bmmjam force-pushed the main branch from 6c2b377 to 66c896b Compare January 14, 2025 04:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving model#18

Improving model#18
AmarskiyArtem wants to merge 4 commits intoSimulatorML:mainfrom
AmarskiyArtem:improving-model

AmarskiyArtem commented Mar 21, 2024

Uh oh!

AmarskiyArtem Mar 21, 2024

Uh oh!

sokolgood Apr 9, 2024

Uh oh!

sokolgood Apr 9, 2024

Uh oh!

AmarskiyArtem Apr 12, 2024

Uh oh!

sokolgood Apr 9, 2024

Uh oh!

sokolgood Apr 9, 2024

Uh oh!

sokolgood Apr 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AmarskiyArtem commented Mar 21, 2024

Uh oh!

AmarskiyArtem Mar 21, 2024

Choose a reason for hiding this comment

Uh oh!

sokolgood Apr 9, 2024

Choose a reason for hiding this comment

Uh oh!

sokolgood Apr 9, 2024

Choose a reason for hiding this comment

Uh oh!

AmarskiyArtem Apr 12, 2024

Choose a reason for hiding this comment

Uh oh!

sokolgood Apr 9, 2024

Choose a reason for hiding this comment

Uh oh!

sokolgood Apr 9, 2024

Choose a reason for hiding this comment

Uh oh!

sokolgood Apr 9, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants