View on GitHub

rudetoxifier

Code and data of "Methods for Detoxification of Texts for the Russian Language" paper

Methods for Detoxification of Texts for the Russian Language (ruDetoxifier)

This repository contains models and evaluation methodology for the detoxification task of Russian texts. The original paper “Methods for Detoxification of Texts for the Russian Language” was presented at Dialogue-2021 conference.

Inference Example

In this repository, we release two best models detoxGPT and condBERT (see Methodology for more details). You can try detoxification inference example in this notebook or Open In Colab.


Methodology

In our research, we tested several approaches:

Baselines

detoxGPT

Based on ruGPT models. This method requires parallel dataset for training. We tested ruGPT-small, ruGPT-medium, and ruGPT-large models in several setups:

condBERT

Based on BERT model. This method does not require parallel dataset for training. One of the tasks on which original BERT was pretrained – predicting the word that should was replaced with a [MASK] token – suits delete-retrieve-generate style transfer method. We tested RuBERT and Geotrend pre-trained models in several setups:


Automatic Evaluation

The evaluation consists of three types of metrics:

Finally, aggregation metric: geometric mean between STA, CS and PPL.

Launching

You can run ru_metric.py script for evaluation. The fine-tuned weights for toxicity classifier can be found here.


Results

Method STA↑ CS↑ WO↑ BLEU↑ PPL↓ GM↑
Baselines            
Duplicate 0.00 1.00 1.00 1.00 146.00 0.05 ± 0.0012
Delete 0.27 0.96 0.85 0.81 263.55 0.10 ± 0.0007
Retrieve 0.91 0.85 0.07 0.09 65.74 0.22 ± 0.0010
detoxGPT-small            
zero-shot 0.93 0.20 0.00 0.00 159.11 0.10 ± 0.0005
few-shot 0.17 0.70 0.05 0.06 83.38 0.11 ± 0.0009
fine-tuned 0.51 0.70 0.05 0.05 39.48 0.20 ± 0.0011
detoxGPT-medium            
fine-tuned 0.49 0.77 0.18 0.21 86.75 0.16 ± 0.0009
detoxGPT-large            
fine-tuned 0.61 0.77 0.22 0.21 36.92 0.23 ± 0.0010
condBERT            
DeepPavlov zero-shot 0.53 0.80 0.42 0.61 668.58 0.08 ± 0.0006
DeepPavlov fine-tuned 0.52 0.86 0.51 0.53 246.68 0.12 ± 0.0007
Geotrend zero-shot 0.62 0.85 0.54 0.64 237.46 0.13 ± 0.0009
Geotrend fine-tuned 0.66 0.86 0.54 0.64 209.95 0.14 ± 0.0009

Data

Folder data consists of all used train datasets, test data and naive example of style transfer result:


Citation

If you find this repository helpful, feel free to cite our publication:

@article{DBLP:journals/corr/abs-2105-09052,
  author    = {Daryna Dementieva and
               Daniil Moskovskiy and
               Varvara Logacheva and
               David Dale and
               Olga Kozlova and
               Nikita Semenov and
               Alexander Panchenko},
  title     = {Methods for Detoxification of Texts for the Russian Language},
  journal   = {CoRR},
  volume    = {abs/2105.09052},
  year      = {2021},
  url       = {https://arxiv.org/abs/2105.09052},
  archivePrefix = {arXiv},
  eprint    = {2105.09052},
  timestamp = {Mon, 31 May 2021 16:16:57 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2105-09052.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Contacts

For any questions please contact Daryna Dementieva via email or Telegram.