This project use Longformer's attention mechanism to alireza7/ARMAN-MSR-persian-base in order to perform abstractive summarization on long documents. so new model can accept 8K tokens (rather than 512 tokens).
fine-tuned model is available in huggingface
from transformers import AutoTokenizer
from transformers import pipeline
summarizer = pipeline("summarization", model="zedfum/arman-longformer-8k-finetuned-ensani", tokenizer="zedfum/arman-longformer-8k-finetuned-ensani" , device=0)
text_to_summarize=""
summarizer(text_to_summarize, min_length=5, max_length=512,truncation=True)