Skip to content

Commit

Permalink
Overlay render background (#22)
Browse files Browse the repository at this point in the history
* Add support for adding background to rendered boxes in text-render-helper.cpp

* Update localization strings and add support for appending to output file

* Refactor OCR filter properties callback in ocr-filter.cpp and include necessary headers in text-render-helper.cpp

* Update localization strings and add support for appending to output file
  • Loading branch information
royshil committed Apr 12, 2024
1 parent 16c63d6 commit 395d7d2
Show file tree
Hide file tree
Showing 18 changed files with 418 additions and 32 deletions.
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,12 +48,14 @@ OCR Plugin enables many use cases for enhancing your stream or recording:
Available now:
- Add OCR Filter to any source with image or video output
- Choose from Scoreboard model or English, French, Spanish, German, Chinese, Japanese, Arabic, Turkish, Portugese, Hindi, Russian and Italian
- Output OCR result to an OBS Text Source
- Choose the segmentation mode: Word, Line, Page, etc.
- "Semantic Smoothing": getting more consistent outputs with higher accuracy and confidence by "averaging" several text outputs
- Timing/Running modes: per X-milliseconds
- Output OCR result to an OBS Text Source
- Output to a text file (with/out aggregation)
- Output formatting (with inja): e.g. "Score: {{score}}"
- Output text detection to image source
- Output text detection to image source (draws boxes, text, etc.)
- Output to settings (e.g. for other plugins to use as triggers)
- Binarization methods (threshold, Otsu, Triangle, adaptive)
- Image Dilation
- Rescale (optimal Tesseract performance is at 35 pixels / character)
Expand All @@ -62,7 +64,7 @@ Coming soon:
- More languages built-in (pretrained Tesseract models)
- Allowing external model files
- More output capabilities e.g. Parsing, websocket event, etc.
- Extracting text from complex image layouts
- Detection area selection (to prevent using Crop/Pad Filter)
- Different timing/run modes: per X-frames, image change, etc.
- Image stabilization
- Optical flow tracking for fast moving text
Expand Down
2 changes: 1 addition & 1 deletion buildspec.json
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
},
"name": "obs-ocr",
"displayName": "OBS OCR Plugin",
"version": "0.0.6",
"version": "0.0.7",
"author": "Roy Shilkrot",
"website": "https://github.com/occ-ai/obs-ocr",
"email": "[email protected]",
Expand Down
36 changes: 36 additions & 0 deletions data/locale/ar-SA.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
OCRFilter="التصفية الضوئية للحروف"
PageSegmentationMode="نمط تجزئة الصفحة"
Language="اللغة"
EnableSmoothing="تفعيل التنعيم"
ConfThreshold="عتبة الثقة"
WordLength="طول الكلمة"
WindowSize="حجم النافذة"
CharWhitelist="قائمة الحروف المسموح بها"
UserPatterns="أنماط المستخدم"
OutputTextSource="مصدر النص الناتج"
NoOutput="لا يوجد ناتج"
UpdateTimer="تحديث المؤقت (مللي ثانية)"
AdvancedSettings="الإعدادات المتقدمة"
UpdateOnChange="التحديث فقط عند تغيير الصورة"
UpdateOnChangeThreshold="عتبة التغيير %"
OutputFormatting="تنسيق الناتج"
OutputTextDetectionMaskSource="مصدر قناع النص المكتشف"
SaveToFile="حفظ في ملف"
OutputFilePath="مسار ملف الناتج"
BinarizationMode="نمط التثنية"
BinarizationThreshold="عتبة التثنية"
BinarizationBlockSize="حجم الكتلة في التثنية"
PreviewBinarization="معاينة التثنية"
RescaleImage="تغيير حجم الصورة"
RescaleTargetSize="حجم الهدف لتغيير الحجم"
DilationIterations="تكرارات التوسيع"
ImageOutputOption="خيار ناتج الصورة"
DetectionBoxesMask="قناع صناديق الكشف"
TextRendering="عرض النص"
TextWithBackground="نص مع خلفية"
SelectPresets="اختيار الإعدادات المسبقة"
CharWhitelistPreset="إعداد قائمة الحروف المسموح بها مسبقًا"
NumericPunctuation="أرقام / علامات ترقيم"
OutputFlatten="تسطيح الناتج إلى سطر واحد"
OutputFileAppend="إضافة إلى الملف؟"
current_output="الناتج الحالي"
36 changes: 36 additions & 0 deletions data/locale/de-DE.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
OCRFilter="OCR-Filter"
PageSegmentationMode="Seiten Segmentierungsmodus"
Language="Sprache"
EnableSmoothing="Glättung aktivieren"
ConfThreshold="Vertrauensschwelle"
WordLength="Wortlänge"
WindowSize="Fenstergröße"
CharWhitelist="Zeichen-Whitelist"
UserPatterns="Benutzermuster"
OutputTextSource="Quelle des Ausgabetextes"
NoOutput="Keine Ausgabe"
UpdateTimer="Aktualisierungs-Timer (ms)"
AdvancedSettings="Erweiterte Einstellungen"
UpdateOnChange="Nur bei Bildänderung aktualisieren"
UpdateOnChangeThreshold="Änderungsschwelle %"
OutputFormatting="Ausgabeformatierung"
OutputTextDetectionMaskSource="Quelle der Ausgabemaske"
SaveToFile="In Datei speichern"
OutputFilePath="Ausgabedateipfad"
BinarizationMode="Binarisierungsmodus"
BinarizationThreshold="Binarisierungsschwelle"
BinarizationBlockSize="Binarisierungsblockgröße"
PreviewBinarization="Binarisierungsvorschau"
RescaleImage="Bild skalieren"
RescaleTargetSize="Zielgröße skalieren"
DilationIterations="Dilatationsiterationen"
ImageOutputOption="Bildausgabeoption"
DetectionBoxesMask="Erkennungsboxen Maske"
TextRendering="Textüberlagerung"
TextWithBackground="Text mit Hintergrund"
SelectPresets="Voreinstellung auswählen"
CharWhitelistPreset="Zeichen-Whitelist-Voreinstellung"
NumericPunctuation="Numerisch / Interpunktion"
OutputFlatten="Ausgabe zu einer einzigen Zeile glätten"
OutputFileAppend="An Datei anhängen?"
current_output="Aktuelle Ausgabe"
6 changes: 6 additions & 0 deletions data/locale/en-US.ini
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,9 @@ ImageOutputOption="Image Output Option"
DetectionBoxesMask="Detection Boxes Mask"
TextRendering="Text Overlay"
TextWithBackground="Text with Background"
SelectPresets="Select Preset"
CharWhitelistPreset="Character Whitelist Preset"
NumericPunctuation="Numeric / Punctuation"
OutputFlatten="Flatten Output to Single Line"
OutputFileAppend="Append to File?"
current_output="Current Output"
36 changes: 36 additions & 0 deletions data/locale/es-ES.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
OCRFilter="OCR"
PageSegmentationMode="Modo de Segmentación de Página"
Language="Idioma"
EnableSmoothing="Habilitar Suavizado"
ConfThreshold="Umbral de Confianza"
WordLength="Longitud de Palabra"
WindowSize="Tamaño de Ventana"
CharWhitelist="Lista de Caracteres Permitidos"
UserPatterns="Patrones de Usuario"
OutputTextSource="Fuente de Texto de Salida"
NoOutput="Sin Salida"
UpdateTimer="Temporizador de Actualización (ms)"
AdvancedSettings="Configuración Avanzada"
UpdateOnChange="Actualizar Solo al Cambiar la Imagen"
UpdateOnChangeThreshold="Umbral de Cambio %"
OutputFormatting="Formato de Salida"
OutputTextDetectionMaskSource="Fuente de Máscara de Detección de Texto"
SaveToFile="Guardar en Archivo"
OutputFilePath="Ruta de Archivo de Salida"
BinarizationMode="Modo de Binario"
BinarizationThreshold="Umbral de Binario"
BinarizationBlockSize="Tamaño de Bloque de Binario"
PreviewBinarization="Vista Previa de Binario"
RescaleImage="Reescalar Imagen"
RescaleTargetSize="Tamaño de Destino de Reescalado"
DilationIterations="Iteraciones de Dilatación"
ImageOutputOption="Opción de Salida de Imagen"
DetectionBoxesMask="Máscara de Cuadros de Detección"
TextRendering="Superposición de Texto"
TextWithBackground="Texto con Fondo"
SelectPresets="Seleccionar Preajustes"
CharWhitelistPreset="Preajuste de Lista de Caracteres Permitidos"
NumericPunctuation="Numérico / Puntuación"
OutputFlatten="Aplanar Salida a una Línea Única"
OutputFileAppend="¿Agregar al Archivo?"
current_output="Salida Actual"
36 changes: 36 additions & 0 deletions data/locale/fr-FR.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
OCRFilter="OCR"
PageSegmentationMode="Mode de segmentation de page"
Language="Langue"
EnableSmoothing="Activer le lissage"
ConfThreshold="Seuil de confiance"
WordLength="Longueur du mot"
WindowSize="Taille de la fenêtre"
CharWhitelist="Liste blanche de caractères"
UserPatterns="Motifs utilisateur"
OutputTextSource="Source de texte de sortie"
NoOutput="Pas de sortie"
UpdateTimer="Minuterie de mise à jour (ms)"
AdvancedSettings="Paramètres avancés"
UpdateOnChange="Mettre à jour uniquement en cas de changement d'image"
UpdateOnChangeThreshold="Seuil de changement %"
OutputFormatting="Formatage de sortie"
OutputTextDetectionMaskSource="Source de masque de détection de texte"
SaveToFile="Enregistrer dans un fichier"
OutputFilePath="Chemin du fichier de sortie"
BinarizationMode="Mode de binarisation"
BinarizationThreshold="Seuil de binarisation"
BinarizationBlockSize="Taille de bloc de binarisation"
PreviewBinarization="Aperçu de la binarisation"
RescaleImage="Redimensionner l'image"
RescaleTargetSize="Taille cible de redimensionnement"
DilationIterations="Itérations de dilatation"
ImageOutputOption="Option de sortie d'image"
DetectionBoxesMask="Masque de boîtes de détection"
TextRendering="Superposition de texte"
TextWithBackground="Texte avec arrière-plan"
SelectPresets="Sélectionner un préréglage"
CharWhitelistPreset="Préréglage de liste blanche de caractères"
NumericPunctuation="Numérique / Ponctuation"
OutputFlatten="Aplatir la sortie en une seule ligne"
OutputFileAppend="Ajouter au fichier ?"
current_output="Sortie actuelle"
36 changes: 36 additions & 0 deletions data/locale/ja-JP.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
OCRFilter="OCR"
PageSegmentationMode="ページセグメンテーションモード"
Language="言語"
EnableSmoothing="スムージングを有効にする"
ConfThreshold="信頼度の閾値"
WordLength="単語の長さ"
WindowSize="ウィンドウサイズ"
CharWhitelist="文字のホワイトリスト"
UserPatterns="ユーザーパターン"
OutputTextSource="出力テキストのソース"
NoOutput="出力なし"
UpdateTimer="更新タイマー(ミリ秒)"
AdvancedSettings="高度な設定"
UpdateOnChange="画像の変更時のみ更新"
UpdateOnChangeThreshold="変更の閾値 %"
OutputFormatting="出力の書式設定"
OutputTextDetectionMaskSource="マスクソースの出力テキスト検出"
SaveToFile="ファイルに保存"
OutputFilePath="出力ファイルパス"
BinarizationMode="バイナリ化モード"
BinarizationThreshold="バイナリ化の閾値"
BinarizationBlockSize="バイナリ化のブロックサイズ"
PreviewBinarization="バイナリ化のプレビュー"
RescaleImage="画像のリスケール"
RescaleTargetSize="リスケールのターゲットサイズ"
DilationIterations="膨張の反復回数"
ImageOutputOption="画像の出力オプション"
DetectionBoxesMask="検出ボックスのマスク"
TextRendering="テキストのオーバーレイ"
TextWithBackground="背景付きテキスト"
SelectPresets="プリセットの選択"
CharWhitelistPreset="文字のホワイトリストプリセット"
NumericPunctuation="数字/句読点"
OutputFlatten="出力を単一行にフラット化"
OutputFileAppend="ファイルに追記する?"
current_output="現在の出力"
36 changes: 36 additions & 0 deletions data/locale/pt-BR.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
OCRFilter="Filtro OCR"
PageSegmentationMode="Modo de Segmentação de Página"
Language="Idioma"
EnableSmoothing="Ativar Suavização"
ConfThreshold="Limiar de Confiança"
WordLength="Comprimento da Palavra"
WindowSize="Tamanho da Janela"
CharWhitelist="Lista de Caracteres Permitidos"
UserPatterns="Padrões do Usuário"
OutputTextSource="Fonte do Texto de Saída"
NoOutput="Sem Saída"
UpdateTimer="Tempo de Atualização (ms)"
AdvancedSettings="Configurações Avançadas"
UpdateOnChange="Atualizar Somente na Mudança da Imagem"
UpdateOnChangeThreshold="Limiar de Mudança %"
OutputFormatting="Formatação de Saída"
OutputTextDetectionMaskSource="Fonte da Máscara de Detecção de Texto"
SaveToFile="Salvar em Arquivo"
OutputFilePath="Caminho do Arquivo de Saída"
BinarizationMode="Modo de Binzarização"
BinarizationThreshold="Limiar de Binzarização"
BinarizationBlockSize="Tamanho do Bloco de Binzarização"
PreviewBinarization="Visualizar Binzarização"
RescaleImage="Redimensionar Imagem"
RescaleTargetSize="Tamanho de Destino para Redimensionamento"
DilationIterations="Iterações de Dilatação"
ImageOutputOption="Opção de Saída de Imagem"
DetectionBoxesMask="Máscara de Caixas de Detecção"
TextRendering="Sobreposição de Texto"
TextWithBackground="Texto com Fundo"
SelectPresets="Selecionar Preset"
CharWhitelistPreset="Preset de Lista de Caracteres Permitidos"
NumericPunctuation="Numérico / Pontuação"
OutputFlatten="Aplanar Saída para uma Única Linha"
OutputFileAppend="Anexar ao Arquivo?"
current_output="Saída Atual"
36 changes: 36 additions & 0 deletions data/locale/ru-RU.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
OCRFilter="Фильтр OCR"
PageSegmentationMode="Режим сегментации страницы"
Language="Язык"
EnableSmoothing="Включить сглаживание"
ConfThreshold="Порог уверенности"
WordLength="Длина слова"
WindowSize="Размер окна"
CharWhitelist="Белый список символов"
UserPatterns="Пользовательские шаблоны"
OutputTextSource="Источник выходного текста"
NoOutput="Нет выхода"
UpdateTimer="Таймер обновления (мс)"
AdvancedSettings="Расширенные настройки"
UpdateOnChange="Обновление только при изменении изображения"
UpdateOnChangeThreshold="Порог изменения %"
OutputFormatting="Форматирование вывода"
OutputTextDetectionMaskSource="Источник маски обнаружения текста"
SaveToFile="Сохранить в файл"
OutputFilePath="Путь к выходному файлу"
BinarizationMode="Режим бинаризации"
BinarizationThreshold="Порог бинаризации"
BinarizationBlockSize="Размер блока бинаризации"
PreviewBinarization="Предварительный просмотр бинаризации"
RescaleImage="Масштабирование изображения"
RescaleTargetSize="Целевой размер масштабирования"
DilationIterations="Итерации расширения"
ImageOutputOption="Опция вывода изображения"
DetectionBoxesMask="Маска рамок обнаружения"
TextRendering="Отображение текста"
TextWithBackground="Текст на фоне"
SelectPresets="Выбор предустановок"
CharWhitelistPreset="Предустановленный белый список символов"
NumericPunctuation="Числа / Пунктуация"
OutputFlatten="Вывод в одну строку"
OutputFileAppend="Добавить к файлу?"
current_output="Текущий вывод"
36 changes: 36 additions & 0 deletions data/locale/zh-CN.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
OCRFilter="OCR"
PageSegmentationMode="页面分割模式"
Language="语言"
EnableSmoothing="启用平滑"
ConfThreshold="置信度阈值"
WordLength="词长度"
WindowSize="窗口大小"
CharWhitelist="字符白名单"
UserPatterns="用户模式"
OutputTextSource="输出文本来源"
NoOutput="无输出"
UpdateTimer="更新定时器(毫秒)"
AdvancedSettings="高级设置"
UpdateOnChange="仅在图像更改时更新"
UpdateOnChangeThreshold="更改阈值 %"
OutputFormatting="输出格式"
OutputTextDetectionMaskSource="输出掩码来源"
SaveToFile="保存到文件"
OutputFilePath="输出文件路径"
BinarizationMode="二值化模式"
BinarizationThreshold="二值化阈值"
BinarizationBlockSize="二值化块大小"
PreviewBinarization="预览二值化"
RescaleImage="重新缩放图像"
RescaleTargetSize="重新缩放目标大小"
DilationIterations="膨胀迭代次数"
ImageOutputOption="图像输出选项"
DetectionBoxesMask="检测框掩码"
TextRendering="文本叠加"
TextWithBackground="带背景的文本"
SelectPresets="选择预设"
CharWhitelistPreset="字符白名单预设"
NumericPunctuation="数字/标点符号"
OutputFlatten="将输出平铺为单行"
OutputFileAppend="追加到文件?"
current_output="当前输出"
12 changes: 11 additions & 1 deletion src/consts.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,19 @@ const char *const WHITELIST_CHARS_ITALIAN =
// add portuguese characters with accents
const char *const WHITELIST_CHARS_PORTUGUESE =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!@#$%^&*()_+-=[]{}|;':\",./<>?`~\\áàãâéêíóôõúüç ";
// add russian characters
// add russian characters for cyrillic, both upper and lower case
const char *const WHITELIST_CHARS_RUSSIAN =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!@#$%^&*()_+-=[]{}|;':\",./<>?`~\\абвгдеёжзийклмнопрстуфхцчшщъыьэюя ";
const char *const WHITELIST_CHARS_JAPANESE =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!@#$%^&*()_+-=[]{}|;':\",./<>?`~\\あいうえおかきくけこさしすせそたちつてとなにぬねのはひふへほまみむめもやゆよらりるれろわをん ";
// hindi characters
const char *const WHITELIST_CHARS_HINDI =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!@#$%^&*()_+-=[]{}|;':\",./<>?`~\\अआइईउऊऋएऐओऔकखगघङचछजझञटठडढणतथदधनपफबभमयरलवशषसह ";
// arabic characters
const char *const WHITELIST_CHARS_ARABIC =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!@#$%^&*()_+-=[]{}|;':\",./<>?`~\\ا ب ت ث ج ح خ د ذ ر ز س ش ص ض ط ظ ع غ ف ق ك ل م ن ه و ي ";
// numeric characters with punctuation for time, date, currency, etc.
const char *const WHITELIST_CHARS_NUMERIC = "0123456789!@#$%^&*()_+-=[]{}|;':\",./<>?`~\\ ";

const int OUTPUT_IMAGE_OPTION_DETECTION_MASK = 0;
const int OUTPUT_IMAGE_OPTION_TEXT_OVERLAY = 1;
Expand Down
2 changes: 2 additions & 0 deletions src/filter-data.h
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,8 @@ struct filter_data {
bool update_on_change;
int update_on_change_threshold;
int output_image_option;
bool output_file_append;
bool output_flatten;

bool isDisabled;

Expand Down
Loading

0 comments on commit 395d7d2

Please sign in to comment.