@@ -581,52 +581,86 @@ By default, SWIRL loads **English stopwords**. To change this:
581
581
582
582
## Redact or Remove Personally Identifiable Information (PII)
583
583
584
- SWIRL supports ** PII removal and redaction ** using [ Microsoft Presidio] ( https://microsoft.github.io/presidio/ ) .
584
+ SWIRL supports redaction or removal of PII from queries and results, via [ Microsoft Presidio] ( https://microsoft.github.io/presidio/ ) .
585
585
586
- ** ` RemovePIIQueryProcessor ` (Redacts Queries) **
586
+ ** RedactPIIQueryProcessor **
587
587
588
- Removes PII ** before querying ** .
588
+ This processor redacts PII entities in queries. For example: ` Captain James T. Kirk ` → ` Captain [PERSON] `
589
589
590
- * Enable for a Specific SearchProvider: *
590
+ To enable for a specific SearchProvider, add it before the ` Adaptive ` or ` NoMod ` Query Processor.
591
591
592
592
``` json
593
593
"query_processors" : [
594
- " AdaptiveQueryProcessor " ,
595
- " RemovePIIQueryProcessor "
594
+ " RedactPIIQueryProcessor " ,
595
+ " AdaptiveQueryProcessor "
596
596
]
597
597
```
598
598
599
- * Enable for ALL SearchProviders:*
599
+ {.warning}
600
+ If the API receiving the redacted PII can't handle brackets ` [] ` , use the ` AdaptiveQueryProcessor ` * after* PII redaction to remove them.
600
601
601
- Modify ` swirl/models.py ` :
602
+ ** RemovePIIQueryProcessor **
603
+
604
+ This processor removes detected PII entities from queries entirely.
605
+
606
+ To enable for a specific SearchProvider, add it before the ` Adaptive ` or ` NoMod ` Query Processor.
607
+
608
+ ``` json
609
+ "query_processors" : [
610
+ " RemovePIIQueryProcessor" ,
611
+ " AdaptiveQueryProcessor"
612
+ ]
613
+ ```
614
+
615
+ To add either of these to the pre-query processing pipeline, so it runs before any SearchProvider query processing:
616
+
617
+ 1 . Add it to the ` search.prequery_processing ` list. This is only supported via the SWIRL API.
618
+
619
+ 2 . Modify ` swirl/models.py ` :
602
620
603
621
``` python
604
622
def getSearchPreQueryProcessorsDefault ():
605
623
return [" RemovePIIQueryProcessor" ]
606
624
```
607
625
608
- More details: [ ResultProcessors ] ( ./Developer-Reference#result-processors )
626
+ And restart SWIRL. [ Contact support ] ( #support ) for assistance.
609
627
610
- ** ` RemovePIIResultProcessor ` (Redacts Results) **
628
+ For more information: [ ResultProcessors ] ( ./Developer-Reference#result-processors )
611
629
612
- Redacts PII ** in results ** (e.g., ` "James T. Kirk" ` → ` "<PERSON>" ` ).
630
+ ** RedactPIIResultProcessor **
613
631
614
- * Enable for a Specific SearchProvider: *
632
+ Redacts PII in results. In a document, for example: ` These are the logs of Captain James T. Kirk. ` → ` "These are the logs of Captain [PERSON]" `
615
633
616
634
``` json
617
635
"result_processors" : [
618
636
" MappingResultProcessor" ,
619
637
" DateFinderResultProcessor" ,
620
638
" CosineRelevancyResultProcessor" ,
621
- " RemovePIIResultProcessor "
639
+ " RedactPIIResultProcessor "
622
640
]
623
641
```
624
642
625
643
More details: [ ResultProcessors] ( ./Developer-Reference#post-result-processors )
626
644
627
- ** ` RemovePIIPostResultProcessor ` **
645
+ {.note}
646
+ There is no RemovePIIResultProcessor at this time as it may impair use of AI.
647
+
648
+ ** RedactPIIPostResultProcessor **
649
+
650
+ This processor applies PII redaction from the unified results, from all responding sources.
651
+
652
+ To add either of these to the pre-query processing pipeline, so it runs before any SearchProvider query processing:
653
+
654
+ 1 . Add it to the ` search.prequery_processing ` list. This is only supported via the SWIRL API.
655
+
656
+ 2 . Modify ` swirl/models.py ` :
657
+
658
+ ``` python
659
+ def getSearchPostResultProcessorsDefault ():
660
+ return [" CosineRelevancyPostResultProcessor" ," RedactPIIPostResultProcessor" ]
661
+ ```
628
662
629
- This processor applies ** PII redaction after all results are processed ** .
663
+ This configuration re-ranks using entities, but then redacts them in the results displayed to the user. This leaves the entities in the explain vector, which is available via the API. To prevent this, [ disable the explain vector by setting ` SWIRL_EXPLAIN ` to ` False ` ] ( TBD ) .
630
664
631
665
## Understand the Explain Structure
632
666
0 commit comments