Added code for lexical slur validator #1

rkritika1508 · 2025-11-25T07:57:34Z

Summary

This PR adds functionality to detect lexical slurs by creating custom validators using guardrails-ai.

Added new code for lexical slur detection, using a configurable list of slur terms (a slur-list CSV was added) to filter or flag offensive content
Added or updated dependencies/guardrails for AI support and unit tests (UTs), ensuring the new validators are tested.

How to Test / Review

Run the newly added validators on representative test data to verify removal/flagging works
Run existing experiment pipelines to ensure they still function correctly after refactoring/project structure changes
Review the slur-list CSV to confirm the list is accurate and suitable
Review the unit tests to ensure coverage and correctness

For testing, use something like this, add this in the Guardrails.py file -

def setup():
    parser = argparse.ArgumentParser(description="Run the AI safety guardrails pipeline.")
    parser.add_argument(
        "--input",
        type=str,
        required=True,
        help="Path to the input file or some string input",
    )
    args = parser.parse_args()
    return args.input

if __name__ == "__main__":
    user_input = setup()
    guardrails = Guardrails(user_input)
    guardrails.make()
    
    # Run input validators
    safe_input = guardrails.run_input_validators("your_text")

    # Run output validators
    #guardrails.run_output_validators("your_text")

Sample input

{
    "guardrails": {
        "input": [
            {
                "type": "pii_remover"
            },
            {
                "type": "lexical_slur",
                "languages": [
                        "en",
                        "hi"
                    ],
                "severity": "all"
            }
        ],
        "output": []
    }
}

About hub_loader.py
Guardrail AI's recommended method to install validators from its hub is via their cli using the install command.
eg guardrails hub install hub://guardrails/regex_match.
In the initial stages, we are keeping a tight check on the validators we are adding, so we COULD add the commands to install the validators we support in our dockerfile or a startup script.
But we've created hub_loader.py as a possible way to control the install process via the python process itself.
If it feels over engineered at the moment, happy to keep it out for now.

Checklist

Before submitting a pull request, please ensure that you mark these task.

Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
If you've fixed a bug or added code that is tested and has test cases.

Notes

Please add here if any other information is required for the reviewer.

dennyabrain · 2025-11-28T06:45:11Z

its a nitpick, but filenames are named inconsistently. Some have capital letters in their name (Curated_Slurlist_Hindi_English.csv), some don't break the words correctly (pdfanonymizer.py). Lets be consistent with the rest of the code and make it all snake case? so curated_slurlist_hi_en.csv and pdf_anonymizer

rkritika1508 · 2025-11-28T06:53:37Z

makes sense, btw should we store the csv in the safety folder or should it be someplace else? What do you suggest? Storing it in someplace like blob storage seems overkill.

dennyabrain · 2025-11-28T07:10:09Z

makes sense, btw should we store the csv in the safety folder or should it be someplace else? What do you suggest? Storing it in someplace like blob storage seems overkill.

Its ok to keep it in a csv here for the 0.1. Makes for easy demo too. We can discuss remote storage eventually when they might need more customization over slur list per request or client.

backend/app/ai-safety/src/utils/util.py

backend/app/safety/guardrails_engine.py

backend/app/safety/validators/ban_list_safety_validator_config.py

rkritika1508 and others added 3 commits November 25, 2025 13:25

Added code for lexical slur and PII remover validators

0af12f5

refactor: move directory

3401b19

Added PII remover and lexical slur detection validators

3efe7e8

dennyabrain reviewed Nov 28, 2025

View reviewed changes

backend/app/ai-safety/src/utils/util.py Outdated Show resolved Hide resolved

backend/app/safety/guardrails_engine.py Show resolved Hide resolved

dennyabrain marked this pull request as draft November 28, 2025 10:27

rkritika1508 and others added 12 commits November 28, 2025 18:29

Resolved comments

171d3e1

Updated code

80977e2

Removed redundant code

6efef3d

Renamed files acc to python convention

30faf31

fixed UTs

e5ba7d1

chore: code reorganization

017e2a4

Updated code and fixed test cases

0b05a99

updated unit test

34817d5

Removed few files

a36b823

reversed uv.lock

c490ef1

updated uv.lock

5bf8633

cleanup up code

b62f47c

rkritika1508 changed the title ~~Added code for lexical slur and PII remover validators~~ Added code for lexical slur validator Dec 1, 2025

rkritika1508 and others added 6 commits December 2, 2025 03:49

Cleaned up code

bac48ce

chore: refactor

7caea29

chore: cleanup

58b0fbc

chore: cleanup

f6b2139

chore: cleanup

924b230

Added the banlist validator

2e2454d

dennyabrain reviewed Dec 2, 2025

View reviewed changes

backend/app/safety/validators/ban_list_safety_validator_config.py Show resolved Hide resolved

dennyabrain marked this pull request as ready for review December 2, 2025 12:01

Renamed file

180aa82

rkritika1508 added 2 commits December 2, 2025 17:41

Merge branch 'main' into feature/ai-safety-tattle

ea079f2

Fixed guardrail config

a3b013b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added code for lexical slur validator #1

Added code for lexical slur validator #1

Uh oh!

rkritika1508 commented Nov 25, 2025 •

edited

Loading

Uh oh!

dennyabrain commented Nov 28, 2025

Uh oh!

rkritika1508 commented Nov 28, 2025

Uh oh!

dennyabrain commented Nov 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Added code for lexical slur validator #1

Are you sure you want to change the base?

Added code for lexical slur validator #1

Uh oh!

Conversation

rkritika1508 commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Notes

Uh oh!

dennyabrain commented Nov 28, 2025

Uh oh!

rkritika1508 commented Nov 28, 2025

Uh oh!

dennyabrain commented Nov 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rkritika1508 commented Nov 25, 2025 •

edited

Loading