Skip to content

Conversation

@rkritika1508
Copy link

@rkritika1508 rkritika1508 commented Nov 25, 2025

Summary

This PR adds functionality to detect lexical slurs by creating custom validators using guardrails-ai.

  • Added new code for lexical slur detection, using a configurable list of slur terms (a slur-list CSV was added) to filter or flag offensive content
  • Added or updated dependencies/guardrails for AI support and unit tests (UTs), ensuring the new validators are tested.

How to Test / Review

  • Run the newly added validators on representative test data to verify removal/flagging works
  • Run existing experiment pipelines to ensure they still function correctly after refactoring/project structure changes
  • Review the slur-list CSV to confirm the list is accurate and suitable
  • Review the unit tests to ensure coverage and correctness

For testing, use something like this, add this in the Guardrails.py file -

def setup():
    parser = argparse.ArgumentParser(description="Run the AI safety guardrails pipeline.")
    parser.add_argument(
        "--input",
        type=str,
        required=True,
        help="Path to the input file or some string input",
    )
    args = parser.parse_args()
    return args.input

if __name__ == "__main__":
    user_input = setup()
    guardrails = Guardrails(user_input)
    guardrails.make()
    
    # Run input validators
    safe_input = guardrails.run_input_validators("your_text")

    # Run output validators
    #guardrails.run_output_validators("your_text")

Sample input

{
    "guardrails": {
        "input": [
            {
                "type": "pii_remover"
            },
            {
                "type": "lexical_slur",
                "languages": [
                        "en",
                        "hi"
                    ],
                "severity": "all"
            }
        ],
        "output": []
    }
}

About hub_loader.py
Guardrail AI's recommended method to install validators from its hub is via their cli using the install command.
eg guardrails hub install hub://guardrails/regex_match.
In the initial stages, we are keeping a tight check on the validators we are adding, so we COULD add the commands to install the validators we support in our dockerfile or a startup script.
But we've created hub_loader.py as a possible way to control the install process via the python process itself.
If it feels over engineered at the moment, happy to keep it out for now.

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

Notes

Please add here if any other information is required for the reviewer.

@dennyabrain
Copy link

its a nitpick, but filenames are named inconsistently. Some have capital letters in their name (Curated_Slurlist_Hindi_English.csv), some don't break the words correctly (pdfanonymizer.py). Lets be consistent with the rest of the code and make it all snake case? so curated_slurlist_hi_en.csv and pdf_anonymizer

@rkritika1508
Copy link
Author

makes sense, btw should we store the csv in the safety folder or should it be someplace else? What do you suggest? Storing it in someplace like blob storage seems overkill.

@dennyabrain
Copy link

makes sense, btw should we store the csv in the safety folder or should it be someplace else? What do you suggest? Storing it in someplace like blob storage seems overkill.

Its ok to keep it in a csv here for the 0.1. Makes for easy demo too. We can discuss remote storage eventually when they might need more customization over slur list per request or client.

@dennyabrain dennyabrain marked this pull request as draft November 28, 2025 10:27
@rkritika1508 rkritika1508 changed the title Added code for lexical slur and PII remover validators Added code for lexical slur validator Dec 1, 2025
@dennyabrain dennyabrain marked this pull request as ready for review December 2, 2025 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants