Skip to content

Fix: Make CI/CD pipeline robust and non-interactive#12

Closed
heilcheng wants to merge 2 commits intomainfrom
fix/cicd-pipeline
Closed

Fix: Make CI/CD pipeline robust and non-interactive#12
heilcheng wants to merge 2 commits intomainfrom
fix/cicd-pipeline

Conversation

@heilcheng
Copy link
Owner

Summary

This pull request resolves critical issues in the CI/CD pipeline that were causing the integration-test job to fail and makes the overall pipeline more robust by ensuring linting, security, and type checking failures properly fail the build.

🔧 Interactive Script Fix

Problem: The examples/basic_usage.py script was hanging during CI/CD execution due to input() calls waiting for user interaction.

Solution:

  • ✅ Added os.isatty() check to detect non-interactive environments (CI/CD)
  • ✅ Automatically uses dummy model by default in CI/CD environments
  • ✅ Skips all user prompts and runs minimal evaluation in CI mode
  • ✅ Preserves full interactive functionality when run manually
  • ✅ Proper error handling and variable scope management

Before:

choice = input("Enter choice (1 or 2, default=1): ").strip()
run_full = input("\nRun full benchmark evaluation? (y/N): ").strip().lower()

After:

if is_interactive:
    choice = input("Enter choice (1 or 2, default=1): ").strip()
    run_full = input("\nRun full benchmark evaluation? (y/N): ").strip().lower()
else:
    # Non-interactive mode (CI/CD) - use dummy model by default
    choice = "1"
    run_full = 'y'
    print("Non-interactive environment detected, using dummy model...")

🛡️ CI/CD Workflow Improvements

Problem: The CI/CD pipeline was not failing builds when it should due to permissive flags that treated errors as warnings.

Changes Made:

1. Linting (flake8)

  • Removed: --exit-zero flag that treated all errors as warnings
  • Added: examples/ directory to linting scope
  • 🎯 Result: Linting errors now properly fail the build

2. Security Scanning (bandit)

  • Removed: || true that prevented security failures from failing the build
  • 🎯 Result: Security vulnerabilities now properly fail the build

3. Type Checking (mypy)

  • Removed: --ignore-missing-imports flag for stricter checking
  • Added: --strict-optional flag for better type safety
  • 🎯 Result: Type errors now properly fail the build with better coverage

📊 Impact

Before (Problematic)

  • ❌ Integration tests hung indefinitely on input() calls
  • ⚠️ Linting errors ignored with --exit-zero
  • ⚠️ Security vulnerabilities ignored with || true
  • ⚠️ Missing imports ignored in type checking

After (Robust)

  • ✅ Integration tests run automatically without user interaction
  • ✅ Linting errors fail the build appropriately
  • ✅ Security vulnerabilities fail the build appropriately
  • ✅ Type checking is more comprehensive and strict

🧪 Testing

Non-Interactive Mode Verification:

# Runs without hanging, uses dummy model automatically
python examples/basic_usage.py < /dev/null

Interactive Mode Preserved:

# Still prompts for user input when run manually
python examples/basic_usage.py

🎯 Files Changed

  1. examples/basic_usage.py

    • Added environment detection with os.isatty()
    • Implemented non-interactive defaults
    • Fixed variable scope issues
    • Enhanced error handling for CI/CD mode
  2. .github/workflows/ci.yml

    • Removed permissive flags from linting, security, and type checking
    • Enhanced coverage and strictness
    • Ensured proper build failures on issues

Verification Checklist

  • Integration tests no longer hang on user input
  • Script works in both interactive and non-interactive modes
  • Linting errors will fail the build
  • Security vulnerabilities will fail the build
  • Type checking is more comprehensive
  • All existing functionality preserved for manual use
  • No breaking changes to the public API

🚀 Benefits

  1. Reliable CI/CD: Integration tests now run automatically without manual intervention
  2. Better Code Quality: Stricter linting, security, and type checking standards
  3. Faster Feedback: Developers get immediate feedback on code quality issues
  4. Maintainability: Easier to catch and fix issues before they reach production
  5. User Experience: Interactive script still works perfectly for manual testing

This PR ensures the MEQ-Bench CI/CD pipeline is robust, reliable, and maintains high code quality standards while preserving the excellent user experience for manual testing and development.


🤖 Generated with Claude Code

heilcheng added 2 commits July 4, 2025 23:00
**Interactive Script Fix:**
- Add os.isatty() check to detect non-interactive environments
- Use dummy model by default in CI/CD environments
- Skip user prompts and run minimal evaluation in CI mode
- Preserve all interactive functionality for manual use
- Ensures integration tests can run without hanging on input()

**CI/CD Workflow Improvements:**
- Remove --exit-zero flag from flake8 to fail build on linting errors
- Remove || true from bandit to fail build on security vulnerabilities
- Remove --ignore-missing-imports from mypy for stricter type checking
- Add examples/ directory to flake8 linting scope
- Add --strict-optional flag to mypy for better type safety

These changes make the CI/CD pipeline more robust by ensuring that
linting errors, security issues, and type checking problems will
properly fail the build instead of being ignored.
- Set choice variable in non-interactive branch to prevent UnboundLocalError
- Script now properly runs without hanging on input() calls
- Maintains compatibility for both interactive and CI/CD environments
@heilcheng heilcheng closed this Jul 4, 2025
@heilcheng heilcheng deleted the fix/cicd-pipeline branch December 28, 2025 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant