Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(genai): samples for batch predict #13120

Merged
merged 6 commits into from
Feb 4, 2025

Conversation

Valeriy-Burlaka
Copy link
Member

@Valeriy-Burlaka Valeriy-Burlaka commented Jan 31, 2025

Description

Add samples for batch prediction with genai SDK.

Checklist

@Valeriy-Burlaka Valeriy-Burlaka self-assigned this Jan 31, 2025
@Valeriy-Burlaka Valeriy-Burlaka requested review from a team as code owners January 31, 2025 13:34
@Valeriy-Burlaka Valeriy-Burlaka marked this pull request as draft January 31, 2025 13:34
Copy link

snippet-bot bot commented Jan 31, 2025

Here is the summary of changes.

You are about to add 2 region tags.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@product-auto-label product-auto-label bot added the samples Issues that are directly related to samples. label Jan 31, 2025
Copy link
Contributor

@code-review-assist code-review-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi team,

I'm currently reviewing this pull request and will provide my feedback shortly. In the meantime, here's a summary for you and other reviewers to quickly get up to speed:

This pull request, submitted by Valeriy-Burlaka, adds the ability to perform batch predictions using Google Cloud Storage (GCS). Let's break down the changes:

Intent:

The primary goal is to enhance the existing functionality by enabling users to leverage GCS for batch prediction tasks. This likely improves efficiency and scalability for large-scale prediction jobs.

Changes:

  • A new Python file, genai/batch_predict/batch_prediction_with_gcs.py, was added. This file contains the core logic for initiating and monitoring batch prediction jobs using the Google GenAI API. The code uses the google-genai library to interact with the GenAI service, specifically using the client.batches.create and client.batches.get methods to manage the job lifecycle. The job's state is polled until it reaches a terminal state (succeeded, failed, cancelled, or paused).
  • genai/batch_predict/requirements.txt and genai/batch_predict/requirements-test.txt were updated to include necessary dependencies, such as google-genai, google-api-core, google-cloud-storage, and pytest.
  • A new test file, genai/batch_predict/test_batch_predict.py, was added to ensure the functionality works as expected. This test uses a pytest fixture to manage a temporary GCS bucket for output, and asserts that the job completes successfully and the output is written to the specified GCS location.

Activity:

The PR description mentions that it addresses issue # and includes a checklist for ensuring code quality and compliance. The author has also indicated that they've followed the sample guidelines and updated relevant files such as the README and CODEOWNERS.

I'll provide a more detailed review shortly. In the meantime, please let me know if you have any questions.

And now, a little haiku to lighten the mood:

Batch jobs now fly high,
GCS wings them to the clouds,
Data's swift descent.

Copy link
Contributor

@code-review-assist code-review-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request introduces batch prediction using Google Cloud Storage (GCS). The implementation looks good overall. However, there are a few minor points to address and some suggestions for improvement.

I've summarized the style guide elements referenced below:

  • Descriptive Function Names: Function names should clearly describe their purpose.
  • Docstrings: All functions and methods should have docstrings explaining their purpose, arguments, and return values.
  • Type Hinting: Use type hints to improve code readability and maintainability.
  • Error Handling: Handle potential errors gracefully and provide informative error messages.
  • Testing: Ensure sufficient test coverage, including edge cases and error scenarios.

genai/batch_predict/batch_prediction_with_gcs.py Outdated Show resolved Hide resolved
genai/batch_predict/test_batch_predict.py Outdated Show resolved Hide resolved
@Valeriy-Burlaka Valeriy-Burlaka force-pushed the vb/py-genaisdk-samples-batch-predict branch 2 times, most recently from 8b63eee to 93fd032 Compare January 31, 2025 17:29
@Valeriy-Burlaka Valeriy-Burlaka force-pushed the vb/py-genaisdk-samples-batch-predict branch from 93fd032 to 3060801 Compare January 31, 2025 17:37
@Valeriy-Burlaka Valeriy-Burlaka changed the title feat: batch predict with GCS feat(genai): samples for batch predict Jan 31, 2025
@Valeriy-Burlaka Valeriy-Burlaka force-pushed the vb/py-genaisdk-samples-batch-predict branch 2 times, most recently from 028a11f to 65ae82a Compare February 3, 2025 11:27
@Valeriy-Burlaka Valeriy-Burlaka force-pushed the vb/py-genaisdk-samples-batch-predict branch from 65ae82a to 427846f Compare February 3, 2025 11:29
@Valeriy-Burlaka Valeriy-Burlaka marked this pull request as ready for review February 3, 2025 11:50
Copy link
Member

@msampathkumar msampathkumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need minor updates. PTAL.

Valeriy-Burlaka and others added 4 commits February 3, 2025 18:11
Co-authored-by: code-review-assist[bot] <182814678+code-review-assist[bot]@users.noreply.github.com>
@Valeriy-Burlaka
Copy link
Member Author

Hi @msampathkumar! I updated the PR, — please take another look.

Copy link
Member

@msampathkumar msampathkumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Passing all the tests. Nice work.

@msampathkumar msampathkumar merged commit 27454d8 into main Feb 4, 2025
14 checks passed
@msampathkumar msampathkumar deleted the vb/py-genaisdk-samples-batch-predict branch February 4, 2025 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
samples Issues that are directly related to samples.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants