Skip to content
Open
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 63 additions & 17 deletions mssql_python/cursor.py
Original file line number Diff line number Diff line change
Expand Up @@ -2452,7 +2452,18 @@ def nextset(self) -> Union[bool, None]:
return True

def _bulkcopy(
self, table_name: str, data: Iterable[Union[Tuple, List]], **kwargs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please provide more details in the PR summary for this change?
This API change may not be scalable in future as any new parameter addition need to go through the API contract change which is not advisable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conversation around new API spec is happening in Github Discussions topic #414
I'll update the PR summary with the new spec details and link to the above discussion shortly
will ping once done

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - updated, pls recheck and let me know if you have any comments

Copy link
Contributor

@subrata-ms subrata-ms Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not convinced about moving out from **kwargs completly.
Can we group the parameters logically with couple of parameter?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not convinced about moving out from **kwargs completly. Can we group the parameters logically with couple of parameter?

Hi @subrata-ms, can you please be more specific with regard to why you're not convinced here? We discussed in the thread linked above and those of us on the end user side who decided to participate seemed to agree that this is a good design. As someone who has worked with SQL Server bulk copy libraries for over 20 years in a number of different languages I feel that this is the best possible option from a usability perspective: The parameters will be obvious and easily discovered by users. Introducing a required external object here won't be especially helpful and users will just have to jump through another hoop when making small changes. And while it does seem like there are a touch more parameters here than we usually see, the real concern should be forward maintainability -- but there is none. These basic parameters haven't changed in all of the time I've been working with SQL Server and I seriously doubt they ever will.

Copy link
Contributor

@subrata-ms subrata-ms Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @amachanic , I would suggest, we discuss these API design approach options in the API Design Review forum so we can gather broader feedback. ( #414 ).

self,
table_name: str,
data: Iterable[Union[Tuple, List]],
batch_size: Optional[int] = None,
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

batch_size is typed as Optional[int], but the validation accepts floats (isinstance(batch_size, (int, float))) while the error message says “positive integer”. Please align the type hint, validation, and message (either require an int, or explicitly support non-integer values and document why).

Suggested change
batch_size: Optional[int] = None,
batch_size: Optional[Union[int, float]] = None,

Copilot uses AI. Check for mistakes.
timeout: Optional[int] = 30,
column_mappings: Optional[List[Tuple[Union[str, int], str]]] = None,
keep_identity: Optional[bool] = None,
check_constraints: Optional[bool] = None,
table_lock: Optional[bool] = None,
keep_nulls: Optional[bool] = None,
fire_triggers: Optional[bool] = None,
use_internal_transaction: Optional[bool] = None,
): # pragma: no cover
"""
Perform bulk copy operation for high-performance data loading.
Expand All @@ -2471,20 +2482,33 @@ def _bulkcopy(
- The number of values in each row must match the number of columns
in the target table

**kwargs: Additional bulk copy options.
batch_size: Number of rows to send per batch. Default uses server optimal.

timeout: Operation timeout in seconds. Default is 30.

column_mappings: Maps source data columns to target table column names.
Each tuple is (source, target_column_name) where:
- source: Column name (str) or 0-based index (int) in the source data
- target_column_name: Name of the target column in the database table

When omitted: Columns are mapped by ordinal position (first data
column → first table column, second → second, etc.)

When specified: Only the mapped columns are inserted; unmapped
source columns are ignored, and unmapped target columns must
have default values or allow NULL.

keep_identity: Preserve identity values from source data.

column_mappings (List[Tuple[int, str]], optional):
Maps source data column indices to target table column names.
Each tuple is (source_index, target_column_name) where:
- source_index: 0-based index of the column in the source data
- target_column_name: Name of the target column in the database table
check_constraints: Check constraints during bulk copy.

When omitted: Columns are mapped by ordinal position (first data
column → first table column, second → second, etc.)
table_lock: Use table-level lock instead of row-level locks.

When specified: Only the mapped columns are inserted; unmapped
source columns are ignored, and unmapped target columns must
have default values or allow NULL.
keep_nulls: Preserve null values instead of using default values.

fire_triggers: Fire insert triggers on the target table.

use_internal_transaction: Use an internal transaction for each batch.

Returns:
Dictionary with bulk copy results including:
Expand Down Expand Up @@ -2523,10 +2547,6 @@ def _bulkcopy(
f"data must be an iterable of tuples or lists, got non-iterable {type(data).__name__}"
)

# Extract and validate kwargs with defaults
batch_size = kwargs.get("batch_size", None)
timeout = kwargs.get("timeout", 30)

# Validate batch_size type and value (only if explicitly provided)
if batch_size is not None:
if not isinstance(batch_size, (int, float)):
Expand Down Expand Up @@ -2599,7 +2619,33 @@ def _bulkcopy(
pycore_connection = mssql_py_core.PyCoreConnection(pycore_context)
pycore_cursor = pycore_connection.cursor()

result = pycore_cursor.bulkcopy(table_name, iter(data), **kwargs)
# Build kwargs dynamically - only pass non-None values
# This lets PyO3/Rust use its defined defaults for unspecified params
bulkcopy_kwargs = {}
if batch_size is not None:
bulkcopy_kwargs["batch_size"] = batch_size
if timeout is not None:
bulkcopy_kwargs["timeout"] = timeout
if column_mappings is not None:
bulkcopy_kwargs["column_mappings"] = column_mappings
if keep_identity is not None:
bulkcopy_kwargs["keep_identity"] = keep_identity
if check_constraints is not None:
bulkcopy_kwargs["check_constraints"] = check_constraints
if table_lock is not None:
bulkcopy_kwargs["table_lock"] = table_lock
if keep_nulls is not None:
bulkcopy_kwargs["keep_nulls"] = keep_nulls
if fire_triggers is not None:
bulkcopy_kwargs["fire_triggers"] = fire_triggers
if use_internal_transaction is not None:
bulkcopy_kwargs["use_internal_transaction"] = use_internal_transaction

result = pycore_cursor.bulkcopy(
table_name,
iter(data),
**bulkcopy_kwargs,
)

return result

Expand Down
Loading