configure random seed dtype based on backend #19928

haohuanw · 2024-06-27T05:41:12Z

seeing below error on running tensorflow distributed training with multiple workers. the issue being seed state getting broadcasted is in a dtype that tensorflow doesn't support:

Exception encountered: ''Value for attr 'T' of uint32 is not in the list of allowed values: bool, float, half, double, int32, int64
	; NodeDef: {{node CollectiveBcastSend}}; Op<name=CollectiveBcastSend; signature=input:T -> data:T; attr=T:type,allowed=[DT_BOOL, DT_FLOAT, DT_HALF, DT_DOUBLE, DT_INT32, DT_INT64]; attr=group_size:int; attr=group_key:int; attr=instance_key:int; attr=shape:shape; attr=communication_hint:string,default="auto"; attr=timeout_seconds:float,default=0; is_stateful=true; is_distributed_communication=true> [Op:CollectiveBcastSend]''

this pr introduces a seed dtype function to customize for an ideal seed dtype for backends.

codecov-commenter · 2024-06-27T05:47:09Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.97%. Comparing base (c8a7f28) to head (8033c45).
Report is 21 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #19928      +/-   ##
==========================================
- Coverage   79.01%   78.97%   -0.04%     
==========================================
  Files         499      499              
  Lines       46441    46506      +65     
  Branches     8550     8561      +11     
==========================================
+ Hits        36694    36730      +36     
- Misses       8020     8044      +24     
- Partials     1727     1732       +5

Flag	Coverage Δ
keras	`78.83% <100.00%> (-0.04%)`	⬇️
keras-jax	`62.41% <50.00%> (+<0.01%)`	⬆️
keras-numpy	`57.32% <58.33%> (+0.10%)`	⬆️
keras-tensorflow	`63.60% <79.16%> (-0.03%)`	⬇️
keras-torch	`62.38% <50.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

fchollet

Thanks for the PR. This is a reasonable change.

fchollet · 2024-06-27T17:48:27Z

keras/src/backend/tensorflow/random.py

@@ -14,15 +14,15 @@ def tf_draw_seed(seed):

 def normal(shape, mean=0.0, stddev=1.0, dtype=None, seed=None):
    dtype = dtype or floatx()
-    seed = tf_draw_seed(seed)


You can remove the tf_draw_seed function then

fchollet · 2024-06-27T17:49:03Z

keras/src/backend/jax/core.py

@@ -346,6 +346,11 @@ def unstack(x, num=None, axis=0):
    ]


+def random_seed_dtype():
+    # jax random seed uses uint32.
+    return standardize_dtype("uint32")


standardize_dtype will be a no-op here. You can skip it.

fchollet

LGTM, thank you

fchollet · 2024-06-28T18:16:35Z

It seems this breaks TF GPU CI due to a strange issue -- TF automatically places int32 constants on CPU (this is different compared to every other dtype). I think using int64 instead would work -- trying it now.

configure random seed dtype based on backend

2826a0a

google-ml-butler bot added the size:M label Jun 27, 2024

google-ml-butler bot assigned gbaned Jun 27, 2024

fchollet reviewed Jun 27, 2024

View reviewed changes

address comment

8033c45

haohuanw requested a review from fchollet June 28, 2024 04:46

google-ml-butler bot added the awaiting review label Jun 28, 2024

gbaned added this to Assigned Reviewer in PR Queue via automation Jun 28, 2024

fchollet approved these changes Jun 28, 2024

View reviewed changes

google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Jun 28, 2024

PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Jun 28, 2024

fchollet merged commit 272af9c into keras-team:master Jun 28, 2024
6 checks passed

PR Queue automation moved this from Approved by Reviewer to Merged Jun 28, 2024

google-ml-butler bot removed awaiting review ready to pull Ready to be merged into the codebase kokoro:force-run labels Jun 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configure random seed dtype based on backend #19928

configure random seed dtype based on backend #19928

haohuanw commented Jun 27, 2024

codecov-commenter commented Jun 27, 2024 •

edited

Loading

fchollet left a comment

fchollet Jun 27, 2024

fchollet Jun 27, 2024

fchollet left a comment

fchollet commented Jun 28, 2024

configure random seed dtype based on backend #19928

configure random seed dtype based on backend #19928

Conversation

haohuanw commented Jun 27, 2024

codecov-commenter commented Jun 27, 2024 • edited Loading

Codecov Report

fchollet left a comment

Choose a reason for hiding this comment

fchollet Jun 27, 2024

Choose a reason for hiding this comment

fchollet Jun 27, 2024

Choose a reason for hiding this comment

fchollet left a comment

Choose a reason for hiding this comment

fchollet commented Jun 28, 2024

codecov-commenter commented Jun 27, 2024 •

edited

Loading