You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat[field].fillna(value=feat[field].mean(), inplace=True)
12 Jun 10:59 INFO Saving filtered dataset into [saved/bert4recbole-SequentialDataset.pth]
12 Jun 10:59 INFO bert4recbole
The number of users: 93328
Average actions of users: 2185.097806636879
The number of items: 93329
Average actions of items: 2185.1212202387333
The number of inters: 203928623
The sparsity of the dataset: 97.65874016271815%
Remain Fields: ['user_id', 'item_id', 'timestamp', 'area_id']
12 Jun 11:45 INFO Saving split dataloaders into: [saved/bert4recbole-for-BERT4Rec-dataloader.pth]
Traceback (most recent call last):
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/torch/serialization.py", line 632, in save
_legacy_save(obj, opened_file, pickle_module, pickle_protocol)
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/torch/serialization.py", line 776, in _legacy_save
storage._write_file(f, _should_read_directly(f), True, torch._utils._element_size(dtype))
MemoryError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data1/bert4rec/bert4rec-main/scripts/bole/run.py", line 7, in
run_recbole(model='BERT4Rec', dataset=r'bert4recbole',
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/recbole/quick_start/quick_start.py", line 133, in run_recbole
train_data, valid_data, test_data = data_preparation(config, dataset)
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/recbole/data/utils.py", line 194, in data_preparation
save_split_dataloaders(
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/recbole/data/utils.py", line 99, in save_split_dataloaders
pickle.dump(Serialization_dataloaders, f)
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/torch/storage.py", line 951, in reduce
torch.save(self, b, _use_new_zipfile_serialization=False)
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/torch/serialization.py", line 631, in save
with _open_file_like(f, 'wb') as opened_file:
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/torch/serialization.py", line 439, in exit
self.file_like.flush()
ValueError: I/O operation on closed file.
n_layers: 2 # (int) The number of transformer layers in transformer encoder.
n_heads: 2 # (int) The number of attention heads for multi-head attention layer.
hidden_size: 64 # (int) The number of features in the hidden state.
inner_size: 256 # (int) The inner hidden size in feed-forward layer.
hidden_dropout_prob: 0.2 # (float) The probability of an element to be zeroed.
attn_dropout_prob: 0.2 # (float) The probability of an attention score to be zeroed.
hidden_act: 'gelu' # (str) The activation function in feed-forward layer.
layer_norm_eps: 1e-12 # (float) A value added to the denominator for numerical stability.
initializer_range: 0.02 # (float) The standard deviation for normal initialization.
mask_ratio: 0.2 # (float) The probability for a item replaced by MASK token.
loss_type: 'CE' # (str) The type of loss function.
transform: mask_itemseq # (str) The transform operation for batch data process.
ft_ratio: 0.5 # (float) The probability of generating fine-tuning samples
描述这个 bug
feat[field].fillna(value=feat[field].mean(), inplace=True)
12 Jun 10:59 INFO Saving filtered dataset into [saved/bert4recbole-SequentialDataset.pth]
12 Jun 10:59 INFO bert4recbole
The number of users: 93328
Average actions of users: 2185.097806636879
The number of items: 93329
Average actions of items: 2185.1212202387333
The number of inters: 203928623
The sparsity of the dataset: 97.65874016271815%
Remain Fields: ['user_id', 'item_id', 'timestamp', 'area_id']
12 Jun 11:45 INFO Saving split dataloaders into: [saved/bert4recbole-for-BERT4Rec-dataloader.pth]
Traceback (most recent call last):
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/torch/serialization.py", line 632, in save
_legacy_save(obj, opened_file, pickle_module, pickle_protocol)
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/torch/serialization.py", line 776, in _legacy_save
storage._write_file(f, _should_read_directly(f), True, torch._utils._element_size(dtype))
MemoryError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data1/bert4rec/bert4rec-main/scripts/bole/run.py", line 7, in
run_recbole(model='BERT4Rec', dataset=r'bert4recbole',
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/recbole/quick_start/quick_start.py", line 133, in run_recbole
train_data, valid_data, test_data = data_preparation(config, dataset)
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/recbole/data/utils.py", line 194, in data_preparation
save_split_dataloaders(
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/recbole/data/utils.py", line 99, in save_split_dataloaders
pickle.dump(Serialization_dataloaders, f)
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/torch/storage.py", line 951, in reduce
torch.save(self, b, _use_new_zipfile_serialization=False)
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/torch/serialization.py", line 631, in save
with _open_file_like(f, 'wb') as opened_file:
File "/data1/bert4rec/bert4rec-main/venv/lib/python3.10/site-packages/torch/serialization.py", line 439, in exit
self.file_like.flush()
ValueError: I/O operation on closed file.
如何复现
复现这个 bug 的步骤:
yaml 文件
gpu_id: '0,1,2,3'
worker: 0
model config
n_layers: 2 # (int) The number of transformer layers in transformer encoder.
n_heads: 2 # (int) The number of attention heads for multi-head attention layer.
hidden_size: 64 # (int) The number of features in the hidden state.
inner_size: 256 # (int) The inner hidden size in feed-forward layer.
hidden_dropout_prob: 0.2 # (float) The probability of an element to be zeroed.
attn_dropout_prob: 0.2 # (float) The probability of an attention score to be zeroed.
hidden_act: 'gelu' # (str) The activation function in feed-forward layer.
layer_norm_eps: 1e-12 # (float) A value added to the denominator for numerical stability.
initializer_range: 0.02 # (float) The standard deviation for normal initialization.
mask_ratio: 0.2 # (float) The probability for a item replaced by MASK token.
loss_type: 'CE' # (str) The type of loss function.
transform: mask_itemseq # (str) The transform operation for batch data process.
ft_ratio: 0.5 # (float) The probability of generating fine-tuning samples
dataset config
field_separator: "," #指定数据集field的分隔符
seq_separator: " " #指定数据集中token_seq或者float_seq域里的分隔符
USER_ID_FIELD: user_id #指定用户id域
ITEM_ID_FIELD: item_id #指定物品id域
TIME_FIELD: timestamp #指定时间域
MAX_ITEM_LIST_LENGTH: 50 #指定最大序列长度
save_dataset: True #是否保存处理后的数据到本地
save_dataloaders: Ture #是否保存加载数据的方式
#指定从什么文件里读什么列,这里就是从ml-1m.inter里面读取user_id, item_id, rating, timestamp这四列,剩下的以此类推
load_col:
inter: [user_id, item_id, timestamp]
item: [item_id, area_id]
training settings
epochs: 500 #训练的最大轮数
train_batch_size: 128 #训练的batch_size
learner: adam #使用的pytorch内置优化器
learning_rate: 0.001 #学习率
training_neg_sample_num: 0 #负采样数目
eval_step: 1 #每次训练后做evalaution的次数
stopping_step: 10 #控制训练收敛的步骤数,在该步骤数内若选取的评测标准没有什么变化,就可以提前停止了
evalution settings
eval_setting: TO_LS,full #对数据按时间排序,设置留一法划分数据集,并使用全排序
metrics: ["Recall", "MRR","NDCG","Hit","Precision"] #评测标准
valid_metric: MRR@10 #选取哪个评测标准作为作为提前停止训练的标准
eval_batch_size: 8 #评测的batch_size
show_progress: True
预期
出现文件关闭的错误
实验环境(请补全下列信息):
The text was updated successfully, but these errors were encountered: