train error #87

cqray1990 · 2020-07-03T08:10:39Z

when train hed by using the data you support but it raised:
Creating layer data
F0703 16:04:09.971015 2124 layer_factory.hpp:81] Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: ImageLabelmapData (known types: AbsVal, Accuracy, ArgMax, BNLL, BatchNorm, BatchReindex, Bias, Clip, Concat, ContrastiveLoss, Convolution, Crop, Data, Deconvolution, Dropout, DummyData, ELU, Eltwise, Embed, EuclideanLoss, Exp, Filter, Flatten, HDF5Data, HDF5Output, HingeLoss, Im2col, ImageData, InfogainLoss, InnerProduct, Input, LRN, LSTM, LSTMUnit, Log, MVN, MemoryData, MultinomialLogisticLoss, PReLU, Parameter, Pooling, Power, Python, RNN, ReLU, Reduction, Reshape, SPP, Scale, Sigmoid, SigmoidCrossEntropyLoss, Silence, Slice, Softmax, SoftmaxWithLoss, Split, Swish, TanH, Threshold, Tile, WindowData)
*** Check failure stack trace: ***
@ 0x7f095cad75cd google::LogMessage::Fail()
@ 0x7f095cad9433 google::LogMessage::SendToLog()
@ 0x7f095cad715b google::LogMessage::Flush()
@ 0x7f095cad9e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f095cfb0db4 caffe::Net<>::Init()
@ 0x7f095cfb21ae caffe::Net<>::Net()
@ 0x7f095cfbd05a caffe::Solver<>::InitTrainNet()
@ 0x7f095cfbe625 caffe::Solver<>::Init()
@ 0x7f095cfbe93f caffe::Solver<>::Solver()
@ 0x7f095cfd9b91 caffe::Creator_SGDSolver<>()
@ 0x40c400 train()
@ 0x409260 main
@ 0x7f095b9cf830 __libc_start_main
@ 0x409a79 _start
@ (nil) (unknown)
已放弃 (核心已转储)
and i train the examples name mnist in caffe it has no problem

here is my solver.prototxt:
net: "/home/lgx/caffe/examples/hed/train_val.prototxt"
test_iter: 0
test_interval: 1000000

lr for fine-tuning should be lower than when starting from scratch

#debug_info: true
base_lr: 0.000001
lr_policy: "step"
gamma: 0.1
iter_size: 10

stepsize should also be lower, as we're closer to being done

stepsize: 10000
display: 20
max_iter: 30001
momentum: 0.9
weight_decay: 0.0002
snapshot: 1000
snapshot_prefix: "hed"

uncomment the following to default to CPU mode solving

solver_mode: CPU

and train_val.prototxt
name: "HED"
layer {
name: "data"
type: "ImageLabelmapData"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: false
mean_value: 104.00699
mean_value: 116.66877
mean_value: 122.67892
}
image_data_param {
root_folder: "/home/HED-BSDS/"
source: "/home/lgx/HED-BSDS/train_pair.lst"
batch_size: 1
shuffle: true
new_height: 0
new_width: 0
}
}
layer {
name: "data"
type: "ImageLabelmapData"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false
mean_value: 104.00699
mean_value: 116.66877
mean_value: 122.67892
}
image_data_param {
root_folder: "/home/HED-BSDS/"
source: "/home/HED-BSDS/train_pair.lst"
#Just setup the network. No real online testing
batch_size: 1
shuffle: true
new_height: 0
new_width: 0
}
}

layer { bottom: 'data' top: 'conv1_1' name: 'conv1_1' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 64 pad: 35 kernel_size: 3 } }
layer { bottom: 'conv1_1' top: 'conv1_1' name: 'relu1_1' type: "ReLU" }
layer { bottom: 'conv1_1' top: 'conv1_2' name: 'conv1_2' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 64 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv1_2' top: 'conv1_2' name: 'relu1_2' type: "ReLU" }
layer { name: 'pool1' bottom: 'conv1_2' top: 'pool1' type: "Pooling"
pooling_param { pool: MAX kernel_size: 2 stride: 2 } }

layer { name: 'conv2_1' bottom: 'pool1' top: 'conv2_1' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 128 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv2_1' top: 'conv2_1' name: 'relu2_1' type: "ReLU" }
layer { bottom: 'conv2_1' top: 'conv2_2' name: 'conv2_2' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 128 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv2_2' top: 'conv2_2' name: 'relu2_2' type: "ReLU" }
layer { bottom: 'conv2_2' top: 'pool2' name: 'pool2' type: "Pooling"
pooling_param { pool: MAX kernel_size: 2 stride: 2 } }

layer { bottom: 'pool2' top: 'conv3_1' name: 'conv3_1' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv3_1' top: 'conv3_1' name: 'relu3_1' type: "ReLU" }
layer { bottom: 'conv3_1' top: 'conv3_2' name: 'conv3_2' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv3_2' top: 'conv3_2' name: 'relu3_2' type: "ReLU" }
layer { bottom: 'conv3_2' top: 'conv3_3' name: 'conv3_3' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv3_3' top: 'conv3_3' name: 'relu3_3' type: "ReLU" }
layer { bottom: 'conv3_3' top: 'pool3' name: 'pool3' type: "Pooling"
pooling_param { pool: MAX kernel_size: 2 stride: 2 } }

layer { bottom: 'pool3' top: 'conv4_1' name: 'conv4_1' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv4_1' top: 'conv4_1' name: 'relu4_1' type: "ReLU" }
layer { bottom: 'conv4_1' top: 'conv4_2' name: 'conv4_2' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv4_2' top: 'conv4_2' name: 'relu4_2' type: "ReLU" }
layer { bottom: 'conv4_2' top: 'conv4_3' name: 'conv4_3' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv4_3' top: 'conv4_3' name: 'relu4_3' type: "ReLU" }
layer { bottom: 'conv4_3' top: 'pool4' name: 'pool4' type: "Pooling"
pooling_param { pool: MAX kernel_size: 2 stride: 2 } }

layer { bottom: 'pool4' top: 'conv5_1' name: 'conv5_1' type: "Convolution"
param { lr_mult: 100 decay_mult: 1 } param { lr_mult: 200 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv5_1' top: 'conv5_1' name: 'relu5_1' type: "ReLU" }
layer { bottom: 'conv5_1' top: 'conv5_2' name: 'conv5_2' type: "Convolution"
param { lr_mult: 100 decay_mult: 1 } param { lr_mult: 200 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv5_2' top: 'conv5_2' name: 'relu5_2' type: "ReLU" }
layer { bottom: 'conv5_2' top: 'conv5_3' name: 'conv5_3' type: "Convolution"
param { lr_mult: 100 decay_mult: 1 } param { lr_mult: 200 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv5_3' top: 'conv5_3' name: 'relu5_3' type: "ReLU" }

DSN conv 1

layer { name: 'score-dsn1' type: "Convolution" bottom: 'conv1_2' top: 'score-dsn1-up'
param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 1 kernel_size: 1 } }
layer { type: "Crop" name: 'crop' bottom: 'score-dsn1-up' bottom: 'data' top: 'upscore-dsn1' }
layer { type: "SigmoidCrossEntropyLoss" bottom: "upscore-dsn1" bottom: "label" top:"dsn1_loss" loss_weight: 1}

DSN conv 2

layer { name: 'score-dsn2' type: "Convolution" bottom: 'conv2_2' top: 'score-dsn2'
param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 1 kernel_size: 1 } }
layer { type: "Deconvolution" name: 'upsample_2' bottom: 'score-dsn2' top: 'score-dsn2-up'
param { lr_mult: 0 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0}
convolution_param { kernel_size: 4 stride: 2 num_output: 1 } }
layer { type: "Crop" name: 'crop' bottom: 'score-dsn2-up' bottom: 'data' top: 'upscore-dsn2' }
layer { type: "SigmoidCrossEntropyLoss" bottom: "upscore-dsn2" bottom: "label" top:"dsn2_loss" loss_weight: 1}

DSN conv 3

layer { name: 'score-dsn3' type: "Convolution" bottom: 'conv3_3' top: 'score-dsn3'
param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 1 kernel_size: 1 } }
layer { type: "Deconvolution" name: 'upsample_4' bottom: 'score-dsn3' top: 'score-dsn3-up'
param { lr_mult: 0 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0}
convolution_param { kernel_size: 8 stride: 4 num_output: 1 } }
layer { type: "Crop" name: 'crop' bottom: 'score-dsn3-up' bottom: 'data' top: 'upscore-dsn3' }
layer { type: "SigmoidCrossEntropyLoss" bottom: "upscore-dsn3" bottom: "label" top:"dsn3_loss" loss_weight: 1}

###DSN conv 4###
layer { name: 'score-dsn4' type: "Convolution" bottom: 'conv4_3' top: 'score-dsn4'
param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 1 kernel_size: 1 } }
layer { type: "Deconvolution" name: 'upsample_8' bottom: 'score-dsn4' top: 'score-dsn4-up'
param { lr_mult: 0 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0}
convolution_param { kernel_size: 16 stride: 8 num_output: 1 } }
layer { type: "Crop" name: 'crop' bottom: 'score-dsn4-up' bottom: 'data' top: 'upscore-dsn4' }
layer { type: "SigmoidCrossEntropyLoss" bottom: "upscore-dsn4" bottom: "label" top:"dsn4_loss" loss_weight: 1}

###DSN conv 5###
layer { name: 'score-dsn5' type: "Convolution" bottom: 'conv5_3' top: 'score-dsn5'
param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 1 kernel_size: 1 } }
layer { type: "Deconvolution" name: 'upsample_16' bottom: 'score-dsn5' top: 'score-dsn5-up'
param { lr_mult: 0 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0}
convolution_param { kernel_size: 32 stride: 16 num_output: 1 } }
layer { type: "Crop" name: 'crop' bottom: 'score-dsn5-up' bottom: 'data' top: 'upscore-dsn5' }
layer { type: "SigmoidCrossEntropyLoss" bottom: "upscore-dsn5" bottom: "label" top:"dsn5_loss" loss_weight: 1}

Concat and multiscale weight layer

layer { name: "concat" bottom: "upscore-dsn1" bottom: "upscore-dsn2" bottom: "upscore-dsn3"
bottom: "upscore-dsn4" bottom: "upscore-dsn5" top: "concat-upscore" type: "Concat"
concat_param { concat_dim: 1} }
layer { name: 'new-score-weighting' type: "Convolution" bottom: 'concat-upscore' top: 'upscore-fuse'
param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.002 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 1 kernel_size: 1 weight_filler {type: "constant" value: 0.2} } }
layer { type: "SigmoidCrossEntropyLoss" bottom: "upscore-fuse" bottom: "label" top:"fuse_loss" loss_weight: 1}

and solve.py as follows:
from future import division
import numpy as np
import sys
caffe_root = '../../'
sys.path.insert(0, caffe_root + 'python')
import caffe

make a bilinear interpolation kernel

credit @longjon

def upsample_filt(size):
factor = (size + 1) // 2
if size % 2 == 1:
center = factor - 1
else:
center = factor - 0.5
og = np.ogrid[:size, :size]
return (1 - abs(og[0] - center) / factor) *
(1 - abs(og[1] - center) / factor)

set parameters s.t. deconvolutional layers compute bilinear interpolation

N.B. this is for deconvolution without groups

def interp_surgery(net, layers):
for l in layers:
m, k, h, w = net.params[l][0].data.shape
if m != k:
print('input + output channels need to be the same')
raise
if h != w:
print('filters need to be square')
raise
filt = upsample_filt(h)
net.params[l][0].data[range(m), range(k), :, :] = filt

base net -- follow the editing model parameters example to make

a fully convolutional VGG16 net.

http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/net_surgery.ipynb

base_weights = '/home/hedcanny/models/5stage-vgg.caffemodel'

init

caffe.set_mode_gpu()
caffe.set_device(0)

solver = caffe.SGDSolver('solver.prototxt')

do net surgery to set the deconvolution weights for bilinear interpolation

interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
interp_surgery(solver.net, interp_layers)

copy base weights for fine-tuning

#solver.restore('dsn-full-res-3-scales_iter_29000.solverstate')
solver.net.copy_from(base_weights)

solve straight through -- a better approach is to define a solving loop to

1. take SGD steps

2. score the model by the test net `solver.test_nets[0]`

3. repeat until satisfied

solver.step(100000)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train error #87

train error #87

cqray1990 commented Jul 3, 2020

train error #87

train error #87

Comments

cqray1990 commented Jul 3, 2020

lr for fine-tuning should be lower than when starting from scratch

stepsize should also be lower, as we're closer to being done

uncomment the following to default to CPU mode solving

solver_mode: CPU

DSN conv 1

DSN conv 2

DSN conv 3

Concat and multiscale weight layer

make a bilinear interpolation kernel

credit @longjon

set parameters s.t. deconvolutional layers compute bilinear interpolation

N.B. this is for deconvolution without groups

base net -- follow the editing model parameters example to make

a fully convolutional VGG16 net.

http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/net_surgery.ipynb

init

do net surgery to set the deconvolution weights for bilinear interpolation

copy base weights for fine-tuning

solve straight through -- a better approach is to define a solving loop to

1. take SGD steps

2. score the model by the test net solver.test_nets[0]

3. repeat until satisfied

2. score the model by the test net `solver.test_nets[0]`