Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train error #87

Open
cqray1990 opened this issue Jul 3, 2020 · 0 comments
Open

train error #87

cqray1990 opened this issue Jul 3, 2020 · 0 comments

Comments

@cqray1990
Copy link

when train hed by using the data you support but it raised:
Creating layer data
F0703 16:04:09.971015 2124 layer_factory.hpp:81] Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: ImageLabelmapData (known types: AbsVal, Accuracy, ArgMax, BNLL, BatchNorm, BatchReindex, Bias, Clip, Concat, ContrastiveLoss, Convolution, Crop, Data, Deconvolution, Dropout, DummyData, ELU, Eltwise, Embed, EuclideanLoss, Exp, Filter, Flatten, HDF5Data, HDF5Output, HingeLoss, Im2col, ImageData, InfogainLoss, InnerProduct, Input, LRN, LSTM, LSTMUnit, Log, MVN, MemoryData, MultinomialLogisticLoss, PReLU, Parameter, Pooling, Power, Python, RNN, ReLU, Reduction, Reshape, SPP, Scale, Sigmoid, SigmoidCrossEntropyLoss, Silence, Slice, Softmax, SoftmaxWithLoss, Split, Swish, TanH, Threshold, Tile, WindowData)
*** Check failure stack trace: ***
@ 0x7f095cad75cd google::LogMessage::Fail()
@ 0x7f095cad9433 google::LogMessage::SendToLog()
@ 0x7f095cad715b google::LogMessage::Flush()
@ 0x7f095cad9e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f095cfb0db4 caffe::Net<>::Init()
@ 0x7f095cfb21ae caffe::Net<>::Net()
@ 0x7f095cfbd05a caffe::Solver<>::InitTrainNet()
@ 0x7f095cfbe625 caffe::Solver<>::Init()
@ 0x7f095cfbe93f caffe::Solver<>::Solver()
@ 0x7f095cfd9b91 caffe::Creator_SGDSolver<>()
@ 0x40c400 train()
@ 0x409260 main
@ 0x7f095b9cf830 __libc_start_main
@ 0x409a79 _start
@ (nil) (unknown)
已放弃 (核心已转储)
and i train the examples name mnist in caffe it has no problem

here is my solver.prototxt:
net: "/home/lgx/caffe/examples/hed/train_val.prototxt"
test_iter: 0
test_interval: 1000000

lr for fine-tuning should be lower than when starting from scratch

#debug_info: true
base_lr: 0.000001
lr_policy: "step"
gamma: 0.1
iter_size: 10

stepsize should also be lower, as we're closer to being done

stepsize: 10000
display: 20
max_iter: 30001
momentum: 0.9
weight_decay: 0.0002
snapshot: 1000
snapshot_prefix: "hed"

uncomment the following to default to CPU mode solving

solver_mode: CPU

and train_val.prototxt
name: "HED"
layer {
name: "data"
type: "ImageLabelmapData"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: false
mean_value: 104.00699
mean_value: 116.66877
mean_value: 122.67892
}
image_data_param {
root_folder: "/home/HED-BSDS/"
source: "/home/lgx/HED-BSDS/train_pair.lst"
batch_size: 1
shuffle: true
new_height: 0
new_width: 0
}
}
layer {
name: "data"
type: "ImageLabelmapData"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false
mean_value: 104.00699
mean_value: 116.66877
mean_value: 122.67892
}
image_data_param {
root_folder: "/home/HED-BSDS/"
source: "/home/HED-BSDS/train_pair.lst"
#Just setup the network. No real online testing
batch_size: 1
shuffle: true
new_height: 0
new_width: 0
}
}

layer { bottom: 'data' top: 'conv1_1' name: 'conv1_1' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 64 pad: 35 kernel_size: 3 } }
layer { bottom: 'conv1_1' top: 'conv1_1' name: 'relu1_1' type: "ReLU" }
layer { bottom: 'conv1_1' top: 'conv1_2' name: 'conv1_2' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 64 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv1_2' top: 'conv1_2' name: 'relu1_2' type: "ReLU" }
layer { name: 'pool1' bottom: 'conv1_2' top: 'pool1' type: "Pooling"
pooling_param { pool: MAX kernel_size: 2 stride: 2 } }

layer { name: 'conv2_1' bottom: 'pool1' top: 'conv2_1' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 128 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv2_1' top: 'conv2_1' name: 'relu2_1' type: "ReLU" }
layer { bottom: 'conv2_1' top: 'conv2_2' name: 'conv2_2' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 128 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv2_2' top: 'conv2_2' name: 'relu2_2' type: "ReLU" }
layer { bottom: 'conv2_2' top: 'pool2' name: 'pool2' type: "Pooling"
pooling_param { pool: MAX kernel_size: 2 stride: 2 } }

layer { bottom: 'pool2' top: 'conv3_1' name: 'conv3_1' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv3_1' top: 'conv3_1' name: 'relu3_1' type: "ReLU" }
layer { bottom: 'conv3_1' top: 'conv3_2' name: 'conv3_2' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv3_2' top: 'conv3_2' name: 'relu3_2' type: "ReLU" }
layer { bottom: 'conv3_2' top: 'conv3_3' name: 'conv3_3' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 256 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv3_3' top: 'conv3_3' name: 'relu3_3' type: "ReLU" }
layer { bottom: 'conv3_3' top: 'pool3' name: 'pool3' type: "Pooling"
pooling_param { pool: MAX kernel_size: 2 stride: 2 } }

layer { bottom: 'pool3' top: 'conv4_1' name: 'conv4_1' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv4_1' top: 'conv4_1' name: 'relu4_1' type: "ReLU" }
layer { bottom: 'conv4_1' top: 'conv4_2' name: 'conv4_2' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv4_2' top: 'conv4_2' name: 'relu4_2' type: "ReLU" }
layer { bottom: 'conv4_2' top: 'conv4_3' name: 'conv4_3' type: "Convolution"
param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv4_3' top: 'conv4_3' name: 'relu4_3' type: "ReLU" }
layer { bottom: 'conv4_3' top: 'pool4' name: 'pool4' type: "Pooling"
pooling_param { pool: MAX kernel_size: 2 stride: 2 } }

layer { bottom: 'pool4' top: 'conv5_1' name: 'conv5_1' type: "Convolution"
param { lr_mult: 100 decay_mult: 1 } param { lr_mult: 200 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv5_1' top: 'conv5_1' name: 'relu5_1' type: "ReLU" }
layer { bottom: 'conv5_1' top: 'conv5_2' name: 'conv5_2' type: "Convolution"
param { lr_mult: 100 decay_mult: 1 } param { lr_mult: 200 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv5_2' top: 'conv5_2' name: 'relu5_2' type: "ReLU" }
layer { bottom: 'conv5_2' top: 'conv5_3' name: 'conv5_3' type: "Convolution"
param { lr_mult: 100 decay_mult: 1 } param { lr_mult: 200 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 512 pad: 1 kernel_size: 3 } }
layer { bottom: 'conv5_3' top: 'conv5_3' name: 'relu5_3' type: "ReLU" }

DSN conv 1

layer { name: 'score-dsn1' type: "Convolution" bottom: 'conv1_2' top: 'score-dsn1-up'
param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 1 kernel_size: 1 } }
layer { type: "Crop" name: 'crop' bottom: 'score-dsn1-up' bottom: 'data' top: 'upscore-dsn1' }
layer { type: "SigmoidCrossEntropyLoss" bottom: "upscore-dsn1" bottom: "label" top:"dsn1_loss" loss_weight: 1}

DSN conv 2

layer { name: 'score-dsn2' type: "Convolution" bottom: 'conv2_2' top: 'score-dsn2'
param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 1 kernel_size: 1 } }
layer { type: "Deconvolution" name: 'upsample_2' bottom: 'score-dsn2' top: 'score-dsn2-up'
param { lr_mult: 0 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0}
convolution_param { kernel_size: 4 stride: 2 num_output: 1 } }
layer { type: "Crop" name: 'crop' bottom: 'score-dsn2-up' bottom: 'data' top: 'upscore-dsn2' }
layer { type: "SigmoidCrossEntropyLoss" bottom: "upscore-dsn2" bottom: "label" top:"dsn2_loss" loss_weight: 1}

DSN conv 3

layer { name: 'score-dsn3' type: "Convolution" bottom: 'conv3_3' top: 'score-dsn3'
param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 1 kernel_size: 1 } }
layer { type: "Deconvolution" name: 'upsample_4' bottom: 'score-dsn3' top: 'score-dsn3-up'
param { lr_mult: 0 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0}
convolution_param { kernel_size: 8 stride: 4 num_output: 1 } }
layer { type: "Crop" name: 'crop' bottom: 'score-dsn3-up' bottom: 'data' top: 'upscore-dsn3' }
layer { type: "SigmoidCrossEntropyLoss" bottom: "upscore-dsn3" bottom: "label" top:"dsn3_loss" loss_weight: 1}

###DSN conv 4###
layer { name: 'score-dsn4' type: "Convolution" bottom: 'conv4_3' top: 'score-dsn4'
param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 1 kernel_size: 1 } }
layer { type: "Deconvolution" name: 'upsample_8' bottom: 'score-dsn4' top: 'score-dsn4-up'
param { lr_mult: 0 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0}
convolution_param { kernel_size: 16 stride: 8 num_output: 1 } }
layer { type: "Crop" name: 'crop' bottom: 'score-dsn4-up' bottom: 'data' top: 'upscore-dsn4' }
layer { type: "SigmoidCrossEntropyLoss" bottom: "upscore-dsn4" bottom: "label" top:"dsn4_loss" loss_weight: 1}

###DSN conv 5###
layer { name: 'score-dsn5' type: "Convolution" bottom: 'conv5_3' top: 'score-dsn5'
param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 1 kernel_size: 1 } }
layer { type: "Deconvolution" name: 'upsample_16' bottom: 'score-dsn5' top: 'score-dsn5-up'
param { lr_mult: 0 decay_mult: 1 } param { lr_mult: 0 decay_mult: 0}
convolution_param { kernel_size: 32 stride: 16 num_output: 1 } }
layer { type: "Crop" name: 'crop' bottom: 'score-dsn5-up' bottom: 'data' top: 'upscore-dsn5' }
layer { type: "SigmoidCrossEntropyLoss" bottom: "upscore-dsn5" bottom: "label" top:"dsn5_loss" loss_weight: 1}

Concat and multiscale weight layer

layer { name: "concat" bottom: "upscore-dsn1" bottom: "upscore-dsn2" bottom: "upscore-dsn3"
bottom: "upscore-dsn4" bottom: "upscore-dsn5" top: "concat-upscore" type: "Concat"
concat_param { concat_dim: 1} }
layer { name: 'new-score-weighting' type: "Convolution" bottom: 'concat-upscore' top: 'upscore-fuse'
param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.002 decay_mult: 0}
convolution_param { engine: CAFFE num_output: 1 kernel_size: 1 weight_filler {type: "constant" value: 0.2} } }
layer { type: "SigmoidCrossEntropyLoss" bottom: "upscore-fuse" bottom: "label" top:"fuse_loss" loss_weight: 1}

and solve.py as follows:
from future import division
import numpy as np
import sys
caffe_root = '../../'
sys.path.insert(0, caffe_root + 'python')
import caffe

make a bilinear interpolation kernel

credit @longjon

def upsample_filt(size):
factor = (size + 1) // 2
if size % 2 == 1:
center = factor - 1
else:
center = factor - 0.5
og = np.ogrid[:size, :size]
return (1 - abs(og[0] - center) / factor) *
(1 - abs(og[1] - center) / factor)

set parameters s.t. deconvolutional layers compute bilinear interpolation

N.B. this is for deconvolution without groups

def interp_surgery(net, layers):
for l in layers:
m, k, h, w = net.params[l][0].data.shape
if m != k:
print('input + output channels need to be the same')
raise
if h != w:
print('filters need to be square')
raise
filt = upsample_filt(h)
net.params[l][0].data[range(m), range(k), :, :] = filt

base net -- follow the editing model parameters example to make

a fully convolutional VGG16 net.

http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/net_surgery.ipynb

base_weights = '/home/hedcanny/models/5stage-vgg.caffemodel'

init

caffe.set_mode_gpu()
caffe.set_device(0)

solver = caffe.SGDSolver('solver.prototxt')

do net surgery to set the deconvolution weights for bilinear interpolation

interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
interp_surgery(solver.net, interp_layers)

copy base weights for fine-tuning

#solver.restore('dsn-full-res-3-scales_iter_29000.solverstate')
solver.net.copy_from(base_weights)

solve straight through -- a better approach is to define a solving loop to

1. take SGD steps

2. score the model by the test net solver.test_nets[0]

3. repeat until satisfied

solver.step(100000)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant