72 final_layer_params=
None, init=
'glorot_normal', reg=0.0, use_shortcuts=
False):
75 Return a new Residual Network using full pre-activation based on the work in 76 "Identity Mappings in Deep Residual Networks" by He et al 77 http://arxiv.org/abs/1603.05027 79 The following network definition achieves 92.0% accuracy on CIFAR-10 test using 80 `adam` optimizer, 100 epochs, learning rate schedule of 1e.-3 / 1.e-4 / 1.e-5 with 81 transitions at 50 and 75 epochs: 82 ResNetPreAct(layer1_params=(3,128,2),res_layer_params=(3,32,25),reg=reg) 84 Removed max pooling and using just stride in first convolutional layer. Motivated by 85 "Striving for Simplicity: The All Convolutional Net" by Springenberg et al 86 (https://arxiv.org/abs/1412.6806) and my own experiments where I observed about 0.5% 87 improvement by replacing the max pool operations in the VGG-like cifar10_cnn.py example 88 in the Keras distribution. 92 input_dim : tuple of (C, H, W) 93 nb_classes: number of scores to produce from final affine layer (input to softmax) 94 layer1_params: tuple of (filter size, num filters, stride for conv) 95 res_layer_params: tuple of (filter size, num res layer filters, num res stages) 96 final_layer_params: None or tuple of (filter size, num filters, stride for conv) 97 init: type of weight initialization to use 98 reg: L2 weight regularization (or weight decay) 99 use_shortcuts: to evaluate difference between residual and non-residual network 102 sz_L1_filters, nb_L1_filters, stride_L1 = layer1_params
103 sz_res_filters, nb_res_filters, nb_res_stages = res_layer_params
105 use_final_conv = (final_layer_params
is not None)
107 sz_fin_filters, nb_fin_filters, stride_fin = final_layer_params
108 sz_pool_fin = input_shape[1] / (stride_L1 * stride_fin)
110 sz_pool_fin = input_shape[1] / (stride_L1)
114 from keras import backend as K 115 # Permute dimension order if necessary 116 if K.image_dim_ordering() == 'tf': 117 input_shape = (input_shape[1], input_shape[2], input_shape[0]) 120 img_input = Input(shape=input_shape, name=
'cifar')
123 filters=nb_L1_filters,
124 kernel_size=(sz_L1_filters,sz_L1_filters),
126 strides=(stride_L1, stride_L1),
127 kernel_initializer=init,
128 kernel_regularizer=l2(reg),
133 x = BatchNormalization(axis=3, name=
'bn0')(x)
134 x = Activation(
'relu', name=
'relu0')(x)
136 for stage
in range(1,nb_res_stages+1):
139 (nb_L1_filters, nb_res_filters),
144 use_shortcuts=use_shortcuts
148 x = BatchNormalization(axis=3, name=
'bnF')(x)
149 x = Activation(
'relu', name=
'reluF')(x)
153 filters=nb_L1_filters,
154 kernel_size=(sz_L1_filters,sz_L1_filters),
156 strides=(stride_fin, stride_fin),
157 kernel_initializer=init,
158 kernel_regularizer=l2(reg),
162 x = AveragePooling2D((sz_pool_fin,sz_pool_fin), name=
'avg_pool')(x)
166 x = Dense(nb_classes, activation=
'softmax', name=
'fc10')(x)
168 return Model(img_input, x, name=
'rnpa')
169 def rnpa_bottleneck_layer(input_tensor, nb_filters, filter_sz, stage, init='glorot_normal', reg=0.0, use_shortcuts=True)