This article will caffe/python Next classify.py Code and related classifier.py and io.py Analyze .
One ,classify.py
By the last if__name__ == '__main__':
main(sys.argv) Run on the command line on behalf of the file , Run main function , Parameters stored in sys.argv in . stay main In function definition , Judge and store various parameters separately , They are as follows :
input_file: input image , Parameter is required .
output_file: output file , Parameter is required .
--model_def: Network test structure file , Default is imagenet Of deploy.txt file
--pretrained_model: Network parameter file , Default is imagenet Of bvlc_reference_caffenet.caffemodel file .
--gpu: Yes no gpu calculation ,action=’store
true’ Indicates if not specified , Default false, use cpu, Otherwise true, use gpu. For one 128*128 Grayscale image of ,cpu Forward calculation approximate 20ms, and gpu only 5ms about .
--center_only: default false, That is, to predict the cropped image of the input image , Then average the results ; Designated as true, That is, only the middle part of the input image is taken for one prediction . of course , If the specified input image is the same as the crop size , Then take the middle part as the original picture itself .
--images_dim: Enter image size , Only height and width are considered , default 256*256.
--mean_file: Mean value file . Note that the data format is npy file , Store as numpy.array format , Dimension is ( passageway , high , wide ). If only through compute_mean.bin Calculated mean file , Conversion required . The default mean value file is imagenet Of ilsvrc_2012_mean.npy file .
--input_scale: Scaling factor after image preprocessing , Occurs after subtracting the mean , Default is 1.
--raw_scale: Scaling factor before image preprocessing , Before subtracting the mean . Because the pixel value read in is [0,1] section , The default is 255.0, Make pixels in [0,255] section .
--channel_swap: Channel adjustment , Default is ’2,1,0’, because caffe adopt opencv The image channel read in is BGR, Therefore, it is necessary to RGB-->BGR, That is 0 Channels and 2 Channel switching .
--ext: default ’jpg’, Represents if the input is specified as a directory , Only the suffix is read jpg Documents of .
The following parameters are improved versions classify.py China Singapore .
--labels_file: Label category file , Default is imagenet Of synset_words.txt file .
--print_results: Print results to screen , Do not specify false, Designated as true.
--force_grayscale: Whether to specify input as single channel image , Do not specify false, Designated as true.
adopt args= parser.parse_args() to update , Confirm the final input parameters . The following is a classification test :
# List generation , Dividing dimension strings by commas , And forced into int type . Last is the list .
image_dims = [int(s) for s inargs.images_dim.split(',')]
# If the mean value file is specified , Then load the mean value file
if args.mean_file:
mean =np.load(args.mean_file)
# If it's a grayscale image , No channel switching . If it is rgb image , If there is a channel switch , Dividing strings by commas , Forced conversion to int type , Save to list .
if args.force_grayscale:
channel_swap = None
else:
if args.channel_swap
channel_swap = [int(s) for s inargs.channel_swap.split(',')]
# If specified gpu, Start gpu pattern
if args.gpu:
caffe.set_mode_gpu()
print("GPU mode")
else:
caffe.set_mode_cpu()
print("CPU mode")
# Initialize classifier , see classifier.py
classifier = caffe.Classifier(..)
# Here is the code to read the file , It is reported that loading gray-scale image will report an error , Here is the sum of recorded gray levels rgb Image code .
if args.force_grayscale:
# there false Representative returns single channel image , see io.py
inputs =[caffe.io.load_image(args.input_file, False)]
else:
inputs = [caffe.io.load_image(args.input_file)]
# inputs use [] Put it all together , Representative list storage , therefore len(inputs) Represents how many input images there are .
# time , Here ms In units
start = time.time() * 1000
# Forward calculation , see classifier.py, obtain preditions by np array , Number of input images , Number of forecast categories
predictions = classifier.predict(inputs,not args.center_only)
print("Done in %.2f ms." %(time.time() * 1000 - start))
print("Predictions : %s"% predictions)
# Print results , Rank by score , Give the top five categories with higher scores , Class name by labels_file appoint .
# print result, add by caisenchuan
if args.print_results:
...
Two ,Classifier.py
This file defines classifier class , Includes initialization functions __init__ and predict function .
1, __init__:
First called caffe Class initialization function , And set test pattern .
Then called transformer class , with cifar-10 take as an example , Enter as dictionary {’data’: (1,3,32,32)}.
And then there was set_transpose method :
# From dimension (32,32,3) Convert to (3,32,32), For caffe Processing in
self.transformer.set_transpose(in_,(2, 0, 1))
Then call transformer Class set method , Setting various parameters , See below for details io.py Parsing in .
last , On the definition of image dimension :
# Cut size according to prototxt definition
self.crop_dims =np.array(self.blobs[in_].data.shape[2:])
# If the picture size parameter is not defined , Is equal to the cut size ; Otherwise, by definition
# generally speaking , If a cut is used , Image size > Cut size
if not image_dims:
image_dims = self.crop_dims
self.image_dims = image_dims
2, predict:
Perform forward calculation , Predicting the probability of image classification . Parameter is the Boolean value of the input and oversampling .
# definition inputs_ dimension (m,h,w,channel)
input_ = np.zeros((len(inputs),
self.image_dims[0],
self.image_dims[1],
inputs[0].shape[2]),
dtype=np.float32)
# Unify all dimensions to be classified into image_dims size
for ix, in_ in enumerate(inputs):
input_[ix] = caffe.io.resize_image(in_,self.image_dims)
# If oversampling , Each image is generated by clipping 10 Images
# Dimension will become (10*m,h,w,channel)
if oversample:
# Generate center, corner, and mirroredcrops.
input_ = caffe.io.oversample(input_,self.crop_dims)
# otherwise , Crop center region . Take the midpoint of image size , Then take the cutting length up and down respectively .
# with 64*64 Crop 32*32 take as an example ,(64,64) Take the midpoint -->(32,32), Expand to four coordinates -->(32,32,32,32),
# Take cutting size (32,32,32,32)+(-16,-16,16,16)-->(16,16,48,48)
else:
# Take center crop.
center = np.array(self.image_dims) /2.0
crop = np.tile(center, (1, 2))[0] +np.concatenate([
-self.crop_dims / 2.0,
self.crop_dims / 2.0
])
crop = crop.astype(int)
input_ = input_[:, crop[0]:crop[2],crop[1]:crop[3], :]
# Convert input to caffe Required format , Dimension becomes (m,channel,h,w)
caffe_in =np.zeros(np.array(input_.shape)[[0, 3, 1, 2]],
dtype=np.float32)
# Preprocess each image , see io.py Of preprocess function
for ix, in_ in enumerate(input_):
caffe_in[ix] =self.transformer.preprocess(self.inputs[0], in_)
# Forward calculation , Output as dictionary ,out[‘prob’] For all kinds of probability
out =self.forward_all(**{self.inputs[0]: caffe_in})
predictions = out[self.outputs[0]]
# If oversampling , Every 10 Average forecast results
if oversample:
predictions =predictions.reshape((len(predictions) / 10, 10, -1))
predictions = predictions.mean(1)
# Return results
return predictions
Three ,io.py
This document focuses on pretreatment class Transformer Member functions of .
1,preprocess
Note that the comment part of the function shows the whole process of preprocessing , include :
Convert to single precision ;
resize To uniform size ;
Dimension to (channel,h,w);
Channel switching , Convert to BGR;
Scale before mean ;
Subtract mean ;
Scale after subtracting mean .
Important code :
...
# return [h,w]
in_dims = self.inputs[in_][2:]
# The input image is different from the specified size , be resize unified
if caffe_in.shape[:2] != in_dims:
caffe_in = resize_image(caffe_in, in_dims)
# Dimension transformation
if transpose is not None:
caffe_in = caffe_in.transpose(transpose)
# Channel switching , refer to channel Exchange of ,h and w unchanged
if channel_swap is not None:
caffe_in = caffe_in[channel_swap, :, :]
# multiplication
if raw_scale is not None:
caffe_in *= raw_scale
# subtraction
if mean is not None:
caffe_in -= mean
# multiplication
if input_scale is not None:
caffe_in *= input_scale
return caffe_in
2,load_image, be careful color Parameter default True
# utilize skimage Tools read in pictures , Default read in color picture , If as_grey by 1, Then read in the gray image ; Read in value is [0,1] Floating point number of
img =skimage.img_as_float(skimage.io.imread(filename,
as_grey=notcolor)).astype(np.float32)
# Make sure to return a three-dimensional array .
if img.ndim == 2:
# If only two dimensions are read in , Need to add dimension
img = img[:, :, np.newaxis]
if color:
# If it's a gray image, it's read in as a color image , Then expand to three channels
img = np.tile(img, (1, 1, 3))
elif img.shape[2] == 4:
# If there are four channels , Remove the fourth channel
img= img[:, :, :3]
# return (h,w,3) Array of
return img
also resize_image,oversample, And all kinds of set function , I won't introduce them here . so-called caffe Of python Interface or matlab Interface , It's all right caffe Input preprocessing and output result processing of , Regardless of the intermediate process of network computing .
Technology
Daily Recommendation