On linear classifier

   Given some data sets , They belong to two different categories . For example, for advertising data , It is a typical dichotomous problem , Generally, the data to be clicked is called positive sample , Data that has not been clicked is called a negative sample . Now we're going to find a linear classifier , These data are divided into two categories ( Of course, in reality , Advertising data is particularly complex , It is impossible to distinguish by a linear classifier ). use X Represents sample data ,Y Represents the sample category ( for example 1 And -1, perhaps 1 And 0). The purpose of our linear classifier , It's about finding a hyperplane (Hyperplan) Separate the two types of samples . For this hyperplane , It can be described by the following formula :

ωTx+b=0

   about logistic regression , Yes :

hθ(x)=g(θTx)=11+e−θTx

   among x Sample ,x=[x1,x2,⋯,xn] by n Dimensional vector , function g For what we often say logistic function .g The more general formula of is :

g(z)=11+e−z

   This formula , Students who have a little knowledge of machine learning may be very familiar with it , Not only in logistic Returning , stay SVM in , stay ANN in , You can see him , It is widely used . Most of the information is about this formula , It's all given directly . But I don't know if you have thought about it , Since this formula is so widely used , So why should we use it ?

   Have many people been stunned . That's what everybody uses . That's what the books say . yes , But when something's always hanging around in front of your eyes , Should you think about why ? Anyway, for me , If something has appeared in front of me for the third time and I don't know why , I'll try to figure out why .

Why use it Logistic function

   Students who have learned pattern recognition must have learned all kinds of classifiers . The simplest natural classifier is linear classifier , In linear classifier , The simplest should belong to the perceptron . In the 1950s and 1960s , The perceptron comes up :

y=0,∑i=1nωix≤b
y=1,∑i=1nωix>b

   The idea of perceptron , It's the dot product of all features and weights ( inner product ), Then compare the size with the threshold , The samples are divided into two categories . Students who know a little bit about neural network , You must be familiar with this picture :

   you 're right , This is a picture of a perceptron .
   My postgraduate entrance examination is the control principle , If you have studied the control principle or the signal system , We know that the perceptron is equivalent to the step function in those two courses :

   The essence of the two is the same , That is, by setting a threshold , Then compare the size of sample and threshold to classify .

   This model is simple and intuitive , It's easier to implement ( How about the simplest existing classifier ). But the problem is , This model is not smooth enough . first , hypothesis t0=10
, Now there's a sample coming in , Finally, the calculated value is 10.01, You said that the sample classification should be 1 still 0 What about ? It's not very reliable . second , This function is in the t0
There's a step , From 0 reach 1 The mutation of , This leads to this discontinuity , It's not convenient to deal with it mathematically .

   I've written so much , At last it was my turn logistic Function comes out . Compare the previous perceptron or step function , What are his strengths ?

   adopt logistic The image of the function , We can easily sum up the following advantages of him :
  1. His input range is −∞→+∞ , And it happens to be (0,1), The probability distribution is exactly satisfied (0,1) Requirements of . We use probability to describe classifiers , Nature is much more convenient than a certain threshold ;
  2. It's a monotone rising function , It has good continuity , There are no discontinuities .

   Write it here , Partners should understand why they use it logistic Function .
   Coming soon logistic Follow up article series .

Technology
©2019-2020 Toolsou All rights reserved,
The 11th Blue Bridge Cup python The real topic of the University Group National Games JavaSwing To achieve a simple Lianliankan games 【Spring Source code analysis 】42-@Conditional Detailed explanation element-ui Step on pit record 2019PHP Interview questions ( Continuously updated )PHPJava Misunderstanding —— Method overloading is a manifestation of polymorphism ? First issue 500 100 million , Set up a new Department , What is Tencent going to do ? Google chrome The browser can't open the web page , But what if other browsers can open it ? Regression of dependent variable order categories (R language )【Golang Basic series 10 】Go language On conditional sentences if