After half a day and another night ..
I learned it for a short time TensorFlow, But it didn't work at all in today's practice match ... The first stage has been finished a long time ago , Today is the second stage , What are they all about

Skewness again , It's the coefficient again , It's fitting again KNN, It's really disgusting , Basically, they are learning and using now , I looked up the information for half a day , I've done it a million times csdn There's a clue
Eight questions on structured data ( The third stage can be started only by making eight ways )
Four of them are masked ... Because these four questions are basically output 0 or 1, Although there are times of submission , But still tenacious try out ...
Make four 1,2,3,5
Finding the sum of skewness boxcox1p Change has a fixed function , Apply it directly
skewness :skew()
Second question : Calculate the skewness of body weight
import scipy.stats as st import pandas as pd pd.options.display.max_columns =
None pd.options.display.max_rows = None path1 =
"/home/kesci/input/liver_df9751/ Structured data training camp .csv" # chipotle.tsv df = pd.read_csv(
path1) df.head(30) aveTime = df['Weight\n weight '].median() chipo['Weight\n weight '].
nunique() df2 = df.fillna(aveTime) col = df2.iloc[:, 3] arrs = col.values ##
print(arrs) w=st.skew(arrs) # Calculation of skewness ## 0.7565543738808015 print('%.4f'%w)
boxcox1p Transformation :boxcox1p()
use boxcox1p Change your weight ,lambda=0.1, What is the skewness of the changed data ?
import scipy.stats as st import pandas as pd from scipy.special import
boxcox1p pd.options.display.max_columns = None pd.options.display.max_rows =
None path1= "/home/kesci/input/liver_df9751/ Structured data training camp .csv" # chipotle.tsv df = pd
.read_csv(path1) aveTime = df['Weight\n weight '].median() wt = df['Weight\n weight '].
fillna(aveTime) lam=0.1 wt = boxcox1p(wt, lam) w=st.skew(wt.values) # Calculation of skewness ##
0.7565543738808015 print('%.4f'%w)
I really want to talk about the fifth question , It's amazing , Online questions :
Use the same data as above KNN(K=5), How many of the classification results are inconsistent with the real results ?
What do you mean by the same data : Only age was selected , Weight and ALF( Missing values were filled with median , No extra treatment )

I went through the data over and over again , I've seen countless KNN My article , Important to find in an article KNN Usage of ,KNN There are also functions KNeighborsClassifier(n_neighbors=5), Inside n_neighbors That is, in the title K

We get the classification results , But it's not enough , What is the real result ? We didn't get it , Because we train all the data as a training set KNN, namely KNN Data is needed to generate , Then you need data to test , But the title gives only one set of data , Later I wondered if it was self generated KNN To test yourself ? That is to use the classification results generated by the training set to test their own classification results
So my classmates and I output the classification results and the results of the original data itself , Inequality found , The difference is the answer
import scipy.stats as st import pandas as pd import regex as re pd.options.
display.max_columns = None pd.options.display.max_rows = None path1 =
"/home/kesci/input/liver_df9751/ Structured data training camp .csv" # chipotle.tsv #
path2="/home/kesci/inputver_df9751/ Structured data training camp test suite .csv" data = pd.read_csv(path1) #
data_test=pd.read_csv(path1) col_names=list(data.columns) col=[] for i in range(
len(col_names)): if re.findall(r"\u2028(.+)",col_names[i])!=[]: col.append(re.
findall(r"\u2028(.+)",col_names[i])[0]) elif re.findall(r"\n(.+)",col_names[i])
!=[]: col.append(re.findall(r"\n(.+)",col_names[i])[0]) else: col.append(
col_names[i]) ## modify dataframe Listing data.columns=col feature1 = [' weight ',' Age ','ALF'] for
i in feature1: ave=data[i].median() data[i] = data[i].fillna(ave) print(data[i]
.values) a_zi=[] for i in range(len(data)): c=[data[' weight '][i],data[' Age '][i]] a_zi
.append(c) from sklearn.neighbors import KNeighborsClassifier neigh =
KNeighborsClassifier(n_neighbors=5) neigh.fit(a_zi, data['ALF']) cnt=0 for i in
range(len(a_zi)): if(neigh.predict([a_zi[i]])==data['ALF'][i]): cnt+=1 print(len
(a_zi),cnt,len(a_zi)-cnt)
It's really amazing ...
We're going to continue tomorrow

Technology
©2019-2020 Toolsou All rights reserved,
Hikvision - Embedded software written test questions C Language application 0 The length of array in memory and structure is 0 In depth analysis data structure --- The preorder of binary tree , Middle order , Subsequent traversal How to do it ipad Transfer of medium and super large files to computer elementui Shuttle box el-transfer Display list content text too long 2019 The 10th Blue Bridge Cup C/C++ A Summary after the National Games ( Beijing Tourism summary )unity Shooting games , Implementation of first person camera python of numpy Module detailed explanation and application case Study notes 【STM32】 Digital steering gear Horizontal and vertical linkage pan tilt Vue Used in Element Open for the first time el-dialog Solution for not getting element