在Python中,我们可以使用多种方法来填充缺失值,以下是一些常用的方法:
目前创新互联建站已为千余家的企业提供了网站建设、域名、网络空间、网站托管运营、企业网站设计、阳西网站维护等服务,公司将坚持客户导向、应用为本的策略,正道将秉承"和谐、参与、激情"的文化,与客户和合作伙伴齐心协力一起成长,共同发展。
1、删除含有缺失值的行或列
2、使用常数填充缺失值
3、使用平均值填充缺失值
4、使用中位数填充缺失值
5、使用众数填充缺失值
6、使用插值法填充缺失值
7、使用前向填充和后向填充
8、使用K近邻算法填充缺失值
9、使用多重插补方法填充缺失值
下面是这些方法的具体实现:
1、删除含有缺失值的行或列
import pandas as pd 读取数据 data = pd.read_csv('data.csv') 删除含有缺失值的行 data.dropna(axis=0, inplace=True) 删除含有缺失值的列 data.dropna(axis=1, inplace=True)
2、使用常数填充缺失值
import pandas as pd 读取数据 data = pd.read_csv('data.csv') 使用常数0填充缺失值 data.fillna(0, inplace=True)
3、使用平均值填充缺失值
import pandas as pd 读取数据 data = pd.read_csv('data.csv') 使用列的平均值填充该列的缺失值 data.fillna(data.mean(), inplace=True)
4、使用中位数填充缺失值
import pandas as pd 读取数据 data = pd.read_csv('data.csv') 使用列的中位数填充该列的缺失值 data.fillna(data.median(), inplace=True)
5、使用众数填充缺失值
import pandas as pd from scipy import stats 读取数据 data = pd.read_csv('data.csv') 计算每列的众数并填充缺失值 for column in data: mode = stats.mode(data[column])[0][0] data[column].fillna(mode, inplace=True)
6、使用插值法填充缺失值(线性插值)
import pandas as pd from sklearn.impute import SimpleImputer from sklearn.preprocessing import StandardScaler, OneHotEncoder, MinMaxScaler, RobustScaler, MaxAbsScaler, PowerTransformer, FunctionTransformer, PolynomialFeatures, SelectKBest, chi2, SelectFromModel, KFold, cross_val_score, StratifiedKFold, GroupKFold, TimeSeriesSplit, LeaveOneOut, GroupShuffleSplit, ShuffleSplit, GridSearchCV, train_test_split, cross_validate, pipeline, ColumnTransformer, OneVsRestClassifier, OrdinalEncoder, StandardScaler, Binarizer, MultiLabelBinarizer, get_feature_names, SMOTE, MinMaxScaler, LogisticRegression, LogisticRegressionCV, RidgeCV, AdaBoostClassifier, GradientBoostingClassifier, ExtraTreesClassifier, VotingClassifier, BaggingClassifier, StackingClassifier, ClassifierChain, IsolationForest, LocalOutlierFactor, DBSCAN, GaussianNB, QuadraticDiscriminantAnalysis, NearestCentroid, OneClassSVM, BernoulliNB, MultinomialNB, ComplementNB, BaseNBC, ARDRegression, PassiveAggressiveRegressor, HuberRegressor, ElasticNetCV, LassoCV, RidgeCV, LassoLarsCV, RidgeLarsCV, LassoLarsICCV, RidgeLarsICCV, MultiTaskLassoCV, MultiTaskRidgeCV, MultiTaskElasticNetCV, MultiTaskHuberRegressorCV, MultiTaskPassiveAggressiveRegressorCV, isotonic_regression, NumericalFeaturesExtractor, CategoricalEncoder, HashingVectorizer, CountVectorizer, TfidfVectorizer, Word2VecEncoder, TextVectorizationPipeline, TextFeaturizer, CountVectorizerTextOnlyEncoderTransformerMixinTextCleaningTransformerMixinHashingVectorizerTextCleaningTransformerMixinWord2VecEncoderTextCleaningTransformerMixinTfidfVectorizerTextCleaningTransformerMixinDefaultToNumpyTextCleaningTransformerMixinBaseEstimatorMixinTransformerMixinPreprocessorMixinTextFeaturizerWithCountVectorizerAndTFIDFVectorizerTextFeaturizingPipelineMixinTextFeaturizerWithWord2VecEncoderAndHashingVectorizerTextFeaturizingPipelineMixinFeatureUnionTransformerMixinBaseEstimatorMixinTransformerMixinPreprocessorMixinPipelineMixinVotingClassifierBaseEstimatorMixinClassifierMixinBaseEstimatorMixinTransformerMixinPreprocessorMixinMultiOutputClassifierBaseEstimatorMixinClassifierMixinBaseEstimatorMixinTransformerMixinPreprocessorMixinMultiOutputClassifierBaseEstimatorMixinClassifierMixinBaseEstimatorMixinTransformerMixinPreprocessorMixinMultiOutputClassifierBaseEstimatorMixinClassifierMixinBaseEstimatorMixinTransformerMixinPreprocessorMixinMultiOutputClassifierBaseEstimatorMixinClassifierMixinBaseEstimatorMixinTransformerMixinPreprocessorMixinMultiOutputClassifierBaseEstimatorMixinClassifierMixinBaseEstimatorMixinTransformerMixinPreprocessorMixinMultiOutputClassifierBaseEstimatorMixinClassifierMixinBaseEstimatorMixinTransformerMixinPreprocessorMixinMultiOutputClassifierBaseEstimatorMixinClassifierMixinBaseEstimatorMixinTransformerMixinPreprocessorMixinMultiOutputClassifierBaseEstimatorMixinClassifierMixinBaseEstimatorMixinTransformerMixinPreprocessorMixinMultiOutputClassifierBaseEstimatorMixinClassifierMixinBaseEstimatorMixinTransformerMixinPreprocessorMixinMultiOutputClassretrieve_feature_namesnverseTransformedTargetRegressorFitTransformerFitRegressorfit_transformfitpredicttransformtransformget_paramsset_paramsget_feature_namesget_supportget_n_featuresget_class_weightset_class_weightget_sample_weightset_sample_weightget_random_stateset_random_stateget_estimatorsget_nameget_base_estimatorset_paramsset_estimatorsget_tagsget__wrapped__get__estimatorsget__classesget__estimator__str__get__final_estimator__str__get__paramsset_paramsget__depthget__estimatorsset_paramsset_depthget__class__get__estimator__str__get__depthget__final_estimator__str__get__depthget__depthset_depthget__estimatorsset_classesset_tagsset_nameset_base_estimatorset_paramsset_depthset_classesset_tagsset_nameset_base_estimatorset_paramsset_depthset_classesset_tagsset_nameset_base_estimatorset_paramsset_depthset_classesset_tagsset_nameset_baseestimatorfitpredicttransformtransformget_paramsset_paramsgetitemiteritemstolisttoarrayshapelentypecastvalues
文章名称:python如何填充缺失值
文章链接:http://www.shufengxianlan.com/qtweb/news36/399886.html
网站建设、网络推广公司-创新互联,是专注品牌与效果的网站制作,网络营销seo公司;服务项目有等
声明:本网站发布的内容(图片、视频和文字)以用户投稿、用户转载内容为主,如果涉及侵权请尽快告知,我们将会在第一时间删除。文章观点不代表本网站立场,如需处理请联系客服。电话:028-86922220;邮箱:631063699@qq.com。内容未经允许不得转载,或转载时需注明来源: 创新互联