IT
파이썬 UCI Auto MPG 시각화 연습
astrocker
2021. 1. 9. 23:51
반응형
머신러닝 실습 Data인 Auto MPG르 간단히 전처리 단계 Data 시각화를 해봄.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
plt.rcParams['font.family']='DejaVu Sans'
plt.rcParams['axes.unicode_minus']=False
Data는 본문 하단 출처를 참고하여 다운로드...
df=pd.read_csv('drive/My Drive/data/cars/uci_auto_mpg.csv')
df.info()
========== Result ==========
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 398 entries, 0 to 397
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 MPG 398 non-null float64
1 Cylinders 398 non-null int64
2 Displacement 398 non-null float64
3 Horsepower 392 non-null float64
4 Weight 398 non-null float64
5 Acceleration 398 non-null float64
6 Model Year 398 non-null int64
7 Origin 398 non-null int64
dtypes: float64(5), int64(3)
memory usage: 25.0 KB
null 값 있는 column 확인 후 삭제
df.isna().sum() # null값 여부 확인
========== Result ==========
MPG 0
Cylinders 0
Displacement 0
Horsepower 6
Weight 0
Acceleration 0
Model Year 0
Origin 0
dtype: int64
df.dropna(axis=0,inplace=True) # nan row 삭제
df.isna().sum() # null값 여부 확인
========== Result ==========
MPG 0
Cylinders 0
Displacement 0
Horsepower 0
Weight 0
Acceleration 0
Model Year 0
Origin 0
dtype: int64
heatmap 그려보기
sns.heatmap(df.corr(),vmin=-1,vmax=1,
linewidth=0.5,annot=True,cmap=plt.cm.gist_heat)
plt.show()
pairplot 그려보기
sns.pairplot(df[['MPG', 'Cylinders', 'Displacement', 'Horsepower', 'Weight']],diag_kind='kde')
plt.show()
Data출처 : archive.ics.uci.edu/ml/datasets/Auto+MPG
UCI Machine Learning Repository: Auto MPG Data Set
Auto MPG Data Set Download: Data Folder, Data Set Description Abstract: Revised from CMU StatLib library, data concerns city-cycle fuel consumption Data Set Characteristics: Multivariate Number of Instances: 398 Area: N/A Attribute Characteristics: Cat
archive.ics.uci.edu
728x90
반응형