IT

파이썬 UCI Auto MPG 시각화 연습

astrocker 2021. 1. 9. 23:51
반응형

머신러닝 실습 Data인 Auto MPG르 간단히 전처리 단계 Data 시각화를 해봄.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
plt.rcParams['font.family']='DejaVu Sans'
plt.rcParams['axes.unicode_minus']=False

 

Data는 본문 하단 출처를 참고하여 다운로드...

df=pd.read_csv('drive/My Drive/data/cars/uci_auto_mpg.csv')
df.info()
========== Result ==========
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 398 entries, 0 to 397
Data columns (total 8 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   MPG           398 non-null    float64
 1   Cylinders     398 non-null    int64  
 2   Displacement  398 non-null    float64
 3   Horsepower    392 non-null    float64
 4   Weight        398 non-null    float64
 5   Acceleration  398 non-null    float64
 6   Model Year    398 non-null    int64  
 7   Origin        398 non-null    int64  
dtypes: float64(5), int64(3)
memory usage: 25.0 KB

 

null 값 있는 column 확인 후 삭제

df.isna().sum() # null값 여부 확인
========== Result ==========
MPG             0
Cylinders       0
Displacement    0
Horsepower      6
Weight          0
Acceleration    0
Model Year      0
Origin          0
dtype: int64

df.dropna(axis=0,inplace=True) # nan row 삭제

df.isna().sum() # null값 여부 확인
========== Result ==========
MPG             0
Cylinders       0
Displacement    0
Horsepower      0
Weight          0
Acceleration    0
Model Year      0
Origin          0
dtype: int64

 

heatmap 그려보기

sns.heatmap(df.corr(),vmin=-1,vmax=1,
            linewidth=0.5,annot=True,cmap=plt.cm.gist_heat)
plt.show()

 

pairplot 그려보기

sns.pairplot(df[['MPG', 'Cylinders', 'Displacement', 'Horsepower', 'Weight']],diag_kind='kde')
plt.show()

 

 

Data출처 : archive.ics.uci.edu/ml/datasets/Auto+MPG

 

UCI Machine Learning Repository: Auto MPG Data Set

Auto MPG Data Set Download: Data Folder, Data Set Description Abstract: Revised from CMU StatLib library, data concerns city-cycle fuel consumption Data Set Characteristics:   Multivariate Number of Instances: 398 Area: N/A Attribute Characteristics: Cat

archive.ics.uci.edu

 

728x90
반응형