9951 explained code solutions for 126 technologies


python-scikit-learnPandas dataframe and Sklearn linear regression example


import pandas as pd
from sklearn import linear_model, model_selection

data = pd.read_csv('/var/www/examples/housing.csv')

X = data.loc[:, ['housing_median_age', 'total_rooms']]
y = data.median_house_value

X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y)

model = linear_model.LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)ctrl + c
import pandas as pd

load Pandas module

from sklearn import

import module from scikit-learn

pd.read_csv

read data from csv file into dataframe

['housing_median_age', 'total_rooms']

select those columns to be used as features

data.median_house_value

target variable

model_selection.train_test_split

splits given X and y datasets to test (25% of values by default) and train (75% of values by default) subsets

linear_model.LinearRegression

initialize linear regression model

.fit(

train model with a given features and target variable dataset

.predict(

predict target variable based on given features dataset


Usage example

import pandas as pd
from sklearn import linear_model, model_selection

data = pd.read_csv('/var/www/examples/housing.csv')

X = data.loc[:, ['housing_median_age', 'total_rooms']]
y = data.median_house_value

X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y)

model = linear_model.LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print(len(y_pred))
output
5160