9951 explained code solutions for 126 technologies


python-scikit-learnHow to split dataset to test and train samples


from sklearn import datasets, model_selection

X, y = datasets.load_diabetes(return_X_y=True)

X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, train_size=0.66)ctrl + c
from sklearn import

import module from scikit-learn

datasets.load_diabetes

loads sample diabetes database

model_selection.train_test_split

splits given X and y datasets to test (25% of values by default) and train (75% of values by default) subsets

train_size

portion of objects to use for train sample (66% in our case)

X_train, X_test

train and test samples for features

y_train, y_test

train and test samples for target value


Usage example

from sklearn import datasets, model_selection

X, y = datasets.load_diabetes(return_X_y=True)

X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, train_size=0.66)

print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)
output
(291, 10)
(291,)
(151, 10)
(151,)