• Empleos
  • Sobre nosotros
  • profesionales
    • Inicio
    • Empleos
    • Cursos y retos
  • empresas
    • Inicio
    • Publicar vacante
    • Nuestro proceso
    • Precios
    • Evaluaciones
    • Nómina
    • Blog
    • Comercial
    • Calculadora de salario

0

289
Vistas
Using sklearn cross_val_score and kfolds to fit and help predict model

I'm trying to understand using kfolds cross validation from the sklearn python module.

I understand the basic flow:

  • instantiate a model e.g. model = LogisticRegression()
  • fitting the model e.g. model.fit(xtrain, ytrain)
  • predicting e.g. model.predict(ytest)
  • use e.g. cross val score to test the fitted model accuracy.

Where i'm confused is using sklearn kfolds with cross val score. As I understand it the cross_val_score function will fit the model and predict on the kfolds giving you an accuracy score for each fold.

e.g. using code like this:

kf = KFold(n=data.shape[0], n_folds=5, shuffle=True, random_state=8)
lr = linear_model.LogisticRegression()
accuracies = cross_val_score(lr, X_train,y_train, scoring='accuracy', cv = kf)

So if I have a dataset with training and testing data, and I use the cross_val_score function with kfolds to determine the accuracy of the algorithm on my training data for each fold, is the model now fitted and ready for prediction on the testing data? So in the case above using lr.predict

about 3 years ago · Santiago Trujillo
1 Respuestas
Responde la pregunta

0

No the model is not fitted. Looking at the source code for cross_val_score:

scores=parallel(delayed(_fit_and_score)(clone(estimator),X,y,scorer,
                                        train,test,verbose,None,fit_params)

As you can see, cross_val_score clones the estimator before fitting the fold training data to it. cross_val_score will give you output an array of scores which you can analyse to know how the estimator performs for different folds of the data to check if it overfits the data or not. You can know more about it here

You need to fit the whole training data to the estimator once you are satisfied with the results of cross_val_score, before you can use it to predict on test data.

about 3 years ago · Santiago Trujillo Denunciar
Responde la pregunta
Encuentra empleos remotos

¡Descubre la nueva forma de encontrar empleo!

Top de empleos
Top categorías de empleo
Empresas
Publicar vacante Precios Nuestro proceso Comercial
Legal
Términos y condiciones Política de privacidad
© 2025 PeakU Inc. All Rights Reserved.

Andres GPT

Recomiéndame algunas ofertas
Necesito ayuda