xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

<bound method NDFrame.describe of       gender race/ethnicity parental level of education         lunch  \
0     female        group B           bachelor's degree      standard   
1     female        group C                some college      standard   
2     female        group B             master's degree      standard   
3       male        group A          associate's degree  free/reduced   
4       male        group C                some college      standard   
...      ...            ...                         ...           ...   
1295    male        group B          associate's degree  free/reduced   
1296    male        group C            some high school  free/reduced   
1297  female        group C                 high school      standard   
1298    male        group B           bachelor's degree      standard   
1299    male        group A                 high school      standard   

     test preparation course  math score  reading score  writing score  
0                       none          72             72             74  
1                  completed          69             90             88  
2                       none          90             95             93  
3                       none          47             57             44  
4                       none          76             78             75  
...                      ...         ...            ...            ...  
1295                    none          67             60             59  
1296                    none          39             38             32  
1297                    none          62             79             77  
1298                    none          65             62             56  
1299               completed          54             50             47  

[1300 rows x 8 columns]>

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

<AxesSubplot:xlabel='math score', ylabel='Count'>

Text(15, 90, 'Šikmost: -0.24\nŠpičatost: 0.1')

<AxesSubplot:xlabel='reading score', ylabel='Count'>

Text(15, 90, 'Šikmost: -0.24\nŠpičatost: -0.13')

<AxesSubplot:xlabel='writing score', ylabel='Count'>

Text(15, 90, 'Šikmost: -0.27\nŠpičatost: -0.04')

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

array([[<AxesSubplot:title={'center':'math'}>,
        <AxesSubplot:title={'center':'reading'}>],
       [<AxesSubplot:title={'center':'writing'}>, <AxesSubplot:>]],
      dtype=object)

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

gender         0
race           0
parents        0
lunch          0
preparation    0
math           0
reading        0
writing        0
dtype: int64

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

<matplotlib.collections.PathCollection at 0x2232fbe98e0>

[Text(0.5, 0, 'Math score'), Text(0, 0.5, 'Writing score')]

Text(0.5, 1.0, 'Závislost skóre z matematiky na skóre z psaní')

xxxxxxxxxx

xxxxxxxxxx

<matplotlib.collections.PathCollection at 0x223300a0550>

[Text(0.5, 0, 'Math score'), Text(0, 0.5, 'Writing score')]

Text(0.5, 1.0, 'Závislost skóre z matematiky na skóre z psaní')

xxxxxxxxxx

<matplotlib.collections.PathCollection at 0x2232fcb1e50>

[Text(0.5, 0, 'Math score'), Text(0, 0.5, 'Reading score')]

Text(0.5, 1.0, 'Závislost skóre z matematiky na skóre z čtení')

xxxxxxxxxx

xxxxxxxxxx

<seaborn.axisgrid.PairGrid at 0x2232fcdb5e0>

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

gender         0
race           0
parents        0
lunch          0
preparation    0
math           0
reading        0
writing        0
math_grade     0
dtype: int64

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

gender         0
race           0
parents        0
lunch          0
preparation    0
math           0
reading        0
writing        0
math_grade     0
dtype: int64

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

gender                   female
race                    group C
parents        some high school
lunch              free/reduced
preparation                none
math                          0
reading                      17
writing                      10
math_grade                    0
Name: 59, dtype: object

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

gender         0
race           0
parents        0
lunch          0
preparation    0
math           0
reading        0
writing        0
math_grade     0
dtype: int64

xxxxxxxxxx

xxxxxxxxxx

gender           object
race             object
parents          object
lunch            object
preparation      object
math              int64
reading           int64
writing           int64
math_grade     category
dtype: object

xxxxxxxxxx

xxxxxxxxxx

borderline    490
good          312
repeat        240
failed        180
excellent      78
Name: math_grade, dtype: int64

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

array(["bachelor's degree", 'some college', "master's degree",
       "associate's degree", 'high school', 'some high school'],
      dtype=object)

array(['group B', 'group C', 'group A', 'group D', 'group E'],
      dtype=object)

xxxxxxxxxx

isMale                     int32
race                      object
parents                   object
lunchStandard              int32
preparationCompleted       int32
reading                    int64
writing                    int64
math_grade              category
dtype: object

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

isMale                           int32
lunchStandard                    int32
preparationCompleted             int32
reading                        float64
writing                        float64
math_grade                    category
race_group A                     uint8
race_group B                     uint8
race_group C                     uint8
race_group D                     uint8
race_group E                     uint8
parents_associate's degree       uint8
parents_bachelor's degree        uint8
parents_high school              uint8
parents_master's degree          uint8
parents_some college             uint8
parents_some high school         uint8
dtype: object

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

isMale                         float64
lunchStandard                  float64
preparationCompleted           float64
reading                        float64
writing                        float64
math_grade                    category
race_group A                   float64
race_group B                   float64
race_group C                   float64
race_group D                   float64
race_group E                   float64
parents_associate's degree     float64
parents_bachelor's degree      float64
parents_high school            float64
parents_master's degree        float64
parents_some college           float64
parents_some high school       float64
dtype: object

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

DummyClassifier(strategy='stratified')

DummyClassifier(strategy='stratified')

array(['borderline', 'borderline', 'failed', ..., 'good', 'good',
       'borderline'], dtype=object)

0.26692307692307693

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

DummyClassifier(strategy='uniform')

DummyClassifier(strategy='uniform')

array(['repeat', 'failed', 'borderline', ..., 'borderline', 'repeat',
       'borderline'], dtype=object)

0.21153846153846154

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

C:\Users\Martin\AppData\Roaming\Python\Python38\site-packages\sklearn\linear_model\_logistic.py:763: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(

LogisticRegression(random_state=42)

xxxxxxxxxx

xxxxxxxxxx

Pipeline(steps=[('standardscaler', StandardScaler()),
                ('logisticregression', LogisticRegression())])

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

0.7076923076923077

xxxxxxxxxx

xxxxxxxxxx

0.678021978021978

xxxxxxxxxx

xxxxxxxxxx

0.643956043956044

xxxxxxxxxx

C:\Users\Martin\AppData\Roaming\Python\Python38\site-packages\sklearn\linear_model\_logistic.py:763: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
C:\Users\Martin\AppData\Roaming\Python\Python38\site-packages\sklearn\linear_model\_logistic.py:763: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
C:\Users\Martin\AppData\Roaming\Python\Python38\site-packages\sklearn\linear_model\_logistic.py:763: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
C:\Users\Martin\AppData\Roaming\Python\Python38\site-packages\sklearn\linear_model\_logistic.py:763: ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(

0.6098901098901099

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

['failed', 'good', 'borderline', 'failed', 'repeat', ..., 'repeat', 'borderline', 'borderline', 'repeat', 'good']
Length: 910
Categories (5, object): ['failed' < 'repeat' < 'borderline' < 'good' < 'excellent']

xxxxxxxxxx

array(['borderline', 'borderline', 'borderline', 'borderline',
       'borderline', 'good', 'borderline', 'borderline', 'borderline',
       'good', 'good', 'borderline', 'borderline', 'borderline', 'good',
       'borderline', 'borderline', 'good', 'good', 'failed', 'borderline',
       'borderline', 'borderline', 'good', 'repeat', 'failed', 'good',
       'borderline', 'failed', 'borderline', 'borderline', 'borderline',
       'failed', 'repeat', 'borderline', 'borderline', 'borderline',
       'borderline', 'good', 'borderline', 'borderline', 'failed',
       'repeat', 'good', 'borderline', 'good', 'borderline', 'failed',
       'failed', 'good', 'good', 'borderline', 'good', 'good',
       'borderline', 'failed', 'borderline', 'borderline', 'borderline',
       'borderline', 'borderline', 'borderline', 'borderline',
       'borderline', 'failed', 'borderline', 'borderline', 'good', 'good',
       'borderline', 'borderline', 'borderline', 'borderline',
       'borderline', 'borderline', 'good', 'borderline', 'borderline',
       'good', 'borderline', 'good', 'failed', 'repeat', 'borderline',
       'failed', 'failed', 'borderline', 'failed', 'borderline', 'failed',
       'failed', 'borderline', 'good', 'borderline', 'good', 'borderline',
       'borderline', 'good', 'borderline', 'good', 'good', 'borderline',
       'good', 'borderline', 'borderline', 'borderline', 'good', 'good',
       'good', 'borderline', 'good', 'failed', 'borderline', 'repeat',
       'repeat', 'borderline', 'failed', 'borderline', 'borderline',
       'borderline', 'borderline', 'borderline', 'borderline',
       'borderline', 'good', 'good', 'failed', 'borderline', 'good',
       'borderline', 'borderline', 'borderline', 'borderline',
       'borderline', 'borderline', 'good', 'borderline', 'borderline',
       'good', 'borderline', 'good', 'borderline', 'borderline',
       'borderline', 'borderline', 'good', 'borderline', 'good', 'repeat',
       'borderline', 'borderline', 'repeat', 'borderline', 'repeat',
       'borderline', 'good', 'borderline', 'borderline', 'borderline',
       'borderline', 'good', 'borderline', 'failed', 'good', 'good',
       'failed', 'failed', 'borderline', 'repeat', 'borderline',
       'borderline', 'failed', 'borderline', 'good', 'borderline',
       'borderline', 'borderline', 'good', 'good', 'good', 'repeat',
       'good', 'borderline', 'borderline', 'borderline', 'borderline',
       'failed', 'good', 'borderline', 'good', 'borderline', 'borderline',
       'borderline', 'borderline', 'borderline', 'borderline', 'good',
       'borderline', 'good', 'failed', 'borderline', 'borderline',
       'borderline', 'borderline', 'borderline', 'failed', 'borderline',
       'good', 'good', 'borderline', 'borderline', 'borderline',
       'borderline', 'borderline', 'borderline', 'good', 'good',
       'borderline', 'borderline', 'borderline', 'good', 'good',
       'borderline', 'good', 'failed', 'good', 'borderline', 'borderline',
       'borderline', 'borderline', 'good', 'borderline', 'failed', 'good',
       'repeat', 'good', 'borderline', 'borderline', 'borderline',
       'borderline', 'borderline', 'borderline', 'borderline',
       'borderline', 'repeat', 'failed', 'borderline', 'borderline',
       'borderline', 'failed', 'borderline', 'borderline', 'borderline',
       'borderline', 'borderline', 'good', 'borderline', 'repeat', 'good',
       'borderline', 'borderline', 'borderline', 'borderline',
       'borderline', 'failed', 'good', 'borderline', 'borderline',
       'borderline', 'borderline', 'borderline', 'borderline', 'failed',
       'borderline', 'good', 'failed', 'borderline', 'borderline', 'good',
       'borderline', 'borderline', 'borderline', 'repeat', 'borderline',
       'repeat', 'borderline', 'borderline', 'good', 'repeat', 'good',
       'borderline', 'borderline', 'good', 'borderline', 'failed', 'good',
       'borderline', 'borderline', 'good', 'borderline', 'failed', 'good',
       'borderline', 'borderline', 'good', 'borderline', 'borderline',
       'good', 'failed', 'borderline', 'repeat', 'failed', 'borderline',
       'repeat', 'borderline', 'borderline', 'borderline', 'borderline',
       'good', 'borderline', 'failed', 'borderline', 'borderline',
       'failed', 'borderline', 'good', 'borderline', 'good', 'borderline',
       'borderline', 'failed', 'failed', 'failed', 'borderline', 'failed',
       'borderline', 'borderline', 'borderline', 'good', 'good', 'repeat',
       'borderline', 'good', 'good', 'borderline', 'borderline',
       'borderline', 'borderline', 'failed', 'good', 'failed', 'good',
       'borderline', 'good', 'repeat', 'borderline', 'failed', 'good',
       'borderline', 'borderline', 'repeat', 'borderline', 'good',
       'borderline', 'borderline', 'borderline', 'borderline',
       'borderline', 'borderline', 'borderline', 'borderline', 'failed',
       'borderline', 'good', 'borderline', 'borderline', 'borderline',
       'borderline', 'borderline', 'good', 'failed', 'good', 'good',
       'good', 'borderline', 'borderline', 'good', 'borderline', 'good',
       'borderline'], dtype=object)

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

RandomForestClassifier()

xxxxxxxxxx

0.6186520947176685

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

SVC(kernel='linear')

xxxxxxxxxx

0.650455373406193

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

C:\Users\Martin\AppData\Roaming\Python\Python38\site-packages\xgboost\sklearn.py:1224: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
  warnings.warn(label_encoder_deprecation_msg, UserWarning)

XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
              gamma=5, gpu_id=-1, importance_type=None,
              interaction_constraints='', learning_rate=0.300000012,
              max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
              monotone_constraints='()', n_estimators=100, n_jobs=-1,
              num_parallel_tree=1, objective='multi:softprob', predictor='auto',
              random_state=0, reg_alpha=0, reg_lambda=1, scale_pos_weight=None,
              subsample=1, tree_method='exact', validate_parameters=1,
              verbosity=0)

xxxxxxxxxx

C:\Users\Martin\AppData\Roaming\Python\Python38\site-packages\xgboost\sklearn.py:1224: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
  warnings.warn(label_encoder_deprecation_msg, UserWarning)
C:\Users\Martin\AppData\Roaming\Python\Python38\site-packages\xgboost\sklearn.py:1224: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
  warnings.warn(label_encoder_deprecation_msg, UserWarning)
C:\Users\Martin\AppData\Roaming\Python\Python38\site-packages\xgboost\sklearn.py:1224: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
  warnings.warn(label_encoder_deprecation_msg, UserWarning)
C:\Users\Martin\AppData\Roaming\Python\Python38\site-packages\xgboost\sklearn.py:1224: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
  warnings.warn(label_encoder_deprecation_msg, UserWarning)
C:\Users\Martin\AppData\Roaming\Python\Python38\site-packages\xgboost\sklearn.py:1224: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].
  warnings.warn(label_encoder_deprecation_msg, UserWarning)

0.6505494505494506

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

dict_keys(['bootstrap', 'ccp_alpha', 'class_weight', 'criterion', 'max_depth', 'max_features', 'max_leaf_nodes', 'max_samples', 'min_impurity_decrease', 'min_impurity_split', 'min_samples_leaf', 'min_samples_split', 'min_weight_fraction_leaf', 'n_estimators', 'n_jobs', 'oob_score', 'random_state', 'verbose', 'warm_start'])

xxxxxxxxxx

xxxxxxxxxx

Fitting 3 folds for each of 16 candidates, totalling 48 fits

xxxxxxxxxx

RandomForestClassifier(max_depth=5, random_state=42)

xxxxxxxxxx

{'max_depth': 5, 'n_estimators': 100}

xxxxxxxxxx

0.6263715187308204

xxxxxxxxxx

Best: 0.626372 using {'max_depth': 5, 'n_estimators': 100}
0.374729 (0.002134) with: {'max_depth': 1, 'n_estimators': 10}
0.374729 (0.002134) with: {'max_depth': 1, 'n_estimators': 100}
0.376929 (0.004569) with: {'max_depth': 1, 'n_estimators': 1000}
0.376929 (0.004569) with: {'max_depth': 1, 'n_estimators': 2000}
0.613159 (0.018480) with: {'max_depth': 5, 'n_estimators': 10}
0.626372 (0.009528) with: {'max_depth': 5, 'n_estimators': 100}
0.624182 (0.004884) with: {'max_depth': 5, 'n_estimators': 1000}
0.623082 (0.009991) with: {'max_depth': 5, 'n_estimators': 2000}
0.590129 (0.021553) with: {'max_depth': 10, 'n_estimators': 10}
0.615403 (0.022261) with: {'max_depth': 10, 'n_estimators': 100}
0.623064 (0.018196) with: {'max_depth': 10, 'n_estimators': 1000}
0.623075 (0.017568) with: {'max_depth': 10, 'n_estimators': 2000}
0.596737 (0.028030) with: {'max_depth': 20, 'n_estimators': 10}
0.604406 (0.009273) with: {'max_depth': 20, 'n_estimators': 100}
0.618692 (0.027885) with: {'max_depth': 20, 'n_estimators': 1000}
0.614296 (0.023779) with: {'max_depth': 20, 'n_estimators': 2000}

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

dict_keys(['C', 'class_weight', 'dual', 'fit_intercept', 'intercept_scaling', 'l1_ratio', 'max_iter', 'multi_class', 'n_jobs', 'penalty', 'random_state', 'solver', 'tol', 'verbose', 'warm_start'])

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

Best: 0.686813 using {'C': 1.0, 'penalty': 'l2', 'solver': 'lbfgs'}
0.684249 (0.033684) with: {'C': 100, 'penalty': 'l2', 'solver': 'newton-cg'}
0.683883 (0.034011) with: {'C': 100, 'penalty': 'l2', 'solver': 'lbfgs'}
0.620879 (0.041483) with: {'C': 100, 'penalty': 'l2', 'solver': 'liblinear'}
0.683516 (0.034331) with: {'C': 100, 'penalty': 'l2', 'solver': 'sag'}
0.683883 (0.035289) with: {'C': 100, 'penalty': 'l2', 'solver': 'saga'}
0.000000 (0.000000) with: {'C': 100, 'penalty': 'l1', 'solver': 'newton-cg'}
0.000000 (0.000000) with: {'C': 100, 'penalty': 'l1', 'solver': 'lbfgs'}
0.621245 (0.041091) with: {'C': 100, 'penalty': 'l1', 'solver': 'liblinear'}
0.000000 (0.000000) with: {'C': 100, 'penalty': 'l1', 'solver': 'sag'}
0.684249 (0.035544) with: {'C': 100, 'penalty': 'l1', 'solver': 'saga'}
0.683516 (0.034448) with: {'C': 100, 'penalty': 'none', 'solver': 'newton-cg'}
0.683516 (0.034448) with: {'C': 100, 'penalty': 'none', 'solver': 'lbfgs'}
0.000000 (0.000000) with: {'C': 100, 'penalty': 'none', 'solver': 'liblinear'}
0.683516 (0.034448) with: {'C': 100, 'penalty': 'none', 'solver': 'sag'}
0.684615 (0.035113) with: {'C': 100, 'penalty': 'none', 'solver': 'saga'}
0.681319 (0.037961) with: {'C': 10, 'penalty': 'l2', 'solver': 'newton-cg'}
0.681685 (0.037264) with: {'C': 10, 'penalty': 'l2', 'solver': 'lbfgs'}
0.618315 (0.040618) with: {'C': 10, 'penalty': 'l2', 'solver': 'liblinear'}
0.681319 (0.037961) with: {'C': 10, 'penalty': 'l2', 'solver': 'sag'}
0.680586 (0.038376) with: {'C': 10, 'penalty': 'l2', 'solver': 'saga'}
0.000000 (0.000000) with: {'C': 10, 'penalty': 'l1', 'solver': 'newton-cg'}
0.000000 (0.000000) with: {'C': 10, 'penalty': 'l1', 'solver': 'lbfgs'}
0.620879 (0.041483) with: {'C': 10, 'penalty': 'l1', 'solver': 'liblinear'}
0.000000 (0.000000) with: {'C': 10, 'penalty': 'l1', 'solver': 'sag'}
0.684982 (0.035020) with: {'C': 10, 'penalty': 'l1', 'solver': 'saga'}
0.683516 (0.034448) with: {'C': 10, 'penalty': 'none', 'solver': 'newton-cg'}
0.683516 (0.034448) with: {'C': 10, 'penalty': 'none', 'solver': 'lbfgs'}
0.000000 (0.000000) with: {'C': 10, 'penalty': 'none', 'solver': 'liblinear'}
0.683516 (0.034448) with: {'C': 10, 'penalty': 'none', 'solver': 'sag'}
0.684982 (0.034673) with: {'C': 10, 'penalty': 'none', 'solver': 'saga'}
0.686447 (0.039083) with: {'C': 1.0, 'penalty': 'l2', 'solver': 'newton-cg'}
0.686813 (0.038566) with: {'C': 1.0, 'penalty': 'l2', 'solver': 'lbfgs'}
0.608425 (0.038434) with: {'C': 1.0, 'penalty': 'l2', 'solver': 'liblinear'}
0.686447 (0.038669) with: {'C': 1.0, 'penalty': 'l2', 'solver': 'sag'}
0.686447 (0.039083) with: {'C': 1.0, 'penalty': 'l2', 'solver': 'saga'}
0.000000 (0.000000) with: {'C': 1.0, 'penalty': 'l1', 'solver': 'newton-cg'}
0.000000 (0.000000) with: {'C': 1.0, 'penalty': 'l1', 'solver': 'lbfgs'}
0.621612 (0.039283) with: {'C': 1.0, 'penalty': 'l1', 'solver': 'liblinear'}
0.000000 (0.000000) with: {'C': 1.0, 'penalty': 'l1', 'solver': 'sag'}
0.682784 (0.037074) with: {'C': 1.0, 'penalty': 'l1', 'solver': 'saga'}
0.683516 (0.034448) with: {'C': 1.0, 'penalty': 'none', 'solver': 'newton-cg'}
0.683516 (0.034448) with: {'C': 1.0, 'penalty': 'none', 'solver': 'lbfgs'}
0.000000 (0.000000) with: {'C': 1.0, 'penalty': 'none', 'solver': 'liblinear'}
0.683516 (0.034448) with: {'C': 1.0, 'penalty': 'none', 'solver': 'sag'}
0.684615 (0.035113) with: {'C': 1.0, 'penalty': 'none', 'solver': 'saga'}
0.652747 (0.039885) with: {'C': 0.1, 'penalty': 'l2', 'solver': 'newton-cg'}
0.652747 (0.039885) with: {'C': 0.1, 'penalty': 'l2', 'solver': 'lbfgs'}
0.597070 (0.034205) with: {'C': 0.1, 'penalty': 'l2', 'solver': 'liblinear'}
0.652747 (0.039885) with: {'C': 0.1, 'penalty': 'l2', 'solver': 'sag'}
0.652747 (0.039885) with: {'C': 0.1, 'penalty': 'l2', 'solver': 'saga'}
0.000000 (0.000000) with: {'C': 0.1, 'penalty': 'l1', 'solver': 'newton-cg'}
0.000000 (0.000000) with: {'C': 0.1, 'penalty': 'l1', 'solver': 'lbfgs'}
0.606960 (0.039792) with: {'C': 0.1, 'penalty': 'l1', 'solver': 'liblinear'}
0.000000 (0.000000) with: {'C': 0.1, 'penalty': 'l1', 'solver': 'sag'}
0.660806 (0.039696) with: {'C': 0.1, 'penalty': 'l1', 'solver': 'saga'}
0.683516 (0.034448) with: {'C': 0.1, 'penalty': 'none', 'solver': 'newton-cg'}
0.683516 (0.034448) with: {'C': 0.1, 'penalty': 'none', 'solver': 'lbfgs'}
0.000000 (0.000000) with: {'C': 0.1, 'penalty': 'none', 'solver': 'liblinear'}
0.683516 (0.034448) with: {'C': 0.1, 'penalty': 'none', 'solver': 'sag'}
0.684615 (0.034419) with: {'C': 0.1, 'penalty': 'none', 'solver': 'saga'}
0.567766 (0.031125) with: {'C': 0.01, 'penalty': 'l2', 'solver': 'newton-cg'}
0.567766 (0.031125) with: {'C': 0.01, 'penalty': 'l2', 'solver': 'lbfgs'}
0.573993 (0.045635) with: {'C': 0.01, 'penalty': 'l2', 'solver': 'liblinear'}
0.567766 (0.031125) with: {'C': 0.01, 'penalty': 'l2', 'solver': 'sag'}
0.567766 (0.031125) with: {'C': 0.01, 'penalty': 'l2', 'solver': 'saga'}
0.000000 (0.000000) with: {'C': 0.01, 'penalty': 'l1', 'solver': 'newton-cg'}
0.000000 (0.000000) with: {'C': 0.01, 'penalty': 'l1', 'solver': 'lbfgs'}
0.383516 (0.011116) with: {'C': 0.01, 'penalty': 'l1', 'solver': 'liblinear'}
0.000000 (0.000000) with: {'C': 0.01, 'penalty': 'l1', 'solver': 'sag'}
0.412088 (0.017433) with: {'C': 0.01, 'penalty': 'l1', 'solver': 'saga'}
0.683516 (0.034448) with: {'C': 0.01, 'penalty': 'none', 'solver': 'newton-cg'}
0.683516 (0.034448) with: {'C': 0.01, 'penalty': 'none', 'solver': 'lbfgs'}
0.000000 (0.000000) with: {'C': 0.01, 'penalty': 'none', 'solver': 'liblinear'}
0.683516 (0.034448) with: {'C': 0.01, 'penalty': 'none', 'solver': 'sag'}
0.684982 (0.034673) with: {'C': 0.01, 'penalty': 'none', 'solver': 'saga'}

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

xxxxxxxxxx

STUDENT PERFORMANCE PREDICTION - Klasifikační úloha¶

Kaggle dataset¶

Importy základních knihoven¶

Zde si modifikujeme interactiveshell, pro output všech řádků¶

Nahrání csv, data insights a exploration¶

Abychom se mohli rozhodnout, jak budeme s daty dále pracovat, zobrazíme si dataset a vlastnosti jeho sloupců pomocí několika následujících řádků kódu¶

Vytvoříme proměnnou scores¶

Vizualizace rozložení známek z testů¶

Transformace - změna názvů sloupců pro lepší manipulaci¶

Zobrazíme si rozložení numerických sloupců i tímto způsobem pro vyzkoušení¶

Kontrola null hodnot¶

V případě chybějících hodnot¶

Ukázka vyřešení doplnění chybějících hodnot¶

Nalezení potenciálních závislostí v datech¶

Zobrazení matice závislostí za využití Seaborn "pairplot"¶

Transformace datasetu¶

Vytvoření nového sloupce¶

Příprava a naplnění "target" sloupce¶

Přepsání sloupce kategoriálními hodnotami¶

Vypíšeme počty jednotlivých známek dle přidělených kategorií¶

Smazání původního sloupce math z datasetu¶

Převod hodnot vybraných vhodných sloupců do binárních hodnot¶

Výpis unikátních hodnot z sloupců Parent a Race pro možnost jejich transformace do numerických hodnot, se kterými lze dále manipulovat¶

Pro normalizaci zbylých kategoriálních sloupců zde využijeme funkce "get_dummies"¶

Normalizace hodnot skóre¶

Vytvoření dataframe "data_prepared", který složíme pro naše potřeby¶

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -¶

VYTVOŘENÍ TRÉNOVACÍCH A TESTOVACÍCH PROMĚNNÝCH A MODELU¶

BASELINE Model pro zjištění potenciální účinnosti našeho modelu¶

Logistická regrese¶

Random Forest¶

SVC¶

XGBoost¶

GridSearch a Ladění hyperparametrů pro RandomForestClassifier¶

Využití GridSearch a hledání klíčových hyperparametrů pro model Logistické regrese¶

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -¶

Závěr¶

Reakce oponentura¶

Lessons learned¶

Poznámky¶

	gender	race/ethnicity	parental level of education	lunch	test preparation course	math score	reading score	writing score
0	female	group B	bachelor's degree	standard	none	72	72	74
1	female	group C	some college	standard	completed	69	90	88
2	female	group B	master's degree	standard	none	90	95	93
3	male	group A	associate's degree	free/reduced	none	47	57	44
4	male	group C	some college	standard	none	76	78	75
5	female	group B	associate's degree	standard	none	71	83	78
6	female	group B	some college	standard	completed	88	95	92
7	male	group B	some college	free/reduced	none	40	43	39
8	male	group D	high school	free/reduced	completed	64	64	67
9	female	group B	high school	free/reduced	none	38	60	50

	isMale	race	parents	lunchStandard	preparationCompleted	reading	writing	math_grade
0	0	group B	bachelor's degree	1	0	72	74	borderline
1	0	group C	some college	1	1	90	88	borderline
2	0	group B	master's degree	1	0	95	93	excellent
3	1	group A	associate's degree	0	0	57	44	failed

	race_group A	race_group B	race_group C	race_group D	race_group E	parents_associate's degree	parents_bachelor's degree	parents_high school	parents_master's degree	parents_some college	parents_some high school
0	0	1	0	0	0	0	1	0	0	0	0
1	0	0	1	0	0	0	0	0	0	1	0
2	0	1	0	0	0	0	0	0	1	0	0
3	1	0	0	0	0	1	0	0	0	0	0
4	0	0	1	0	0	0	0	0	0	1	0
...	...	...	...	...	...	...	...	...	...	...	...
1295	0	1	0	0	0	1	0	0	0	0	0
1296	0	0	1	0	0	0	0	0	0	0	1
1297	0	0	1	0	0	0	0	1	0	0	0
1298	0	1	0	0	0	0	1	0	0	0	0
1299	1	0	0	0	0	0	0	1	0	0	0

	race	parents	lunchStandard	preparationCompleted	reading	writing	math_grade
0	group B	bachelor's degree	1	0	0.72	0.74	borderline
1	group C	some college	1	1	0.90	0.88	borderline
2	group B	master's degree	1	0	0.95	0.93	excellent

	isMale	lunchStandard	preparationCompleted	reading	writing	math_grade	race_group A	race_group B	race_group C	race_group D	race_group E	parents_associate's degree	parents_bachelor's degree	parents_high school	parents_master's degree	parents_some college	parents_some high school
0	0.0	1.0	0.0	0.72	0.74	borderline	0.0	1.0	0.0	0.0	0.0	0.0	1.0	0.0	0.0	0.0	0.0
1	0.0	1.0	1.0	0.90	0.88	borderline	0.0	0.0	1.0	0.0	0.0	0.0	0.0	0.0	0.0	1.0	0.0
2	0.0	1.0	0.0	0.95	0.93	excellent	0.0	1.0	0.0	0.0	0.0	0.0	0.0	0.0	1.0	0.0	0.0
3	1.0	0.0	0.0	0.57	0.44	failed	1.0	0.0	0.0	0.0	0.0	1.0	0.0	0.0	0.0	0.0	0.0
4	1.0	1.0	0.0	0.78	0.75	good	0.0	0.0	1.0	0.0	0.0	0.0	0.0	0.0	0.0	1.0	0.0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
1295	1.0	0.0	0.0	0.60	0.59	borderline	0.0	1.0	0.0	0.0	0.0	1.0	0.0	0.0	0.0	0.0	0.0
1296	1.0	0.0	0.0	0.38	0.32	failed	0.0	0.0	1.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	1.0
1297	0.0	1.0	0.0	0.79	0.77	borderline	0.0	0.0	1.0	0.0	0.0	0.0	0.0	1.0	0.0	0.0	0.0
1298	1.0	1.0	0.0	0.62	0.56	borderline	0.0	1.0	0.0	0.0	0.0	0.0	1.0	0.0	0.0	0.0	0.0
1299	1.0	1.0	1.0	0.50	0.47	repeat	1.0	0.0	0.0	0.0	0.0	0.0	0.0	1.0	0.0	0.0	0.0