improving accuracy of logistic model

Question

I am trying to reproduce results from one paper, where authors minimized the following loss function \begin{align} \min_{w \in R^d} \frac{1}{n} \sum_{i \in [n]} log(1 + exp(-y_ix_i^Tw))+\frac{\lambda}{2}\|w\|^2, \end{align} where $w$ are weights are $\lambda$ is regularization parameter for ijcnn1 dataset.

This dataset is specific by unbalanced data (90%-0, 10%-1). As a preprocessing step, I applied MinMaxScaler and StandardScaler.

I use model written in keras, which looks simply as:

model = keras.Sequential([
    keras.layers.Dense(1, activation="sigmoid", kernel_initializer='uniform', kernel_regularizer=regularizers.l2(1e-4), use_bias=True)
])
sgd = optimizers.SGD(lr=0.05, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy'])
model.fit(train_dataset, epochs=1000)

However, I manage to obtain only 91% accuracy at the best. Looking at the prediction, I observe that my model learns to predicts almost everything to be zero. I also tried using class_weight, but it did not seem to help. Does anybody have any suggestions, how to obtain better results?

Are your classes balanced? Or do you have a disproportionate amount of zeros? — Leevo
– Leevo, Commented Mar 30, 2020 at 10:00
no, classes are not balanced, there is disproportionate amount of zeros — user93607
– user93607, Commented Mar 30, 2020 at 10:15

Leevo · Accepted Answer · 2020-03-30 10:19:12Z

1

If you have a disproportionate amount of zeros, it means you models doesn't have enough data in order to learn how to correctly classify observations. Because it sees zeros almost all the time, it probably has learned to output zeros.

The main thing you can do to solve this problem is to train your model using mini-batches, and build them by sampling 0- and 1-observations with equal probability - in other words, attributing greater weight to 1-observations compared to zeros. In that way, your model will be fed with balanced data and will learn to classify them correctly.

answered Mar 30, 2020 at 10:19

Leevo

6,4753 gold badges18 silver badges52 bronze badges

$\begingroup$ Hi, thanks for your recommendation. It helped but I still can not reach higher accuracy than 93%. Are there any other tricks, which I could use? $\endgroup$

user93607
– user93607

2020-03-30 14:26:30 +00:00
Commented Mar 30, 2020 at 14:26
$\begingroup$ Ensembles can usually boost performance. Train more than one model, possibly with different features/architectures, and combine their prediction (simple average should do the job). $\endgroup$

Leevo
– Leevo

2020-03-30 14:58:41 +00:00
Commented Mar 30, 2020 at 14:58

Add a comment |

Stack Exchange Network

improving accuracy of logistic model

1 Answer 1

Your Answer

Hot Network Questions

improving accuracy of logistic model

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions