I am trying to reproduce results from one paper, where authors minimized the following loss function \begin{align} \min_{w \in R^d} \frac{1}{n} \sum_{i \in [n]} log(1 + exp(-y_ix_i^Tw))+\frac{\lambda}{2}\|w\|^2, \end{align} where $w$ are weights are $\lambda$ is regularization parameter for ijcnn1 dataset.
This dataset is specific by unbalanced data (90%-0, 10%-1). As a preprocessing step, I applied MinMaxScaler and StandardScaler.
I use model written in keras, which looks simply as:
model = keras.Sequential([
keras.layers.Dense(1, activation="sigmoid", kernel_initializer='uniform', kernel_regularizer=regularizers.l2(1e-4), use_bias=True)
])
sgd = optimizers.SGD(lr=0.05, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy'])
model.fit(train_dataset, epochs=1000)
However, I manage to obtain only 91% accuracy at the best. Looking at the prediction, I observe that my model learns to predicts almost everything to be zero. I also tried using class_weight, but it did not seem to help. Does anybody have any suggestions, how to obtain better results?