Skip to main content

Questions tagged [perceptron]

An early example of neural network without any hidden layers and with a single (possibly nonlinear) output unit.

Filter by
Sorted by
Tagged with
0 votes
0 answers
38 views

Let $\mathbf{x}_k\in \mathrm{R}^{n\times 1}$ be an $n$-dimensional input to multi-layer perceptron(MLP) at time $t = k$. The output is $\mathbf{x}_{k+1}\in \mathrm{R}^{n\times 1}$ at time $t = k+1$. ...
user146290's user avatar
1 vote
0 answers
47 views

In the context of linear classifiers, such as the perceptron or logistic regression, I understand that the decision boundary is defined by a linear combination of input features and weights, plus a ...
Narges Ghanbari's user avatar
1 vote
2 answers
136 views

I am reading Machine Learning with PyTorch and Ski-kit learn book by Sebastian Raschka While plotting the decision boundary (a line in this case, since the number of features considered = 2) I can't ...
tripma's user avatar
  • 21
5 votes
2 answers
247 views

I am confused about the criteria which determines whether a model is linear or not. As far as I understand, the following statements are equivalent : A model is linear Output class label is a linear ...
TarS's user avatar
  • 53
0 votes
1 answer
105 views

Imagine predicting BMI index like 1,2,3,4,5 and having weight and height as input. I know it can be easily done with other method. Also I have to use sigmoid function and I am really new to this. I ...
Lu Phone Maw's user avatar
4 votes
1 answer
135 views

In the book "Understanding Machine Learning, S. David Ben et al.", the authors describe the Batch Perceptron Algorithm as follows: However, in the book "Python Machine Learning, ...
Tran Khanh's user avatar
2 votes
0 answers
142 views

How would I prove the Perception mistake bound is tight. Avrim Blum’s lecture notes claim that the upper bound for mistakes is $\frac{R}{\gamma}^2$, but I don’t understand how to prove this is mistake ...
Vum's user avatar
  • 21
1 vote
1 answer
90 views

I am using perceptron machine learning to solve the binary classification problem A vs B. For this I have to assign the actual values of A and B to either 1 or -1 to be able to use perceptron. Does it ...
user100000's user avatar
2 votes
1 answer
665 views

I was revisiting neural network basics from this post. The perceptron follows below equation: $$\begin{align} y & = 1 & \text{if } \sum_{i=1}^n w_i\times x_i \geq \theta \\ & = 0 & \...
RajS's user avatar
  • 151
3 votes
0 answers
92 views

I met with "Tikhonov regularization" in two textbooks. The first is "Pattern Recognition and Machine Learning" by Christopher M. Bishop. In page 267 of his book, the regularized ...
zzzhhh's user avatar
  • 333
1 vote
1 answer
1k views

I try to understand the differences between the MP Neuron and the Perceptron. Is my understanding right that the MP Neuron mathematically only differences in the activation function. I.e. the MP ...
yemy's user avatar
  • 159
5 votes
2 answers
284 views

I am learning about perceptrons and how they work. I read that each weight $w_j$ is updated based on the equation: $\begin{equation} w_j:=w_j+\Delta w_j \end{equation} $ Where: $\begin{equation} \...
unno's user avatar
  • 51
1 vote
0 answers
197 views

I got these 3 different plots of decision boundaries using 3 different parameters for hidden_layer_sizes of the MLPClassifier from sklearn on XOR gate. ...
wyc's user avatar
  • 21
3 votes
1 answer
185 views

How the equation 5.80 in _Pattern Recognition and Machine Learning_ by Bishop is derived?
ironhide012's user avatar
3 votes
1 answer
320 views

It's pretty known that when dealing with models (without regularization) the main assumption is $n >> p$ where $p$ is the number of features in the dataset Let's suppose that we have 1.000.000 ...
Alberto's user avatar
  • 1,561
4 votes
1 answer
200 views

I am new to stack overflow and deep learning so I hope I am doing this the right way. I tried to find the solution myself but it has not been successful so I am seeking some help. This is the ...
Bubo's user avatar
  • 43
1 vote
0 answers
1k views

I have an imbalanced data (n = 600, about 97% majority and 3% minority) with 20 features and a binary outcome. The data has been split into a training set and a test set (80%/20%). I used H2o autoML ...
user145331's user avatar
2 votes
0 answers
84 views

While studying machine learning I have known 2 learning models: linear regression and perceptron. I know the difference between the Learning algorithm they use, but the hypothesis set look the same to ...
Dazckel's user avatar
  • 81
1 vote
0 answers
99 views

Say we have a relationship $ z = Wx$ for a multi layer perceptron where $z$ and $x$ are $n$ dimensional vectors. When we find $\frac{dz}{dx}$ , I would assume this would just be $W$, not $W^T$. I was ...
bebop's user avatar
  • 11
1 vote
1 answer
687 views

I'm stumped as to why this example doesn't do a better job fitting the data, I suspect it has to do with my interpretation of the perceptron object's coefficients. Note that I'm interested in the <...
eretmochelys's user avatar
1 vote
1 answer
327 views

To help me with some understanding, I'm trying to learn the Logical AND and Logical OR using Linear Regression trained over the following data: ...
Christian's user avatar
  • 219
0 votes
0 answers
78 views

I thought my MLP (multi-layer perceptron)'s accuracy will increase after tuning. However, the accuracy dropped. Then someone told me that I should add Dropout layers with 50% dropping. I did that. ...
user366312's user avatar
  • 2,077
1 vote
1 answer
833 views

The paper: https://arxiv.org/abs/2110.11309, makes the following claim at the end of page 3: The gradient of loss $L$ with respect to weights $W_l$ of an MLP is a rank-1 matrix for each of B batch ...
Andrew's user avatar
  • 13
2 votes
1 answer
370 views

I'm trying to btter understand the formalism under the following compact formulation of a single-layer perceptron. If we consider $V=\mathbb{R}^d$, then $$\hat{f}(x_1, \dots, x_d) = \sum_{i=1}^Nc_i\...
James Arten's user avatar
0 votes
1 answer
247 views

This is about the contents of section 1.2.1 and 1.2.1.1 of the book "Neural Networks and Deep Learning: A Textbook". The link to the sections is here. The question arises from the following ...
zzzhhh's user avatar
  • 333
1 vote
1 answer
241 views

From Goodfellow et al.'s Deep Learning book: Several key concepts arose during (...) the 1980s that remain central to today’s deep learning. One of these concepts is that of distributed ...
Saucy Goat's user avatar
1 vote
1 answer
299 views

I read code in book "Hand-on Machine Learning in Sklearn and TensorFlow" by Aurelien Geron ...
Tan Phan's user avatar
  • 113
22 votes
1 answer
7k views

The ReLU function is commonly used as an activation function in machine learning, as well, as its modifications (ELU, leaky ReLU). The overall idea of these functions is the same: before ...
MefAldemisov's user avatar
2 votes
1 answer
285 views

I have a following thought problem involving perceptron and binary classification that I wonder if anyone has thought about before. This is not from any textbook or reference, although I doubt I'm the ...
Olórin's user avatar
  • 744
0 votes
1 answer
43 views

I understand the math but I want to make sure I understand the mapping back to real world scenarios. Thinking about it logically, I cannot think of a real world scenario where you would have a ...
Grant Curell's user avatar
1 vote
1 answer
1k views

I'm reading Hands-On Machine Learning and the author states that: You may have noticed the fact that the Perceptron learning algorithm strongly resembles Stochastic Gradient Descent. In fact, Scikit-...
Ng Lok Chun's user avatar
1 vote
0 answers
161 views

I am trying to build a binary classifier using a MLP with the Keras package in R. My question is, why does the package require the labels to be a one-hot vector? For example, the value 1 will be the ...
383930283423's user avatar
3 votes
1 answer
206 views

I am trying to predict the amount of energy demand(Wh) of the next two weeks per hour. The dataset I have, contains each hour of each day since 2019 of the energy demand, is something like this: ...
ivan's user avatar
  • 53
1 vote
1 answer
970 views

I've been following an algorithm described on a book called Knowledge Discovery with Support Vector Machines by Lutz H. Hamel. In the book, there is this learning algorithm for a single perceptron ...
Burak Kaymakci's user avatar
3 votes
1 answer
159 views

I've always been a bit confused when it comes to Deep Learning terminology. Is the definition of the perceptron, whether single layer or multi layer, associated with a specific type of activation ...
Kamal Raydan's user avatar
1 vote
0 answers
120 views

I'm designing an MLP classifier and I've been noticing that: Using a very shallow network, or one whose at least one layer has a small number of neurons yields bad performance Using a deep network ...
Mefitico's user avatar
  • 111
1 vote
1 answer
601 views

If by following way single perceptron is made to work like Logistic Regression. How much correct is it to say that I made perceptron to work as Logistic Regression. Question came to mind as ...
Ajey's user avatar
  • 185
1 vote
0 answers
116 views

Matrices are good objects to store connections between dimensions/entities. However, matrix computation is often time consuming and sometimes wasteful if matrix is too sparse. Also thinking about the ...
metron's user avatar
  • 111
0 votes
0 answers
87 views

Why don't I see a GRU anywhere with more than one layer of perceptrons inside, it's pretty obvious to try to put more layers in there, but I don't see anyone doing that
xvel's user avatar
  • 1
2 votes
1 answer
913 views

In a binary classification problem, if both logistic regression and a single preceptron uses Sigmoid function, what's the difference in classification results, since they will have the same decision ...
denali's user avatar
  • 21
1 vote
1 answer
433 views

I have been wondering whether a convolution can be represented in terms of an MLP. We can say that in convolution we have shared parameters between different neurons. But how to express this ...
Nomaan Qureshi's user avatar
8 votes
1 answer
2k views

In most work I've seen, MLPs (multilayer perceptron, the most typical feedforward neural network) and RBF (radial basis function) networks are compared as distinct models, where MLP neuron outputs $\...
Christabella Irwanto's user avatar
1 vote
1 answer
2k views

I am training a neural network for a regression task, where the dependent variable varies in the range from $0$ to $10$. Unsurprizingly, with the test data set, I obtain the predictions that fall ...
Roger V.'s user avatar
  • 5,091
2 votes
1 answer
911 views

I am wondering if it is at all possible to plot a 4D perceptron line in 2D. Obviously, it would be impossible to observe it with all of its original information, but is there a way for me to observe ...
Max's user avatar
  • 23
2 votes
1 answer
430 views

The perceptron training algorithm is summarized as: Apply the inputs and calculate the output $ y $ Compare with the desired output yd and calculate error $e = y-y_d$ Update the weights based on the ...
Osama El-Ghonimy's user avatar
5 votes
2 answers
2k views

In the SciKit documentation of the MLP classifier, there is the early_stopping flag which allows to stop the learning if there is not any improvement in several ...
volperossa's user avatar
0 votes
1 answer
853 views

I'm wondering if anybody can explain how Rosenblatt reached his formula for updating the weights of his Perceptron: $\textbf{w}_{t+1} = \textbf{w}_{t} +\eta ( y_j - \hat{y}_j ) \textbf{x}_j$ It seems ...
Berthrand Eros's user avatar
1 vote
1 answer
253 views

I want to use a multilevel logistic regression for a double purpose, estimating the value of coefficients to explain a phenomenon. At the same time, I want to split the data through cross-validation ...
Andres Martinez's user avatar