Credit: image found on image source
The code implements the Perceptron, one of the earliest machine learning algorithms for binary classification. It is a linear classifier that:
- Learns a decision boundary by adjusting weights iteratively;
- Uses a threshold-based activation function (step function);
- Updates weights only when misclassifications occur.
- Weights: Learned coefficients for input features (
self.weights). They determine each feature's importance in decision-making; - Bias: Offset term (
self.bias) allowing the model to shift decision boundary away from the origin.
- Learning Rate (
learning_rate): Controls update magnitude during training. A smaller value (default = 0.0001) makes convergence slower but more stable; - Epochs: Number of full passes over the training data. Prevents infinite loops for non-separable data.
- Converts weighted sum to binary output (0/1):
def activation_function(self, weighted_sum: float) -> int:
return 1 if weighted_sum >= 0 else 0- Creates a non-linear decision threshold while maintaining a linear decision boundary.
- Epoch: Full iteration over the entire dataset (outer loop in
train); - Early Stopping: Checks
convergedflag to terminate early if no errors occur.
- Compute weighted sum:
weighted_sum = sum(x * w for x, w in zip(inputs, self.weights)) + self.bias- Apply activation function to get prediction (0/1).
- Error Calculation:
error = target - prediction- Weight Update Rule (Perceptron Learning Rule):
self.weights[i] += self.learning_rate * error * inputs[i]
self.bias += self.learning_rate * error- Updates weights immediately after each sample rather than using batch updates.
- Perceptron Convergence Theorem: Guarantees convergence to a solution if data is linearly separable;
- Implementation:
convergedflag checks if all samples are correctly classified in an epoch.
- The perceptron can only learn linearly separable patterns;
- Decision Boundary: Defined by
w·x + b = 0, a hyperplane in the input space.
- Acts as a trainable offset, equivalent to a weight for a constant input of 1.
- Weights and bias start at zero (
[0.0] * num_inputs, bias: float = 0.0).
- Requires labeled data (
datasetanddesired_outputs); - Learns by comparing predictions to ground truth.
- Prediction: Uses learned weights/bias on new data via
predict()method.
- Binary Classification Only: Outputs 0/1 via step function;
- Linear Boundaries: Cannot handle complex/non-linear patterns;
- Sensitivity to Learning Rate: Poor choice can cause slow convergence or oscillations.
