You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+14-2Lines changed: 14 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -141,7 +141,13 @@ This repository contains machine learning programs in the Python programming lan
141
141
<h3>ii) Clustering</h3>
142
142
--> Clustering is a data mining technique which groups unlabeled data based on their similarities or differences.<br><br>
143
143
--> Clustering algorithms are used to process raw, unclassified data objects into groups represented by structures or patterns in the information.<br><br>
144
-
--> Clustering algorithms can be categorized into a few types, specifically exclusive, overlapping, hierarchical, and probabilistic.<br><br>
144
+
--> Clustering algorithms can be categorized into a few types, specifically exclusive, overlapping, hierarchical, and probabilistic.<br>
145
+
<h3>Types of Clustering</h3>
146
+
<h4>1. K Means Clustering</h4>
147
+
--> K-Means Clustering is an unsupervised machine learning algorithm.<br><br>
148
+
--> Its objective is to group data points into K clusters to minimize the variance within each cluster.<br><br>
149
+
--> The process involves iteratively assigning data points to the nearest cluster centroid and updating the centroids until convergence.<br><br>
150
+
--> K-Means is commonly applied in various domains such as customer segmentation, image compression, and anomaly detection.<br><br>
145
151
146
152
<h3>iii) Dimentionality Reduction</h3>
147
153
--> Dimensionality reduction is a technique used when the number of features, or dimensions, in a given dataset is too high.<br><br>
@@ -262,7 +268,13 @@ This repository contains machine learning programs in the Python programming lan
262
268
--> The dataset is typically split into 50,000 training images and 10,000 test images.<br><br>
263
269
--> Common classes in CIFAR-10 include airplanes, automobiles, birds, cats, dogs, and more.<br><br>
264
270
--> The primary purpose of CIFAR-10 is for image classification and object recognition.<br><br>
265
-
--> Researchers and developers often use it to benchmark and evaluate machine learning and deep learning algorithms.
271
+
--> Researchers and developers often use it to benchmark and evaluate machine learning and deep learning algorithms.<br>
272
+
273
+
<h2>Mall Customers Dataset</h2>
274
+
--> Dataset is taken from: <ahref="https://www.kaggle.com/datasets/vjchoudhary7/customer-segmentation-tutorial-in-python/data"><imgsrc="https://cdn4.iconfinder.com/data/icons/logos-and-brands/512/189_Kaggle_logo_logos-1024.png"height =40width=40title="Housing Dataset"alt="Housing Dataset"> </a><br><br>
275
+
--> Contains Mall Customers data for Clustering.<br><br>
276
+
--> Gender, Age, Annual Income (k$) and Spending Score (1-100) columns are used to cluster data points.<br><br>
277
+
--> Dataset is already cleaned,no preprocessing required.
0 commit comments