Jan 27, 2022 09:59
2 yrs ago
38 viewers *
English term
k-means++ seeding
English
Tech/Engineering
Computers (general)
Detecting Meaningful Clusters from High-dimensional Data: A Strongly Consistent Sparse Center-based Clustering Approach
Algorithm 1 gives a formal description of the LW-kmeans algorithm. The algorithm is initiated by randomly
choosing the k initial cluster centroids from from the n
datapoints. A k-means++ seeding [61] is also possible and
leads to slight improvement in the results as shown in
Section 6 of the supplement.
choosing the k initial cluster centroids from from the n
datapoints. A k-means++ seeding [61] is also possible and
leads to slight improvement in the results as shown in
Section 6 of the supplement.
References
(k-means++) (seeding) | Helena Chavarria |
Reference comments
55 mins
Reference:
(k-means++) (seeding)
Theorem 1.1. For any set of data points, E[φ] ≤ 8(ln k + 2)φOP T .
This sampling is both fast and simple, and it already achieves approximation guarantees that k-means cannot. We propose using it to seed the initial centers for k-means, leading to a combined algorithm we call k-means++
https://theory.stanford.edu/~sergei/papers/kMeansPP-soda.pdf
I've no idea what it means, but it's two terms: 'k-means++' plus 'seeding'
--------------------------------------------------
Note added at 58 mins (2022-01-27 10:57:41 GMT)
--------------------------------------------------
n data mining, k-means++ is an algorithm for choosing the initial values (or "seeds") for the k-means clustering algorithm. It was proposed in 2007 by David Arthur and Sergei Vassilvitskii, as an approximation algorithm for the NP-hard k-means problem—a way of avoiding the sometimes poor clusterings found by the standard k-means algorithm. It is similar to the first of three seeding methods proposed, in independent work, in 2006 by Rafail Ostrovsky, Yuval Rabani, Leonard Schulman and Chaitanya Swamy. (The distribution of the first seed is different.)
https://en.wikipedia.org/wiki/K-means++
--------------------------------------------------
Note added at 3 hrs (2022-01-27 13:44:28 GMT)
--------------------------------------------------
No, I'm afraid I can't refer you to the 'best dictionary for understanding words'. That's one of the reasons why translation is so difficult. You might be lucky and find an online glossary but usually translators need to use places like ProZ, dictionaries, websites, personal experience and common sense. I'm sorry I can't help you.
This sampling is both fast and simple, and it already achieves approximation guarantees that k-means cannot. We propose using it to seed the initial centers for k-means, leading to a combined algorithm we call k-means++
https://theory.stanford.edu/~sergei/papers/kMeansPP-soda.pdf
I've no idea what it means, but it's two terms: 'k-means++' plus 'seeding'
--------------------------------------------------
Note added at 58 mins (2022-01-27 10:57:41 GMT)
--------------------------------------------------
n data mining, k-means++ is an algorithm for choosing the initial values (or "seeds") for the k-means clustering algorithm. It was proposed in 2007 by David Arthur and Sergei Vassilvitskii, as an approximation algorithm for the NP-hard k-means problem—a way of avoiding the sometimes poor clusterings found by the standard k-means algorithm. It is similar to the first of three seeding methods proposed, in independent work, in 2006 by Rafail Ostrovsky, Yuval Rabani, Leonard Schulman and Chaitanya Swamy. (The distribution of the first seed is different.)
https://en.wikipedia.org/wiki/K-means++
--------------------------------------------------
Note added at 3 hrs (2022-01-27 13:44:28 GMT)
--------------------------------------------------
No, I'm afraid I can't refer you to the 'best dictionary for understanding words'. That's one of the reasons why translation is so difficult. You might be lucky and find an online glossary but usually translators need to use places like ProZ, dictionaries, websites, personal experience and common sense. I'm sorry I can't help you.
Note from asker:
Thanks can you refer me to the best dictionary for understanding words? |
Discussion
"انتخاب هوشمندانه مراکز اولیه در الگوریتم خوشه بندی K-means بهمنظور بهبود تشخیص موضوع"
https://jcsit.ir/article/49
There is also partial clustering seeding - this is where K-means ++ seeding comes in. What the k-means++ algorithm does is generate the initial seeds (starting centroid points) which are then fed into the K-means algorithm instead of randomly chosen seeds. This generally results in more accurately defined clusters than if you started off with randomly chosen seeds (starting centroid points).
https://www.csc.kth.se/utbildning/kth/kurser/DD143X/dkand13/... (pages 1-8)
and this link has some good graphics https://devopedia.org/k-means-clustering
Pick the first center randomly from the given points. After picking (i-1) centers, pick the ith center to be a point p with probability proportional to the square of the Euclidean distance of p to the closest previously (i − 1) chosen centers."
from https://www.google.co.uk/books/edition/Theory_and_Applicatio... (page 7)
Some more information at the bottom of this page
https://www.mathworks.com/help/stats/kmeans.html
and in this piece
https://medium.com/@srv96/kmeans-a-careful-seeding-technique...