Questions tagged [scalability]
The scalability tag has no summary.
33 questions
2 votes
0 answers
519 views
Is there a need to use scaling for Age attribute?
What is the good way working with 'Age' attribute? Don't touch it or should it be scaled? Below photo shows my results 'Before' and 'After' standardization.
2 votes
1 answer
152 views
Where and how to do large scale supervised machine learning?
I'm beginner in ML and I have a large dataset that has 15 features with 6M rows, so it becomes challenging to work on it locally. I can train one model locally but to perform hyper parameter tuning ...
3 votes
0 answers
183 views
Clustering large set of images
I've got some big datasets of images (a few million each), and I would like to cluster them according to images' visual similarities. I've extracted a feature vector for each image; the space of ...
2 votes
0 answers
29 views
Resource-unintensive (low complexity) methods for large-scale unsupervised clustering?
I'm working on an issue where I need to cluster user types on a scale in an unsupervised manner. I've been looking at the basics like KNN and K-means etc., but I found it hard to scale, as these ...
5 votes
3 answers
403 views
How can one quickly look up people from a large database?
Vocabulary Face detection: Finding all faces in an image. Face representation: The simplest way to represent a face is as an image (pixels / color values). This is not very space efficient and likely ...
2 votes
5 answers
395 views
Face Recognition (Scalability Issue)
Background I would like to build a face recognition model for registration and login for some kind of service. For example, using this approach (CNN + SVM). When a new user wants to register a ...
1 vote
1 answer
1k views
Scaling DBSCAN clustering - minHash?
Applying density based clustering (DBSCAN) on $50k$ data points and about $2k$-$4k$ features, I achieve the desired results. However, scaling this to $10$ million data points requires a creatively ...
0 votes
1 answer
118 views
Distance measure calculation addresses for record linking
At the moment we use different methods for record linking locations in different datasets. Theoretically given two locations we can give a prediction on how well they match (are the same). This is ...