Data mining small datasets

vabm

I am new in data mining. For what I understand most techniques are intended to be used with large data sets, but I am curious to know if this is a must or just a general rule. In other words, is it ok to use data mining techniques in small data sets? Most examples work in small tables, but are there any limitations? Why?

Has QUIT--Anony-Mousse

Most data mining techniques are statistical approaches.

To get significant patterns, you need enough data. Otherwise anything measures may as well just be random deviations due to chance. The more data you have, the better your patterns could be.

But most data isn't "big" in the sense of "big data": a lot of methods would not scale to really big data sets. In most cases, you only have a few thousand (not a few exabyte) of data; in particular after preprocessing the data into the desired format.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Construction of DecisionTree in data mining

From Dev

Data mining with csv (python)

From Dev

elasticsearch - planning data mining and metrics

From Dev

Text mining a total mess of data

From Dev

R and data mining not enough memory?

From Dev

Choosing Attributes for Data Mining Algorithm

From Dev

Data Mining - K nearest neighbor

From Dev

Mining massive data sets in Python

From Dev

Data Mining Using dictionaries in python

From Dev

What is the difference between Big Data and Data Mining?

From Dev

Qualitative data analysis using data mining techniques

From Dev

data mining with unstructured data how to implement?

From Dev

Awfully slow execution on a small datasets – where to start debugging?

From Dev

Data mining with postgres in production environment - is there a better way?

From Dev

Minimum support and minimum confidence in Data Mining

From Dev

What are the different pattern evaluation measures in data mining?

From Dev

Read HTML code into R for data & text mining

From Dev

SparkR - Creating Test and Train DataFrames for Data Mining

From Dev

Read HTML code into R for data & text mining

From Dev

SparkR - Creating Test and Train DataFrames for Data Mining

From Dev

Entrez and RISmed library for pubmed data mining

From Dev

Error correction with small data

From Dev

Data mining: Representing data in transactional/data matrix form

From Dev

Datasets for benchmarking Fuzzy Clustering method with millions of data

From Dev

Python: How to sample data into Test and Train datasets?

From Dev

Populating missing data using one of the concatenated datasets

From Dev

Extracting/subsetting data in R based on separate datasets

From Dev

Handling large datasets with data-driven tests

From Dev

Is there any data-mining/text-mining/machine learning techniques to find the most appropriate Tags for a given document

Related Related

  1. 1

    Construction of DecisionTree in data mining

  2. 2

    Data mining with csv (python)

  3. 3

    elasticsearch - planning data mining and metrics

  4. 4

    Text mining a total mess of data

  5. 5

    R and data mining not enough memory?

  6. 6

    Choosing Attributes for Data Mining Algorithm

  7. 7

    Data Mining - K nearest neighbor

  8. 8

    Mining massive data sets in Python

  9. 9

    Data Mining Using dictionaries in python

  10. 10

    What is the difference between Big Data and Data Mining?

  11. 11

    Qualitative data analysis using data mining techniques

  12. 12

    data mining with unstructured data how to implement?

  13. 13

    Awfully slow execution on a small datasets – where to start debugging?

  14. 14

    Data mining with postgres in production environment - is there a better way?

  15. 15

    Minimum support and minimum confidence in Data Mining

  16. 16

    What are the different pattern evaluation measures in data mining?

  17. 17

    Read HTML code into R for data & text mining

  18. 18

    SparkR - Creating Test and Train DataFrames for Data Mining

  19. 19

    Read HTML code into R for data & text mining

  20. 20

    SparkR - Creating Test and Train DataFrames for Data Mining

  21. 21

    Entrez and RISmed library for pubmed data mining

  22. 22

    Error correction with small data

  23. 23

    Data mining: Representing data in transactional/data matrix form

  24. 24

    Datasets for benchmarking Fuzzy Clustering method with millions of data

  25. 25

    Python: How to sample data into Test and Train datasets?

  26. 26

    Populating missing data using one of the concatenated datasets

  27. 27

    Extracting/subsetting data in R based on separate datasets

  28. 28

    Handling large datasets with data-driven tests

  29. 29

    Is there any data-mining/text-mining/machine learning techniques to find the most appropriate Tags for a given document

HotTag

Archive