Self-Taught Learning
From Ufldl
(→Overview) |
|||
Line 1: | Line 1: | ||
== Overview == | == Overview == | ||
- | In machine learning, sometimes it's not who has the best algorithm that wins | + | In machine learning, one of the most reliable ways to get better performance is |
+ | to give your algorithms more data. This has led to the that aphorism that in | ||
+ | machine learning, "sometimes it's not who has the best algorithm that wins; it's | ||
+ | who has the most data." | ||
- | + | One can always try to get more labeled data, but this can be expensive. In | |
particular, researchers have already gone to extraordinary lengths to use tools | particular, researchers have already gone to extraordinary lengths to use tools | ||
such as AMT (Amazon Mechanical Turk) to get large training sets. While having | such as AMT (Amazon Mechanical Turk) to get large training sets. While having | ||
Line 12: | Line 15: | ||
from ''unlabeled'' data, then we can easily obtain and learn from massive | from ''unlabeled'' data, then we can easily obtain and learn from massive | ||
amounts of it. Even though a single unlabeled example is less informative than | amounts of it. Even though a single unlabeled example is less informative than | ||
- | a single labeled example, if we can get tons of the former---for example, | + | a single labeled example, if we can get tons of the former---for example, by downloading |
- | + | random unlabeled images/audio clips/text documents off the | |
internet---and if our algorithms can exploit this unlabeled data effectively, | internet---and if our algorithms can exploit this unlabeled data effectively, | ||
then we might be able to achieve better performance than the massive | then we might be able to achieve better performance than the massive | ||
- | hand-engineering and massive hand-labeling | + | hand-engineering and massive hand-labeling approaches. |
- | approaches. | + | |
In Self-taught learning and Unsupervised feature learning, we will give our | In Self-taught learning and Unsupervised feature learning, we will give our | ||
Line 23: | Line 25: | ||
representation of the input. If we are trying to solve a specific | representation of the input. If we are trying to solve a specific | ||
classification task, then we take this learned feature representation and | classification task, then we take this learned feature representation and | ||
- | whatever labeled data we have for that classification task, and apply | + | whatever (perhaps small amount of) labeled data we have for that classification task, and apply |
supervised learning on that labeled data to solve the classification task. | supervised learning on that labeled data to solve the classification task. | ||
These ideas are probably most powerful in settings where we have a lot of | These ideas are probably most powerful in settings where we have a lot of | ||
unlabeled data, and a relatively smaller amount of labeled data. However, | unlabeled data, and a relatively smaller amount of labeled data. However, | ||
- | these models | + | these models often given good results even if we have only |
labeled data (in which case we usually perform the feature learning step using | labeled data (in which case we usually perform the feature learning step using | ||
the labeled data, but ignoring the labels). | the labeled data, but ignoring the labels). | ||
+ | <!-- | ||
In terms of terminology, there are two common unsupervised feature learning | In terms of terminology, there are two common unsupervised feature learning | ||
settings, depending on what type of unlabeled data you have. Lets explain this | settings, depending on what type of unlabeled data you have. Lets explain this | ||
Line 40: | Line 43: | ||
is just missing its label (so you don't know which ones are cars, and which | is just missing its label (so you don't know which ones are cars, and which | ||
ones are motorcycles), then you could use that data to learn the features. | ones are motorcycles), then you could use that data to learn the features. | ||
- | This setting---where each unlabeled | + | This setting---where each unlabeled example is drawn from the same |
distribution as your labeled examples (and thus can be labeled either "car" | distribution as your labeled examples (and thus can be labeled either "car" | ||
or "motorcycle")---is usually called the '''semi-supervised''' setting; | or "motorcycle")---is usually called the '''semi-supervised''' setting; | ||
Line 52: | Line 55: | ||
'''self-taught learning''' setting. In the self-taught learning setting, it is far easier to | '''self-taught learning''' setting. In the self-taught learning setting, it is far easier to | ||
obtain large amounts of unlabeled data, and thus leverage the potential of | obtain large amounts of unlabeled data, and thus leverage the potential of | ||
- | learning from massive amounts of data. | + | learning from massive amounts of data. |
+ | !--> | ||
== Learning features == | == Learning features == |