次の図は、特徴 x とラベル y の間に直線関係があるノイズのあるデータセットを示しています。この図は、正則化を適用せずにこのデータセットでトレーニングされたディシジョン ツリーも示しています。このモデルは、すべてのトレーニング サンプルを正しく予測します(モデルの予測はトレーニング サンプルと一致します)。ただし、同じ線形パターンと異なるノイズ インスタンスを含む新しいデータセットでは、モデルのパフォーマンスは低下します。
[[["わかりやすい","easyToUnderstand","thumb-up"],["問題の解決に役立った","solvedMyProblem","thumb-up"],["その他","otherUp","thumb-up"]],[["必要な情報がない","missingTheInformationINeed","thumb-down"],["複雑すぎる / 手順が多すぎる","tooComplicatedTooManySteps","thumb-down"],["最新ではない","outOfDate","thumb-down"],["翻訳に関する問題","translationIssue","thumb-down"],["サンプル / コードに問題がある","samplesCodeIssue","thumb-down"],["その他","otherDown","thumb-down"]],["最終更新日 2025-02-25 UTC。"],[[["\u003cp\u003eDecision trees are prone to overfitting, especially with noisy data, leading to poor generalization on unseen data.\u003c/p\u003e\n"],["\u003cp\u003eRegularization techniques like setting a maximum depth, minimum examples per leaf, and pruning can mitigate overfitting.\u003c/p\u003e\n"],["\u003cp\u003eHyperparameter tuning, such as using cross-validation to optimize maximum depth and minimum examples, further enhances model performance.\u003c/p\u003e\n"],["\u003cp\u003eDecision trees offer direct interpretability, but their structure can be sensitive to changes in the training data.\u003c/p\u003e\n"],["\u003cp\u003eWhile interpretable, decision trees can still exhibit overfitting despite regularization efforts, a challenge addressed by more advanced models like decision forests.\u003c/p\u003e\n"]]],[],null,["\u003cbr /\u003e\n\nOverfitting and pruning\n\nUsing the algorithm described above, we can train a decision tree that will\nperfectly classify training examples, assuming the examples are separable.\nHowever, if the dataset contains noise, this tree will overfit to the data\nand show poor test accuracy.\n\nThe following figure shows a noisy dataset with a linear relation between a\nfeature x and the label y. The figure also shows a decision tree trained on this\ndataset without any type of regularization. This model correctly predicts all\nthe training examples (the model's prediction match the training\nexamples). However, on a new dataset containing the same linear pattern and a\ndifferent noise instance, the model would perform poorly.\n\n**Figure 12. A noisy dataset.**\n\nTo limit overfitting a decision tree, apply one or both of the following\n[regularization](https://developers.google.com/machine-learning/glossary#regularization)\ncriteria while training the decision tree:\n\n- **Set a maximum depth:** Prevent decision trees from growing past a maximum depth, such as 10.\n- **Set a minimum number of examples in leaf:** A leaf with less than a certain number of examples will not be considered for splitting.\n\nThe following figure illustrates the effect of differing minimum number of\nexamples per leaf. The model captures less of the noise.\n\n**Figure 13. Differing minimum number of examples per leaf.**\n\nYou can also regularize after training by selectively removing (pruning) certain\nbranches, that is, by converting certain non-leaf nodes to leaves. A common\nsolution to select the branches to remove is to use a validation dataset. That\nis, if removing a branch improves the quality of the model on the validation\ndataset, then the branch is removed.\n\nThe following drawing illustrates this idea. Here, we test if the validation\naccuracy of the decision tree would be improved if the non-leaf green node\nwas turned into a leaf; that is, pruning the orange nodes.\n\n**Figure 14. Pruning a condition and its children into a leaf.**\n\nThe following figure illustrates the effect of using 20% of the dataset as\nvalidation to prune the decision tree:\n\n**Figure 15. Using 20% of the dataset to prune the decision tree.**\n\nNote that using a validation dataset reduces the number of examples available\nfor the initial training of the decision tree.\n\nMany model creators apply multiple criteria. For example, you could do all of\nthe following:\n\n- Apply a minimum number of examples per leaf.\n- Apply a maximum depth to limit the growth of the decision tree.\n- Prune the decision tree.\n\nYDF Code\nIn YDF, the learning algorithms are pre-configured with default values for all the pruning hyperparameters. For example, here are the default values for two pruning hyperparameters:\n\n- The minimum number of examples is 5 (`min_examples = 5`)\n- 10% of the training dataset is retained for validation (`validation_ratio = 0.1`).\n\nYou can disable pruning with the validation dataset by setting `validation_ratio=0.0`.\n\nThose criteria introduce new hyperparameters that need to be tuned (e.g. maximum\ntree depth), often with automated hyperparameter tuning. Decision trees are\ngenerally fast enough to train to use hyperparameter tuning with\ncross-validation. For example, on a dataset with \"n\" examples:\n\n- Divide the training examples into p non-overlapping groups. For example: `p=10`.\n- For all the possible hyperparameter values; for example, max depth in {3,5,6,7,8,9}, min examples in {5,8,10,20}.\n - Evaluate, on each group, the quality of a decision tree trained on the other p-1 groups.\n - Average the evaluation across the groups.\n- Select the hyperparameter value with the best averaged evaluation.\n- Train a final decision tree using all the \"n\" examples with the selected hyperparameters.\n\nIn this section we discussed the ways decision trees limit overfitting. Despite\nthese methods, underfitting and overfitting are major weaknesses of decision\ntrees. Decision forests introduce new methods to limit overfitting, which we\nwill see later.\n\nDirect decision tree interpretation\n\nDecision trees are easily interpretable. That said, changing even a few\nexamples can completely change the structure---and therefore the\ninterpretation---of the decision tree.\n| **Note:** Especially when the dataset contains many somehow similar features, the learned decision tree is only *one* of multiple more-or-less equivalent decision trees that fit the data.\n\nBecause of the way decision trees are built, partitioning the training\nexamples, one can use a decision tree to interpret the dataset itself\n(as opposed to the model). Each leaf represents a particular corner of\nthe dataset. \nYDF Code\nIn YDF, you can look at trees with the `model.describe()` function. You can also access and plot individual tree with `model.get_tree()`. See [YDF's model inspection tutorial](https://ydf.readthedocs.io/en/latest/tutorial/inspecting_trees/) for more details.\n\nHowever, indirect interpretation is also informative."]]