Final answer:
In regression tree pruning, it's shown that after pruning a subtree at node t to a single node, upstream nodes' cost-complexity measure will not decrease, ensuring that no further immediate pruning is required under the same complexity parameter α.
Step-by-step explanation:
The question concerns the weakest link pruning method used in the context of regression trees, which is a technique in machine learning and statistics. The goal of this method is to simplify the model to prevent overfitting, while maintaining its predictive accuracy. Given a complexity parameter α, pruning is done by removing sections of the tree that do not provide a statistical benefit greater than the penalty imposed by α. In this context, you are asked to show that upstream nodes of a pruned node satisfy a specific inequality which implies that further pruning is not immediately necessary.
To solve this, we need to understand the definitions of cost-complexity g(t), which is a measure used to decide whether a node should be pruned. According to the information provided, after pruning a subtree St at node t to create tree T1 from the original tree T0, we should have g1(t′) ≥ g0(t′) for all nodes t′ upstream of node t.
In practice, this ensures that after pruning, the resulting tree will not be immediately pruned further under the same α. The cost-complexity function g(t) takes into account the error of the subtree at t and the size of the subtree penalized by α. When the subtree St is pruned to a single node t, the increase in error might be offset by the reduction in complexity (fewer nodes), and thus the cost-complexity could decrease. Continuing this rationale, g1(t′) for any upstream node t′ should naturally be higher (or equal) than g0(t′), as the pruning would have been performed with an aim to not increase the cost-complexity when considering α.