63.3k views
2 votes
Data analysts at Universal bank is building a classification tree to classify its customers into two classes: nonacceptors (class 0) and acceptors (class 1) of personal loan offer. Each customer can be described by a set of attributes, such as age, experience, income, family size, education, average spending on credit cards per month, etc. The data analysts are in the process of identifying the most powerful predictor to split a set of training records (denoted by A). After splitting A with customer's family size, two subsets of records are generated, denoted by A1 and A2 respectively. The number of nonacceptors and acceptors in A, A1, and A2 are given below.

A A1 A2
Number of Nonacceptors 352 340 12
Number of Acceptors 223 36 187

Round your answers to 3 digits after the decimal point.

The Gini index of A is __________.
The Gini index of A1 is __________.
The Gini index of A2 is ___________.

1 Answer

2 votes

Answer:

Check the explanation

Step-by-step explanation:

Number of Number of The proportion of The proportion of

Non Acceptors Acceptors Non-acceptors (p0) Acceptors (p1)

A 352 223 100% 100%

A1 348 36 99% 16%

A2 4 187 1% 84%

Gini Index 1-(0.99)^2-(0.01)^2 1-(0.16)^2-(0.84)^2

0.0198 0.2688

Combined Gini Index for A1 and A2:

Number of Non Acceptors Number of Acceptors Total

A 352 223 575

A1 348 36 384

A2 4 187 191

Combined Gini Index 1-(352/575)^2-(223/575)^2

0.4748

Gini index if training records are split by customer's family size:

Number of Non Acceptors Number of Acceptors Total

A 352 223 575

A1 348 36 384

A2 4 187 191

Gini Index for this node 1-(352/575)^2-(223/575)^2

0.4748

Gini (A1) 1-(348/384)^2-(36/384)^2

0.1699

Gini (A2) 1-(4/191)^2-(187/191)^2

0.0410

Gini (A1,A2) 0.4748-(384/575)^0.1699-(191/575)^0.0410

-1.4147

User Itzel
by
9.4k points