Answer:
See the proof below
Explanation:
Let's assume that our random variable of interest is Y and we have a set of parameters
in the original network.
And let's assume that we add an additional parameter
and we want to see if the likehood for
data:image/s3,"s3://crabby-images/32a58/32a58a5e43e6d58bfc522353345fd788e625c743" alt="x_1, x_2,.....,x_k, x_(k+1)"
We don't know the distribution for each parameter
but we can say that the likehood function for the original set of parameters is given by:
data:image/s3,"s3://crabby-images/1924d/1924d4229aeed689d0a152453bf401afe1ea3678" alt="F=L(y| x_1,x_2,....,x_k)"
And in order to maximize this function we need to take partial derivates respect to each parameter like this:
data:image/s3,"s3://crabby-images/572d2/572d2d9c5fef0a5188e1decf7d531ef002c7b0c8" alt="(dF)/(dx_i) =(dL)/(dx_i), i=1,2,....,k"
We just need to set up the last derivate equal to zero and solve for the parameters who satisfy the condition.
If we add a new parameter the new likehood function would be given by:
data:image/s3,"s3://crabby-images/d2d8d/d2d8d00b48e62774f459d67465116af7fe32dd16" alt="F=L(y| x_1,x_2,....,x_k,x_(k+1))"
And in order to maximize this function again we need to take partial derivates respect to each parameter like this:
data:image/s3,"s3://crabby-images/65d53/65d53a9b1f1162516c2746ad158f16b07b25f523" alt="(dF)/(dx_i) =(dL)/(dx_i), i=1,2,....,k,k+1"
We are ssuming that we have the same parameters from 1 to k for the new likehood function. So then the likehood for the data would be unchanged and if we have more info for the likehood function we are maximizing the function since we are adding new parameters in order to estimate the function.
data:image/s3,"s3://crabby-images/d2d50/d2d5037f1602dccf94ea579889c48b0c02465894" alt="max L(y| x_1,x_2,....,x_k,x_(k+1)) \geq max L(y| x_1,x_2,....,x_k)"