A byte is 8 bits. In a 2^2 byte memory, you have 4 bytes. The lowest starts at 0 and the highest is at 3 bytes or 2^2 - 1. If the bytes are contiguous (they are juxtaposed- touching each other rather than spread out) then you can fit all 4 bytes into a 4 byte memory perfectly.
A word is just a grouping of bytes and it implies that the word is some meaningful piece of data, whereas a byte is not necessarily a meaningful piece of data.
In a 4 byte word, 2^22 bytes are available to store words. Lowest address is at 0 and the highest address is 2^22 - 1.