Answer:
Dialects- because native languages have many different dialects it would be hard to account for all of the varrences. Chinese is (one of) the densest languages in the world since there are literally hundreds of thousands of sub-varieties of the language. Trying to develop a software that can capture all of these varrences will be difficult because dialects change and mutate as people move in and out of regions.
Loud environments- Since voice recognition systems are used in all sorts of environments it would be hard to account for the different background noise that will be heard and have to be filtered through. Creating a VR that can identify all noises will be virtually impossible.
Step-by-step explanation: