Semantic parsing is the process of extracting meaning and intent from natural language text, which can often come from voice interactions. Well performed semantic parsing can also enable machines and software to learn by reading.
Traditional approaches to semantic parsing have been supervised: that is, humans provide a set of well-known input and output to the problem, and the machine "learns by example". But this has several problems, most obviously the way the machine will only learn as well as the examples teach--if you give the same software that performs great on the example set some natural language about a totally different topic, it wil often fail to perform adequately.
Unsupervised semantic parsing is relatively new but has been shown to outperform supervised approaches by three times on a regular basis. It relies on the key idea that the names of actions and object can be learned. This involves some self-learning--by providing a wide variety of examples about the same topic, the system is able to find similarities and infer that they're talking about the same subject, and learn based on it.
Newer semantic parsing systems are inherently probability-based. Meaning they understand what you're saying within a range of confidence. This isn't the same as hearing you wrong, as when you tell your phone to find the nearest steak house and it finds the nearest take out. That is bad voice recognition. This is where you tell Siri to find the nearest steak house and it thinks you means the nearest steak that can be bought near your house.