Answer:
This refers to a machine learning approach in which agents learn to use feedback from their environment to choose actions that maximize rewards.
Step-by-step explanation:
hope this helps
1.6m questions
2.0m answers