Answer:
This refers to a machine learning approach in which agents learn to use feedback from their environment to choose actions that maximize rewards.
Step-by-step explanation:
hope this helps
8.8m questions
11.4m answers