Answer:
This refers to a machine learning approach in which agents learn to use feedback from their environment to choose actions that maximize rewards.
Step-by-step explanation:
hope this helps
3.3m questions
4.3m answers