Reinforcement Learning 101
Any new subject can be learned in the same way as we learn any new language (Learning new vocabulary → Understanding Language syntax → Conversing with localites)
Objective of reinforcement learning is to implement automation, optimization and discovery with minimal human intervention.
In this blog I will give an introduction to Reinforcement learning vocab
- Agent
- Single
- Multi
- Independent Q-learning
- CTDE
- MADDPG
- Environment
- Stationary
- Non-Stationary
- State
- Action
- Episode, Visit
- Reward
- Long Term
- Short Term
- Markov Decision Process
- Bellman Equation
- Transition Probability
- Discounting Factor
- Policy
- Value Function
- State
- Monte Carlo Method to learn
- Action
- Model
- Free
- Q-Learning
- SARSA
- DQN
- Based
- Dyna-Q
- Monte Carlo Tree Search
- PILCO
- Policy Gradient Method
- REINFORCE
- Iteration
- Policy
- Value
- Evaluation
- Policy
- Improvement
- Policy
- Exploration vs Eploitation
- Exporation Strategies
- Initialization
- Termination
- Actor Critic Methods
I'll keep updating this blog whenever I have time
Comments
Post a Comment