Q-Learning is a model-free reinforcement learning algorithm used to find the optimal action-selection policy for a given finite Markov decision process. It learns the quality of actions, represented as a Q-value, which indicates the expected utility of taking a given action in a specific state and following the optimal policy thereafter.