학습목표 - Exploration and Exploitation 의 이해 - Epsilon-greedy Strategy의 이해 Single agent approaches로 Anatomical landmark detection을 하는 프로젝트를 진행하기 위해 참고한 논문에서 training하는 방법으로 "During training, the agent follows an epsilon-greedy policy. The terminal state is reached when the distance to the target landmark is less or equal than 1mm. During testing, the agent starts in the 80% inner region of the image..