Abstract: A model-free method based on deep reinforcement learning is implemented to control channel power for GSNR optimization. Flat GSNR covering entire C-band is achieved across 1440 km ...