Reinforcement Learning: Training Environment Simulator (paper)

A previous post describes a reinforcement learning model trained to find the optimal control settings for a reflow oven that solders electronic components to a circuit board. The oven’s moving belt transports the product (i.e., the circuit board) through multiple heating zones. This process heats the product according to a temperature-time target profile required to produce reliable solder connections.

Since considerable time is required to stabilize an oven’s temperature after changing the heater settings and passing the product through the oven, an oven simulator is used to speed up the process. The simulator emulates a single pass of the product through the oven in a few seconds compared to the minutes required by a physical oven.

The oven simulator has eight heating zones, each with a control for setting the temperature of the zone’s heater. After each pass, the simulator provides the temperature readings of the product recorded as it traveled through the oven.

Download PDF