[Robot Hardware 05] - Physical AI at the Motor Control Level

Robot hardware from a Physical AI perspective - FOC, current control, and actuator non-idealities

EN KR

Robot Hardware from a Physical AI perspective

AI-based approaches such as reinforcement learning and policy learning are expanding quickly in robot control. But these discussions often start too high in the stack. We hear terms like contact-rich manipulation, sim-to-real, and domain randomization, while the first layer of physics that the policy meets on real hardware sits much lower: the motor drive and current control.

A learned policy usually outputs a command $u_t$ in some form. It may be a joint torque command $\tau_{des}$, a target current $i_{q,ref}$, a target velocity $\dot{q}_{ref}$, or a more abstract latent action. But on a real robot, that command does not turn directly into mechanical torque. At minimum, it passes through a chain like this:

\[u_t \rightarrow i_{q,ref} \rightarrow \text{current controller} \rightarrow v_{abc} \rightarrow \text{inverter switching} \rightarrow i_{abc} \rightarrow \tau_e \rightarrow \tau_{joint}\]

Between the “action” seen by the policy and the “mechanical output” produced by the robot, there are already layers of electromechanical approximation, sampling, saturation, delay, and nonlinearity. If we explain sim-to-real only as contact uncertainty, we miss half of the problem. FOC is a useful way to handle a three-phase PMSM in the d-q frame, but it does not turn an actuator into an ideal torque source. FOC does not remove nonlinearity. It pushes some of that nonlinearity into a coordinate system that we can control.

FOC and Non-Ideal Torque Generation

Electrical angle and rotor position in a three-phase PMSM

The core idea of FOC is to transform three-phase current into the rotor-flux-aligned d-q frame, separating flux-producing current from torque-producing current. For an ideal surface PMSM, torque is often written as:

\[\tau_e \approx \frac{3}{2} p \lambda_m i_q\]

Here, $p$ is the number of pole pairs, $\lambda_m$ is the magnet flux linkage, and $i_q$ is q-axis current. Looking only at this equation, it is tempting to think that if we control $i_q$, torque will be clean and linear. Many motor-drive documents describe q-axis current as the torque-producing current.

But that equation depends on several assumptions. The rotor electrical angle must be accurate. Current measurement must be accurate enough. Inverter nonlinearity and magnetic saturation must be small. Back-EMF and air-gap flux should be close to sinusoidal. If any one of these assumptions drifts, the relationship between $\tau_e$ and $i_q$ becomes less clean.

The important point is that real robot PMSMs are not perfect sinusoidal machines from a textbook. PMSMs are usually designed for more sinusoidal back-EMF than BLDC motors, but stator slotting, magnet shape, winding distribution, and manufacturing tolerance still distort the waveform. Position-dependent torque variation remains.

Two terms are often mixed together but should be separated: cogging torque and torque ripple. Cogging torque appears even with no current, caused by preferred magnetic positions between rotor magnets and stator slots. Torque ripple appears while current is flowing, caused by nonuniform torque production over electrical angle. Both make commanded torque less smooth, but their causes are different.

High-Torque-Density Motors and Control Sensitivity

Robot motors are usually asked to produce high torque density in a small package. That often leads to large rotor radius, high copper fill, high pole count, tight packaging, and aggressive thermal design.

High-torque-density design often pairs well with high-pole-count PMSMs, but that choice has a control cost. As the pole-pair count $p$ increases, electrical angle $\theta_e$ changes faster with mechanical angle $\theta_m$:

\[\theta_e = p \, \theta_m + \theta_{offset}\]

So the same encoder error $\Delta\theta_m$ becomes larger in the electrical frame:

\[\Delta\theta_e = p \, \Delta\theta_m\]

In high-pole-count motors, small mechanical angle errors, encoder offset, shaft runout, and interpolation noise spread more strongly into d-q decomposition error. The advantage of compact torque density comes with higher commutation sensitivity.

This connects directly to Physical AI. A policy usually does not know how sensitive the actuator’s internal electrical coordinates are. It simply asks for “a little more torque” at the next step. The drive must convert rotor angle into electrical angle, align the current vector, and absorb any error as torque ripple, heat, or vibration. The policy’s action space may look smooth and continuous, but the torque space realized by the drive sits on a much more fragile coordinate system.

Limits of Current Sensing

Current sensing is often treated too casually in robot control. It is easy to imagine that the drive knows each phase current exactly in real time. In practice, most systems use shunt-resistor-based current sensing, and that measurement is tightly coupled to PWM switching.

Low-side shunt, dual-shunt, three-shunt, and single DC-link shunt configurations differ in detail, but they share one fact: current is not continuously “seen.” It is reconstructed during specific sampling windows.

Single-shunt and DC-link shunt systems are especially sensitive because accurate reconstruction of three-phase current becomes difficult when active vector duration is too short. Even in three-shunt systems, the usable sampling window is limited by ADC trigger timing, dead time, switching transients, blanking time, amplifier slew rate, and low-pass filtering.

That means the feedback $i_q$ used by the current PI controller is already an estimate containing noise, delay, offset, and quantization. For example:

\[v_q = K_p (i_{q,ref} - i_q) + K_i \int (i_{q,ref} - i_q) \, dt\]

This looks like a simple PI controller. But if $i_q$ is a sampled-and-reconstructed quantity, the loop is chasing an observed current with timing constraints, not the true continuous current. At low duty cycles or during fast current transitions, small timing errors can disturb the estimated $i_d$ and $i_q$. That can create unwanted d-axis current or increase torque ripple.

This is not just “sensor noise.” It means the actuator bandwidth assumed by the policy and the torque bandwidth delivered by the drive are not the same.

Torque Constant Models and Nonlinearity

Robot control papers often write the relationship between torque and current as:

\[\tau_e = K_t i_q\]

As a first approximation, this is useful. But in real robots, we should not trust it too literally. The torque constant $K_t$ is not perfectly constant across the full operating range. Magnetic saturation, temperature rise, saliency, flux harmonics, inverter nonlinearity, rotor angle misalignment, and current bias can all change the torque produced by the same $i_q$.

In IPM motors, the difference between $L_d$ and $L_q$ and the resulting saliency become even more important. Inductance can vary with electrical angle and operating point. This is why research on PMSM inductance identification and inverter nonlinearity compensation continues: the real machine is not as honest as the ideal equation.

The low-torque region is especially interesting. Manipulation, balancing, and fine contact control often depend on very small forces and torques. In this range, small nonlinearities can dominate the average model. Static friction, cogging, current quantization, sensor offset, and dead zones can make a small policy action produce no torque at all, until it suddenly jumps.

Simulators often make this region too smooth. A policy that looks delicate in simulation may fail on hardware during edge contact, insertion, or stick-slip interaction.

Heat and Time-Varying Sim-to-Real Gap

Many discussions of sim-to-real stop at “simulation cannot perfectly match reality.” The more difficult issue is that reality itself changes over time.

After only a few minutes of motor operation, winding temperature rises, copper resistance changes, current-sensing offsets drift, and magnet flux can weaken slightly. In some systems, amplifier drift, board temperature, and bus-voltage sag add even more variation. The same $\tau_{des}$ command may produce a different response in the morning than it does ten minutes into an experiment.

So sim-to-real is not only a gap between virtual and real systems. It also includes drift inside the real system.

This matters for learning-based control. A policy usually assumes stationary dynamics, at least implicitly. Domain randomization and online adaptation can help, but if drive-level dynamics keep shifting with temperature and duty cycle, the policy is not just facing contact uncertainty. It is facing actuator truthfulness that changes over time.

In other words, learning is hard not only because the environment is complex, but because the actuator itself lies in slightly different ways at each step.

Actuator Truthfulness from a Physical AI Perspective

At this point, we need to shift perspective. If we explain failures of Physical AI only through data scarcity, network architecture, reward design, or simulation fidelity, the explanation is too software-centered. Lower in the stack, the actuator is already non-ideal.

A torque command from a policy passes through current-loop bandwidth, sampling time, ADC timing, dead-time compensation, rotor-angle alignment, saturation, and thermal drift before it becomes something close to joint torque. Then reducer friction, backlash, transmission compliance, seal drag, and reflected inertia move joint-side behavior even further away from motor-side behavior.

The policy often expects something like:

\[\tau_{joint} \approx f_{ideal}(u_t)\]

Real hardware is closer to:

\[\tau_{joint} = f(u_t,\ \theta_e,\ i_{meas},\ T,\ V_{bus},\ \text{PWM timing},\ \text{inverter nonlinearity},\ \text{transmission state})\]

Here, $T$ is temperature, $V_{bus}$ is DC bus voltage, and transmission state includes friction history and compliance.

The message is simple: learning-based control is not interacting with a motor that produces force proportional to command. It is interacting with a state-dependent and time-dependent electromechanical system.

What QDD and Low-Reduction Actuation Mean

This discussion also explains why low-reduction actuation and QDD appear so often in modern robot hardware.

QDD does not remove the motor-drive non-idealities described above. Current sensing errors, rotor-angle misalignment, and torque ripple still exist. But QDD reduces the extra distortion introduced by high-ratio transmissions: backlash, reflected-inertia amplification, friction hysteresis, and force-transmission distortion.

In other words, the imperfect torque produced at the motor side is less severely warped before it reaches the joint. The benefit of QDD is not just that it is “backdrivable.” It helps preserve the torque-motion relationship that a learned policy implicitly assumes.

From this perspective, hardware that works well with learning-based control is not simply hardware with more sensors or faster computers. More fundamentally, it is hardware with higher actuation truthfulness: a torque command is transferred to mechanical output in a relatively consistent way.

The success of Physical AI depends not only on algorithmic intelligence, but also on the honesty of the actuator physics underneath it.