ADAM: Auto-Animation of Digital Humans Using AI Methods

Research output: Book/ReportPh.D. thesis

Abstract

IMU based motion capture data notoriously lacks global positional information, due to inherent limitations in the sensing hardware. On top of that, it often suffers from physically implausible data artefacts, such as interpenetration of body parts, penetration of the floor and foot skating.

The research in this thesis presents methods for reconstructing global position trajectories from local pose information and improving IMU motion data quality in an automated fashion.

The first phase of this work introduces a novel method for reconstructing global positions using neural networks. A U-Net convolutional neural network was trained to process pose information for real-time position estimation. The work leveraged a diverse dataset, encompassing a wide range of activities and subjects, to train this network. Superior error properties were observed with the U-Net compared to a more standard convolutional neural network architecture, leading to more accurate global position predictions.

Building upon this foundation, subsequent research refined the global trajectory reconstruction process and added trajectory reconstruction in the vertical direction. A a lean U-Net model was developed, designed to integrate local pose information with acceleration signals from the IMU sensors. The model estimated short, character-centered trajectories over a sequence of frames, employing a weighted average approach to minimize estimation bias and noise. Tested on a novel dataset comprising actors not included in the training set, this enhanced method showed good accuracy in reconstructing ground truth trajectories. Acceleration signals were shown to play a critical role in maintaining trajectory reconstruction quality when pose data quality declined.

The final aspect of this thesis tackled inherent limitations in IMU-based motion capture, such as self-penetrating body parts, foot skating, and floating. These issues significantly hamper the realism achievable with cost-effective IMU systems. To overcome this, reinforcement learning was utilised to train an AI agent that could mimic error-prone sample motions within a simulated environment. This approach could prevent these common distortions while preserving the unique characteristics of the sample motions. The agent was trained on a blend of faulty IMU data and high-quality optical motion capture data. By examining different configurations of observation and action spaces, optimal settings were identified for use on unseen data. The efficacy of this approach was validated employing a set of quantitative metrics. These tests, conducted on a benchmark dataset of IMU-based motion data from actors outside the training set, demonstrated the method’s capability to enhance the realism and usability of IMU-based motion capture systems, narrowing the gap with marker-based alternatives.
Original languageEnglish
PublisherDepartment of Computer Science, Faculty of Science, University of Copenhagen
Number of pages120
Publication statusPublished - 2024

Cite this