I'm a research scientist at DJI in ShenZhen, China, where I work on a small team that mostly works on 4D World Model.
I have experience in Large Scene Reconstruction (combining UAV and ground-based capture), Camera Pose Estimation, Motion Capture, VR, Multi-modal Recommendation & Search, Defect Detection, and Machine Vision. I earned my Master's degree from National University of Defense Technology (NUDT).
A method for 3D hand pose/shape estimation, fusing implicit image/explicit 2D joint features with pixel direction info, reducing motion jitter via confidence-based prediction.
A method for enhancing 3DGS with ray tracing for specular inter-reflections, separating specular/rough surface types and optimizing geometry to improve rendering accuracy and enable scene editing.
A hybrid 3DGS and textured mesh representation, where meshes handle texture-rich flat regions and Gaussians model complex geometries, optimized via joint training with a warm-up strategy and transmittance-aware supervision, achieving comparable rendering quality at higher FPS with fewer Gaussians.
Generative 3D reconstruction in partial observation, hindered by limited view range and inconsistent generation, is advanced by the training-free Zero-P-to-3, which integrates local dense observations and multi-source priors via fusion-based DDIM sampling and iterative refinement, outperforming SOTAs especially in invisible regions.
Estimating 3D rotations relies on rotation representations, and this work reveals that common orthogonalization procedures slow training efficiency, thus advocating learning unorthogonalized Pseudo Rotation Matrices (PRoM) which converge faster to better solutions and achieve state-of-the-art results in human pose estimation on large-scale benchmarks.