Software packages
Stochastic inertialaccelerated methods with delayed derivatives
On solving weaklyconvex stochastic optimization in a distributed setting, we propose a stochastic inertialaccelerated method, which can have guaranteed convergence even outdated derivative information is used during the update. The information delay can slow down the convergence speed, but the effect will diminish with the number of updates for composite smooth problems. Empirically, the inertial acceleration can significantly speed up the convergence, as compared to a nonaccelerated counterpart. Moreover, the asynchronous update can yield significantly higher parallelization speedup than the synchronous counterpart.
APAM: asynchronous parallel adaptive stochastic gradient method
We propose APAM, an asynchronous parallel adaptive stochastic gradient method, for solving stochastic optimization problems. Adaptive stochastic gradient methods (e.g., Adam, AMSGrad) have been extensively used in training deep learning models. They often have significantly faster convergence than a nonadaptive method. By asynchronous parallel computing, APAM can achieve nearlinear speedup on top of the alreadyfastconvergent adaptive method.
ADMM for quadratic assignment problem (QAP)
By a lifting technique, we obtain a tightest convex relaxation of QAP among all relaxed problems that use quadratic constraints. On solving the lifted SDP relaxation, we first use a facial reduction technique to remove redundant constraints and then apply the alternating direction method of multipliers (ADMM). On two classes of QAP benchmark data sets, our method can consistently give tighter lower bounds than stateoftheart methods and can sometimes reach the optimal value.
ARock: asynchronous parallel coordinate updates
We propose ARock, an asynchronous parallel coordinate update method, for finding a fixed point of an operator . Compared to the synchronous parallel counterpart, ARock does not perform synchronization among all nodes after one update, and thus it can eliminate idle waiting time that is incurred by synchronization and load imbalance. ARock can enjoy the same convergence guarantee as its synchronous counterpart but yield much better parallelization speedup.
Tensor completion by parallel matrix factorization
Dictionary learning and patchdictionary method for wholeimage recovery
We propose a new dictionary learning method by the accelerated alternating proximal gradient method. By using a given (analytic or learned) dictionary, we sparsecode each image patch and represent the whole image by paving all nonoverlapping patches. This way, we propose to recover a whole image from its global measurements by using a patchsize dictionary. Through a few recoveries from different partitions, we eliminate the “stitching” effect. Our method can obtain more faithful recovery and is more efficient than stateoftheart methods.
Matrix/tensor factorization by block coordinate update methods
We propose a block coordinate update method with three different update schemes: block minimization, block proximal point, and block proximal gradient. A unified convergence analysis is provided to three different schemes. A subsequence convergence result is shown first, and a wholesequence convergence result is then established for problems that satisfy the KurdykaLojasiewicz property. Our method has been applied to matrix factorization, CP and Tucker tensor factorization with or without nonnegativity constraints, and dictionary learning problems.
Learning circulant kernels
We propose to learn a circulant sensing matrix by minimizing its mutual coherence with a given (analytic or learned) dictionary. With the learned sensing matrix, the recovery quality of a signal or image can be significantly improved, as compared to a random sensing matrix. Also, we propose to learn a circulant sensing matrix together with a dictionary to further improve the recovery.
Iterative reweighted Least Squares
Citation: M. Lai, Y. Xu and W. Yin. Improved iteratively reweighted least squares for unconstrained smoothed Lq minimization. SIAM Journal on Numerical Analysis, 51(2), pp. 927–957, 2013.
Matrix completion with nonnegative factors
We propose a novel model of nonnegative matrix completion through performing nonnegative matrix factorization (NMF) from partial observations of a nonnegative matrix. The model builds on the NMF and the lowrank matrix completion. We apply the alternating direction method of multipliers (ADMM) to solve the model. Despite nonconvexity, ADMM enjoys nice convergence and also performs well numerically. By utilizing the nonnegativity property of the underlying matrix, our method can consistently perform better than a lowrank matrix completion method that does not enforce the nonnegativity constraint.
Citation: Y. Xu, W. Yin, Z. Wen, and Y. Zhang. An alternating direction algorithm for matrix completion with nonnegative factors. Frontiers of Mathematics in China, 7(2), 365–384, 2012.
