Adaptive projection-free methods for constrained variational inequalities in machine learning 
 

Thanh Hieu Pham1,
1 Posts and Telecommunications Institute of Technology

Main Article Content

Abstract

We propose the Penalty-Regularized Adaptive Constrained Gradient Method (PR-A-CGM), a projection-free algorithm for solving variational inequality problems with pseudo-monotone operators and convex functional constraints. Unlike projection-based methods, PR-A-CGM enforces feasibility by introducing a smooth penalty term into the update direction, avoiding costly or intractable projections while retaining convergence guarantees under pseudo-monotonicity and noisy oracles. We prove weak convergence under standard assumptions, establish strong convergence and rates under stronger conditions, and validate our method on machine learning applications such as fairness-constrained classification. Experiments show that PR-A-CGM improves feasibility and robustness over projection-free baselines while narrowing the gap to projection-based methods. These results highlight penalty-regularized primal methods as practical tools for constrained optimization in modern large-scale learning. 
 

Article Details

References

[1] B. Becker and R. Kohavi. Adult Dataset. UCI Machine Learning Repository, 1996. https://doi.org/10.24432/C5XW20.
[2] L. Bottou, F. E. Curtis, and J. Nocedal. Optimization methods for large-scale machine learning. SIAM Review, 60(2):223–311, 2018.
[3] A. Chambolle and T. Pock. A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision, 40(1):120–145, 2011.
[4] L. Cui, X. Chen, and J. Zhang. Projection-free stochastic optimization with complex constraints. NeurIPS, 2022.
[5] J. Diakonikolas, C. Daskalakis, and M. Jordan. Efficient methods for structured nonconvex-nonconcave min-max optimization. AISTATS, 2020.
[6] M. Donini, L. Oneto, S. Ben-David, J. Shawe-Taylor, and M. Pontil. Empirical risk minimization under fairness constraints. Advances in Neural Information Processing Systems (NeurIPS), 2018.
[7] E. Esser, X. Zhang, and T. F. Chan. A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science. SIAM Journal on Imaging Sciences, 3(4):1015–1046, 2010.
[8] F. Facchinei and J.-S. Pang. Finite-Dimensional Variational Inequalities and Complementarity Problems. Springer, 2003.
[9] G. Gidel, H. Daneshmand, A. Liautard, and S. Lacoste-Julien. A variational inequality perspective on generative adversarial networks. Advances in Neural Information Processing Systems (NeurIPS), 32, 2019.
[10] E. Gorbunov, M. Danilova, and D. Dobre. Near-optimal methods for stochastic variational inequalities. Advances in Neural Information Processing Systems (NeurIPS), 2022.
[11] A. Juditsky, A. Nemirovski, and C. Tauvel. Solving variational inequalities with stochastic mirror-prox algorithm. Stochastic Systems, 1(1):17–58, 2011. DOI: 10.1214/10-SSY011.
[12] G. M. Korpelevich. The extragradient method for finding saddle points and other problems. Ekonomika i Matematicheskie Metody, 1976.
[13] G. Lan and R. D. C. Monteiro. Iteration-complexity of first-order penalty methods for convex programming. Mathematical Programming, 155(1):511–547, 2016. DOI: 10.1007/s10107-015-0896-1.
[14] T. Lin, C. Jin, and M. Jordan. On gradient descent ascent for nonconvex-concave minimax problems. ICML, 2020.
[15] T. Lin, C. Jin, and M. Jordan. Near-optimal algorithms for minimax optimization. Conference on Learning Theory (COLT), 2020.
[16] A. Mill´an, R. Iutzeler, and J. Malick. On pseudo-monotone variational inequalities and inexact projections. Mathematical Programming, 2024.
[17] M. Muehlebach and M. I. Jordan. On constraints in first-order optimization: a view from non-smooth dynamical systems. Journal of Machine Learning Research, 23(1):1–47, 2022.
[18] A. Nemirovski. Prox-method with rate of convergence O(1/t) for variational inequalities with Lipschitz continuous monotone operators. SIAM Journal on Optimization, 2004.
[19] A. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro. Robust
stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19(4):1574–1609, 2009. DOI: 10.1137/070704277.
[20] S. Paternain, L. Chamon, M. Calvo-Fullana, and A. Ribeiro. Constrained reinforcement learning has zero duality gap. International Conference on Learning Representations (ICLR), 2022.
[21] R. T. Rockafellar. Monotone operators and the proximal point algorithm. SIAM Journal on Control and Optimization, 1976.