Abstract:
The general theory of greedy approximation with respect to arbitrary dictionaries is well developed in the case of real Banach spaces. Recently some results proved for the Weak Chebyshev Greedy Algorithm (WCGA) in the case of real Banach spaces were extended to the case of complex Banach spaces. In this paper we extend some of the results known in the real case for greedy algorithms other than the WCGA to the case of complex Banach spaces.
Bibliography: 25 titles.
Keywords:
complex Banach space, greedy approximation, weak greedy algorithm with free relaxation, greedy algorithm with weakness and relaxation, incremental algorithm.
Sparse and greedy approximation theory is important in applications. This theory has actively been developing for about 40 years. In the beginning researchers were interested in approximation in Hilbert spaces (see, for instance, [12], [15], [17], [18], [1], [7], [4], and [2]; for a detailed history, see [22]). Later, this theory was extended to the case of real Banach spaces (see, for instance, [10], [20], [13], and [5]; for a detailed history see [22] and [23]) and to the case of convex optimization (see, for instance, [3], [11], [16], [19], [24], [25], [8], [14], and [6]). The reader can find some open problems in the book [22].
We study greedy approximation with respect to arbitrary dictionaries in Banach spaces. This general theory is well developed in the case of real Banach spaces (see the book [22]). Recently (see [9]) some results proved in the case of real Banach spaces were extended to the case of complex Banach spaces. The Weak Chebyshev Greedy Algorithm (WCGA) was studied in [9]. In this paper we discuss greedy algorithms other than the WCGA. For them we extend some results known in the real case to the case of complex Banach spaces. We use the same main ideas as in the real case: first, we prove a recurrent inequality between the norms ‖fm‖ and ‖fm−1‖ of the residuals of the algorithm; second, we analyze this recurrent inequality. The step from the real case to the complex case requires some technical modifications in the proofs of the corresponding recurrent inequalities. For the reader’s convenience we present a detailed analysis here.
Let X be a (real or complex) Banach space with norm ‖⋅‖. We say that a set D of elements (functions) from X is a dictionary if each g∈D has a norm bounded by one (‖g‖⩽1) and the closure of spanD is X. It is convenient for us to consider, along with the dictionary D, its symmetrization. In the case of real Banach spaces we set
D±:={±g:g∈D}.
In the case of complex Banach spaces we set
D∘:={eiθg:g∈D,θ∈[0,2π)}.
In the above notation D∘ symbol ∘ stands for the unit circle.
and denote the closure (in X) of Ao1(D) by A1(D). Then it is clear that A1(D) is the closure (in X) of the convex hull of D± in the real case and of D∘ in the complex case. Also, denote by conv(D) the closure (in X) of the convex hull of D.
We introduce a new norm, associated with the dictionary D, in the dual space X∗ by the formula
‖F‖D:=sup
Note that, clearly, in the real case \|F\|_{\mathcal D}=\sup_{\phi\in{\mathcal D}^\pm}F(\phi) and in the complex case \|F\|_{\mathcal D}= \sup_{\phi\in{\mathcal D}^{\circ}}\operatorname{Re}(F(\phi)).
In this paper we study greedy algorithms with respect to {\mathcal D}. For a non-zero element h\in X we let F_h denote a norming (peak) functional for h:
We will use the following three simple and well-known lemmas. In the case of real Banach spaces these lemmas were proved, for instance, in [22], Chap. 6, pp. 342–343.
Lemma 2.2. Let X be a uniformly smooth Banach space and L be a finite-dimensional subspace of X. For any f\in X\setminus L let f_L denote the best approximant of f in L. Then
Denote by \nu the complex conjugate of \operatorname{sign}(F_{f-f_L}(\phi)), where \operatorname{sign} z:=z/|z| for z\ne 0. Then \nu F_{f-f_L}(\phi)=|F_{f-f_L}(\phi)|. For any \lambda\geqslant 0, from the definition of \rho(u) we obtain
is obvious because {\mathcal D} \subset A_1({\mathcal D}). We prove the reverse inequality. Take any f\in A_1({\mathcal D}). Then for any \varepsilon >0 there exist g_1^\varepsilon,\dots,g_N^\varepsilon \in {\mathcal D} and numbers a_1^\varepsilon,\dots,a_N^\varepsilon such that |a_1^\varepsilon|+\cdots+|a_N^\varepsilon| \leqslant 1 and
is obvious because {\mathcal D} \subset \operatorname{conv}({\mathcal D}). We prove the reverse inequality. Take any f\in \operatorname{conv}({\mathcal D}). Then for any \varepsilon >0 there exist g_1^\varepsilon,\dots,g_N^\varepsilon \in {\mathcal D} and non-negative numbers a_1^\varepsilon,\dots,a_N^\varepsilon such that a_1^\varepsilon+\cdots+a_N^\varepsilon=1 and
In this section we discuss three greedy-type algorithms and prove convergence and rate of convergence results for those algorithms. In the case of real Banach spaces these algorithms and the corresponding results on their convergence and rate of convergence are known. We show here how to modify these algorithms to the case of complex Banach spaces.
3.1. WGAFR
The following algorithm — WGAFR — can be defined in the same way in both the real and complex cases. We present the definition in the complex case.
The Weak Greedy Algorithm with Free Relaxation (WGAFR). Let \tau:=\{t_m\}_{m=1}^\infty, t_m\in[0,1], be a weakness sequence. We define f_0:=f and G_0:=0. Then for each m\geqslant 1 we have the following inductive definition.
(1) \varphi_m \in {\mathcal D} is any element satisfying
We begin with the main lemma. In the case of real Banach spaces this lemma is known (see, for instance, [22], p. 350, Lemma 6.20).
Lemma 3.1. Let X be a uniformly smooth complex Banach space with modulus of smoothness \rho(u). Take a number \varepsilon\geqslant 0 and two elements f and f^\varepsilon of X such that
Proof. Denote by \nu the complex conjugate of \operatorname{sign} F_{f_{m-1}}(\varphi_m), where \operatorname{sign} z:=z/|z| for z\ne 0. Then \nu F_{f_{m-1}}(\varphi_m)=|F_{f_{m-1}}(\varphi_m)|. By the definition of f_m
Remark 3.1. It follows from the definition of the \operatorname{WGAFR} that the sequence \{\|f_m\|\} is non-increasing.
We now prove a convergence theorem for an arbitrary uniformly smooth Banach space. The modulus of smoothness \rho(u) of a uniformly smooth Banach space is an even convex function such that \rho(0)=0 and \lim_{u\to0}\rho(u)/u=0. The function s(u):=\rho(u)/u, s(0):=0, associated with \rho(u) is a continuous increasing function on [0,\infty). Therefore, the inverse function s^{-1}(\,\cdot\,) is well defined. The following theorem is known in the case of real Banach spaces (see, for instance, [22], p. 352, Theorem 6.22).
Theorem 3.1. Let X be a uniformly smooth complex Banach space with modulus of smoothness \rho(u). Assume that a sequence \tau:=\{t_k\}_{k=1}^\infty satisfies the following condition: for each \theta >0 we have
Proof. This proof only uses Lemma 3.1, which provides a relation between the residuals \|f_m\| and \|f_{m-1}\|. It repeats the corresponding proof in the real case. For completeness we present this simple proof here. By Remark 3.1\{\|f_m\|\} is a non-increasing sequence. Therefore,
We obtain a contradiction, which proves the theorem. \Box
Remark 3.2. The reader can find some necessary conditions on the weakness sequence \tau for the convergence of the WCGA, for instance, in [22], p. 346, Proposition 6.13.
We now proceed to results on the rate of convergence. The following theorem is known in the case of real Banach spaces (see, for instance, [22], p. 353, Theorem 6.23).
Theorem 3.2. Let X be a uniformly smooth Banach space with modulus of smoothness \rho(u)\leqslant \gamma u^q, 1<q\leqslant 2. Take a number \varepsilon\geqslant 0 and two elements f and f^\varepsilon of X such that
Proof. This proof only uses Lemma 3.1, which provides a relation between the residuals \|f_m\| and \|f_{m-1}\|. It repeats the corresponding proof in the real case. For completeness we present this proof here. It is clear that it suffices to consider the case A(\varepsilon)\geqslant\varepsilon. Otherwise, \|f_m\|\leqslant\|f\|\leqslant\|f^\varepsilon\|+ \varepsilon\leqslant2\varepsilon. Also assume that \|f_m\|>2\varepsilon (otherwise Theorem 3.2 holds trivially). Then, by Remark 3.1 we have \|f_k\|>2\varepsilon for all k=0,1,\dots,m. By Lemma 3.1 we obtain
Raising both sides of this inequality to power p and taking into account the inequality x^r\leqslant x for r\geqslant 1, 0\leqslant x\leqslant 1, we obtain
We will need the following simple lemma, various versions of which are well known (see, for example, [22], p. 91).
Lemma 3.2. Given a number C_1>0 and a sequence \{a_k\}_{k=1}^\infty, a_k >0, k=1,2,\dots, assume that \{x_m\}_{m=0}^\infty is a sequence of non-negative numbers satisfying the inequalities
We use Lemma 3.2 for x_k:=\|f_k\|^p. Using the estimate \|f\| \leqslant A(\varepsilon)+\varepsilon, we set C_1:= A(\varepsilon)+\varepsilon. We specify a_k:=t^p_k/(A_qA(\varepsilon)^p). Then (3.5) guarantees that we can apply Lemma 3.2. Note that A_q>1. Then Lemma 3.2 gives
In this subsection we study the Greedy Algorithm with Weakness parameter t and Relaxation {\mathbf r} (\operatorname{GAWR}(t,{\mathbf r})), where {\mathbf r}:=\{r_j\}_{j=1}^\infty, r_j\in [0,1), j=1,2,\dots . In addition to the acronym \operatorname{GAWR}(t,{\mathbf r}), we will use the abbreviated acronym \operatorname{GAWR} for the name of this algorithm. We give a general definition of the algorithm in the case of the weakness sequence \tau.
The Greedy Algorithm with Weakness and Relaxation \operatorname{GAWR}(\tau,\mathbf r). Let \tau:=\{t_m\}_{m=1}^\infty, t_m\in[0,1], be a weakness sequence. We define f_0:=f and G_0:=0. Then for each m\geqslant 1 we give the following inductive definition.
(1) \varphi_m \in {\mathcal D} is any element satisfying
We now prove results on convergence and rate of convergence for the \operatorname{GAWR}(t,{\mathbf r}), that is, for the \operatorname{GAWR}(\tau,{\mathbf r}) with \tau =\{t_j\}_{j=1}^\infty, t_j=t \in (0,1], j=1,2,\dots . We begin with an analogue of Lemma 3.1. In the case of real Banach spaces Lemma 3.3 is known (see, for instance, [22], p. 355, Lemma 6.24).
Lemma 3.3. Let X be a uniformly smooth complex Banach space with modulus of smoothness \rho(u). Take a number \varepsilon\geqslant 0 and two elements f and f^\varepsilon of X such that
Proof. As in the above proof of Lemma 3.1, we denote by \nu the complex conjugate of \operatorname{sign} F_{f_{m-1}}(\varphi_m), where \operatorname{sign}z:=z/|z| for z\ne 0. Then \nu F_{f_{m-1}}(\varphi_m)=|F_{f_{m-1}}(\varphi_m)|. The definition of f_m implies that
We begin with a convergence result. It is known that the version of Lemma 3.3 for the real case implies the corresponding convergence result — Theorem 3.3 below — in the real case (see, for instance, [22], pp. 355–357). That same proof derives Theorem 3.3 from Lemma 3.3. We do not present it here.
Theorem 3.3. Let the sequence {\mathbf r}:=\{r_k\}_{k=1}^\infty, r_k\in[0,1), satisfy the conditions
Then the \operatorname{GAWR}(t,{\mathbf r}) converges in any uniformly smooth Banach space for each f\in X and for all dictionaries {\mathcal D}.
The situation with results on the rate of convergence is very similar to the situation with the convergence result described above. The real-case proof (see, for instance, [22], pp. 357–358) derives Theorem 3.4 from Lemma 3.3. We do not present it here.
Theorem 3.4. Let X be a uniformly smooth Banach space with modulus of smoothness \rho(u)\leqslant \gamma u^q, 1<q\leqslant 2. Let {\mathbf r}:=\{2/(k+2)\}_{k=1}^\infty. Consider the \operatorname{GAWR}(t,{\mathbf r}). Then for a pair of functions f, f^\varepsilon satisfying
We proceed to one more greedy-type algorithm (see [21]). Let \varepsilon=\{\varepsilon_n\}_{n=1}^\infty , \varepsilon_n> 0, n=1,2,\dots . The following greedy algorithm was introduced and studied in [21] (see also [22], pp. 361–363) in the case of real Banach spaces.
The Incremental Algorithm with schedule \varepsilon (\operatorname{IA}(\varepsilon)). Let f\in\operatorname{conv}({\mathcal D}). Denote f_0^{i,\varepsilon}:=f and G_0^{i,\varepsilon}:=0. Then for each m\geqslant 1 we have the following inductive definition.
(1) \varphi_m^{i,\varepsilon} \in {\mathcal D} is any element satisfying
We now modify the definition of \operatorname{IA}(\varepsilon) to make it suitable for complex Banach spaces.
Incremental Algorithm (complex) with schedule \varepsilon (\operatorname{IAc}(\varepsilon)). Let f\in A_1({\mathcal D}). Denote f_0^{{\rm c},\varepsilon}:= f and G_0^{{\rm c},\varepsilon}:=0. Then for each m\geqslant 1 we have the following inductive definition.
(1) \varphi_m^{{\rm c},\varepsilon} \in {\mathcal D}^\circ is any element satisfying
Denote by \nu_m the complex conjugate of \operatorname{sign} F_{f_{m-1}^{{\rm c},\varepsilon}}(\varphi_m^{{\rm c},\varepsilon}), where \operatorname{sign} z:=z/|z| for z\ne 0.
Proof. We use the abbreviated notation f_m:=f_m^{{\rm c},\varepsilon}, \varphi_m:=\nu_m\varphi_m^{{\rm c},\varepsilon}, and G_m:=G_m^{{\rm c},\varepsilon}. Writing
Using the definition of \varepsilon_m and the assumption \rho(u) \leqslant \gamma u^q, we make the following observation. There exists a constant C(K_1) such that if
Taking (3.16) into account we apply Lemma 3.4 to the sequence a_n=\|f_n\|, n=1,2,\dots, for \alpha=1/p, \beta =3/4 and complete the proof of Theorem 3.5.
We now consider one more modification of \operatorname{IA}(\varepsilon), which is suitable for complex Banach spaces and provides a convex combination of dictionary elements. In the case of real Banach spaces it coincides with the \operatorname{IA}(\varepsilon).
The Incremental Algorithm (complex and convex) with schedule \varepsilon (\operatorname{IAcc}(\varepsilon)). Let f\in \operatorname{conv}({\mathcal D}). Set f_0^{{\rm cc},\varepsilon}:=f and G_0^{{\rm cc},\varepsilon}:=0. Then, for each m\geqslant 1 we have the following inductive definition.
(1) \varphi_m^{{\rm cc},\varepsilon} \in {\mathcal D} is any element satisfying
A. R. Barron, “Universal approximation bounds for superposition of a sigmoidal functions”, IEEE Trans. Inform. Theory, 39:3 (1993), 930–945
2.
A. R. Barron, A. Cohen, W. Dahmen, and R. DeVore, “Approximation and learning by greedy algorithms”, Ann. Statist., 36:1 (2008), 64–94
3.
K. L. Clarkson, “Coresets, sparse greedy approximation, and the Frank–Wolfe algorithm”, ACM Trans. Algorithms, 6:4 (2010), 63, 30 pp.
4.
G. Davis, S. Mallat, and M. Avellaneda, “Adaptive greedy approximations”, Constr. Approx., 13:1 (1997), 57–98
5.
A. V. Dereventsov and V. N. Temlyakov, “A unified way of analyzing some greedy algorithms”, J. Funct. Anal., 277:12 (2019), 108286, 30 pp.
6.
A. Dereventsov and V. Temlyakov, “Biorthogonal greedy algorithms in convex optimization”, Appl. Comput. Harmon. Anal., 60 (2022), 489–511
7.
R. A. DeVore and V. N. Temlyakov, “Some remarks on greedy algorithms”, Adv. Comput. Math., 5:2-3 (1996), 173–187
8.
R. A. DeVore and V. N. Temlyakov, “Convex optimization on Banach spaces”, Found. Comput. Math., 16:2 (2016), 369–394
9.
S. Dilworth, G. Garrigós, E. Hernández, D. Kutzarova, and V. Temlyakov, “Lebesgue-type inequalities in greedy approximation”, J. Funct. Anal., 280:5 (2021), 108885, 37 pp.
10.
M. J. Donahue, L. Gurvits, C. Darken, and E. Sontag, “Rate of convex approximation in non-Hilbert spaces”, Constr. Approx., 13:2 (1997), 187–220
11.
M. A. T. Figueiredo, R. D. Nowak, and S. J. Wright, “Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems”, IEEE J. Sel. Top. Signal Process., 1:4 (2007), 586–597
12.
J. H. Friedman and W. Stuetzle, “Projection pursuit regression”, J. Amer. Statist. Assoc., 76:376 (1981), 817–823
13.
M. Ganichev and N. J. Kalton, “Convergence of the weak dual greedy algorithm in L_p-spaces”, J. Approx. Theory, 124:1 (2003), 89–95
14.
Zheming Gao and G. Petrova, “Rescaled pure greedy algorithm for convex optimization”, Calcolo, 56:2 (2019), 15, 14 pp.
15.
P. J. Huber, “Projection pursuit”, Ann. Statist., 13:2 (1985), 435–475
16.
M. Jaggi, “Revisiting Frank–Wolfe: projection-free sparse convex optimization”, Proceedings of the 30th international conference on machine learning, Proceedings of Machine Learning Research (PMLR), 28, no. 1, 2013, 427–435, https://proceedings.mlr.press/v28/jaggi13.html
17.
L. K. Jones, “On a conjecture of Huber concerning the convergence of projection pursuit regression”, Ann. Statist., 15:2 (1987), 880–882
18.
L. K. Jones, “A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression
and neural network training”, Ann. Statist., 20:1 (1992), 608–613
19.
S. Shalev-Shwartz, N. Srebro, and Tong Zhang, “Trading accuracy for sparsity in optimization problems with sparsity constraints”, SIAM J. Optim., 20:6 (2010), 2807–2832
20.
V. N. Temlyakov, “Greedy algorithms in Banach spaces”, Adv. Comput. Math., 14:3 (2001), 277–292
21.
V. N. Temlyakov, “Greedy-type approximation in Banach spaces and applications”, Constr. Approx., 21:2 (2005), 257–292
22.
V. Temlyakov, Greedy approximation, Cambridge Monogr. Appl. Comput. Math., 20, Cambridge Univ. Press, Cambridge, 2011, xiv+418 pp.
23.
V. Temlyakov, Multivariate approximation, Cambridge Monogr. Appl. Comput. Math., 32, Cambridge Univ. Press, Cambridge, 2018, xvi+534 pp.
24.
A. Tewari, P. K. Ravikumar, and I. S. Dhillon, “Greedy algorithms for structurally constrained high dimensional problems”, Adv. Neural Inf. Process. Syst., 24, NIPS 2011, MIT Press, Cambridge, MA, 2011, 882–890
25.
Tong Zhang, “Sequential greedy approximation for certain convex optimization problems”, IEEE Trans. Inform. Theory, 49:3 (2003), 682–691
Citation:
A. V. Gasnikov, V. N. Temlyakov, “On greedy approximation in complex Banach spaces”, Russian Math. Surveys, 79:6 (2024), 975–990