# Primes represented by positive definite binary quadratic forms

Primes represented by positive definite binary quadratic forms Abstract Let f be a primitive positive definite integral binary quadratic form of discriminant −D and let πf(x) be the number of primes up to x which are represented by f. We prove several types of upper bounds for πf(x) within a constant factor of its asymptotic size: unconditional, conditional on the Generalized Riemann Hypothesis (GRH) and for almost all discriminants. The key feature of these estimates is that they hold whenever x exceeds a small power of D and, in some cases, this range of x is essentially best possible. In particular, if f is reduced then this optimal range of x is achieved for almost all discriminants or by assuming GRH. We also exhibit an upper bound for the number of primes represented by f in a short interval and a lower bound for the number of small integers represented by f which have few prime factors. 1. Introduction 1.1. Historical review The distribution of primes represented by positive definite integral binary quadratic forms is a classical topic within number theory and has been intensely studied over centuries by many renowned mathematicians, including Fermat, Euler, Lagrange, Legendre, Gauss, Dirichlet and Weber. A beautiful exposition on the subject and its history can be found in [3]. We shall refer to positive definite integral binary quadratic forms as simply ‘forms’. Let f(u,v)=au2+buv+cv2 be a form of discriminant −D=b2−4ac ⁠. An integer n is said to be represented by f if there exists (u,v)∈Z2 such that n=f(u,v) ⁠. A form is primitive if a,b, and c are relatively prime. The group SL2(Z) naturally acts on the set of primitive forms with discriminant −D, and two forms are said to be properly equivalent if they belong to the same SL2(Z) orbit. A primitive form is reduced if ∣b∣≤a≤c and b≥0 if either ∣b∣=a or a=c ⁠. Every primitive form is properly equivalent to a unique reduced form. Most amazingly, the set of primitive forms with discriminant −D modulo proper equivalence can be given a composition law that makes it a finite abelian group Cl(−D) ⁠. This is the class group of −D and its size h(−D) is the class number. We refer to each equivalence class of the class group as a form class. Now, assume f is primitive. The central object of study is the number of primes represented by f up to x ⁠, denoted πf(x)=∣{p≤x:p=f(u,v)forsome(u,v)∈Z2}∣ for x≥2 ⁠. From deep connections established by class field theory, the Chebotarev density theorem [16, 21] implies that primes are equidistributed amongst all form classes. Namely, πf(x)∼δfxh(−D)logx (1.1) as x→∞ ⁠, where δf={12iff(u,v)isproperlyequivalenttoitsoppositef(u,−v),1otherwise. (1.2) Unfortunately, the asymptotic (1.1) derived from [16] requires x to be exponentially larger than D and, if a Siegel zero exists, then the situation worsens. (A Siegel zero is a putative real zero of a real Hecke L-function that is extremely close to the edge of the critical strip. It would be a major breakthrough to eliminate the possibility of its existence.) In either case, this is unsuitable for many applications. Assuming the Generalized Riemann Hypothesis (GRH), one can do much better. It follows from the same work of Lagarias and Odlyzko [16] that, assuming GRH, πf(x)=δfLi(x)h(−D)+O(x1/2log(Dx)) (1.3) for x≥2 ⁠. Here Li(x)=∫2x1logtdt∼xlogx ⁠. The main term in (1.3) dominates when x≥D1+ε and any ε>0 ⁠. This GRH range for x is essentially optimal for certain but not necessarily all forms. Among other results, we will partially address this defect in this paper. For further discussion of range optimality and improvements (both conditional and unconditional), see Sections 1.2 and 1.3. Now, to remedy our lack of knowledge about the asymptotics of πf(x) in the unconditional case, one can settle for upper and lower bounds of the shape xh(−D)logx≪πf(x)≪xh(−D)logx (1.4) in the hopes of improving the valid range of x ⁠. For uniform lower bounds, we refer the reader to [8, 14, 24] and, most recently, [23] wherein it was shown that there exists a prime p represented by f of size at most O(D700) ⁠. The focus of this paper, however, is on upper bounds for πf(x) ⁠. Assuming GRH, there has been no improvement beyond (1.3) itself. Unconditionally, a general result of Lagarias–Montgomery–Odlyzko [15, Theorem 1.5] on the Chebotarev density theorem yields some progress, but the range of x is still worse than exponential in D ⁠. Recently, a theorem of Thorner and Zaman [22] improves upon the aforementioned Chebotarev result and consequently implies that πf(x)<2δfLi(x)h(−D)forx≥D700 (1.5) and D sufficiently large. Short of excluding a Siegel zero, the constant 2 is best possible and, moreover, the range of x is a polynomial in D like (1.3). On the other hand, the quality of exponent 700 in (1.5) leaves much to be desired when compared to the GRH exponent of 1+ε ⁠. Both [15, 22] carefully study the zeros of Hecke L-functions to prove their respective results, whereas a more recent paper of Debaene [4] uses a lattice point counting argument and Selberg’s sieve to subsequently establish another such Chebotarev-type theorem. His result implies a weaker variant of inequality (1.5), but for a greatly improved range of x≥D9/2+ε ⁠. Broadly speaking, we will specialize Debaene’s strategy to positive definite binary quadratic forms. An alternate formulation that expands the valid range of upper (and lower) bounds for πf(x) can be obtained by averaging over discriminants or forms, in analogy with the famous theorems of Bombieri–Vinogradov and Barban–Davenport–Halberstam on primes in arithmetic progressions. As part of his Ph.D. thesis, Ditchen [5, 6] achieved such elegant statements which we present in rough terms for simplicityʼs sake. Namely, for 100% of fundamental discriminants −D≢0(mod8) and all of their forms f ⁠, he proved πf(x)=δfxh(−D)logx+Oε(xh(−D)(logx)2) (1.6) provided x≥D20/3+ε ⁠. He also showed for 100% of fundamental discriminants −D≢0(mod8) and 100% of their forms, Equation (1.6) holds for x≥D3+ε ⁠. Finally, before we discuss an optimal range for upper bounds of πf(x) and the details of our results, we give a somewhat imprecise flavor of what we have shown. One should compare with (1.3), (1.5), (1.6), and their associated papers [4, 5, 16, 22]. Theorem 1.1 Let f(u,v)=au2+buv+cv2be a reduced positive definite integral binary quadratic form of discriminant −D ⁠. If GRH holds and ε>0then πf(x)≪εxh(−D)logxforx≥(D/a)1+ε.This estimate holds unconditionally for 100% of discriminants −D ⁠. Unconditionally and uniformly over all discriminants −D ⁠, the same upper bound for πf(x)holds for x≥(D2/a)1+ε ⁠. 1.2. Optimal range What is the minimal size of x relative to D for which one can reasonably expect (1.4) to hold for all forms f? We have seen GRH implies x≥D1+ε is valid and, in one sense, this range is best possible. For example, the form f(u,v)=u2+D4v2 with D≡0(mod4) does not represent any prime <D/4 ⁠. The reason is simple: the coefficients are simply too large. On the other hand, what about forms f(u,v)=au2+buv+cv2 for which all of the coefficients are small compared to D ⁠, say a,b,c≪D? The GRH range x≥D1+ε is insensitive to the size of the form’s coefficients. Since primitive forms are properly equivalent to a reduced form and all forms in the same form class represent the same primes, it is enough to consider a reduced form f so the coefficients necessarily satisfy ∣b∣≤a≤c ⁠. Consider the sum ∑n≤x∣{(u,v)∈Z2:n=f(u,v)}∣. If x<c then, as f is reduced, the only terms n contributing to the above sum are of the form n=f(u,0)=au2 ⁠. The sum is, therefore, equal to 2x/a+O(1) and at most one prime value n contributes to its size. On the other hand, if x≥c then the above sum is of size O(x/D) ⁠. Thus, one might reasonably suspect that primes will appear with natural frequency for x≥c1+ε and any ε>0 ⁠. As c≍D/a ⁠, it is conceivable for reduced forms f to satisfy (1.3) in (what we will refer to as) the optimal range x≥(D/a)1+ε for any ε>0 ⁠. (Of course, one could potentially refine the factor of ε but that is not our goal.) Can we expect to obtain any kind of unconditional or GRH-conditional bounds for πf(x) in the optimal range? For the sake of comparison, we turn to primes in arithmetic progressions since estimates like (1.5) are relatives of the classical Brun–Titchmarsh inequality. A version due to Montgomery and Vaughan [19] states for (a,q)=1 and x>q that π(x;q,a)<2xφ(q)log(x/q). (1.7) Here π(x;q,a) represents the number of primes up to x congruent to a(modq) ⁠. The range x>q is clearly best possible, which inspires the possibility of success for a similar approach to primes represented by binary quadratic forms. Our goal is to give upper bounds for πf(x) as close to the optimal range as possible. 1.3. Results Our first result is a uniform upper bound for πf(x) ⁠, both unconditional and conditional on GRH. For a discriminant −D ⁠, let χ−D(·)=(−D·) denote the corresponding Kronecker symbol which is a quadratic Dirichlet character. Theorem 1.2 Let f(u,v)=au2+buv+cv2be a reduced positive definite integral binary quadratic form with discriminant −Dand let ε>0be arbitrary. Set ϕ=ϕ(χ−D)≔{0assumingL(s,χ−D)satisfiesGRH,14unconditionally. (1.8)If x≥(D1+4ϕ/a)1+εthen πf(x)<41−θ·δfxh(−D)logx{1+Oε(loglogxlogx)}, (1.9)where θ=θϕ=(1+2ϕ+ε2)logDlogx−logalogx. Remarks If ϕ=0 then 0<θ<1 ⁠. Similarly, if ϕ=14 then 0<θ<34 ⁠. The constant ϕ is associated with bounds for L(s,χ−D) in the critical strip. In particular, ϕ=12 corresponds to the usual convexity estimate. Unconditionally, one should compare this estimate with (1.5) and the related works [4, 22]. Theorem 1.2 achieves the upper bound in (1.4) with the range x≥(D2/a)1+ε which improves over the prior range x≥D9/2+ε implied by [4]. In fact, when a≫D1/2 ⁠, Theorem 1.2’s range becomes x≫D3/2+ε, which is fairly close to the classical GRH range x≥D1+ε ⁠. Of course, inequality (1.9) has a weaker implied constant 41−θ instead of 2 as in (1.5). Assuming GRH, we obtain the desired range x≥(D/a)1+ε discussed in Section 1.2. Note we only assume GRH for the quadratic Dirichlet L-function L(s,χ−D), whereas (1.3) assumes GRH for the collection of Hecke L-functions associated to the corresponding ring class field. Since every primitive form is properly equivalent to a reduced form, we may ignore the dependence on the coefficient a in Theorem 1.2 to obtain the following simplified result. Corollary 1.3 Let fbe a primitive positive definite integral binary quadratic form with discriminant −Dand let ε>0be arbitrary. If x≥D2+εthen πf(x)<8·δfxh(−D)logx{1+Oε(loglogxlogx)}. We will actually prove a more general version of Theorem 1.2 that allows us to estimate the number of primes represented by f in a short interval. Theorem 1.4 Let f(u,v)=au2+buv+cv2be a reduced positive definite integral binary quadratic form with discriminant −Dand let ε>0be arbitrary. Let ϕbe defined by (1.8). If (D1+4ϕa)1/2+εx1/2+ε≤y≤xthen πf(x)−πf(x−y)<21−θ′·δfyh(−D)logy{1+Oε(loglogylogy)}, (1.10)where θ′=θϕ′=logx2logy+(12+ϕ+ε4)logDlogy−loga2logy. Remarks Assuming GRH, (1.3) implies that πf(x)−πf(x−y)≪δfyh(−D)logy for (Dx)1/2+ε≤y≤x ⁠. Theorem 1.4 yields an unconditional upper bound of comparable strength and, depending on the size of the coefficients of f ⁠, implies a GRH upper bound for slightly shorter intervals and smaller values of x ⁠. If f is only assumed to be primitive, then the same statement holds by setting a=1 in the condition on y and the value of θ′ ⁠. Theorem 1.2 follows by setting y=x ⁠. For any given discriminant −D ⁠, the unconditional versions (ϕ=14) of Theorems 1.2 and 1.4 fall slightly shy of their GRH counterparts (⁠ ϕ=0 ⁠). Averaging over all discriminants, we show that the GRH quality estimates hold almost always. For Q≥3 ⁠, define D(Q)≔{discriminants−Dwith3≤D≤Q}. (1.11) Here and throughout, a discriminant −D is that of a positive definite integral binary quadratic form. Thus, −D is a negative integer ≡0 or 1(mod4) ⁠. Theorem 1.5 Let Q≥3and 0<ε<18 ⁠. For all except at most Oε(Q1−ε10)discriminants −D∈D(Q) ⁠, the statements in Theorems1.2 and 1.4hold unconditionally with ϕ=0 ⁠. Remark. When considering upper bounds for πf(x) ⁠, this improves over (1.6) in several aspects. First, the desired range discussed in Section 1.2 is achieved on average. Second, we did not utilize any averaging over forms, only their discriminants. Finally, there are no restrictions on the family of discriminants; they need not be fundamental or satisfy any special congruence condition. The underlying strategy to establish Theorems 1.2, 1.4 and 1.5 rests on a natural two-step plan. First, estimate the congruence sum ∑n≤xℓ∣nrf(n), (1.12) where ℓ is squarefree and rf(n)=∣{(u,v)∈Z2:n=f(u,v)}∣ is the number of representations of the integer n by the form f ⁠. Secondly, apply Selberg’s upper bound sieve. Our application of the sieve is fairly routine, but calculating the congruence sums with sufficient precision poses some difficulties. Inspired by a beautiful paper of Blomer and Granville [1] wherein they carefully study the moments of rf(n) ⁠, we determine the congruence sums via geometry of numbers methods. However, for their purposes, only a simple well-known estimate [1, Lemma 3.1] for the first moment ∑n≤xrf(n) was necessary. We execute a more refined analysis of the first moment and related quantities ∣Bℓ(m)∣ (see (6.4) for a definition) using standard arguments with the sawtooth function. Afterwards, the main technical hurdle is to carefully decompose the congruence sum (1.12) into a relatively small number of disjoint quantities ∣Bℓ(m)∣ ⁠. This allows us to apply our existing estimates for ∣Bℓ(m)∣ (see Lemma 6.2) while simultaneously controlling the compounding error terms. We achieve this in Proposition 7.1; see Sections 6 and 7 for details on this argument. When finalizing the proofs of Theorems 1.2, 1.4 and 1.5, the various valid ranges for x are determined by relying on character sum estimates like Burgess’s bound, Heath-Brown’s mean value theorem for quadratic characters [10], and Jutila’s zero density estimate [13]. Since we have calculated the congruence sums (1.12) in Proposition 7.1, we thought it may be of independent interest to examine its performance in conjunction with a lower bound sieve. By a direct application of the beta sieve, we show: Theorem 1.6 Let f(u,v)=au2+buv+cv2be a reduced positive definite integral binary quadratic form with discriminant −D ⁠. For every integer k≥10 ⁠, the number of integers represented by fwith at most kprime factors is ≫xD(logx)2forx≥(Da)1+495k−49. Remarks One can formulate this statement in a slightly weaker alternative fashion: for every ε>0 and x≥(D/a)1+ε ⁠, the number of integers represented by f with at most Oε(1) prime factors is ≫xD(logx)2 ⁠. It is unsurprising that f represents many integers with few prime factors as this follows from standard techniques in sieve theory. The form f even represents primes of size O(D700) by [23]. However, the key feature of Theorem 1.6 is that the size of these integers with few prime factors is very small (in a best possible sense per Section 1.2). 2. Notation and conventions For each of the asymptotic inequalities F≪G ⁠, F=O(G) ⁠, or G≫F ⁠, we mean there exists a constant C>0 such that ∣F∣≤CG ⁠. We henceforth adhere to the convention that all implied constants in all asymptotic inequalities are absolute with respect to all parameters and are effectively computable. If an implied constant depends on a parameter, such as ε ⁠, then we use ≪ε and Oε to denote that the implied constant depends at most on ε ⁠. Throughout the paper, A form refers to a positive definite binary integral quadratic form. −D is the discriminant of a positive definite integral binary quadratic form, so −D is any negative integer ≡0 or 1(mod4) ⁠. It is not necessarily fundamental. f(u,v)=au2+buv+cv2∈Z[u,v] is a positive definite integral binary quadratic form of discriminant −D ⁠. It is not necessarily primitive or reduced. χ−D(·)=(−D·) is the Kronecker symbol attached to −D ⁠. D(Q) is the set of discriminants −D with 3≤D≤Q ⁠. Δ is a (negative) fundamental discriminant of a form. τk is the k-divisor function and τ=τ2 is the divisor function. φ is the Euler totient function. [s,t] is the least common multiple of integers s and t ⁠. (s,t) is the greatest common divisor of integers s and t ⁠. However, we may abuse notation and sometimes refer to a lattice point (u,v)∈Z2, but this will be made clear from context (for example, with the set membership symbol). 3. Elementary estimates First, we establish a standard result employing the Euler-Maclaurin summation formula [11, Lemma 4.1], which will later allow us to counts lattice points inside an ellipse. Lemma 3.1 For W≥1 ⁠, ∑1≤w≤WW2−w2=πW24−W2+O(W). Proof Set G(w)=W2−w2 ⁠. By partial summation, observe that ∑1≤w≤WG(w)=−∫0WtG′(t)dt+∫0Wψ(t)G′(t)dt−12G(0), (3.1) where ψ(t)=t−⌊t⌋−1/2 is the sawtooth function. For the first integral, notice −∫0WtG′(t)dt=∫0Wt2W2−t2dt=πW24. (3.2) For the second integral, we use the Fourier expansion (see for example, [11, Equation (4.18)]) ψ(x)=−∑1≤n≤N(πn)−1sin(2πnx)+O((1+∣∣x∣∣N)−1), where ∣∣x∣∣ is the distance of x to the nearest integer. It follows that ∫0Wψ(t)G′(t)dt=2∑1≤n≤N∫0WG(t)cos(2πnt)dt+O(∫0W∣G′(t)∣1+∣∣t∣∣Ndt) (3.3) after integrating by parts. Using a computer algebra package or table of integrals, ∫0WG(t)cos(2πnt)dt=∫0WW2−t2cos(2πnt)dt=W·J1(2πnW)4n, (3.4) where Jν(z) is the Bessel function of the first kind. Recall that, for z>1+∣ν∣2 ⁠, Jν(z)=(12z)ν∑m=0∞(−1)m(14z2)mm!Γ(m+ν+1)=(2πz)1/2(cos(z−12νπ−14π)+O(1+∣ν∣2z)). When summing (3.4) over 1≤n≤N ⁠, it follows that 2∑1≤n≤N∫0WG(t)cos(2πnt)dt=W4·2∑1≤n≤NJ1(2πnW)n=W2π∑1≤n≤N(cos(2πnW−3π/4)n3/2+O(1n5/2W))≪W (3.5) uniformly over N ⁠. Taking N→∞ ⁠, we conclude from (3.3) and (3.5) that ∫0Wψ(t)G′(t)dt≪W. (3.6) Combining (3.1), (3.2), and the above yields the result.□ Next, in Lemma 3.2, we calculate some weighted average values of the Dirichlet convolution (1∗χ)(n)=∑d∣nχ(d) for any quadratic Dirichlet character χ ⁠. Better estimates and certainly simpler proofs are available via Mellin inversion, but it is convenient for us to explicitly express the error terms using the character sum quantity Sχ(t)≔∑n≤tχ(n). This feature will give us flexibility and streamline the proofs of Theorems 1.4 and 1.5 each of which use different bounds for character sums. One utilizes uniform bounds (conditional and unconditional) whereas the other applies average bounds. Lemma 3.2 Let χ(modD)be a quadratic Dirichlet character. For x≥3 ⁠, ∑n≤x(1∗χ)(n)(1−nx)=x2L(1,χ)+O(E0(x;χ)), (3.7)where E0(x;χ)≔inf1≤y≤x(y2x+∣Sχ(y)∣+x∫y∞∣Sχ(t)∣t2dt)and Sχ(t)=∑n≤tχ(n) ⁠. Moreover, ∑n≤x(1∗χ)(n)n(1−nx)=L(1,χ)(logx+γ−1)+L′(1,χ)+O(E1(x;χ)), (3.8)where E1(x;χ)≔inf1≤y≤x(yx+logx∫y∞∣Sχ(t)∣logtt2dt). Remark Bounds for E0(x;χ) and E1(x;χ) can be found in Sections 4 and 5. Proof The proofs of these facts are standard, but we include the details for the sake of completeness. Equation (3.7) is a more precise version of [20, Section 4.3.1, Exercise 3] which originated from work of Mertens. Before we proceed, recall by partial summation that ∑d≤yχ(d)d=L(1,χ)−∫y∞Sχ(t)t2dt,∑d≤yχ(d)logdd=−L′(1,χ)−∫y∞Sχ(t)(logt−1)t2dt. (3.9) We will use these estimates in what follows. Let A(u)=∑n≤u(1−nu) ⁠. One can verify that A(u)=u2−12∫0u{t}dt=u2−12+O(u−1). (3.10) From the above integral formula for A(u) ⁠, it is straightforward to check that A(u) is continuous and if u>1 is not an integer then A′(u)=[u]([u]+1)2u2≪1. (3.11) In particular, A(u) is increasing and absolutely continuous. Now, to calculate the sum in (3.7), we use Dirichlet’s hyperbola method with a parameter 1≤y≤x ⁠. Namely, Σ≔∑n≤x(1∗χ)(n)(1−nx)=∑d≤yχ(d)A(x/d)+∑y<d≤xχ(d)A(x/d)=Σ1+Σ2, (3.12) say. For Σ1 ⁠, we see by (3.10) that Σ1=x2∑d≤yχ(d)d−12∑d≤yχ(d)+O(y2/x). (3.13) For Σ2 ⁠, since A is an absolutely continuous, decreasing, non-negative function and A(1)=0 ⁠, it follows by partial summation that Σ2=∫yxA(x/t)dSχ(t)=A(1)Sχ(x)+∫yxSχ(t)dA(x/t)=−∫yxSχ(t)A′(x/t)xt−2dt. Thus, by (3.11), ∣Σ2∣≪x(∫yx∣Sχ(t)∣t2dt). Combining the above estimate, (3.13), and (3.9) into (3.12) yields (3.7). We prove (3.8) similarly. For u≥1 ⁠, define B(u)≔∑n≤u1n(1−nu)=logu+γ−1+O(u−1). (3.14) One can verify that B(u) is a non-negative, increasing, and absolutely continuous function of u ⁠. Also, if u>1 is not an integer, then B′(u)≪1u. (3.15) For some parameter 1≤y≤x ⁠, Σ′≔∑n≤x(1∗χ)(n)n(1−nx)=∑d≤yχ(d)dB(x/d)+∑y<d≤xχ(d)dB(x/d)=Σ1′+Σ2′, (3.16) say. To calculate Σ1′ ⁠, we apply (3.14) and deduce that Σ1′=(logx+γ−1)∑d≤yχ(d)d−∑d≤yχ(d)logdd+O(y/x). (3.17) For Σ2′ ⁠, set S˜χ(u)≔∑y<d≤uχ(d)d ⁠. Since B is an absolutely continuous, non-negative, increasing function and B(1)=0 ⁠, we similarly conclude that Σ2′=∫yxB(x/t)dS˜χ(t)=−x∫yxS˜χ(t)B′(x/t)t−2dt. From (3.15), it follows that ∣Σ2′∣≪∫yx∣S˜χ(t)∣tdt. Substituting the identity S˜χ(t)=∫yt1udSχ(u)=Sχ(t)t+∫ytSχ(u)u2du into the previous estimate, we have that Σ2′≪∫yx∣Sχ(t)∣t2dt+∫yx1t∫yt∣Sχ(u)∣u2dudt≪∫yx∣Sχ(t)∣t2dt+(∫yx1tdt)(∫yx∣Sχ(u)∣u2du)≪logx∫yx∣Sχ(t)∣t2dt. Incorporating (3.17), (3.9), and the above into (3.16) establishes (3.8).□ 4. Uniform bounds for quadratic characters Here we collect known uniform bounds for character sums and values of the logarithmic derivatives of Dirichlet L-functions for quadratic characters. We apply the former to obtain estimates for the error terms arising in Lemma 3.2. 4.1. Character sums We state the celebrated result of Burgess [2] specialized to quadratic characters which was extended from cube-free moduli to all moduli by Heath-Brown [9, Lemma 2.4]. Lemma 4.1 Let χ(modD)be a quadratic Dirichlet character. For any η>0,N≥1 ⁠, and integer k≥3 ⁠, ∑n≤Nχ(n)≪η,kD(k+1)/4k2+ηN1−1/k. We also record the well-known GRH-conditional estimate for character sums. Lemma 4.2 Let χ(modD)be a non-principal Dirichlet character and suppose L(s,χ)satisfies GRH. For any η>0and N≥1 ⁠, ∑n≤Nχ(n)≪ηDηN1/2. 4.2. Errors from Lemma 3.2 The results of Section 4.1 allow us to obtain power-saving estimates for the error terms in Lemma 3.2 for small values of x relative to the conductor D ⁠. Lemma 4.3 Let χ(modD)be a quadratic Dirichlet character and let 0<ε<120be arbitrary. Let E0(x;χ)and E1(x;χ)be as in Lemma3.2. If x≫εD1/4+εthen E0(x;χ)≪εx1−ε2/2E1(x;χ)≪εx−ε2/2. (4.1)If L(s,χ)satisfies GRH and x≫εDεthen E0(x;χ)≪εx2/3+εandE1(x;χ)≪εx−1/3+ε. (4.2) Proof First, we consider E0=E0(x;χ) ⁠. Let 1≤y≤x be a parameter yet to be chosen. From Lemma 4.1 and the definition of E0 ⁠, we have that E0≪η,kx−1y2+Dk+14k2+ηx2−1/ky−1. Selecting y=Dk+112k2+η3x1−1/3k,η=18k2,k=⌈1/ε⌉≥20 yields the estimate for E0 in (4.1) since x≫εD1/4+ε ⁠. Now, assume GRH holds for L(s,χ) and x≫εDε ⁠. Utilize Lemma 4.2 with η=3ε2/2 and select y=Dε2/2x5/6 ⁠. This implies that E0≪εy2x+D3ε2/2x3/2y≪εDε2x2/3≪εx2/3+ε as desired. The arguments for E1=E1(x;χ) are similar. Again, by Lemma 4.1, E1≪k,ηyx+Dk+14k2+ηx1−1/k(logx)2y. Selecting y=Dk+18k2+η2x1−1/2k,η=18k2,k=⌈1/ε⌉≥20 yields the estimate for E1 in (4.1) since x≫εD1/4+ε ⁠. If L(s,χ) satisfies GRH, then we use Lemma 4.2 with η=ε2 and select y=Dε2x2/3 ⁠. This implies that E1≪εyx+D3ε2/2y−1/2(logx)2≪εDε2/2x−1/3(logx)2≪εx−1/3+ε as desired.□ 4.3. Logarithmic derivatives We record two results related to the value of the logarithmic derivative of L(s,χ) at s=1 for a quadratic Dirichlet character χ ⁠. Lemma 4.4 (Heath-Brown). If χ(modD)is a quadratic character then, for η>0 ⁠, −L′L(1,χ)≤(18+η)logD+Oη(1). Proof This follows from the proof of [9, Lemma 3.1] with some minor modifications to allow for any modulus D (not just sufficiently large) and a slightly wider range of the quantity σ therein, say 1<σ<1+ε ⁠. See [23, Proposition 2.6] for details.□ Lemma 4.5 If χ(modD)is a quadratic character and L(s,χ)satisfies GRH then for η>0 −L′L(1,χ)≪loglogD≤ηlogD+Oη(1). Proof This is well known and can be deduced from the arguments in Lemma 5.5.□ 5. Average bounds for quadratic characters The purpose of this section is analogous to Section 4, except we focus on estimates averaging over a certain class of quadratic characters attached to discriminants. To be more specific, for Q≥3 ⁠, recall D(Q)={discriminants−Dwith3≤D≤Q}. Here a discriminant is that of a primitive positive definite binary quadratic form. We emphasize that a discriminant −D∈D(Q) is not necessarily fundamental. The associated Kronecker symbol χ−D(·)= (−D· ) is itself a quadratic Dirichlet character. Note the character is primitive if and only if −D is a fundamental discriminant. Our goal is to average certain quantities involving χ−D over −D∈D(Q) ⁠. 5.1. Character sums We record a special case of Heath-Brown’s mean value theorem for primitive quadratic characters [10, Corollary 3]. Lemma 5.1 (Heath-Brown) Let N,Q≥1and let a1,…,anbe arbitrary complex numbers. Let S(Q)denote the set of all primitive quadratic characters of conductor at most Q ⁠. Then ∑χ∈S(Q)∣∑n≤Nanχ(n)∣2≪η((QN)1+η+QηN2+η)max1≤n≤N∣an∣for any η>0 ⁠. Using Lemma 5.1, we deduce an analogous mean value result for the quadratic characters attached to such discriminants. Lemma 5.2 Let N,Q≥1and let D(Q)be defined by (1.11). Then ∑−D∈D(Q)∣∑n≤Nχ−D(n)∣2≪η(QN)1+η+Q1/2+ηN2+ηfor any η>0 ⁠. Proof If −D∈D(Q) then −D=Δk2, where Δ∈Z is the discriminant of some imaginary quadratic field and k≥1 is some integer. Consequently, the Kronecker symbol χ−D(·)= (−D· ) is induced by the primitive quadratic character χΔ(·)= (Δ· ) ⁠. Moreover, χ−D(n)=χk2(n)χΔ(n). For the details on these facts, see [3, Section 7] for example. Therefore, by Lemma 5.1, ∑−D∈D(Q)∣∑n≤Nχ−D(n)∣2≤∑1≤k≤Q∑1≤∣Δ∣≤Q/k2χΔprimitive∣∑n≤Nχk2(n)χΔ(n)∣2≪η∑1≤k≤Q((QN)1+ηk−2−2η+Qηk−2ηN2+η)≪η(QN)1+η+Q1/2+ηN2+η as desired.□ 5.2. Error terms from Lemma 3.2 Again, the results in Section 5.1 leads to estimates for the error terms in Lemma 3.2. Lemma 5.3 Let X≥1,Q≥3and let D(Q)be defined by (1.11). For x≥1and any quadratic character χ ⁠, define E0(x;χ)and E1(x;χ)as in Lemma3.2. Then ∑−D∈D(Q)supX≤x≤2XE0(x;χ−D)≪ηQ1+ηX3/5+η+Q3/4+ηX1+η∑−D∈D(Q)supX≤x≤2XE1(x;χ−D)≪ηQ1+ηX−1/3+η+Q3/4+ηXη (5.1)for η>0 ⁠. Proof For X≤x≤2X ⁠, let 1≤y≤X be an unspecified parameter, depending only on X ⁠. Set y0≔yQ1/2+η ⁠. By Polya–Vinogradov, ∣Sχ(t)∣≪Q1/2logQ for any character χ(modq) with q≤Q ⁠. Therefore, for such χ ⁠, E0(x;χ)≪y2x+∣Sχ(y)∣+x∫y∞∣Sχ(t)∣t2dt≪ηy2x+∣Sχ(y)∣+x∫yy0∣Sχ(t)∣t2dt+xy. As y and y0 depend only on X and Q ⁠, it follows that supX≤x≤2XE0(x;χ)≪ηy2X+∣Sχ(y)∣+X∫yy0∣Sχ(t)∣t2dt+Xy. Summing the above expression over χ=χ−D with −D∈D(Q) ⁠, applying Cauchy–Schwarz, and invoking Lemma 5.2, we see that ∑−D∈D(Q)supX≤x≤2XE0(x;χ−D)≪ηQy2X+Q1+ηy1/2+η+Q3/4+ηy1+η+XQy+XQ1/2∫yy0(Qt)1/2+η+Q1/4+η/2t1+η/2t2dt≪ηQy2X+Q1+ηy1/2+η+Q3/4+ηy1+η+XQy+XQ1+ηy−1/2+η+XQ3/4+η/2y0η/2≪ηQy2X+XQy+XQ1+ηy−1/2+η+X1+ηQ3/4+η. In the last step, we used the definition of y0 and the fact that y≤X ⁠. Selecting y=X4/5 implies the desired result after rescaling η if necessary. We follow the same procedure for the average of E1 ⁠. First, we deduce that supX≤x≤2XE1(x;χ−D)≪ηyX+logX∫yy0∣Sχ(t)∣logtt2dt+log(Qy)y. Again, summing over χ=χ−D with −D∈D(Q) ⁠, we similarly conclude that ∑−D∈D(Q)supX≤x≤2XE1(x;χ−D)≪ηQyX+Qlog(Qy)y+Q1/2logX∫yy0(Qt)1/2+η+Q1/4+η/2t1+η/2t2dt≪ηQyX+Qlog(Qy)y+(Q1+ηy−1/2+η+Q3/4+η/2y0η/2)logX≪ηQyX+Q1+ηy1−η+XηQ1+ηy−1/2+η+XηQ3/4+η. Selecting y=X2/3 yields the desired result.□ Lemma 5.4 Let 0<ε<1/8,X≥1and Q≥3 ⁠. Let c=c(ε)>0and C=C(ε)≥1be arbitrary constants. For all except at most Oε(Q1−ε/10)discriminants −D∈D(Q) ⁠, E0(x;χ−D)≤x7/8+ε,E1(x;χ−D)≤x−1/8+ε, (5.2)uniformly for cDε≤x≤CD2+ε ⁠. Here E0and E1are defined as in Lemma3.2. Proof Without loss, we need only consider discriminants −D∈D(Q) satisfying D≥Q1−ε since the remainder are a collection of negligible size O(Q1−ε) ⁠. For X≥1 ⁠, define E(X,Q,ε)≔{−D∈D(Q):thereexistsX≤x≤2Xviolating(5.2)forχ−D}. By Lemma 5.3 with η=ε/8 ⁠, it follows that E(X,Q,ε)≪εQ1+ε/8X−11/40−7ε/8+Q3/4+ε/8X1/8−7ε/8. Dyadically summing this estimate over X between cQε(1−ε) and CQ2+ε ⁠, we see that the total number of discriminants −D∈D(Q) satisfying D≥Q1−ε and violating (5.2) anywhere in the range cDε≤x≤CD2+ε is bounded by ≪ε(Q1+18ε−1140ε(1−ε)+Q1−32ε)logQ≪εQ1−37320εlogQ≪εQ1−110ε as ε<1/8 ⁠.□ 5.3. Logarithmic derivatives For Q≥3 ⁠, define D*(Q)={fundamentaldiscriminantsΔwith3≤∣Δ∣≤Q}. We show that, aside from a sparse set of fundamental discriminants in D*(Q) ⁠, the logarithmic derivative of L(s,χΔ) at s=1 satisfies a GRH-quality bound. The key inputs are the explicit formula and Jutila’s zero density estimate for primitive quadratic characters. Lemma 5.5 Let Q≥3and ε>0be arbitrary. For all except at most Oε(Q5/6+ε)fundamental discriminants Δ∈D*(Q) ⁠, −L′L(1,χΔ)≪εloglog∣Δ∣. (5.3) Proof We modify the arguments leading to [18, Theorem 3]. Define Dε*(Q) to be the set of fundamental discriminants Δ∈D*(Q) such that ∣Δ∣≤Q and whose L-function L(s,χΔ) is zero-free in the rectangle 34<R{s}<1∣I{s}∣≤∣Δ∣ε. (5.4) First, we estimate −L′L(s,χΔ) for Δ∈Dε*(Q) ⁠. For simplicity, write χ=χΔ ⁠. From the explicit formula in the form given by [12, p. 261], one can verify that −L′L(1,χ)=1y−1∑m<y(ym−1)Λ(m)χ(m)−1y−1∑ρyρ−1ρ(1−ρ)+O(logyy) (5.5) for y≥2 ⁠, where the sum is taken over all non-trivial zeros ρ=β+iγ of L(s,χ) ⁠. From [12, Equation (5.4.6)] and the prime number theorem, it follows for T≥1 that −L′L(1,χ)=−1y−1∑ρ∣γ∣≤Tyρ−1ρ(1−ρ)+O(logy+log(∣Δ∣T)T+log2∣Δ∣y). Set T=∣Δ∣ε ⁠. By the symmetry of the functional equation for real characters χ and the fact that L(s,χ) has no zeros in (5.4), it follows that every zero appearing in the sum over ρ satisfies 14≤R{ρ}≤34 ⁠. Thus, trivially bounding the remaining zeros, we deduce that −L′L(1,χ)≪logy+y−1/4∑ρ∣γ∣≤∣Δ∣ε11+∣γ∣2+log∣Δ∣∣Δ∣ε+log2∣Δ∣y≪logy+log∣Δ∣y1/4+log∣Δ∣∣Δ∣ε+log2∣Δ∣y for y≥2 ⁠. Setting y=(log∣Δ∣)4+2 implies (5.3) holds for all Δ∈Dε*(Q) ⁠. It remains to show that the number of discriminants Δ∈D*(Q)⧹Dε*(Q) is small. Jutila’s zero density estimate [13, Theorem 2] implies that ∑Δ∈D*(Q)N(σ,T,χΔ)≪ε(QT)7−6σ6−4σ+ε10, where N(σ,T,χ) is the number of zeros ρ=β+iγ of L(s,χ) with σ<β<1 and ∣γ∣≤T ⁠. Setting σ=3/4 and T=Qε ⁠, we see that the number of fundamental discriminants Δ∈D*(Q) whose L-function L(s,χΔ) has a zero in the rectangle (5.4) is at most Oε(Q5/6+ε) ⁠. Hence, ∣D*(Q)⧹Dε*(Q)∣≪εQ5/6+ε as required.□ Lemma 5.5 implies the same type of result for the set of all discriminants D(Q) ⁠. Lemma 5.6 Let Q≥3and ε>0be arbitrary. For all except at most Oε(Q5/6+ε)discriminants −D∈D(Q) ⁠, −L′L(1,χ−D)≪εloglogD≤εlogD+Oε(1). Proof Let −D∈D(Q) so, as in the proof of Lemma 5.2, we may write −D=Δk2 for some negative fundamental discriminant Δ and an integer k≥1 ⁠. It follows that χ−D is induced by the primitive character χΔ and, in particular, χ−D=χΔχk2 ⁠. This implies that ∣L′L(1,χ−D)−L′L(1,χΔ)∣≤∑(n,k)≠1Λ(n)n≪∑p∣klogpp≪loglogk. Thus, if Δ is a fundamental discriminant satisfying (5.3) then −L′L(1,χ−D)≪εloglog∣Δ∣+loglogk≪εloglogD. Lemma 5.5 implies that the total number of discriminants −D failing the above bound is ≪ε∑k≤Q(Qk2)5/6+ε≪εQ5/6+ε. This completes the proof.□ 6. Congruence sum decomposition Let f(u,v)=au2+buv+cv2 be a form with discriminant −D ⁠. For this section, we will not require f to be primitive. For any integer n≥0 ⁠, define rf(n)≔∣{(u,v)∈Z2:n=f(u,v)}∣. (6.1) Moreover, for x≥1 and positive integers ℓ and d ⁠, define A=A(x,f)≔{(u,v)∈Z2:f(u,v)≤x},Aℓ=Aℓ(x,f)≔{(u,v)∈A:f(u,v)≡0(modℓ)},Aℓ(d)=Aℓ(x,f;d)≔{(u,v)∈Aℓ:(v,ℓ)=d}. (6.2) We will suppress the dependence on x and f whenever it is clear from context. This will be the case for almost the entirety of the paper. Observe that ∣A∣=∑n≤xrf(n)and∣Aℓ∣=∑n≤xℓ∣nrf(n)=∑d∣ℓ∣Aℓ(d)∣. (6.3) Note the last identity holds since Aℓ is a disjoint union of the sets Aℓ(d) over d∣ℓ ⁠. To calculate ∣Aℓ(d)∣ ⁠, and subsequently ∣Aℓ∣ ⁠, we will need to decompose it into sums similar to ∣Aℓ(1)∣ and estimate them with uniformity over all parameters. To this end, we introduce some additional notation. For any integer ℓ≥1 and m∈Z/ℓZ ⁠, define Bℓ=Bℓ(x,f)≔{(u,v)∈A:(v,ℓ)=1,f(u,v)≡0(modℓ)},Bℓ(m)=Bℓ(x,f;m)≔{(u,v)∈A:(v,ℓ)=1,u≡mv(modℓ)}. (6.4) Note that Bℓ is exactly the same as Aℓ(1) ⁠, but we distinguish it for the sake of clarity. The crucial property of the sets Bℓ and Bℓ(m) is summarized in the following lemma. Lemma 6.1 Let f(u,v)=au2+buv+cv2be a positive definite binary integral quadratic form of discriminant −Dand let ℓ≥1be a squarefree integer. Define M(ℓ)=Mf(ℓ)≔{m∈Z/ℓZ:am2+bm+c≡0(modℓ)}.Then ∣Bℓ∣=∑m∈M(ℓ)∣Bℓ(m)∣. (6.5)Furthermore, M(ℓ)=Mf(ℓ)≔∣M(ℓ)∣is a non-negative multiplicative function satisfying M(p)={1+χ(p)ifp∤a,χ(p)ifp∣aandp∤(a,b,c),pifp∣(a,b,c), (6.6)for all primes p ⁠. Here χ=χ−Dis the corresponding Kronecker symbol. Proof Let (u,v)∈Bℓ ⁠. As (v,ℓ)=1 ⁠, select m∈Z/ℓZ such that u≡mv(modℓ) ⁠. Thus, f(u,v)≡0(modℓ)⟺(am2+bm+c)v2≡0(modℓ)⟺m∈M(ℓ). This implies Bℓ is a union of Bℓ(m) over m∈M(ℓ) ⁠. One can verify from (6.4) that m1≢m2(modℓ) implies Bℓ(m1)∩Bℓ(m2)=. Thus, the union is in fact disjoint yielding (6.5). Next, we count M(ℓ)=∣M(ℓ)∣ ⁠. The function M(ℓ) is multiplicative by the Chinese Remainder Theorem. Let p be an odd prime. If p∤a then M(p)=1+χ(p) by the definition of χ ⁠. If p∣a then for m∈M(ℓ) 0≡am2+bm+c≡bm+c(modp). (6.7) Note in this scenario χ(p)=0 or 1 only. We consider cases. If p∤b then m≡−b−1c(modp) is the only solution to (6.7). Thus, M(p)=1=(b2p)=(b2−4acp)=χ(p) ⁠. If p∣b then condition (6.7) becomes c≡0(modp) ⁠. We further subdivide the cases. If p∤c then no value of m satisfies (6.7) implying M(p)=0=(b2−4acp)=χ(p) ⁠. If p∣c then p∣(a,b,c) in this subcase. Hence, all m∈Z/pZ vacuously satisfy (6.7) so M(p)=p ⁠. Comparing these cases, we see M(p) indeed satisfies (6.6) for all odd primes p ⁠. For p=2 ⁠, one can verify by a tedious case analysis that M(2) also satisfies (6.6).□ In light of Lemma 6.1, the main goal of this section is to determine the size of ∣Bℓ(m)∣ for any m∈Z/ℓZ ⁠. For convenience, set V=V(x,f)≔4axD. (6.8) This notation will be used throughout the paper. While we are more interested when V≥1 ⁠, we only assume x≥1 in all of our arguments so it is possible that 0<V<1 ⁠. Recall φ denotes the Euler totient function and τ is the divisor function. Lemma 6.2 Let f(u,v)=au2+buv+cv2be a positive definite binary integral quadratic form of discriminant −D ⁠. Let ℓ≥1be a squarefree integer and m∈Z/ℓZ ⁠. For x≥1 ⁠, ∣Bℓ(m)∣=φ(ℓ)ℓ2·πD2aV2+O(V+ℓ1/2τ(ℓ)DaV1/2+δ(ℓ)), (6.9)where Bℓ(m)=Bℓ(x,f;m)is defined by (6.4), V=V(x,f)is defined by (6.8), and δ(ℓ)is the indicator function for ℓ=1 ⁠. Remark The expressions and implied constant on the right-hand side of (6.9) are independent m∈Z/ℓZ ⁠. Proof Counting the numbers of pairs (u,v)∈Z2 satisfying f(u,v)≤x amounts to verifying the inequality (2au+bv)2+Dv2≤4ax. Fixing v ⁠, any u satisfying the above inequality lies in the range −bv−4ax−Dv22a≤u≤−bv+4ax−Dv22a. Without loss, we may assume m is an integer lying in {0,1,…,ℓ−1} ⁠. Restricting to u=mv+jℓ ⁠, we see that the integer j must lie in the range −(b+2am)v−4ax−Dv22aℓ≤j≤−(b+2am)v+4ax−Dv22aℓ. For each fixed v and solution u≡mv(modℓ) ⁠, the total number of such integers j is, therefore, 1ℓF(v)+O(1), where F(v)=1a4ax−Dv2=DaV2−v2. (6.10) Now, summing over all integers v satisfying ∣v∣≤V and (v,ℓ)=1 ⁠, we deduce that ∣Bℓ(m)∣=1ℓ∑∣v∣≤V(v,ℓ)=1F(v)+O(∑∣v∣≤V(v,ℓ)=11). The term v=0 contributes to the above sums if and only if ℓ=1 ⁠. Let δ(ℓ) be the indicator function for ℓ=1 ⁠. We separate the term v=0 ⁠, if necessary, in the sums above and note F(v) is even to see that ∣Bℓ(m)∣=2ℓ∑1≤v≤V(v,ℓ)=1F(v)+δ(ℓ)ℓDaV+O(V+δ(ℓ)). (6.11) We remove the condition (v,ℓ)=1 via Mobius inversion and deduce that ∑1≤v≤V(v,ℓ)=1F(v)=∑d∣ℓμ(d)∑1≤w≤V/dF(dw). (6.12) By Lemma 3.1, we see that ∑1≤w≤V/dF(dw)=dDa(πV24d2−V2d+O(V/d)). Since ∑d∣ℓμ(d)d=φ(ℓ)ℓ ⁠, ∑d∣ℓμ(d)=δ(ℓ) ⁠, and ∑d∣ℓd1/2≪ℓ1/2τ(ℓ) ⁠, it follows by (6.12) that ∑1≤v≤V(v,ℓ)=1F(v)=φ(ℓ)ℓπD4adV2−δ(ℓ)D2aV+O(ℓ1/2τ(ℓ)DaV1/2). (6.13) Combining (6.11) and (6.13) yields (6.9). Note that the terms involving δ(ℓ) cancel.□ We conclude this section by calculating ∣Bℓ∣ ⁠. Lemma 6.3 Let f(u,v)=au2+buv+cv2be a positive definite binary integral quadratic form of discriminant −D ⁠. If ℓ≥1is a squarefree integer then ∣Bℓ∣=M(ℓ)(φ(ℓ)ℓ2·πD2aV2+O(V+ℓ1/2τ(ℓ)DaV1/2+δ(ℓ))),where Bℓ=Bℓ(x,f)is defined by (6.4), V=V(x,f)is defined by (6.8), δ(ℓ)is the indicator function for ℓ=1 ⁠, and M(ℓ)=Mf(ℓ)is a multiplicative function defined by (6.6). Proof This is an immediate consequence of Lemmas 6.1 and 6.2 since the latter lemma’s estimates are uniform over all m∈Z/ℓZ ⁠.□ 7. Local densities We may now assemble our tools to establish the key technical proposition. Namely, we estimate the congruence sums given by (6.3) and calculate the local densities. Proposition 7.1 Let fbe a primitive positive definite binary quadratic form with discriminant −D ⁠. If ℓ≥1is a squarefree integer then for x≥1 ⁠, ∣Aℓ∣=∑n≤xℓ∣nrf(n)=g(ℓ)πD2aV2+O(τ3(ℓ)V+ℓ1/2τ(ℓ)τ3(ℓ)DaV1/2+1), (7.1)where V=4ax/Dand gis a multiplicative function satisfying g(p)=1p(1+χ(p)−χ(p)p)forallprimesp. (7.2)Here χ=χ−Dis the corresponding Kronecker symbol. Proof Let d∣ℓ and let Aℓ(d)=Aℓ(x,f;d) be defined by (6.2). From observation (6.3), it suffices to calculate ∣Aℓ(d)∣ ⁠. First, we introduce some notation. For any integer r≥1 ⁠, set fr(u,w)≔f(u,rw)=au2+bruw+cr2w2. (7.3) Notice that its discriminant is −r2D ⁠. Therefore, it follows for any α>0 that V(α2x,fr)=αVrandχ−r2D(n)={χ(n)if(n,r)=10otherwise, where V=V(x,f) and χ=χ−D as usual. Now, write ℓ=dk so (d,k)=1 as ℓ is squarefree. We wish to characterize each point (u,v)∈Aℓ(x,f;d) ⁠. Since (v,ℓ)=d ⁠, it follows by the Chinese Remainder Theorem that f(u,v)≡0(modℓ)⟺au2≡0(modd)andf(u,v)≡0(modk)⟺u≡0(modd(a,d))andf(u,v)≡0(modk). (7.4) Write u=d(a,d)s and v=dt for integers s and t ⁠. Note (t,k)=1 as (v,ℓ)=d and ℓ is squarefree. Then one can verify that f(u,v)=d2(a,d)2·(as2+b(a,d)st+c(a,d)2t2)=d2(a,d)2·f(a,d)(s,t). (7.5) From this change of variables, (7.4) and (7.5), we see that f(u,v)≡0(modℓ)⟺f(a,d)(s,t)≡0(modk)f(u,v)≤x⟺f(a,d)(s,t)≤(a,d)2d2x. (7.6) Note by (7.5) that the congruence conditions are equivalent as (d,k)=1 and ℓ=dk ⁠. Since (t,k)=1 necessarily, we have therefore established that ∣Aℓ(x,f;d)∣=∣Bk((a,d)2xd2,f(a,d))∣. (7.7) Summing this identity over d∣ℓ ⁠, we apply observation (6.3) and Lemma 6.3 to deduce that ∣Aℓ(x,f)∣=∑ℓ=dk∣Bk((a,d)2xd2,f(a,d))∣=πD2aV2·∑ℓ=dkMf(a,d)(k)φ(k)k2(a,d)d2+O(V·∑ℓ=dkMf(a,d)(k)d+DaV1/2∑ℓ=dkMf(a,d)(k)k1/2τ(k)d−1/2+∑ℓ=dkMf(a,d)(k)δ(k)). (7.8) We wish to simplify the remaining sums and error term. Let r∣ℓ ⁠. As f is primitive, Mf(p)={1+χ(p)ifp∤a,χ(p)ifp∣a, by (6.6). To compute Mfr ⁠, observe by the primitivity of f that a prime p divides (a,br,cr2) if and only if p divides (a,r) ⁠. Moreover, if p∣r then χ−r2D(p)=(−r2Dp)=(r2p)(−Dp)=0 and, similarly, if p∤r then χ−r2D(p)=χ(p) ⁠. Combining these observations with (6.6) and (7.3), we see that Mf(a,d)(p)={1+χ(p)ifp∤a,χ(p)ifp∣aandp∤(a,d),pifp∣(a,d). In particular, as (d,k)=1 ⁠, it follows that Mf(a,d)(k)=Mf(k) ⁠. Hence, ∑ℓ=dkMf(a,d)(k)φ(k)k2(a,d)d2=∑ℓ=dkMf(k)·φ(k)k2·(a,d)d2=∏p∣ℓp∤a((1+χ(p))(1p−1p2)+1p2)×∏p∣(ℓ,a)(χ(p)(1p−1p2)+1p)=∏p∣ℓ(1+χ(p)p−χ(p)p2)=g(ℓ). (7.9) Similarly, ∑ℓ=dkMf(a,d)(k)d=∏p∣ℓp∤a(1+χ(p)+1p)×∏p∣(ℓ,a)(χ(p)+1p)≪τ3(ℓ), (7.10) which implies that ∑ℓ=dkMf(a,d)(k)k1/2τ(k)d−1/2≪ℓ1/2τ(ℓ)∑ℓ=dkMf(a,d)(k)d≪ℓ1/2τ(ℓ)τ3(ℓ). (7.11) Combining the observation that ∑ℓ=dkMf(a,d)(k)δ(k)=Mf(a,d)(1)=1 with (7.8), (7.9), (7.10) and (7.11) yields the desired result.□ To obtain a better intuition for the quality of Proposition 7.1, we present the special case when ℓ=1 as a corollary below. We do not claim that this corollary is new, but we have not seen it stated in the literature and thought it may be of independent interest. Corollary 7.2 Let f(u,v)=au2+buv+cv2be a primitive positive definite binary quadratic form with discriminant −D ⁠. For x≥1 ⁠, ∑n≤xrf(n)=2πxD+O((ax)1/2D1/2+(Dx)1/4a3/4+1). Remarks. Suppose f is reduced so ∣b∣≤a≤c ⁠. It is well known (see for example [1, Lemma 3.1]) that ∑n≤xrf(n)=2πxD+O(x1/2a1/2+1). This estimate, like Corollary 7.2, gives the asymptotic ∼2πxD as long as x/c→∞ ⁠, but the error term in Corollary 7.2 is stronger than the above whenever x≥c ⁠. The source of this improvement is a standard analysis of the sawtooth function in Lemma 3.1 and its subsequent application in Lemma 6.2. As discussed in Section 1.2, the condition x≥c is the ‘non-trivial’ range for counting the lattice points inside the ellipse f(u,v)≤x whenever f is reduced. 8. Application of Selberg’s sieve We now apply Selberg’s sieve to give an upper bound for the number of primes in a short interval represented by a reduced positive definite primitive integral binary quadratic form. We leave the calculation of the main term’s implied constant unfinished as the final arguments vary slightly for Theorems 1.4 and 1.5. Proposition 8.1 Let f(u,v)=au2+buv+cv2be a reduced positive definite integral binary quadratic form with discriminant −D ⁠. Let (ax)1/2≤y≤x ⁠. Set z=(aDx)1/4y1/2(logy)−7+1. (8.1)If x≥D/athen πf(x)−πf(x−y)<{logyJ+O((logy)−1)}δfyh(−D)logy, (8.2)where J={1L(1,χ)∑ℓ<zg(ℓ)ifL(1,χ)≥(logy)−2,(logy)2otherwise. (8.3)Here χ=χ−Dis the corresponding Kronecker symbol and gis the completely multiplicative function defined by (7.2). Proof As f is reduced, we have that ∣b∣≤a≤c and moreover c≍D/a≥D≥a ⁠. We will frequently apply these properties while only mentioning that f is reduced. Our argument is divided according to the size of L(1,χ) ⁠. First, assume L(1,χ)<(logy)−2 ⁠. Let w=#{A∈SL2(Z):A·f=f} so by [25, p. 63, Satz 2], w=6 or 4 if −D=−3 or −4, respectively, and w=2 otherwise. If a prime p is represented by f then it is represented with multiplicity equal to δf−1w ⁠. Thus, by Corollary 7.2, wδf(πf(x)−πf(x−y))≤∑x−y<n≤xrf(n)=2πyD+O((ax)1/2D1/2) because x≥D/a and f is reduced. The class number formula (see [25, p. 72, Satz 5] for example) states that h(−D)=wD2πL(1,χ). (8.4) Hence, by our assumption on L(1,χ) ⁠, πf(x)−πf(x−y)≤L(1,χ){1+O((ax)1/2y)}δfyh(−D)≪yh(−D)(logy)2. In the last step, we used that y≥(ax)1/2 ⁠. This establishes (8.2) when L(1,χ)<(logy)−2 ⁠. Therefore, we may henceforth assume L(1,χ)≥(logy)−2. (8.5) Defining P=P(z)=∏p≤zp ⁠, it follows that wδf(πf(x)−πf(x−y))≤∑x−y<n≤x(n,P)=1rf(n)+wδfπ(z), (8.6) where π(z) is the number of primes up to z ⁠. We proceed to estimate the sieved sum. Using Proposition 7.1 and Selberg’s upper bound sieve [7, Theorem 7.1] with level of distribution z2 ⁠, we see that ∑x−y<n≤x(n,P)=1rf(n)<2πyDJ+∑ℓ∣Pℓ<z2rℓλℓ, (8.7) where J=∑ℓ∣Pℓ<zh(ℓ),h(ℓ)=∏p∣ℓg(p)1−g(p),∣λℓ∣≤τ3(ℓ), and rℓ≪τ3(ℓ)V+ℓ1/2τ(ℓ)τ3(ℓ)DaV1/2. Here, as usual, V=4ax/D ⁠. Note 1≤V≤x1/2 as x≥D/a and a≤D ⁠. For the quantity J in the main term, we treat g as a completely multiplicative function and note by (8.3) that J≥∑ℓ<zg(ℓ)=L(1,χ)J. (8.8) The remainder term in (8.7) is bounded in a straightforward manner. Using standard estimates for the k-divisor function τk(ℓ) (see, for example, [17]) and the prime number theorem, one can verify that wδfπ(z)+∑ℓ∣Pℓ<z2rℓλℓ≪zlogz+V∑ℓ<z2τ3(ℓ)2+DaV1/2∑ℓ<z2ℓ1/2τ(ℓ)τ3(ℓ)2≪z2(logz)8·V+z3(logz)17·DaV1/2≪ayD(logy)−6+y3/2x−1/2D(logy)−4. In the last step, we used that V=4ax/D and, by (8.1), z=(aDx)1/4y1/2(logy)−7+1≤y ⁠. Since y≤x and a≤D ⁠, we see that the above is ≪yD(logy)4. Thus, applying the class number formula (8.4) and the well-known estimate L(1,χ)≪logD≪logy ⁠, we conclude that wδfπ(z)+∑ℓ∣Pvℓ<z2rℓλℓ≪yh(−D)(logy)3. (8.9) Combining (8.6), (8.7), (8.8) and (8.9) completes the proof of the proposition with a final application of the class number formula (8.4).□ Evidently, from Proposition 8.1, we will require a lower bound for the sum of local densities. We execute the first steps here. Lemma 8.2 Let gbe the completely multiplicative function defined by (7.2). For z≥1 ⁠, ∑ℓ<zg(ℓ)≥L(1,χ)logz+L′(1,χ)+O(L(1,χ)+E1(z;χ)+z−1E0(z;χ)).Here E1(z;χ)and E0(z;χ)are defined as in Lemma3.2. Proof Define G(s)≔∑n=1∞g(n)n−s=∏p(1−g(p)p−s)−1, which absolutely converges for R{s}>0 since ∣g(p)∣≤2/p ⁠. One can verify that for R{s}>0 G(s)=ζ(s+1)L(s+1,χ)G˜(s), (8.10) where ζ(s) is the Riemann zeta function, L(s,χ) is the Dirichlet L-function attached to the quadratic character χ=χ−D ⁠, and G˜(s)≔∏p(1−χ(p)p−s−2−χ(p)p−2s−21−(1+χ(p))p−s−1+χ(p)p−s−2). It is straightforward to check that G˜(s) is absolutely convergent for R{s}>−1 and, in particular, G˜(0)=1 ⁠. Expanding the Euler product for G˜(s) and writing G˜(s)=∑ng˜(n)n−s for some multiplicative function g˜ ⁠, one can see that g˜(n)≪n−2. As G˜(0)=1 ⁠, it follows that ∑n≤Ng˜(n)=1+O(N−1). Therefore, from (8.10), ∑ℓ<zg(ℓ)=∑ℓ<z∑ℓ=mn(1∗χ)(m)mg˜(n)=∑m<z(1∗χ)(m)m+O(z−1∑m<z(1∗χ)(m)). The desired result now follows from Lemma 3.2.□ 9. Representation of primes We may finally prove Theorems 1.4 and 1.5. In both cases, we will need to apply Proposition 8.1 from which one can see that it suffices to provide an appropriate lower bound for J when L(1,χ)≥(logy)−2 ⁠. By Lemma 8.2, it follows that J≥logz+L′L(1,χ)+O(1+E1(z;χ)(logy)2+E0(z;χ)(logy)2z) (9.1) provided L(1,χ)≥(logy)−2 ⁠, (ax)1/2≤y≤x ⁠, and z is given by (8.1). Recall that E0 and E1 are given by Lemma 3.2 with estimates exhibited in Sections 4 and 5. The proofs for both theorems will employ (9.1). Before we proceed, we wish to emphasize that f(u,v)=au2+buv+cv2 is assumed to be a reduced positive definite binary integral quadratic form of discriminant −D ⁠. Thus, ∣b∣≤a≤c and a≤D ⁠. 9.1. Proof of Theorem 1.4 Recall we are assuming that (D1+4ϕa)1/2+εx1/2+ε≤y≤x, where ϕ is given by (1.8). As a≤D ⁠, this implies that y≥(ax)1/2 ⁠. Furthermore, z=(aDx)1/4y1/2(logy)−7+1≫εDϕ+ε/2,andlogz≍logy≍logx. Therefore, applying Lemma 4.4 (or Lemma 4.5 when assuming GRH) and Lemma 4.3 to (9.1), it follows that J≥logz−(ϕ2+ε4)logD+Oε(1+z−ε2log2y)≥12logy−14logx−(14+ϕ2+ε4)logD+14loga+Oε(loglogy)=1−θ2logy+Oε(loglogy), where θ is defined as in Theorem 1.4. Substituting this estimate in Proposition 8.1 establishes Theorem 1.4 when L(1,χ)≥(logy)−2 ⁠. If L(1,χ)<(logy)−2 then the desired result follows immediately from Proposition 8.1 and hence completes the proof.□ 9.2. Proof of Theorem 1.5 Recall D(Q) is given by (1.11). Let c1(ε)>0 be a sufficiently small constant and C1(ε),C2(ε)≥1 be sufficiently large constants, all of which depend only on ε ⁠. For Q≥3 ⁠, let Dε(Q) be the subset of discriminants −D∈D(Q) such that −L′L(1,χ−D)≤εlogD+C2(ε) (9.2) and, for c1(ε)Dε≤u≤C1(ε)D2+ε ⁠, E0(u;χ−D)≤u7/8+ε,E1(u;χ−D)≤u−1/8+ε. (9.3) By Lemmas 5.4 and 5.6, the number of discriminants not satisfying these two properties is ∣D(Q)⧹Dε(Q)∣≪εQ1−ε/10. Thus, it suffices to show for every discriminant −D∈Dε(Q) and reduced positive definite binary quadratic form f of discriminant −D that πf(x)−πf(x−y)<21−θ′δfyh(−D)logy{1+Oε(loglogylogy)} (9.4) provided (Dxa)1/2+ε≤y≤x ⁠. Here θ′ is defined as in Theorem 1.4 with ϕ=0 ⁠. First, assume (D2xa)1/2+ε≤y≤x. (9.5) Arguing as in Section 9.1, it follows that y≥(ax)1/2 ⁠, z=(aDx)1/4y1/2(logy)−7+1≫εD1/4+ε/2,andlogz≍logy≍logx. Thus, incorporating (9.2) and (4.1) from Lemma 4.3 into (9.1), it similarly follows that J≥1−θ′2logy+Oε(loglogy) (9.6) whenever L(1,χ)≥(logy)−2 ⁠. Therefore, by Proposition 8.1, this establishes (9.4) provided (9.5) holds and −D∈Dε(Q) ⁠. It remains to consider the case when (Dxa)1/2+ε≤y≤(D2xa)1/2+ε≤x. (9.7) Note we continue to assume −D∈Dε(Q) ⁠. As before, it follows that y≥(ax)1/2 ⁠, z=(aDx)1/4y1/2(logy)−7+1≫εDε/2,andlogz≍logy≍logx. Thus, incorporating (9.2) and (9.3) into (9.1), we again obtain (9.6) whenever L(1,χ)≥(logy)−2 ⁠. By Proposition 8.1, this establishes (9.4) provided (9.7) holds and −D∈Dε(Q) ⁠. This completes the proof in all cases.□ 10. Representation of small integers with few prime factors Proof of Theorem 1.6 We apply the beta sieve to the sequence A=A(x,f) given by (6.2). Let g be the local density function defined in Proposition 7.1, so g(p)≤2/p ⁠. Thus, the sequence A is of dimension at most κ=2 and has sifting limit β=β(κ)<4.85 according to [7, Section 11.19]. Let x≥D/a and select z≔V10/49, where V=4ax/D≥2 ⁠. Select the level of distribution to be R=z485/100>zβ ⁠. Thus, by [7, Theorem 11.13] and Proposition 7.1, it follows that ∑n≤x(n,P(z))=1rf(n)≫xD(logx)2+O(∑ℓ∣P(z)ℓ<R∣rℓ∣), (10.1) where ∣rℓ∣≪ℓηV+ℓ1/2+ηDaV1/2+1 for fixed η>0 sufficiently small. Since f is reduced and R=z485/100=V97/98 ⁠, we see that ∑ℓ∣P(z)ℓ<R∣rℓ∣≪R1+ηV+DaR3/2+ηV1/2+R1−η≪DaV2−3196+η≪x1−3392+ηD. Thus, as η>0 is sufficiently small, ∑n≤x(n,P(z))=1rf(n)≫xD(logx)2 for x≥D/a ⁠. For an integer k≥10 ⁠, observe that z≥x1/k if and only if (axD)5k/49≥x⟺x≥(Da)1+495k−49. This completes the proof.□ Acknowledgements I would like to thank John Friedlander and Jesse Thorner for their encouragement and many helpful comments on an earlier version of this paper. I am also grateful to Kannan Soundararajan for several insightful conversations and for initially motivating me to pursue this approach. Funding Part of this work was completed with the support of an NSERC Postdoctoral Fellowship. References 1 V. Blomer and A. Granville , Estimates for representation numbers of quadratic forms , Duke Math. J. 135 ( 2006 ), 261 – 302 . Google Scholar Crossref Search ADS 2 D. A. Burgess , On character sums and L-series. II , Proc. London Math. Soc. (3) 13 ( 1963 ), 524 – 536 . Google Scholar Crossref Search ADS 3 D. A. Cox , Primes of the form x2 + ny2. Pure and Applied Mathematics (Hoboken) , 2nd edn , John Wiley & Sons, Inc , Hoboken, NJ , ( 2013 ). Fermat, class field theory, and complex multiplication. 4 K. Debaene , Explicit counting of ideals and a brun-titchmarsh inequality for the chebotarev density theorem, arXiv preprint arXiv:1611.10103, 2016 . 5 J. Ditchen , On the average distribution of primes represented by binary quadratic forms, arXiv preprint arXiv:1312.1502, 2013 . 6 J. J. Ditchen , Primes of the shape x2 + ny2. The distribution on average and prime number races. PhD thesis, ETH Zürich, 2013 . 7 J. Friedlander and H. Iwaniec , Opera de cribro, volume 57 of American Mathematical Society Colloquium Publications , American Mathematical Society , Providence, RI , 2010 . 8 E. Fogels , On the zeros of Hecke’s L-functions. I, II , Acta Arith. 7 ( 1961 /1962), 131 – 147 . Google Scholar Crossref Search ADS 9 D. R. Heath-Brown , Zero-free regions for Dirichlet L-functions, and the least prime in an arithmetic progression , Proc. London Math. Soc. (3) 64 ( 1992 ), 265 – 338 . Google Scholar Crossref Search ADS 10 D. R. Heath-Brown , A mean value estimate for real character sums , Acta Arith. 72 ( 1995 ), 235 – 275 . Google Scholar Crossref Search ADS 11 H. Iwaniec and E. Kowalski , Analytic number theory, volume 53 of American Mathematical Society Colloquium Publications , American Mathematical Society , Providence, RI , 2004 . 12 Y. Ihara , V. Kumar Murty and M. Shimura , On the logarithmic derivatives of Dirichlet L-functions at s = 1 , Acta Arith. 137 ( 2009 ), 253 – 276 . Google Scholar Crossref Search ADS 13 M. Jutila , On mean values of Dirichlet polynomials with real characters , Acta Arith. 27 ( 1975 ), 191 – 198 . Collection of articles in memory of Juriĭ Vladimirovič Linnik. Google Scholar Crossref Search ADS 14 E. Kowalski and P. Michel , Zeros of families of automorphic L-functions close to 1 , Pacific J. Math. 207 ( 2002 ), 411 – 431 . Google Scholar Crossref Search ADS 15 J. C. Lagarias , H. L. Montgomery and A. M. Odlyzko , A bound for the least prime ideal in the Chebotarev density theorem , Invent. Math. 54 ( 1979 ), 271 – 296 . Google Scholar Crossref Search ADS 16 J. C. Lagarias and A. M. Odlyzko , Effective versions of the Chebotarev density theorem. pages ( 1977 ), 409 – 464 . 17 F. Luca and L. Tóth , The rth moment of the divisor function: an elementary approach , J. Integer Seq. 20 ( 2017 ), 17.7.4 . 18 M. Mourtada and V. Kumar Murty , Omega theorems for L′/L(1,χD) , Int. J. Number Theory 9 ( 2013 ), 561 – 581 . Google Scholar Crossref Search ADS 19 H. L. Montgomery and R. C. Vaughan , The large sieve , Mathematika 20 ( 1973 ), 119 – 134 . Google Scholar Crossref Search ADS 20 H. L. Montgomery and R. C. Vaughan , Multiplicative number theory. I. Classical theory, volume 97 of Cambridge Studies in Advanced Mathematics , Cambridge University Press , Cambridge , 2007 . 21 N. Tschebotareff , Die Bestimmung der Dichtigkeit einer Menge von Primzahlen, welche zu einer gegebenen Substitutionsklasse gehören , Math. Ann. 95 ( 1926 ), 191 – 228 . Google Scholar Crossref Search ADS 22 J. Thorner and A. Zaman , A Chebotarev Variant of the Brun-Titchmarsh Theorem and Bounds for the Lang-Trotter conjectures , Int. Math. Res. Not. ( 2017 ). https://doi.org/10.1093/imrn/rnx031 23 J. Thorner and A. Zaman , An explicit bound for the least prime ideal in the Chebotarev density theorem , Algebra Number Theory 11 ( 2017 ), 1135 – 1197 . Google Scholar Crossref Search ADS 24 A. Weiss , The least prime ideal , J. Reine Angew. Math. 338 ( 1983 ), 56 – 94 . 25 D. B. Zagier , Zetafunktionen und quadratische Körper , Springer-Verlag , Berlin-New York , ( 1981 ). Eine Einführung in die höhere Zahlentheorie. [An introduction to higher number theory], Hochschultext. [University Text]. © The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png The Quarterly Journal of Mathematics Oxford University Press

# Primes represented by positive definite binary quadratic forms

The Quarterly Journal of Mathematics, Volume 69 (4) – Dec 1, 2018
34 pages

Loading next page...
1

/lp/ou_press/primes-represented-by-positive-definite-binary-quadratic-forms-pVdonT48Se
Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
ISSN
0033-5606
eISSN
1464-3847
D.O.I.
10.1093/qmath/hay028
Publisher site
See Article on Publisher Site

### Abstract

Abstract Let f be a primitive positive definite integral binary quadratic form of discriminant −D and let πf(x) be the number of primes up to x which are represented by f. We prove several types of upper bounds for πf(x) within a constant factor of its asymptotic size: unconditional, conditional on the Generalized Riemann Hypothesis (GRH) and for almost all discriminants. The key feature of these estimates is that they hold whenever x exceeds a small power of D and, in some cases, this range of x is essentially best possible. In particular, if f is reduced then this optimal range of x is achieved for almost all discriminants or by assuming GRH. We also exhibit an upper bound for the number of primes represented by f in a short interval and a lower bound for the number of small integers represented by f which have few prime factors. 1. Introduction 1.1. Historical review The distribution of primes represented by positive definite integral binary quadratic forms is a classical topic within number theory and has been intensely studied over centuries by many renowned mathematicians, including Fermat, Euler, Lagrange, Legendre, Gauss, Dirichlet and Weber. A beautiful exposition on the subject and its history can be found in [3]. We shall refer to positive definite integral binary quadratic forms as simply ‘forms’. Let f(u,v)=au2+buv+cv2 be a form of discriminant −D=b2−4ac ⁠. An integer n is said to be represented by f if there exists (u,v)∈Z2 such that n=f(u,v) ⁠. A form is primitive if a,b, and c are relatively prime. The group SL2(Z) naturally acts on the set of primitive forms with discriminant −D, and two forms are said to be properly equivalent if they belong to the same SL2(Z) orbit. A primitive form is reduced if ∣b∣≤a≤c and b≥0 if either ∣b∣=a or a=c ⁠. Every primitive form is properly equivalent to a unique reduced form. Most amazingly, the set of primitive forms with discriminant −D modulo proper equivalence can be given a composition law that makes it a finite abelian group Cl(−D) ⁠. This is the class group of −D and its size h(−D) is the class number. We refer to each equivalence class of the class group as a form class. Now, assume f is primitive. The central object of study is the number of primes represented by f up to x ⁠, denoted πf(x)=∣{p≤x:p=f(u,v)forsome(u,v)∈Z2}∣ for x≥2 ⁠. From deep connections established by class field theory, the Chebotarev density theorem [16, 21] implies that primes are equidistributed amongst all form classes. Namely, πf(x)∼δfxh(−D)logx (1.1) as x→∞ ⁠, where δf={12iff(u,v)isproperlyequivalenttoitsoppositef(u,−v),1otherwise. (1.2) Unfortunately, the asymptotic (1.1) derived from [16] requires x to be exponentially larger than D and, if a Siegel zero exists, then the situation worsens. (A Siegel zero is a putative real zero of a real Hecke L-function that is extremely close to the edge of the critical strip. It would be a major breakthrough to eliminate the possibility of its existence.) In either case, this is unsuitable for many applications. Assuming the Generalized Riemann Hypothesis (GRH), one can do much better. It follows from the same work of Lagarias and Odlyzko [16] that, assuming GRH, πf(x)=δfLi(x)h(−D)+O(x1/2log(Dx)) (1.3) for x≥2 ⁠. Here Li(x)=∫2x1logtdt∼xlogx ⁠. The main term in (1.3) dominates when x≥D1+ε and any ε>0 ⁠. This GRH range for x is essentially optimal for certain but not necessarily all forms. Among other results, we will partially address this defect in this paper. For further discussion of range optimality and improvements (both conditional and unconditional), see Sections 1.2 and 1.3. Now, to remedy our lack of knowledge about the asymptotics of πf(x) in the unconditional case, one can settle for upper and lower bounds of the shape xh(−D)logx≪πf(x)≪xh(−D)logx (1.4) in the hopes of improving the valid range of x ⁠. For uniform lower bounds, we refer the reader to [8, 14, 24] and, most recently, [23] wherein it was shown that there exists a prime p represented by f of size at most O(D700) ⁠. The focus of this paper, however, is on upper bounds for πf(x) ⁠. Assuming GRH, there has been no improvement beyond (1.3) itself. Unconditionally, a general result of Lagarias–Montgomery–Odlyzko [15, Theorem 1.5] on the Chebotarev density theorem yields some progress, but the range of x is still worse than exponential in D ⁠. Recently, a theorem of Thorner and Zaman [22] improves upon the aforementioned Chebotarev result and consequently implies that πf(x)<2δfLi(x)h(−D)forx≥D700 (1.5) and D sufficiently large. Short of excluding a Siegel zero, the constant 2 is best possible and, moreover, the range of x is a polynomial in D like (1.3). On the other hand, the quality of exponent 700 in (1.5) leaves much to be desired when compared to the GRH exponent of 1+ε ⁠. Both [15, 22] carefully study the zeros of Hecke L-functions to prove their respective results, whereas a more recent paper of Debaene [4] uses a lattice point counting argument and Selberg’s sieve to subsequently establish another such Chebotarev-type theorem. His result implies a weaker variant of inequality (1.5), but for a greatly improved range of x≥D9/2+ε ⁠. Broadly speaking, we will specialize Debaene’s strategy to positive definite binary quadratic forms. An alternate formulation that expands the valid range of upper (and lower) bounds for πf(x) can be obtained by averaging over discriminants or forms, in analogy with the famous theorems of Bombieri–Vinogradov and Barban–Davenport–Halberstam on primes in arithmetic progressions. As part of his Ph.D. thesis, Ditchen [5, 6] achieved such elegant statements which we present in rough terms for simplicityʼs sake. Namely, for 100% of fundamental discriminants −D≢0(mod8) and all of their forms f ⁠, he proved πf(x)=δfxh(−D)logx+Oε(xh(−D)(logx)2) (1.6) provided x≥D20/3+ε ⁠. He also showed for 100% of fundamental discriminants −D≢0(mod8) and 100% of their forms, Equation (1.6) holds for x≥D3+ε ⁠. Finally, before we discuss an optimal range for upper bounds of πf(x) and the details of our results, we give a somewhat imprecise flavor of what we have shown. One should compare with (1.3), (1.5), (1.6), and their associated papers [4, 5, 16, 22]. Theorem 1.1 Let f(u,v)=au2+buv+cv2be a reduced positive definite integral binary quadratic form of discriminant −D ⁠. If GRH holds and ε>0then πf(x)≪εxh(−D)logxforx≥(D/a)1+ε.This estimate holds unconditionally for 100% of discriminants −D ⁠. Unconditionally and uniformly over all discriminants −D ⁠, the same upper bound for πf(x)holds for x≥(D2/a)1+ε ⁠. 1.2. Optimal range What is the minimal size of x relative to D for which one can reasonably expect (1.4) to hold for all forms f? We have seen GRH implies x≥D1+ε is valid and, in one sense, this range is best possible. For example, the form f(u,v)=u2+D4v2 with D≡0(mod4) does not represent any prime <D/4 ⁠. The reason is simple: the coefficients are simply too large. On the other hand, what about forms f(u,v)=au2+buv+cv2 for which all of the coefficients are small compared to D ⁠, say a,b,c≪D? The GRH range x≥D1+ε is insensitive to the size of the form’s coefficients. Since primitive forms are properly equivalent to a reduced form and all forms in the same form class represent the same primes, it is enough to consider a reduced form f so the coefficients necessarily satisfy ∣b∣≤a≤c ⁠. Consider the sum ∑n≤x∣{(u,v)∈Z2:n=f(u,v)}∣. If x<c then, as f is reduced, the only terms n contributing to the above sum are of the form n=f(u,0)=au2 ⁠. The sum is, therefore, equal to 2x/a+O(1) and at most one prime value n contributes to its size. On the other hand, if x≥c then the above sum is of size O(x/D) ⁠. Thus, one might reasonably suspect that primes will appear with natural frequency for x≥c1+ε and any ε>0 ⁠. As c≍D/a ⁠, it is conceivable for reduced forms f to satisfy (1.3) in (what we will refer to as) the optimal range x≥(D/a)1+ε for any ε>0 ⁠. (Of course, one could potentially refine the factor of ε but that is not our goal.) Can we expect to obtain any kind of unconditional or GRH-conditional bounds for πf(x) in the optimal range? For the sake of comparison, we turn to primes in arithmetic progressions since estimates like (1.5) are relatives of the classical Brun–Titchmarsh inequality. A version due to Montgomery and Vaughan [19] states for (a,q)=1 and x>q that π(x;q,a)<2xφ(q)log(x/q). (1.7) Here π(x;q,a) represents the number of primes up to x congruent to a(modq) ⁠. The range x>q is clearly best possible, which inspires the possibility of success for a similar approach to primes represented by binary quadratic forms. Our goal is to give upper bounds for πf(x) as close to the optimal range as possible. 1.3. Results Our first result is a uniform upper bound for πf(x) ⁠, both unconditional and conditional on GRH. For a discriminant −D ⁠, let χ−D(·)=(−D·) denote the corresponding Kronecker symbol which is a quadratic Dirichlet character. Theorem 1.2 Let f(u,v)=au2+buv+cv2be a reduced positive definite integral binary quadratic form with discriminant −Dand let ε>0be arbitrary. Set ϕ=ϕ(χ−D)≔{0assumingL(s,χ−D)satisfiesGRH,14unconditionally. (1.8)If x≥(D1+4ϕ/a)1+εthen πf(x)<41−θ·δfxh(−D)logx{1+Oε(loglogxlogx)}, (1.9)where θ=θϕ=(1+2ϕ+ε2)logDlogx−logalogx. Remarks If ϕ=0 then 0<θ<1 ⁠. Similarly, if ϕ=14 then 0<θ<34 ⁠. The constant ϕ is associated with bounds for L(s,χ−D) in the critical strip. In particular, ϕ=12 corresponds to the usual convexity estimate. Unconditionally, one should compare this estimate with (1.5) and the related works [4, 22]. Theorem 1.2 achieves the upper bound in (1.4) with the range x≥(D2/a)1+ε which improves over the prior range x≥D9/2+ε implied by [4]. In fact, when a≫D1/2 ⁠, Theorem 1.2’s range becomes x≫D3/2+ε, which is fairly close to the classical GRH range x≥D1+ε ⁠. Of course, inequality (1.9) has a weaker implied constant 41−θ instead of 2 as in (1.5). Assuming GRH, we obtain the desired range x≥(D/a)1+ε discussed in Section 1.2. Note we only assume GRH for the quadratic Dirichlet L-function L(s,χ−D), whereas (1.3) assumes GRH for the collection of Hecke L-functions associated to the corresponding ring class field. Since every primitive form is properly equivalent to a reduced form, we may ignore the dependence on the coefficient a in Theorem 1.2 to obtain the following simplified result. Corollary 1.3 Let fbe a primitive positive definite integral binary quadratic form with discriminant −Dand let ε>0be arbitrary. If x≥D2+εthen πf(x)<8·δfxh(−D)logx{1+Oε(loglogxlogx)}. We will actually prove a more general version of Theorem 1.2 that allows us to estimate the number of primes represented by f in a short interval. Theorem 1.4 Let f(u,v)=au2+buv+cv2be a reduced positive definite integral binary quadratic form with discriminant −Dand let ε>0be arbitrary. Let ϕbe defined by (1.8). If (D1+4ϕa)1/2+εx1/2+ε≤y≤xthen πf(x)−πf(x−y)<21−θ′·δfyh(−D)logy{1+Oε(loglogylogy)}, (1.10)where θ′=θϕ′=logx2logy+(12+ϕ+ε4)logDlogy−loga2logy. Remarks Assuming GRH, (1.3) implies that πf(x)−πf(x−y)≪δfyh(−D)logy for (Dx)1/2+ε≤y≤x ⁠. Theorem 1.4 yields an unconditional upper bound of comparable strength and, depending on the size of the coefficients of f ⁠, implies a GRH upper bound for slightly shorter intervals and smaller values of x ⁠. If f is only assumed to be primitive, then the same statement holds by setting a=1 in the condition on y and the value of θ′ ⁠. Theorem 1.2 follows by setting y=x ⁠. For any given discriminant −D ⁠, the unconditional versions (ϕ=14) of Theorems 1.2 and 1.4 fall slightly shy of their GRH counterparts (⁠ ϕ=0 ⁠). Averaging over all discriminants, we show that the GRH quality estimates hold almost always. For Q≥3 ⁠, define D(Q)≔{discriminants−Dwith3≤D≤Q}. (1.11) Here and throughout, a discriminant −D is that of a positive definite integral binary quadratic form. Thus, −D is a negative integer ≡0 or 1(mod4) ⁠. Theorem 1.5 Let Q≥3and 0<ε<18 ⁠. For all except at most Oε(Q1−ε10)discriminants −D∈D(Q) ⁠, the statements in Theorems1.2 and 1.4hold unconditionally with ϕ=0 ⁠. Remark. When considering upper bounds for πf(x) ⁠, this improves over (1.6) in several aspects. First, the desired range discussed in Section 1.2 is achieved on average. Second, we did not utilize any averaging over forms, only their discriminants. Finally, there are no restrictions on the family of discriminants; they need not be fundamental or satisfy any special congruence condition. The underlying strategy to establish Theorems 1.2, 1.4 and 1.5 rests on a natural two-step plan. First, estimate the congruence sum ∑n≤xℓ∣nrf(n), (1.12) where ℓ is squarefree and rf(n)=∣{(u,v)∈Z2:n=f(u,v)}∣ is the number of representations of the integer n by the form f ⁠. Secondly, apply Selberg’s upper bound sieve. Our application of the sieve is fairly routine, but calculating the congruence sums with sufficient precision poses some difficulties. Inspired by a beautiful paper of Blomer and Granville [1] wherein they carefully study the moments of rf(n) ⁠, we determine the congruence sums via geometry of numbers methods. However, for their purposes, only a simple well-known estimate [1, Lemma 3.1] for the first moment ∑n≤xrf(n) was necessary. We execute a more refined analysis of the first moment and related quantities ∣Bℓ(m)∣ (see (6.4) for a definition) using standard arguments with the sawtooth function. Afterwards, the main technical hurdle is to carefully decompose the congruence sum (1.12) into a relatively small number of disjoint quantities ∣Bℓ(m)∣ ⁠. This allows us to apply our existing estimates for ∣Bℓ(m)∣ (see Lemma 6.2) while simultaneously controlling the compounding error terms. We achieve this in Proposition 7.1; see Sections 6 and 7 for details on this argument. When finalizing the proofs of Theorems 1.2, 1.4 and 1.5, the various valid ranges for x are determined by relying on character sum estimates like Burgess’s bound, Heath-Brown’s mean value theorem for quadratic characters [10], and Jutila’s zero density estimate [13]. Since we have calculated the congruence sums (1.12) in Proposition 7.1, we thought it may be of independent interest to examine its performance in conjunction with a lower bound sieve. By a direct application of the beta sieve, we show: Theorem 1.6 Let f(u,v)=au2+buv+cv2be a reduced positive definite integral binary quadratic form with discriminant −D ⁠. For every integer k≥10 ⁠, the number of integers represented by fwith at most kprime factors is ≫xD(logx)2forx≥(Da)1+495k−49. Remarks One can formulate this statement in a slightly weaker alternative fashion: for every ε>0 and x≥(D/a)1+ε ⁠, the number of integers represented by f with at most Oε(1) prime factors is ≫xD(logx)2 ⁠. It is unsurprising that f represents many integers with few prime factors as this follows from standard techniques in sieve theory. The form f even represents primes of size O(D700) by [23]. However, the key feature of Theorem 1.6 is that the size of these integers with few prime factors is very small (in a best possible sense per Section 1.2). 2. Notation and conventions For each of the asymptotic inequalities F≪G ⁠, F=O(G) ⁠, or G≫F ⁠, we mean there exists a constant C>0 such that ∣F∣≤CG ⁠. We henceforth adhere to the convention that all implied constants in all asymptotic inequalities are absolute with respect to all parameters and are effectively computable. If an implied constant depends on a parameter, such as ε ⁠, then we use ≪ε and Oε to denote that the implied constant depends at most on ε ⁠. Throughout the paper, A form refers to a positive definite binary integral quadratic form. −D is the discriminant of a positive definite integral binary quadratic form, so −D is any negative integer ≡0 or 1(mod4) ⁠. It is not necessarily fundamental. f(u,v)=au2+buv+cv2∈Z[u,v] is a positive definite integral binary quadratic form of discriminant −D ⁠. It is not necessarily primitive or reduced. χ−D(·)=(−D·) is the Kronecker symbol attached to −D ⁠. D(Q) is the set of discriminants −D with 3≤D≤Q ⁠. Δ is a (negative) fundamental discriminant of a form. τk is the k-divisor function and τ=τ2 is the divisor function. φ is the Euler totient function. [s,t] is the least common multiple of integers s and t ⁠. (s,t) is the greatest common divisor of integers s and t ⁠. However, we may abuse notation and sometimes refer to a lattice point (u,v)∈Z2, but this will be made clear from context (for example, with the set membership symbol). 3. Elementary estimates First, we establish a standard result employing the Euler-Maclaurin summation formula [11, Lemma 4.1], which will later allow us to counts lattice points inside an ellipse. Lemma 3.1 For W≥1 ⁠, ∑1≤w≤WW2−w2=πW24−W2+O(W). Proof Set G(w)=W2−w2 ⁠. By partial summation, observe that ∑1≤w≤WG(w)=−∫0WtG′(t)dt+∫0Wψ(t)G′(t)dt−12G(0), (3.1) where ψ(t)=t−⌊t⌋−1/2 is the sawtooth function. For the first integral, notice −∫0WtG′(t)dt=∫0Wt2W2−t2dt=πW24. (3.2) For the second integral, we use the Fourier expansion (see for example, [11, Equation (4.18)]) ψ(x)=−∑1≤n≤N(πn)−1sin(2πnx)+O((1+∣∣x∣∣N)−1), where ∣∣x∣∣ is the distance of x to the nearest integer. It follows that ∫0Wψ(t)G′(t)dt=2∑1≤n≤N∫0WG(t)cos(2πnt)dt+O(∫0W∣G′(t)∣1+∣∣t∣∣Ndt) (3.3) after integrating by parts. Using a computer algebra package or table of integrals, ∫0WG(t)cos(2πnt)dt=∫0WW2−t2cos(2πnt)dt=W·J1(2πnW)4n, (3.4) where Jν(z) is the Bessel function of the first kind. Recall that, for z>1+∣ν∣2 ⁠, Jν(z)=(12z)ν∑m=0∞(−1)m(14z2)mm!Γ(m+ν+1)=(2πz)1/2(cos(z−12νπ−14π)+O(1+∣ν∣2z)). When summing (3.4) over 1≤n≤N ⁠, it follows that 2∑1≤n≤N∫0WG(t)cos(2πnt)dt=W4·2∑1≤n≤NJ1(2πnW)n=W2π∑1≤n≤N(cos(2πnW−3π/4)n3/2+O(1n5/2W))≪W (3.5) uniformly over N ⁠. Taking N→∞ ⁠, we conclude from (3.3) and (3.5) that ∫0Wψ(t)G′(t)dt≪W. (3.6) Combining (3.1), (3.2), and the above yields the result.□ Next, in Lemma 3.2, we calculate some weighted average values of the Dirichlet convolution (1∗χ)(n)=∑d∣nχ(d) for any quadratic Dirichlet character χ ⁠. Better estimates and certainly simpler proofs are available via Mellin inversion, but it is convenient for us to explicitly express the error terms using the character sum quantity Sχ(t)≔∑n≤tχ(n). This feature will give us flexibility and streamline the proofs of Theorems 1.4 and 1.5 each of which use different bounds for character sums. One utilizes uniform bounds (conditional and unconditional) whereas the other applies average bounds. Lemma 3.2 Let χ(modD)be a quadratic Dirichlet character. For x≥3 ⁠, ∑n≤x(1∗χ)(n)(1−nx)=x2L(1,χ)+O(E0(x;χ)), (3.7)where E0(x;χ)≔inf1≤y≤x(y2x+∣Sχ(y)∣+x∫y∞∣Sχ(t)∣t2dt)and Sχ(t)=∑n≤tχ(n) ⁠. Moreover, ∑n≤x(1∗χ)(n)n(1−nx)=L(1,χ)(logx+γ−1)+L′(1,χ)+O(E1(x;χ)), (3.8)where E1(x;χ)≔inf1≤y≤x(yx+logx∫y∞∣Sχ(t)∣logtt2dt). Remark Bounds for E0(x;χ) and E1(x;χ) can be found in Sections 4 and 5. Proof The proofs of these facts are standard, but we include the details for the sake of completeness. Equation (3.7) is a more precise version of [20, Section 4.3.1, Exercise 3] which originated from work of Mertens. Before we proceed, recall by partial summation that ∑d≤yχ(d)d=L(1,χ)−∫y∞Sχ(t)t2dt,∑d≤yχ(d)logdd=−L′(1,χ)−∫y∞Sχ(t)(logt−1)t2dt. (3.9) We will use these estimates in what follows. Let A(u)=∑n≤u(1−nu) ⁠. One can verify that A(u)=u2−12∫0u{t}dt=u2−12+O(u−1). (3.10) From the above integral formula for A(u) ⁠, it is straightforward to check that A(u) is continuous and if u>1 is not an integer then A′(u)=[u]([u]+1)2u2≪1. (3.11) In particular, A(u) is increasing and absolutely continuous. Now, to calculate the sum in (3.7), we use Dirichlet’s hyperbola method with a parameter 1≤y≤x ⁠. Namely, Σ≔∑n≤x(1∗χ)(n)(1−nx)=∑d≤yχ(d)A(x/d)+∑y<d≤xχ(d)A(x/d)=Σ1+Σ2, (3.12) say. For Σ1 ⁠, we see by (3.10) that Σ1=x2∑d≤yχ(d)d−12∑d≤yχ(d)+O(y2/x). (3.13) For Σ2 ⁠, since A is an absolutely continuous, decreasing, non-negative function and A(1)=0 ⁠, it follows by partial summation that Σ2=∫yxA(x/t)dSχ(t)=A(1)Sχ(x)+∫yxSχ(t)dA(x/t)=−∫yxSχ(t)A′(x/t)xt−2dt. Thus, by (3.11), ∣Σ2∣≪x(∫yx∣Sχ(t)∣t2dt). Combining the above estimate, (3.13), and (3.9) into (3.12) yields (3.7). We prove (3.8) similarly. For u≥1 ⁠, define B(u)≔∑n≤u1n(1−nu)=logu+γ−1+O(u−1). (3.14) One can verify that B(u) is a non-negative, increasing, and absolutely continuous function of u ⁠. Also, if u>1 is not an integer, then B′(u)≪1u. (3.15) For some parameter 1≤y≤x ⁠, Σ′≔∑n≤x(1∗χ)(n)n(1−nx)=∑d≤yχ(d)dB(x/d)+∑y<d≤xχ(d)dB(x/d)=Σ1′+Σ2′, (3.16) say. To calculate Σ1′ ⁠, we apply (3.14) and deduce that Σ1′=(logx+γ−1)∑d≤yχ(d)d−∑d≤yχ(d)logdd+O(y/x). (3.17) For Σ2′ ⁠, set S˜χ(u)≔∑y<d≤uχ(d)d ⁠. Since B is an absolutely continuous, non-negative, increasing function and B(1)=0 ⁠, we similarly conclude that Σ2′=∫yxB(x/t)dS˜χ(t)=−x∫yxS˜χ(t)B′(x/t)t−2dt. From (3.15), it follows that ∣Σ2′∣≪∫yx∣S˜χ(t)∣tdt. Substituting the identity S˜χ(t)=∫yt1udSχ(u)=Sχ(t)t+∫ytSχ(u)u2du into the previous estimate, we have that Σ2′≪∫yx∣Sχ(t)∣t2dt+∫yx1t∫yt∣Sχ(u)∣u2dudt≪∫yx∣Sχ(t)∣t2dt+(∫yx1tdt)(∫yx∣Sχ(u)∣u2du)≪logx∫yx∣Sχ(t)∣t2dt. Incorporating (3.17), (3.9), and the above into (3.16) establishes (3.8).□ 4. Uniform bounds for quadratic characters Here we collect known uniform bounds for character sums and values of the logarithmic derivatives of Dirichlet L-functions for quadratic characters. We apply the former to obtain estimates for the error terms arising in Lemma 3.2. 4.1. Character sums We state the celebrated result of Burgess [2] specialized to quadratic characters which was extended from cube-free moduli to all moduli by Heath-Brown [9, Lemma 2.4]. Lemma 4.1 Let χ(modD)be a quadratic Dirichlet character. For any η>0,N≥1 ⁠, and integer k≥3 ⁠, ∑n≤Nχ(n)≪η,kD(k+1)/4k2+ηN1−1/k. We also record the well-known GRH-conditional estimate for character sums. Lemma 4.2 Let χ(modD)be a non-principal Dirichlet character and suppose L(s,χ)satisfies GRH. For any η>0and N≥1 ⁠, ∑n≤Nχ(n)≪ηDηN1/2. 4.2. Errors from Lemma 3.2 The results of Section 4.1 allow us to obtain power-saving estimates for the error terms in Lemma 3.2 for small values of x relative to the conductor D ⁠. Lemma 4.3 Let χ(modD)be a quadratic Dirichlet character and let 0<ε<120be arbitrary. Let E0(x;χ)and E1(x;χ)be as in Lemma3.2. If x≫εD1/4+εthen E0(x;χ)≪εx1−ε2/2E1(x;χ)≪εx−ε2/2. (4.1)If L(s,χ)satisfies GRH and x≫εDεthen E0(x;χ)≪εx2/3+εandE1(x;χ)≪εx−1/3+ε. (4.2) Proof First, we consider E0=E0(x;χ) ⁠. Let 1≤y≤x be a parameter yet to be chosen. From Lemma 4.1 and the definition of E0 ⁠, we have that E0≪η,kx−1y2+Dk+14k2+ηx2−1/ky−1. Selecting y=Dk+112k2+η3x1−1/3k,η=18k2,k=⌈1/ε⌉≥20 yields the estimate for E0 in (4.1) since x≫εD1/4+ε ⁠. Now, assume GRH holds for L(s,χ) and x≫εDε ⁠. Utilize Lemma 4.2 with η=3ε2/2 and select y=Dε2/2x5/6 ⁠. This implies that E0≪εy2x+D3ε2/2x3/2y≪εDε2x2/3≪εx2/3+ε as desired. The arguments for E1=E1(x;χ) are similar. Again, by Lemma 4.1, E1≪k,ηyx+Dk+14k2+ηx1−1/k(logx)2y. Selecting y=Dk+18k2+η2x1−1/2k,η=18k2,k=⌈1/ε⌉≥20 yields the estimate for E1 in (4.1) since x≫εD1/4+ε ⁠. If L(s,χ) satisfies GRH, then we use Lemma 4.2 with η=ε2 and select y=Dε2x2/3 ⁠. This implies that E1≪εyx+D3ε2/2y−1/2(logx)2≪εDε2/2x−1/3(logx)2≪εx−1/3+ε as desired.□ 4.3. Logarithmic derivatives We record two results related to the value of the logarithmic derivative of L(s,χ) at s=1 for a quadratic Dirichlet character χ ⁠. Lemma 4.4 (Heath-Brown). If χ(modD)is a quadratic character then, for η>0 ⁠, −L′L(1,χ)≤(18+η)logD+Oη(1). Proof This follows from the proof of [9, Lemma 3.1] with some minor modifications to allow for any modulus D (not just sufficiently large) and a slightly wider range of the quantity σ therein, say 1<σ<1+ε ⁠. See [23, Proposition 2.6] for details.□ Lemma 4.5 If χ(modD)is a quadratic character and L(s,χ)satisfies GRH then for η>0 −L′L(1,χ)≪loglogD≤ηlogD+Oη(1). Proof This is well known and can be deduced from the arguments in Lemma 5.5.□ 5. Average bounds for quadratic characters The purpose of this section is analogous to Section 4, except we focus on estimates averaging over a certain class of quadratic characters attached to discriminants. To be more specific, for Q≥3 ⁠, recall D(Q)={discriminants−Dwith3≤D≤Q}. Here a discriminant is that of a primitive positive definite binary quadratic form. We emphasize that a discriminant −D∈D(Q) is not necessarily fundamental. The associated Kronecker symbol χ−D(·)= (−D· ) is itself a quadratic Dirichlet character. Note the character is primitive if and only if −D is a fundamental discriminant. Our goal is to average certain quantities involving χ−D over −D∈D(Q) ⁠. 5.1. Character sums We record a special case of Heath-Brown’s mean value theorem for primitive quadratic characters [10, Corollary 3]. Lemma 5.1 (Heath-Brown) Let N,Q≥1and let a1,…,anbe arbitrary complex numbers. Let S(Q)denote the set of all primitive quadratic characters of conductor at most Q ⁠. Then ∑χ∈S(Q)∣∑n≤Nanχ(n)∣2≪η((QN)1+η+QηN2+η)max1≤n≤N∣an∣for any η>0 ⁠. Using Lemma 5.1, we deduce an analogous mean value result for the quadratic characters attached to such discriminants. Lemma 5.2 Let N,Q≥1and let D(Q)be defined by (1.11). Then ∑−D∈D(Q)∣∑n≤Nχ−D(n)∣2≪η(QN)1+η+Q1/2+ηN2+ηfor any η>0 ⁠. Proof If −D∈D(Q) then −D=Δk2, where Δ∈Z is the discriminant of some imaginary quadratic field and k≥1 is some integer. Consequently, the Kronecker symbol χ−D(·)= (−D· ) is induced by the primitive quadratic character χΔ(·)= (Δ· ) ⁠. Moreover, χ−D(n)=χk2(n)χΔ(n). For the details on these facts, see [3, Section 7] for example. Therefore, by Lemma 5.1, ∑−D∈D(Q)∣∑n≤Nχ−D(n)∣2≤∑1≤k≤Q∑1≤∣Δ∣≤Q/k2χΔprimitive∣∑n≤Nχk2(n)χΔ(n)∣2≪η∑1≤k≤Q((QN)1+ηk−2−2η+Qηk−2ηN2+η)≪η(QN)1+η+Q1/2+ηN2+η as desired.□ 5.2. Error terms from Lemma 3.2 Again, the results in Section 5.1 leads to estimates for the error terms in Lemma 3.2. Lemma 5.3 Let X≥1,Q≥3and let D(Q)be defined by (1.11). For x≥1and any quadratic character χ ⁠, define E0(x;χ)and E1(x;χ)as in Lemma3.2. Then ∑−D∈D(Q)supX≤x≤2XE0(x;χ−D)≪ηQ1+ηX3/5+η+Q3/4+ηX1+η∑−D∈D(Q)supX≤x≤2XE1(x;χ−D)≪ηQ1+ηX−1/3+η+Q3/4+ηXη (5.1)for η>0 ⁠. Proof For X≤x≤2X ⁠, let 1≤y≤X be an unspecified parameter, depending only on X ⁠. Set y0≔yQ1/2+η ⁠. By Polya–Vinogradov, ∣Sχ(t)∣≪Q1/2logQ for any character χ(modq) with q≤Q ⁠. Therefore, for such χ ⁠, E0(x;χ)≪y2x+∣Sχ(y)∣+x∫y∞∣Sχ(t)∣t2dt≪ηy2x+∣Sχ(y)∣+x∫yy0∣Sχ(t)∣t2dt+xy. As y and y0 depend only on X and Q ⁠, it follows that supX≤x≤2XE0(x;χ)≪ηy2X+∣Sχ(y)∣+X∫yy0∣Sχ(t)∣t2dt+Xy. Summing the above expression over χ=χ−D with −D∈D(Q) ⁠, applying Cauchy–Schwarz, and invoking Lemma 5.2, we see that ∑−D∈D(Q)supX≤x≤2XE0(x;χ−D)≪ηQy2X+Q1+ηy1/2+η+Q3/4+ηy1+η+XQy+XQ1/2∫yy0(Qt)1/2+η+Q1/4+η/2t1+η/2t2dt≪ηQy2X+Q1+ηy1/2+η+Q3/4+ηy1+η+XQy+XQ1+ηy−1/2+η+XQ3/4+η/2y0η/2≪ηQy2X+XQy+XQ1+ηy−1/2+η+X1+ηQ3/4+η. In the last step, we used the definition of y0 and the fact that y≤X ⁠. Selecting y=X4/5 implies the desired result after rescaling η if necessary. We follow the same procedure for the average of E1 ⁠. First, we deduce that supX≤x≤2XE1(x;χ−D)≪ηyX+logX∫yy0∣Sχ(t)∣logtt2dt+log(Qy)y. Again, summing over χ=χ−D with −D∈D(Q) ⁠, we similarly conclude that ∑−D∈D(Q)supX≤x≤2XE1(x;χ−D)≪ηQyX+Qlog(Qy)y+Q1/2logX∫yy0(Qt)1/2+η+Q1/4+η/2t1+η/2t2dt≪ηQyX+Qlog(Qy)y+(Q1+ηy−1/2+η+Q3/4+η/2y0η/2)logX≪ηQyX+Q1+ηy1−η+XηQ1+ηy−1/2+η+XηQ3/4+η. Selecting y=X2/3 yields the desired result.□ Lemma 5.4 Let 0<ε<1/8,X≥1and Q≥3 ⁠. Let c=c(ε)>0and C=C(ε)≥1be arbitrary constants. For all except at most Oε(Q1−ε/10)discriminants −D∈D(Q) ⁠, E0(x;χ−D)≤x7/8+ε,E1(x;χ−D)≤x−1/8+ε, (5.2)uniformly for cDε≤x≤CD2+ε ⁠. Here E0and E1are defined as in Lemma3.2. Proof Without loss, we need only consider discriminants −D∈D(Q) satisfying D≥Q1−ε since the remainder are a collection of negligible size O(Q1−ε) ⁠. For X≥1 ⁠, define E(X,Q,ε)≔{−D∈D(Q):thereexistsX≤x≤2Xviolating(5.2)forχ−D}. By Lemma 5.3 with η=ε/8 ⁠, it follows that E(X,Q,ε)≪εQ1+ε/8X−11/40−7ε/8+Q3/4+ε/8X1/8−7ε/8. Dyadically summing this estimate over X between cQε(1−ε) and CQ2+ε ⁠, we see that the total number of discriminants −D∈D(Q) satisfying D≥Q1−ε and violating (5.2) anywhere in the range cDε≤x≤CD2+ε is bounded by ≪ε(Q1+18ε−1140ε(1−ε)+Q1−32ε)logQ≪εQ1−37320εlogQ≪εQ1−110ε as ε<1/8 ⁠.□ 5.3. Logarithmic derivatives For Q≥3 ⁠, define D*(Q)={fundamentaldiscriminantsΔwith3≤∣Δ∣≤Q}. We show that, aside from a sparse set of fundamental discriminants in D*(Q) ⁠, the logarithmic derivative of L(s,χΔ) at s=1 satisfies a GRH-quality bound. The key inputs are the explicit formula and Jutila’s zero density estimate for primitive quadratic characters. Lemma 5.5 Let Q≥3and ε>0be arbitrary. For all except at most Oε(Q5/6+ε)fundamental discriminants Δ∈D*(Q) ⁠, −L′L(1,χΔ)≪εloglog∣Δ∣. (5.3) Proof We modify the arguments leading to [18, Theorem 3]. Define Dε*(Q) to be the set of fundamental discriminants Δ∈D*(Q) such that ∣Δ∣≤Q and whose L-function L(s,χΔ) is zero-free in the rectangle 34<R{s}<1∣I{s}∣≤∣Δ∣ε. (5.4) First, we estimate −L′L(s,χΔ) for Δ∈Dε*(Q) ⁠. For simplicity, write χ=χΔ ⁠. From the explicit formula in the form given by [12, p. 261], one can verify that −L′L(1,χ)=1y−1∑m<y(ym−1)Λ(m)χ(m)−1y−1∑ρyρ−1ρ(1−ρ)+O(logyy) (5.5) for y≥2 ⁠, where the sum is taken over all non-trivial zeros ρ=β+iγ of L(s,χ) ⁠. From [12, Equation (5.4.6)] and the prime number theorem, it follows for T≥1 that −L′L(1,χ)=−1y−1∑ρ∣γ∣≤Tyρ−1ρ(1−ρ)+O(logy+log(∣Δ∣T)T+log2∣Δ∣y). Set T=∣Δ∣ε ⁠. By the symmetry of the functional equation for real characters χ and the fact that L(s,χ) has no zeros in (5.4), it follows that every zero appearing in the sum over ρ satisfies 14≤R{ρ}≤34 ⁠. Thus, trivially bounding the remaining zeros, we deduce that −L′L(1,χ)≪logy+y−1/4∑ρ∣γ∣≤∣Δ∣ε11+∣γ∣2+log∣Δ∣∣Δ∣ε+log2∣Δ∣y≪logy+log∣Δ∣y1/4+log∣Δ∣∣Δ∣ε+log2∣Δ∣y for y≥2 ⁠. Setting y=(log∣Δ∣)4+2 implies (5.3) holds for all Δ∈Dε*(Q) ⁠. It remains to show that the number of discriminants Δ∈D*(Q)⧹Dε*(Q) is small. Jutila’s zero density estimate [13, Theorem 2] implies that ∑Δ∈D*(Q)N(σ,T,χΔ)≪ε(QT)7−6σ6−4σ+ε10, where N(σ,T,χ) is the number of zeros ρ=β+iγ of L(s,χ) with σ<β<1 and ∣γ∣≤T ⁠. Setting σ=3/4 and T=Qε ⁠, we see that the number of fundamental discriminants Δ∈D*(Q) whose L-function L(s,χΔ) has a zero in the rectangle (5.4) is at most Oε(Q5/6+ε) ⁠. Hence, ∣D*(Q)⧹Dε*(Q)∣≪εQ5/6+ε as required.□ Lemma 5.5 implies the same type of result for the set of all discriminants D(Q) ⁠. Lemma 5.6 Let Q≥3and ε>0be arbitrary. For all except at most Oε(Q5/6+ε)discriminants −D∈D(Q) ⁠, −L′L(1,χ−D)≪εloglogD≤εlogD+Oε(1). Proof Let −D∈D(Q) so, as in the proof of Lemma 5.2, we may write −D=Δk2 for some negative fundamental discriminant Δ and an integer k≥1 ⁠. It follows that χ−D is induced by the primitive character χΔ and, in particular, χ−D=χΔχk2 ⁠. This implies that ∣L′L(1,χ−D)−L′L(1,χΔ)∣≤∑(n,k)≠1Λ(n)n≪∑p∣klogpp≪loglogk. Thus, if Δ is a fundamental discriminant satisfying (5.3) then −L′L(1,χ−D)≪εloglog∣Δ∣+loglogk≪εloglogD. Lemma 5.5 implies that the total number of discriminants −D failing the above bound is ≪ε∑k≤Q(Qk2)5/6+ε≪εQ5/6+ε. This completes the proof.□ 6. Congruence sum decomposition Let f(u,v)=au2+buv+cv2 be a form with discriminant −D ⁠. For this section, we will not require f to be primitive. For any integer n≥0 ⁠, define rf(n)≔∣{(u,v)∈Z2:n=f(u,v)}∣. (6.1) Moreover, for x≥1 and positive integers ℓ and d ⁠, define A=A(x,f)≔{(u,v)∈Z2:f(u,v)≤x},Aℓ=Aℓ(x,f)≔{(u,v)∈A:f(u,v)≡0(modℓ)},Aℓ(d)=Aℓ(x,f;d)≔{(u,v)∈Aℓ:(v,ℓ)=d}. (6.2) We will suppress the dependence on x and f whenever it is clear from context. This will be the case for almost the entirety of the paper. Observe that ∣A∣=∑n≤xrf(n)and∣Aℓ∣=∑n≤xℓ∣nrf(n)=∑d∣ℓ∣Aℓ(d)∣. (6.3) Note the last identity holds since Aℓ is a disjoint union of the sets Aℓ(d) over d∣ℓ ⁠. To calculate ∣Aℓ(d)∣ ⁠, and subsequently ∣Aℓ∣ ⁠, we will need to decompose it into sums similar to ∣Aℓ(1)∣ and estimate them with uniformity over all parameters. To this end, we introduce some additional notation. For any integer ℓ≥1 and m∈Z/ℓZ ⁠, define Bℓ=Bℓ(x,f)≔{(u,v)∈A:(v,ℓ)=1,f(u,v)≡0(modℓ)},Bℓ(m)=Bℓ(x,f;m)≔{(u,v)∈A:(v,ℓ)=1,u≡mv(modℓ)}. (6.4) Note that Bℓ is exactly the same as Aℓ(1) ⁠, but we distinguish it for the sake of clarity. The crucial property of the sets Bℓ and Bℓ(m) is summarized in the following lemma. Lemma 6.1 Let f(u,v)=au2+buv+cv2be a positive definite binary integral quadratic form of discriminant −Dand let ℓ≥1be a squarefree integer. Define M(ℓ)=Mf(ℓ)≔{m∈Z/ℓZ:am2+bm+c≡0(modℓ)}.Then ∣Bℓ∣=∑m∈M(ℓ)∣Bℓ(m)∣. (6.5)Furthermore, M(ℓ)=Mf(ℓ)≔∣M(ℓ)∣is a non-negative multiplicative function satisfying M(p)={1+χ(p)ifp∤a,χ(p)ifp∣aandp∤(a,b,c),pifp∣(a,b,c), (6.6)for all primes p ⁠. Here χ=χ−Dis the corresponding Kronecker symbol. Proof Let (u,v)∈Bℓ ⁠. As (v,ℓ)=1 ⁠, select m∈Z/ℓZ such that u≡mv(modℓ) ⁠. Thus, f(u,v)≡0(modℓ)⟺(am2+bm+c)v2≡0(modℓ)⟺m∈M(ℓ). This implies Bℓ is a union of Bℓ(m) over m∈M(ℓ) ⁠. One can verify from (6.4) that m1≢m2(modℓ) implies Bℓ(m1)∩Bℓ(m2)=. Thus, the union is in fact disjoint yielding (6.5). Next, we count M(ℓ)=∣M(ℓ)∣ ⁠. The function M(ℓ) is multiplicative by the Chinese Remainder Theorem. Let p be an odd prime. If p∤a then M(p)=1+χ(p) by the definition of χ ⁠. If p∣a then for m∈M(ℓ) 0≡am2+bm+c≡bm+c(modp). (6.7) Note in this scenario χ(p)=0 or 1 only. We consider cases. If p∤b then m≡−b−1c(modp) is the only solution to (6.7). Thus, M(p)=1=(b2p)=(b2−4acp)=χ(p) ⁠. If p∣b then condition (6.7) becomes c≡0(modp) ⁠. We further subdivide the cases. If p∤c then no value of m satisfies (6.7) implying M(p)=0=(b2−4acp)=χ(p) ⁠. If p∣c then p∣(a,b,c) in this subcase. Hence, all m∈Z/pZ vacuously satisfy (6.7) so M(p)=p ⁠. Comparing these cases, we see M(p) indeed satisfies (6.6) for all odd primes p ⁠. For p=2 ⁠, one can verify by a tedious case analysis that M(2) also satisfies (6.6).□ In light of Lemma 6.1, the main goal of this section is to determine the size of ∣Bℓ(m)∣ for any m∈Z/ℓZ ⁠. For convenience, set V=V(x,f)≔4axD. (6.8) This notation will be used throughout the paper. While we are more interested when V≥1 ⁠, we only assume x≥1 in all of our arguments so it is possible that 0<V<1 ⁠. Recall φ denotes the Euler totient function and τ is the divisor function. Lemma 6.2 Let f(u,v)=au2+buv+cv2be a positive definite binary integral quadratic form of discriminant −D ⁠. Let ℓ≥1be a squarefree integer and m∈Z/ℓZ ⁠. For x≥1 ⁠, ∣Bℓ(m)∣=φ(ℓ)ℓ2·πD2aV2+O(V+ℓ1/2τ(ℓ)DaV1/2+δ(ℓ)), (6.9)where Bℓ(m)=Bℓ(x,f;m)is defined by (6.4), V=V(x,f)is defined by (6.8), and δ(ℓ)is the indicator function for ℓ=1 ⁠. Remark The expressions and implied constant on the right-hand side of (6.9) are independent m∈Z/ℓZ ⁠. Proof Counting the numbers of pairs (u,v)∈Z2 satisfying f(u,v)≤x amounts to verifying the inequality (2au+bv)2+Dv2≤4ax. Fixing v ⁠, any u satisfying the above inequality lies in the range −bv−4ax−Dv22a≤u≤−bv+4ax−Dv22a. Without loss, we may assume m is an integer lying in {0,1,…,ℓ−1} ⁠. Restricting to u=mv+jℓ ⁠, we see that the integer j must lie in the range −(b+2am)v−4ax−Dv22aℓ≤j≤−(b+2am)v+4ax−Dv22aℓ. For each fixed v and solution u≡mv(modℓ) ⁠, the total number of such integers j is, therefore, 1ℓF(v)+O(1), where F(v)=1a4ax−Dv2=DaV2−v2. (6.10) Now, summing over all integers v satisfying ∣v∣≤V and (v,ℓ)=1 ⁠, we deduce that ∣Bℓ(m)∣=1ℓ∑∣v∣≤V(v,ℓ)=1F(v)+O(∑∣v∣≤V(v,ℓ)=11). The term v=0 contributes to the above sums if and only if ℓ=1 ⁠. Let δ(ℓ) be the indicator function for ℓ=1 ⁠. We separate the term v=0 ⁠, if necessary, in the sums above and note F(v) is even to see that ∣Bℓ(m)∣=2ℓ∑1≤v≤V(v,ℓ)=1F(v)+δ(ℓ)ℓDaV+O(V+δ(ℓ)). (6.11) We remove the condition (v,ℓ)=1 via Mobius inversion and deduce that ∑1≤v≤V(v,ℓ)=1F(v)=∑d∣ℓμ(d)∑1≤w≤V/dF(dw). (6.12) By Lemma 3.1, we see that ∑1≤w≤V/dF(dw)=dDa(πV24d2−V2d+O(V/d)). Since ∑d∣ℓμ(d)d=φ(ℓ)ℓ ⁠, ∑d∣ℓμ(d)=δ(ℓ) ⁠, and ∑d∣ℓd1/2≪ℓ1/2τ(ℓ) ⁠, it follows by (6.12) that ∑1≤v≤V(v,ℓ)=1F(v)=φ(ℓ)ℓπD4adV2−δ(ℓ)D2aV+O(ℓ1/2τ(ℓ)DaV1/2). (6.13) Combining (6.11) and (6.13) yields (6.9). Note that the terms involving δ(ℓ) cancel.□ We conclude this section by calculating ∣Bℓ∣ ⁠. Lemma 6.3 Let f(u,v)=au2+buv+cv2be a positive definite binary integral quadratic form of discriminant −D ⁠. If ℓ≥1is a squarefree integer then ∣Bℓ∣=M(ℓ)(φ(ℓ)ℓ2·πD2aV2+O(V+ℓ1/2τ(ℓ)DaV1/2+δ(ℓ))),where Bℓ=Bℓ(x,f)is defined by (6.4), V=V(x,f)is defined by (6.8), δ(ℓ)is the indicator function for ℓ=1 ⁠, and M(ℓ)=Mf(ℓ)is a multiplicative function defined by (6.6). Proof This is an immediate consequence of Lemmas 6.1 and 6.2 since the latter lemma’s estimates are uniform over all m∈Z/ℓZ ⁠.□ 7. Local densities We may now assemble our tools to establish the key technical proposition. Namely, we estimate the congruence sums given by (6.3) and calculate the local densities. Proposition 7.1 Let fbe a primitive positive definite binary quadratic form with discriminant −D ⁠. If ℓ≥1is a squarefree integer then for x≥1 ⁠, ∣Aℓ∣=∑n≤xℓ∣nrf(n)=g(ℓ)πD2aV2+O(τ3(ℓ)V+ℓ1/2τ(ℓ)τ3(ℓ)DaV1/2+1), (7.1)where V=4ax/Dand gis a multiplicative function satisfying g(p)=1p(1+χ(p)−χ(p)p)forallprimesp. (7.2)Here χ=χ−Dis the corresponding Kronecker symbol. Proof Let d∣ℓ and let Aℓ(d)=Aℓ(x,f;d) be defined by (6.2). From observation (6.3), it suffices to calculate ∣Aℓ(d)∣ ⁠. First, we introduce some notation. For any integer r≥1 ⁠, set fr(u,w)≔f(u,rw)=au2+bruw+cr2w2. (7.3) Notice that its discriminant is −r2D ⁠. Therefore, it follows for any α>0 that V(α2x,fr)=αVrandχ−r2D(n)={χ(n)if(n,r)=10otherwise, where V=V(x,f) and χ=χ−D as usual. Now, write ℓ=dk so (d,k)=1 as ℓ is squarefree. We wish to characterize each point (u,v)∈Aℓ(x,f;d) ⁠. Since (v,ℓ)=d ⁠, it follows by the Chinese Remainder Theorem that f(u,v)≡0(modℓ)⟺au2≡0(modd)andf(u,v)≡0(modk)⟺u≡0(modd(a,d))andf(u,v)≡0(modk). (7.4) Write u=d(a,d)s and v=dt for integers s and t ⁠. Note (t,k)=1 as (v,ℓ)=d and ℓ is squarefree. Then one can verify that f(u,v)=d2(a,d)2·(as2+b(a,d)st+c(a,d)2t2)=d2(a,d)2·f(a,d)(s,t). (7.5) From this change of variables, (7.4) and (7.5), we see that f(u,v)≡0(modℓ)⟺f(a,d)(s,t)≡0(modk)f(u,v)≤x⟺f(a,d)(s,t)≤(a,d)2d2x. (7.6) Note by (7.5) that the congruence conditions are equivalent as (d,k)=1 and ℓ=dk ⁠. Since (t,k)=1 necessarily, we have therefore established that ∣Aℓ(x,f;d)∣=∣Bk((a,d)2xd2,f(a,d))∣. (7.7) Summing this identity over d∣ℓ ⁠, we apply observation (6.3) and Lemma 6.3 to deduce that ∣Aℓ(x,f)∣=∑ℓ=dk∣Bk((a,d)2xd2,f(a,d))∣=πD2aV2·∑ℓ=dkMf(a,d)(k)φ(k)k2(a,d)d2+O(V·∑ℓ=dkMf(a,d)(k)d+DaV1/2∑ℓ=dkMf(a,d)(k)k1/2τ(k)d−1/2+∑ℓ=dkMf(a,d)(k)δ(k)). (7.8) We wish to simplify the remaining sums and error term. Let r∣ℓ ⁠. As f is primitive, Mf(p)={1+χ(p)ifp∤a,χ(p)ifp∣a, by (6.6). To compute Mfr ⁠, observe by the primitivity of f that a prime p divides (a,br,cr2) if and only if p divides (a,r) ⁠. Moreover, if p∣r then χ−r2D(p)=(−r2Dp)=(r2p)(−Dp)=0 and, similarly, if p∤r then χ−r2D(p)=χ(p) ⁠. Combining these observations with (6.6) and (7.3), we see that Mf(a,d)(p)={1+χ(p)ifp∤a,χ(p)ifp∣aandp∤(a,d),pifp∣(a,d). In particular, as (d,k)=1 ⁠, it follows that Mf(a,d)(k)=Mf(k) ⁠. Hence, ∑ℓ=dkMf(a,d)(k)φ(k)k2(a,d)d2=∑ℓ=dkMf(k)·φ(k)k2·(a,d)d2=∏p∣ℓp∤a((1+χ(p))(1p−1p2)+1p2)×∏p∣(ℓ,a)(χ(p)(1p−1p2)+1p)=∏p∣ℓ(1+χ(p)p−χ(p)p2)=g(ℓ). (7.9) Similarly, ∑ℓ=dkMf(a,d)(k)d=∏p∣ℓp∤a(1+χ(p)+1p)×∏p∣(ℓ,a)(χ(p)+1p)≪τ3(ℓ), (7.10) which implies that ∑ℓ=dkMf(a,d)(k)k1/2τ(k)d−1/2≪ℓ1/2τ(ℓ)∑ℓ=dkMf(a,d)(k)d≪ℓ1/2τ(ℓ)τ3(ℓ). (7.11) Combining the observation that ∑ℓ=dkMf(a,d)(k)δ(k)=Mf(a,d)(1)=1 with (7.8), (7.9), (7.10) and (7.11) yields the desired result.□ To obtain a better intuition for the quality of Proposition 7.1, we present the special case when ℓ=1 as a corollary below. We do not claim that this corollary is new, but we have not seen it stated in the literature and thought it may be of independent interest. Corollary 7.2 Let f(u,v)=au2+buv+cv2be a primitive positive definite binary quadratic form with discriminant −D ⁠. For x≥1 ⁠, ∑n≤xrf(n)=2πxD+O((ax)1/2D1/2+(Dx)1/4a3/4+1). Remarks. Suppose f is reduced so ∣b∣≤a≤c ⁠. It is well known (see for example [1, Lemma 3.1]) that ∑n≤xrf(n)=2πxD+O(x1/2a1/2+1). This estimate, like Corollary 7.2, gives the asymptotic ∼2πxD as long as x/c→∞ ⁠, but the error term in Corollary 7.2 is stronger than the above whenever x≥c ⁠. The source of this improvement is a standard analysis of the sawtooth function in Lemma 3.1 and its subsequent application in Lemma 6.2. As discussed in Section 1.2, the condition x≥c is the ‘non-trivial’ range for counting the lattice points inside the ellipse f(u,v)≤x whenever f is reduced. 8. Application of Selberg’s sieve We now apply Selberg’s sieve to give an upper bound for the number of primes in a short interval represented by a reduced positive definite primitive integral binary quadratic form. We leave the calculation of the main term’s implied constant unfinished as the final arguments vary slightly for Theorems 1.4 and 1.5. Proposition 8.1 Let f(u,v)=au2+buv+cv2be a reduced positive definite integral binary quadratic form with discriminant −D ⁠. Let (ax)1/2≤y≤x ⁠. Set z=(aDx)1/4y1/2(logy)−7+1. (8.1)If x≥D/athen πf(x)−πf(x−y)<{logyJ+O((logy)−1)}δfyh(−D)logy, (8.2)where J={1L(1,χ)∑ℓ<zg(ℓ)ifL(1,χ)≥(logy)−2,(logy)2otherwise. (8.3)Here χ=χ−Dis the corresponding Kronecker symbol and gis the completely multiplicative function defined by (7.2). Proof As f is reduced, we have that ∣b∣≤a≤c and moreover c≍D/a≥D≥a ⁠. We will frequently apply these properties while only mentioning that f is reduced. Our argument is divided according to the size of L(1,χ) ⁠. First, assume L(1,χ)<(logy)−2 ⁠. Let w=#{A∈SL2(Z):A·f=f} so by [25, p. 63, Satz 2], w=6 or 4 if −D=−3 or −4, respectively, and w=2 otherwise. If a prime p is represented by f then it is represented with multiplicity equal to δf−1w ⁠. Thus, by Corollary 7.2, wδf(πf(x)−πf(x−y))≤∑x−y<n≤xrf(n)=2πyD+O((ax)1/2D1/2) because x≥D/a and f is reduced. The class number formula (see [25, p. 72, Satz 5] for example) states that h(−D)=wD2πL(1,χ). (8.4) Hence, by our assumption on L(1,χ) ⁠, πf(x)−πf(x−y)≤L(1,χ){1+O((ax)1/2y)}δfyh(−D)≪yh(−D)(logy)2. In the last step, we used that y≥(ax)1/2 ⁠. This establishes (8.2) when L(1,χ)<(logy)−2 ⁠. Therefore, we may henceforth assume L(1,χ)≥(logy)−2. (8.5) Defining P=P(z)=∏p≤zp ⁠, it follows that wδf(πf(x)−πf(x−y))≤∑x−y<n≤x(n,P)=1rf(n)+wδfπ(z), (8.6) where π(z) is the number of primes up to z ⁠. We proceed to estimate the sieved sum. Using Proposition 7.1 and Selberg’s upper bound sieve [7, Theorem 7.1] with level of distribution z2 ⁠, we see that ∑x−y<n≤x(n,P)=1rf(n)<2πyDJ+∑ℓ∣Pℓ<z2rℓλℓ, (8.7) where J=∑ℓ∣Pℓ<zh(ℓ),h(ℓ)=∏p∣ℓg(p)1−g(p),∣λℓ∣≤τ3(ℓ), and rℓ≪τ3(ℓ)V+ℓ1/2τ(ℓ)τ3(ℓ)DaV1/2. Here, as usual, V=4ax/D ⁠. Note 1≤V≤x1/2 as x≥D/a and a≤D ⁠. For the quantity J in the main term, we treat g as a completely multiplicative function and note by (8.3) that J≥∑ℓ<zg(ℓ)=L(1,χ)J. (8.8) The remainder term in (8.7) is bounded in a straightforward manner. Using standard estimates for the k-divisor function τk(ℓ) (see, for example, [17]) and the prime number theorem, one can verify that wδfπ(z)+∑ℓ∣Pℓ<z2rℓλℓ≪zlogz+V∑ℓ<z2τ3(ℓ)2+DaV1/2∑ℓ<z2ℓ1/2τ(ℓ)τ3(ℓ)2≪z2(logz)8·V+z3(logz)17·DaV1/2≪ayD(logy)−6+y3/2x−1/2D(logy)−4. In the last step, we used that V=4ax/D and, by (8.1), z=(aDx)1/4y1/2(logy)−7+1≤y ⁠. Since y≤x and a≤D ⁠, we see that the above is ≪yD(logy)4. Thus, applying the class number formula (8.4) and the well-known estimate L(1,χ)≪logD≪logy ⁠, we conclude that wδfπ(z)+∑ℓ∣Pvℓ<z2rℓλℓ≪yh(−D)(logy)3. (8.9) Combining (8.6), (8.7), (8.8) and (8.9) completes the proof of the proposition with a final application of the class number formula (8.4).□ Evidently, from Proposition 8.1, we will require a lower bound for the sum of local densities. We execute the first steps here. Lemma 8.2 Let gbe the completely multiplicative function defined by (7.2). For z≥1 ⁠, ∑ℓ<zg(ℓ)≥L(1,χ)logz+L′(1,χ)+O(L(1,χ)+E1(z;χ)+z−1E0(z;χ)).Here E1(z;χ)and E0(z;χ)are defined as in Lemma3.2. Proof Define G(s)≔∑n=1∞g(n)n−s=∏p(1−g(p)p−s)−1, which absolutely converges for R{s}>0 since ∣g(p)∣≤2/p ⁠. One can verify that for R{s}>0 G(s)=ζ(s+1)L(s+1,χ)G˜(s), (8.10) where ζ(s) is the Riemann zeta function, L(s,χ) is the Dirichlet L-function attached to the quadratic character χ=χ−D ⁠, and G˜(s)≔∏p(1−χ(p)p−s−2−χ(p)p−2s−21−(1+χ(p))p−s−1+χ(p)p−s−2). It is straightforward to check that G˜(s) is absolutely convergent for R{s}>−1 and, in particular, G˜(0)=1 ⁠. Expanding the Euler product for G˜(s) and writing G˜(s)=∑ng˜(n)n−s for some multiplicative function g˜ ⁠, one can see that g˜(n)≪n−2. As G˜(0)=1 ⁠, it follows that ∑n≤Ng˜(n)=1+O(N−1). Therefore, from (8.10), ∑ℓ<zg(ℓ)=∑ℓ<z∑ℓ=mn(1∗χ)(m)mg˜(n)=∑m<z(1∗χ)(m)m+O(z−1∑m<z(1∗χ)(m)). The desired result now follows from Lemma 3.2.□ 9. Representation of primes We may finally prove Theorems 1.4 and 1.5. In both cases, we will need to apply Proposition 8.1 from which one can see that it suffices to provide an appropriate lower bound for J when L(1,χ)≥(logy)−2 ⁠. By Lemma 8.2, it follows that J≥logz+L′L(1,χ)+O(1+E1(z;χ)(logy)2+E0(z;χ)(logy)2z) (9.1) provided L(1,χ)≥(logy)−2 ⁠, (ax)1/2≤y≤x ⁠, and z is given by (8.1). Recall that E0 and E1 are given by Lemma 3.2 with estimates exhibited in Sections 4 and 5. The proofs for both theorems will employ (9.1). Before we proceed, we wish to emphasize that f(u,v)=au2+buv+cv2 is assumed to be a reduced positive definite binary integral quadratic form of discriminant −D ⁠. Thus, ∣b∣≤a≤c and a≤D ⁠. 9.1. Proof of Theorem 1.4 Recall we are assuming that (D1+4ϕa)1/2+εx1/2+ε≤y≤x, where ϕ is given by (1.8). As a≤D ⁠, this implies that y≥(ax)1/2 ⁠. Furthermore, z=(aDx)1/4y1/2(logy)−7+1≫εDϕ+ε/2,andlogz≍logy≍logx. Therefore, applying Lemma 4.4 (or Lemma 4.5 when assuming GRH) and Lemma 4.3 to (9.1), it follows that J≥logz−(ϕ2+ε4)logD+Oε(1+z−ε2log2y)≥12logy−14logx−(14+ϕ2+ε4)logD+14loga+Oε(loglogy)=1−θ2logy+Oε(loglogy), where θ is defined as in Theorem 1.4. Substituting this estimate in Proposition 8.1 establishes Theorem 1.4 when L(1,χ)≥(logy)−2 ⁠. If L(1,χ)<(logy)−2 then the desired result follows immediately from Proposition 8.1 and hence completes the proof.□ 9.2. Proof of Theorem 1.5 Recall D(Q) is given by (1.11). Let c1(ε)>0 be a sufficiently small constant and C1(ε),C2(ε)≥1 be sufficiently large constants, all of which depend only on ε ⁠. For Q≥3 ⁠, let Dε(Q) be the subset of discriminants −D∈D(Q) such that −L′L(1,χ−D)≤εlogD+C2(ε) (9.2) and, for c1(ε)Dε≤u≤C1(ε)D2+ε ⁠, E0(u;χ−D)≤u7/8+ε,E1(u;χ−D)≤u−1/8+ε. (9.3) By Lemmas 5.4 and 5.6, the number of discriminants not satisfying these two properties is ∣D(Q)⧹Dε(Q)∣≪εQ1−ε/10. Thus, it suffices to show for every discriminant −D∈Dε(Q) and reduced positive definite binary quadratic form f of discriminant −D that πf(x)−πf(x−y)<21−θ′δfyh(−D)logy{1+Oε(loglogylogy)} (9.4) provided (Dxa)1/2+ε≤y≤x ⁠. Here θ′ is defined as in Theorem 1.4 with ϕ=0 ⁠. First, assume (D2xa)1/2+ε≤y≤x. (9.5) Arguing as in Section 9.1, it follows that y≥(ax)1/2 ⁠, z=(aDx)1/4y1/2(logy)−7+1≫εD1/4+ε/2,andlogz≍logy≍logx. Thus, incorporating (9.2) and (4.1) from Lemma 4.3 into (9.1), it similarly follows that J≥1−θ′2logy+Oε(loglogy) (9.6) whenever L(1,χ)≥(logy)−2 ⁠. Therefore, by Proposition 8.1, this establishes (9.4) provided (9.5) holds and −D∈Dε(Q) ⁠. It remains to consider the case when (Dxa)1/2+ε≤y≤(D2xa)1/2+ε≤x. (9.7) Note we continue to assume −D∈Dε(Q) ⁠. As before, it follows that y≥(ax)1/2 ⁠, z=(aDx)1/4y1/2(logy)−7+1≫εDε/2,andlogz≍logy≍logx. Thus, incorporating (9.2) and (9.3) into (9.1), we again obtain (9.6) whenever L(1,χ)≥(logy)−2 ⁠. By Proposition 8.1, this establishes (9.4) provided (9.7) holds and −D∈Dε(Q) ⁠. This completes the proof in all cases.□ 10. Representation of small integers with few prime factors Proof of Theorem 1.6 We apply the beta sieve to the sequence A=A(x,f) given by (6.2). Let g be the local density function defined in Proposition 7.1, so g(p)≤2/p ⁠. Thus, the sequence A is of dimension at most κ=2 and has sifting limit β=β(κ)<4.85 according to [7, Section 11.19]. Let x≥D/a and select z≔V10/49, where V=4ax/D≥2 ⁠. Select the level of distribution to be R=z485/100>zβ ⁠. Thus, by [7, Theorem 11.13] and Proposition 7.1, it follows that ∑n≤x(n,P(z))=1rf(n)≫xD(logx)2+O(∑ℓ∣P(z)ℓ<R∣rℓ∣), (10.1) where ∣rℓ∣≪ℓηV+ℓ1/2+ηDaV1/2+1 for fixed η>0 sufficiently small. Since f is reduced and R=z485/100=V97/98 ⁠, we see that ∑ℓ∣P(z)ℓ<R∣rℓ∣≪R1+ηV+DaR3/2+ηV1/2+R1−η≪DaV2−3196+η≪x1−3392+ηD. Thus, as η>0 is sufficiently small, ∑n≤x(n,P(z))=1rf(n)≫xD(logx)2 for x≥D/a ⁠. For an integer k≥10 ⁠, observe that z≥x1/k if and only if (axD)5k/49≥x⟺x≥(Da)1+495k−49. This completes the proof.□ Acknowledgements I would like to thank John Friedlander and Jesse Thorner for their encouragement and many helpful comments on an earlier version of this paper. I am also grateful to Kannan Soundararajan for several insightful conversations and for initially motivating me to pursue this approach. Funding Part of this work was completed with the support of an NSERC Postdoctoral Fellowship. References 1 V. Blomer and A. Granville , Estimates for representation numbers of quadratic forms , Duke Math. J. 135 ( 2006 ), 261 – 302 . Google Scholar Crossref Search ADS 2 D. A. Burgess , On character sums and L-series. II , Proc. London Math. Soc. (3) 13 ( 1963 ), 524 – 536 . Google Scholar Crossref Search ADS 3 D. A. Cox , Primes of the form x2 + ny2. Pure and Applied Mathematics (Hoboken) , 2nd edn , John Wiley & Sons, Inc , Hoboken, NJ , ( 2013 ). Fermat, class field theory, and complex multiplication. 4 K. Debaene , Explicit counting of ideals and a brun-titchmarsh inequality for the chebotarev density theorem, arXiv preprint arXiv:1611.10103, 2016 . 5 J. Ditchen , On the average distribution of primes represented by binary quadratic forms, arXiv preprint arXiv:1312.1502, 2013 . 6 J. J. Ditchen , Primes of the shape x2 + ny2. The distribution on average and prime number races. PhD thesis, ETH Zürich, 2013 . 7 J. Friedlander and H. Iwaniec , Opera de cribro, volume 57 of American Mathematical Society Colloquium Publications , American Mathematical Society , Providence, RI , 2010 . 8 E. Fogels , On the zeros of Hecke’s L-functions. I, II , Acta Arith. 7 ( 1961 /1962), 131 – 147 . Google Scholar Crossref Search ADS 9 D. R. Heath-Brown , Zero-free regions for Dirichlet L-functions, and the least prime in an arithmetic progression , Proc. London Math. Soc. (3) 64 ( 1992 ), 265 – 338 . Google Scholar Crossref Search ADS 10 D. R. Heath-Brown , A mean value estimate for real character sums , Acta Arith. 72 ( 1995 ), 235 – 275 . Google Scholar Crossref Search ADS 11 H. Iwaniec and E. Kowalski , Analytic number theory, volume 53 of American Mathematical Society Colloquium Publications , American Mathematical Society , Providence, RI , 2004 . 12 Y. Ihara , V. Kumar Murty and M. Shimura , On the logarithmic derivatives of Dirichlet L-functions at s = 1 , Acta Arith. 137 ( 2009 ), 253 – 276 . Google Scholar Crossref Search ADS 13 M. Jutila , On mean values of Dirichlet polynomials with real characters , Acta Arith. 27 ( 1975 ), 191 – 198 . Collection of articles in memory of Juriĭ Vladimirovič Linnik. Google Scholar Crossref Search ADS 14 E. Kowalski and P. Michel , Zeros of families of automorphic L-functions close to 1 , Pacific J. Math. 207 ( 2002 ), 411 – 431 . Google Scholar Crossref Search ADS 15 J. C. Lagarias , H. L. Montgomery and A. M. Odlyzko , A bound for the least prime ideal in the Chebotarev density theorem , Invent. Math. 54 ( 1979 ), 271 – 296 . Google Scholar Crossref Search ADS 16 J. C. Lagarias and A. M. Odlyzko , Effective versions of the Chebotarev density theorem. pages ( 1977 ), 409 – 464 . 17 F. Luca and L. Tóth , The rth moment of the divisor function: an elementary approach , J. Integer Seq. 20 ( 2017 ), 17.7.4 . 18 M. Mourtada and V. Kumar Murty , Omega theorems for L′/L(1,χD) , Int. J. Number Theory 9 ( 2013 ), 561 – 581 . Google Scholar Crossref Search ADS 19 H. L. Montgomery and R. C. Vaughan , The large sieve , Mathematika 20 ( 1973 ), 119 – 134 . Google Scholar Crossref Search ADS 20 H. L. Montgomery and R. C. Vaughan , Multiplicative number theory. I. Classical theory, volume 97 of Cambridge Studies in Advanced Mathematics , Cambridge University Press , Cambridge , 2007 . 21 N. Tschebotareff , Die Bestimmung der Dichtigkeit einer Menge von Primzahlen, welche zu einer gegebenen Substitutionsklasse gehören , Math. Ann. 95 ( 1926 ), 191 – 228 . Google Scholar Crossref Search ADS 22 J. Thorner and A. Zaman , A Chebotarev Variant of the Brun-Titchmarsh Theorem and Bounds for the Lang-Trotter conjectures , Int. Math. Res. Not. ( 2017 ). https://doi.org/10.1093/imrn/rnx031 23 J. Thorner and A. Zaman , An explicit bound for the least prime ideal in the Chebotarev density theorem , Algebra Number Theory 11 ( 2017 ), 1135 – 1197 . Google Scholar Crossref Search ADS 24 A. Weiss , The least prime ideal , J. Reine Angew. Math. 338 ( 1983 ), 56 – 94 . 25 D. B. Zagier , Zetafunktionen und quadratische Körper , Springer-Verlag , Berlin-New York , ( 1981 ). Eine Einführung in die höhere Zahlentheorie. [An introduction to higher number theory], Hochschultext. [University Text]. © The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

### Journal

The Quarterly Journal of MathematicsOxford University Press

Published: Dec 1, 2018

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just \$49/month

### Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

### Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

### Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

### Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

DeepDyve

DeepDyve

### Pro

Price

FREE

\$49/month
\$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off