Rate-splitting multiple access for downlink communication systems: bridging, generalizing, and outperforming SDMA and NOMA

Rate-splitting multiple access for downlink communication systems: bridging, generalizing, and... Space-division multiple access (SDMA) utilizes linear precoding to separate users in the spatial domain and relies on fully treating any residual multi-user interference as noise. Non-orthogonal multiple access (NOMA) uses linearly precoded superposition coding with successive interference cancellation (SIC) to superpose users in the power domain and relies on user grouping and ordering to enforce some users to fully decode and cancel interference created by other users. In this paper, we argue that to efficiently cope with the high throughput, heterogeneity of quality of service (QoS), and massive connectivity requirements of future multi-antenna wireless networks, multiple access design needs to depart from those two extreme interference management strategies, namely fully treat interference as noise (as in SDMA) and fully decode interference (as in NOMA). Considering a multiple-input single-output broadcast channel, we develop a novel multiple access framework, called rate-splitting multiple access (RSMA). RSMA is a more general and more powerful multiple access for downlink multi-antenna systems that contains SDMA and NOMA as special cases. RSMA relies on linearly precoded rate-splitting with SIC to decode part of the interference and treat the remaining part of the interference as noise. This capability of RSMA to partially decode interference and partially treat interference as noise enables to softly bridge the two extremes of fully decoding interference and treating interference as noise and provides room for rate and QoS enhancements and complexity reduction. The three multiple access schemes are compared, and extensive numerical results show that RSMA provides a smooth transition between SDMA and NOMA and outperforms them both in a wide range of network loads (underloaded and overloaded regimes) and user deployments (with a diversity of channel directions, channel strengths, and qualities of channel state information at the transmitter). Moreover, RSMA provides rate and QoS enhancements over NOMA at a lower computational complexity for the transmit scheduler and the receivers (number of SIC layers). Keywords: RSMA, NOMA, SDMA, MISO BC, Linear precoding, Rate region, Weighted sum rate, Rate splitting 1 Introduction at each access point, respectively). Moreover, due to the With the dramatic upsurge in the number of devices heterogeneity of devices (high-end such as smartphones expected in 5G and beyond, wireless networks will be and low-end such as Internet of Things and Machine-Type operated in a variety of regimes ranging from underloaded Communications devices), deployments, and applications to overloaded (where the number of scheduled devices is in 5G and beyond, the transmitter will need to serve smaller and larger than the number of transmit antennas simultaneously users with different capabilities, deploy- ments, and qualities of channel state information at the *Correspondence: maoyijie@hku.hk transmitter (CSIT). This massive connectivity problem Department of Electrical and Electronic Engineering, The University of Hong together with the demands for high throughput and het- Kong, Pok Fu Lam Road, Hong Kong, China Full list of author information is available at the end of the article erogeneity of quality of service (QoS) has recently spurred © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 2 of 54 interests in re-thinking multiple access for the downlink orthogonal resources (using OMA), but that may lead to of communication systems. some performance loss and latency increase. In this paper, we propose a new multiple access In nowadays wireless networks, access points are often called rate-splitting multiple access (RSMA). In order to equipped with more than one antenna. This spatial fully assess the novelty of the proposed multiple access dimension opens the door to another well-known type paradigm and the design philosophy, we first review of multiple access, namely SDMA. SDMA superposes the state of the art of two major multiple accesses, users in the same time-frequency resource and sepa- namely non-orthogonal multiple access (NOMA) [1], also rates user via a proper use of the spatial dimensions. called Multi-User Superposition Transmission (MUST) in Contrary to the SISO BC, the multi-antenna BC is non- 3GPP LTE Rel-13 [2] and space-division multiple access degraded, i.e., users cannot be ordered based on their (SDMA). We identify their benefits and limitations and channel strengths in general settings. This is the reason make critical observations, before motivating the intro- why SC–SIC is not capacity-achieving, and the com- duction of the novel and more powerful RSMA. plex dirty paper coding (DPC) is the only strategy that achieves the capacity region of the multiple-input single- 1.1 SDMA and NOMA: the extremes output (MISO) (Gaussian) BC with perfect CSIT [9]. Contrary to orthogonal multiple access (OMA) that DPC, rather than performing interference cancellation at schedules users or groups of users in orthogonal dimen- the receivers as in SC–SIC, can be viewed as a form of sions, e.g., time (TDMA) and frequency (FDMA), NOMA enhanced interference cancellation at the transmitter and superposes users in the same time-frequency resource relies on perfect CSIT to do so. Due to the high computa- via the power domain or the code domain, leading to tional burden of DPC, linear precoding is often considered the power-domain NOMA (e.g., [1]) and code-domain the most attractive alternative to simplify the transmitter NOMA (e.g., sparse code multiple access (SCMA) [3]). design [10]. Interestingly, in a MISO BC, multi-user linear Power-domain NOMA relies on superposition coding precoding (MU–LP), e.g., either in closed form or opti- (SC) at the transmitter and successive interference cancel- mized using optimization methods, though suboptimal, lation (SIC) at the receivers (denoted in short as SC–SIC) is often very useful when users experience relatively sim- [1, 4–6]. Such a strategy is motivated by the well-known ilar channel strengths or long-term signal-to-noise ratio result that SC–SIC achieves the capacity region of the (SNR) and have semi-orthogonal to orthogonal channels single-input single-output (SISO) (Gaussian) broadcast [11]. SDMA is therefore commonly implemented using channel (BC) [7, 8]. It is also well known that the capac- MU–LP. The linear precoders create different beams with each beam being allocated a fraction of the total trans- ity region of the SISO BC is larger than the rate region achieved by OMA (e.g., TDMA) when users experience mit power. Hence, similarly to NOMA, SDMA can also be a disparity of channel strengths [8]. On the other hand, viewed as a superposition of users in the power domain, when users exhibit the same channel strengths, OMA though users are separated at the transmitter side by spa- based on TDMA is sufficient to achieve the capacity tial beamformers rather than by the use of SIC at the region [8]. receivers. The benefit of a single-antenna NOMA using SC–SIC SDMA based on MU–LP is a well-established multi- is therefore to be able, despite the presence of a single ple access that is nowadays the basic principle behind transmit antenna in a SISO BC, to cope with an over- numerous techniques in 4G and 5G such as multi- loaded regime in a spectrally efficient manner where mul- user multiple-input multiple-output (MU–MIMO), coor- tiple users experience potentially very different channel dinated multipoint (CoMP) coordinated beamforming, strengths/path losses (e.g., cell-center users and cell-edge network MIMO, millimeter-wave MIMO, and massive users) on the same time/frequency resource. MIMO. The limitation of a single-antenna NOMA lies in its The benefit of SDMA using MU–LP is therefore to reap complexity as the number of users grows. Indeed, for all spatial multiplexing benefits of a MISO BC with perfect a K-userSISOBC, thestrongest user needstodecode CSIT with a low precoder and receiver complexity. using SIC the K − 1 messages of all co-scheduled users The limitations of SDMA are threefold. and therefore peel off K − 1 layers before accessing its First, it is suited to the underloaded regime and per- intended stream. Though SIC of a small number of layers formance of MU–LP in the overloaded regime quickly should be feasible in practice , the complexity and likeli- drops as it requires more transmit antennas than users hood of error propagation becomes quickly significant for to be able to efficiently manage the multi-user inter- a large number of users. This calls for ways to decrease ference. When the MISO BC becomes overloaded, the current and popular approach for the transmitter is to the number of SIC layers at each user. One could divide users into small groups of users with disparate channels schedule group of users over orthogonal dimensions (e.g., and apply SC–SIC in each group and schedule groups on time/frequency) and perform linear precoding in each Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 3 of 54 group, which may increase latency and decrease QoS a dynamic switching between SC–SIC and zero-forcing depending on the application. beamforming (ZFBF) was investigated. Second, its performance is sensitive to the user channel The second strategy, which we denote as “SC–SIC per orthogonality and strengths and requires the scheduler to group,” consists in grouping K users into G groups. Users pair semi-orthogonal users with similar channel strengths within each group are served using SC–SIC, and users together. The complexity of the scheduler can quickly across groups are served using SDMA so as to mitigate increase when an exhaustive search is performed, though the inter-group interference. Examples of such a strategy low-complexity (suboptimal) scheduling and user-pairing can be found in [1, 20–24]. This strategy can therefore algorithms exist [10]. be seen as a combination of SDMA and NOMA where Third, it is optimal from a degrees of freedom (DoF), the multi-antenna system is effectively decomposed into also known as spatial multiplexing gain, perspective in the G hopefully non-interfering single-antenna NOMA chan- perfect CSIT setting but not in the presence of imperfect nels. For this “SC–SIC per group” approach to perform CSIT [12]. The problem of SDMA design in the presence at its best, users within each group need to have their of imperfect CSIT has been to strive to apply a framework channels aligned and users across groups need to be motivated by perfect CSIT to scenarios with imperfect orthogonal. CSIT, not to design a framework motivated by imperfect Similarly to SDMA, multi-antenna NOMA designs also CSIT from the beginning [12]. This leads to the well- rely on accurate CSIT. In the practical scenario of imper- known severe performance loss of MU–LP in the presence fect CSIT, NOMA design relies on the same above two of imperfect CSIT [13]. strategies butoptimizes theprecodersoastocopewith In view of SC–SIC benefits in a SISO BC, attempts have CSIT imperfection and resulting extra multi-user inter- been made to study multi-antenna NOMA. Two lines of ference. As an example, the MISO BC channel is again research have emerged that both rely on linearly precoded degraded in [17] and precoder optimization with imper- SC–SIC. fect CSIT is studied. The first strategy, which we simply denote as “SC–SIC,” The benefit of multi-antenna NOMA, similarly to the is a direct application of SC–SIC to the MISO BC by single-antenna NOMA, is the potential to cope with an degrading the multi-antenna broadcast channel. It con- overloaded regime where multiple users experience dif- sists in ordering users based on their effective scalar ferent channel strengths/path losses and/or are closely channel (after precoding) strengths and enforce receivers aligned with each other. to decode messages (and cancel interference) in a suc- The limitations of multi-antenna NOMA are fourfold. cessive manner. This is advocated and exemplified for First, the use of SC–SIC in NOMA is fundamentally instance in [14–17]. This NOMA strategy converts the motivated by a degraded BC in which users can be ordered multi-antenna non-degraded channel into an effective based on their channel strengths. This is the key property single-antenna degraded channel, as at least one receiver of the SISO BC that enables SC–SIC to achieve its capacity ends up decoding all messages. While such a strategy can region. Unfortunately, motivated by the promising gains cope with the deployment of users experiencing aligned of SC–SIC in a SISO BC, the multi-antenna NOMA lit- channels and different path loss conditions, it comes at erature strives to apply SC–SIC to a non-degraded MISO the expense of sacrificing and annihilating all spatial mul- BC. This forces to degrade a non-degraded BC and there- tiplexing gains in general settings. By forcing one receiver fore leads to an inefficient use of the spatial dimensions in to decode all streams, the sum DoF is reduced to unity . general settings, leading to a DoF loss. This is the same DoF as that achieved by TDMA/single- Second, NOMA is not suited for general user deploy- user beamforming (or OMA). This is significantly smaller ments since degrading a MISO BC is efficient when users than the sum DoF achieved by DPC and MU–LP in a are sufficiently aligned with each other and exhibit a MISO BC with perfect CSIT, which is the minimum of disparity of channel strengths, not in general settings. the number of transmit antennas and the number of Third, multi-antenna NOMA comes with an increase users . Moreover, this loss in multiplexing gain comes in complexity at both the transmitter and the receivers. with a significant increase in receiver complexity due to Indeed, a multi-layer SIC is needed at the receivers, sim- the multi-layer SIC compared to the treat interference ilarly to the single-antenna NOMA. However, in addition, as noise strategy of MU–LP. As a remedy to recover the since there exists no natural order for the users’ chan- DoF loss, we could envision a dynamic switching between nels in multi-antenna NOMA (because we deal with vec- NOMA and SDMA, reminiscent of the dynamic switch- tors rather than scalars), the precoders, the groups, and ing between SU–MIMO and MU–MIMO in 4G [18]. the decoding orders have to be jointly optimized by the One would dynamically choose the best option between scheduler at the transmitter. Taking as an example, the NOMA and SDMA as a function of the channel states. A application of NOMA based on “SC–SIC” to a three-user particular instance of this approach is taken in [19]where MISO BC, we need to optimize three precoders, one for Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 4 of 54 each user, along with the six possible decoding orders. two extreme of fully treat interference as noise and fully Increasing the number of users leads to an exponential decode interference. increase in the number of possible decoding orders. The idea of RS dates back to Carleial’s work and the “SC–SIC per group” divides users into multiple groups Han and Kobayashi (HK) scheme for the two-user single- but that approach leads to a joint design of user order- antenna interference channel (IC) [25]. However, the use ing and user grouping. To decrease the complexity in of RS as the building block of RSMA is motivated by user ordering and user grouping, multi-antenna NOMA recent works that have shown the benefit of RS in multi- (SC–SIC and SC–SIC per group) forces users belonging antenna BC and the recent progress on characterizing the to the same group to share the same precoder (beam- fundamental limits of a multi-antenna BC (and IC) with forming vector) [1]. Unfortunately, such a restriction can imperfect CSIT. Hence, importantly, in contrast with the only further hurt the overall performance since it shrinks conventional RS (HK scheme) used for the two-user SISO the overall optimization space. IC, we here use RS in a different setup, namely (1) in a BC Fourth, multi-antenna NOMA is subject to the same and (2) with multiple antennas. The use and benefits of RS drawback as SDMA in the presence of imperfect CSIT, in a multi-antenna BC only appeared in the last few years . namely its design is not motivated by any fundamental The capacity region of the K-user MISO BC with imper- limits of aMISOBCwithimperfect CSIT. fect CSIT remains an open problem. As an alternative, The key is to recognize that the limitations and draw- recent progress has been made to characterize the DoF backs of SDMA and NOMA originate from the fact that region of the underloaded and overloaded MISO BC with those two multiple accesses fundamentally rely on two imperfect CSIT. In [26], a novel information theoretic extreme interference management strategies, namely fully upperbound on the sum DoF of the K-user underloaded treat interference as noise and fully decode interference. MISO BC with imperfect CSIT was derived. Interestingly, Indeed, while NOMA relies on some users to fully decode this sum DoF coincides with the sum DoF achieved by and cancel interference created by other users, SDMA a linearly precoded RS strategy at the transmitter with relies on fully treating any residual multi-user interference SIC at the receivers [27, 28]. RS (with SIC) is therefore as noise. In the presence of imperfect CSIT, CSIT inaccu- optimum to achieve the sum DoF of the K-user under- racy results in an additional multi-user interference that is loaded MISO BC with imperfect CSIT, in contrast with treated as noise by both NOMA (SC–SIC per group) and MU–LP that is clearly suboptimum (and so is SC–SIC SDMA. since it achieves a sum DoF of unity )[28]. It turns out that RS with a flexible power allocation is not only opti- mum for the sum DoF but for the entire DoF region of 1.2 RSMA: bridging the extremes In contrast, with RSMA, we take a different route and an underloaded MISO BC with imperfect CSIT [29]. The depart from the SDMA and NOMA literature and those DoF benefit of RS in imperfect CSIT settings were also two extremes of fully decode interference and treat inter- shown in more complicated underloaded networks with ference as noise. We introduce a more general and power- multiple transmitters in [30] and multi-antenna receivers ful multiple access framework based on linearly precoded [31]. Considering user fairness, the optimum symmetric rate splitting (RS) at the transmitter and SIC at the DoF (or max-min DoF), i.e., the DoF that can be achieved receivers. This enables to decode part of the interference by all users simultaneously, of the underloaded MISO BC and treat the remaining part of the interference as noise with imperfect CSIT with MU–LP and RS was studied in [12]. This capability of RSMA to partially decode inter- [32]. RS symmetric DoF was shown to outperform that of ference and partially treat interference as noise enables to MU–LP. Finally, moving to the overloaded MISO BC with softly bridge the two extreme strategies of fully treating heterogeneous CSIT qualities, a multi-layer power parti- interference as noise and fully decoding interference. This tioning strategy that superimposes degraded symbols on contrasts sharply with SDMA and NOMA that exclusively top of linearly precoded rate-splitted symbols was shown rely on the two extremes or a combination thereof. in [33] to achieve the optimal DoF region. In order to partially decode interference and partially ThebenefitsofRShavealsoappearedinmulti-antenna treat interference as noise, RS splits messages into com- settings with perfect CSIT. In an overloaded multigroup mon and private messages and relies on a superimposed multicast setting with perfect CSIT, considering again transmission of common messages decoded by multiple fairness, the symmetric DoF achieved by RS, MU–LP, and users and private messages decoded by their correspond- degraded NOMA transmissions (where receivers decode ing users (and treated as noise by co-scheduled users). messages and cancel interference in a successive manner Users rely on SIC to first decode the common messages as in SC–SIC) was studied in [34]. It was shown that RS here again outperforms both MU–LP and SC–SIC. before accessing the private messages. By adjusting the message split and the power allocation to the common and The DoF metric is insightful to identify the multiplex- private messages, RS has the ability to softly bridge the ing gains of the MISO BC at high SNR but fails to capture Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 5 of 54 the diversity of channel strengths among users. This limi- is the first paper to explicitly recognize that SDMA and tation is countered by the generalized DoF (GDoF) frame- NOMA are both subsets of a more general transmission work, which inherits the tractability of the DoF framework framework based on RS . while capturing the diversity in channel strengths [35]. Second, we provide a general framework of multi- In [36, 37], the GDoF of an underloaded MISO BC with layer RS design that encompasses existing RS schemes imperfect CSIT is studied, and here again, RS is used as as special cases. In particular, the single-layer RS of part of the achievability scheme. [28, 29, 32–34, 38, 40, 41] and the multi-layer (hierarchical The DoF (GDoF) superiority of RS over MU–LP and and topological) RS of [30, 39] are special instances of the SC–SIC in all those multi-antenna settings (with perfect generalized RS strategy developed here. Moreover, the use and imperfect CSIT) comes from the ability of RS to better of RS was primarily motivated by multi-antenna deploy- handle the multi-user interference by evolving in a regime ments subject to multi-user interference due to imperfect in between the extremes of fully treating it as noise and CSIT in those works. The benefit of RS in the presence fully decoding it. of perfect CSIT and/or a diversity of channel strengths in Importantly, therateenhancementsofRSoverMU– a multi-antenna setup, as considered in this paper, is less LP, as predicted by the DoF analysis, are reflected in investigated. RS was shown in [34]toboost theperfor- the finite SNR regime as shown in a number of recent mance of overloaded multi-group multi-cast. However, no works. In [38], finite SNR rate analysis of RS in MISO attempt has been made so far to identify the benefit of RS BC in the presence of quantized feedback was analyzed in multi-antenna BC with perfect CSIT and/or a diversity and it was shown that RS benefits from a CSI feedback of channel strengths. overhead reduction compared to MU–LP. Using opti- Third, we show that the rate performance (rate region, mization methods, the precoder design of RS at finite weighted sum-rate with and without QoS constraints) of SNR was investigated in [28] for the sum rate and rate RSMA is always equal to or larger than that of SDMA and region maximization with imperfect CSIT, in [32]for NOMA. Considering a MISO BC with perfect CSIT and max-min fair transmission with imperfect CSIT, and in no QoS constraints, RSMA performance comes closer to [34] for multi-group multi-cast with perfect CSIT. More- the optimal DPC region than SDMA and NOMA. In sce- over, the benefit of RS over MU–LP in the finite SNR narios with QoS constraints or imperfect CSIT, RSMA regime was shown in massive MIMO [39], millimeter- always outperforms SDMA and NOMA. Since it is moti- wave systems [40] and multi-antenna deployments subject vated by fundamental DoF analysis, RSMA is also optimal to hardware impairments [41]. Finally, the performance from a DoF perspective in both perfect and imperfect benefits of the power-partitioning strategy relying on RS CSIT and therefore optimally exploit the spatial dimen- in the overloaded MISO BC with heterogeneous CSIT was sions and the availability of CSIT, in contrast with SDMA confirmed using simulations at finite SNR in the presence and NOMA that are suboptimal. of a diversity of channel strengths [33]. In particular, in Fourth, we show that RSMA is much more robust than contrast to the RS used in [12, 28, 29, 32–34, 38, 40, 41] SDMA and NOMA to user deployments, CSIT inaccu- that relies on a single common message, [39] (as well as racy, and network load. It can operate in a wide range of [30]) showed the benefits in the finite SNR regime of a practical deployments involving scenarios where the user multi-layer (hierarchical) RS relying on multiple common channels are neither orthogonal nor aligned and exhibit messages decoded by various groups of users. similar strengths or a diversity of strengths, where the In this paper, in view of the limitations of SDMA and CSI is perfectly or imperfectly known to the transmitter, NOMA and the above literature on RS in multi-antenna and where the network load can vary between the under- BC, we design a novel multiple access, called rate-splitting loaded and the overloaded regimes. In particular, in the multiple access (RSMA) for downlink communication overloaded regime, the RSMA framework is shown to be system . RSMA is a much more attractive solution (per- particularly suited to cope with a variety of device capa- formance and complexity-wise) that retains the benefits bilities, e.g., high-end devices along with cheap Internet- of SDMA and NOMA but tackles all the aforementioned of-Things (IoT)/Machine-Type Communications (MTC) limitations of SDMA and NOMA. Considering a MISO devices. Indeed, the RS framework can be used to pack BC, we make the following contributions. the IoT/MTC traffic in the common message, while still First, we show that RSMA is a more general delivering high-quality service to high-end devices. class/framework of multi-user transmission that encom- Fifth, we show that the performance gain can come passes SDMA and NOMA as special cases. RSMA is with a lower computational complexity than NOMA for showntoreducetoSDMAifchannelsareofsimilar both the transmit scheduler and the receivers. In con- strengths and sufficiently orthogonal with each other trast to NOMA that requires complicated user grouping and to NOMA if channels exhibit sufficiently diverse and ordering and potential dynamic switching (between strengths and are sufficiently aligned with each other. This SDMA, SC–SIC and SC–SIC per group) at the transmit Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 6 of 54 scheduler and multiple layers of SIC at the receivers, stand for an identity matrix and an all-zero vector, respec- a simple one-layer RS that does not require any user tively, with appropriate dimensions. CN (δ, σ ) represents ordering, grouping, or dynamic switching at the transmit a complex Gaussian distribution with mean δ and variance scheduler and a single layer of SIC at the receivers still σ . |A| is the cardinality of the set A. significantly outperforms NOMA. In contrast to SDMA, 2Systemmodel RSMA is less sensitive to user pairing and therefore does Consider a system where a base station (BS) equipped not require complex user scheduling and pairing .How- with N antennas serves K single-antenna users. The users ever, RSMA comes with a slightly higher encoding com- N ×1 areindexed by theset K ={1, ... , K }.Let x ∈ C plexity than SDMA and NOMA due to the encoding of denotes the signal vector transmitted in a given channel the common streams on top of the private streams. use. It is subject to the power constraint E{x }≤ P . Sixth, though SC–SIC is optimal to achieve the capacity The signal received at user-k is region of SISO BC, we show that a single-layer RS is a low- complexity alternative that only requires a single layer of y = h x + n , ∀k ∈ K (1) k k SIC at each receiver and achieves close to SC–SIC (with N ×1 where h ∈ C is the channel between the BS and user-k. multi-layer SIC) performance in a SISO BC deployment. n ∼ CN 0, σ is the additive white Gaussian noise As a takeaway message, we note that the ability of a k n,k wireless network architecture to partially decode inter- (AWGN) at the receiver. Without loss of generality, we ference and partially treat interference as noise can lead assume the noise variances are equal to one for all users. to enhanced throughput and QoS, increased robustness, The transmit SNR is equal to the total power consumption P . and lowered complexity compared to alternatives that are We assume CSI of users is perfectly known at the BS in forced to operate in the extreme regimes of fully treating the following model. The imperfect CSIT scenario will be interference as noise and fully decoding interference. discussed in the proposed algorithm and the numerical It is also worth making the analogy with other types results. Channel state information at the receivers (CSIR) of channels where the ability to bridge the extremes of isassumedtobeperfect. treating interference as noise and fully decoding inter- In this work, we are interested in beamforming designs ference has appeared. Considering a two-user SISO IC, for signal x at the BS. Specifically, the objective of beam- interference is fully decoded in the strong interference forming designs is to maximize the WSR of users subject regime and is treated as noise in the weak interference to a power constraint of the BS and QoS constraints of regime. Between those two extremes, interference is nei- each user. We firstly state and compare two baseline multi- ther strong enough to be fully decoded nor weak enough antenna multiple accesses, namely SDMA and NOMA. to be treated as noise. The best known strategy for the Then, RSMA is explained. The WSR problem of each two-user SISO IC is obtained using RS (so-called HK strategy will be formulated, and the algorithm adopted scheme). RS in this context is well known to be superior to solve the corresponding problem will be stated in the to strategies relying on fully treating interference as noise, following sections. fully decoding interference, or orthogonalization (TDMA, FDMA) [25, 35]. Limiting ourselves to those extremes 3 SDMA and NOMA strategies is suboptimal [25, 35]. In this section, we describe two baseline multiple accesses. The rest of the paper is organized as follows. The sys- The messages W , ... , W intended for users 1 to K, 1 K tem model is described in Section 2. The existing multiple respectively, are encoded into K independent data streams accesses are specified in Section 3.InSection 4,the s =[ s , ... , s ] independently. Symbols are mapped 1 K proposed RSMA and its low-complexity structures are to the transmit antennas through a precoding matrix N ×1 described and compared with existing multiple accesses. denoted by P =[ p , ... , p ], where p ∈ C is the 1 K The corresponding weighted sum rate (WSR) problems precoder for user-k. The superposed signal is x = Ps = are formulated, and the weighted MMSE (WMMSE) p s . Assuming that E{ss }= I, the transmit k k k∈K approach to solve the problem is discussed. Numerical power is constrained by tr(PP ) ≤ P . results are illustrated in Section 5, followed by conclusions and future works in Section 6. 3.1 SDMA Notations: The boldface uppercase and lowercase letters SDMA based on MU–LP is a well-established multiple are used to represent matrices and vectors. The super- access. Each user only decodes its desired message by T H scripts (·) and (·) denote transpose and conjugate- treating interference as noise. The signal-to-interference- transpose operators, respectively. tr(·) and diag(·) are the plus-noise ratio (SINR) at user-k is given by trace and diagonal entries, respectively. |·| is the absolute H 2 |h p | value, and · is the Euclidean norm. E{·} refers to the sta- γ =  .(2) |h p | + 1 tistical expectation. C denotes the complex space. I and 0 j j=k,j∈K k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 7 of 54 For a given weight vector u = [u , ... , u ],the WSR where R = min {log (1 + γ )}.In 1 K π(k) i≥k,i∈K π(i)→π(k) achieved by MU–LP is [14], the problem (5) with equal weights is solved by the approximation technique minorization-maximization R (u) = max u R MU−LP k k algorithm (MMA). To keep a single and unified approach k∈K to solve the WSR problem of different beamforming (3) s.t. tr PP ≤ P strategies, we still use the WMMSE algorithm to solve th it. By approximating the rate region with a set of rate R ≥ R , ∀k ∈ K weights, therateregion R (π ) with a certain decod- SC−SIC where R = log (1 + γ ) is theachievablerateofuser-k. k k ing order π is attained. To achieve the rate region of u is a non-negative constant which allows resource allo- SC–SIC, all decoding orders should be considered. The th cation to prioritize different users. R accounts for any k largest achievable rate region of SC–SIC is defined as potential individual rate constraint for user-k.Itensures the convex hull of the union over all decoding orders as the QoS of each user. The WMMSE algorithm proposed in R = conv(∪ R (π )). SC−SIC π SC−SIC [42] is adopted to solve problem (3). The main idea of the WMMSE algorithm is to reformulate the WSR problem 3.2.2 SC–SIC per group into its equivalent WMMSE problem and solve it using the Assuming the K users are divided into G groups, denoted alternating optimization (AO) approach. The rate region as G ={1, ... , G}. In each group, there is a subset of users of the MU–LP strategy is approximated by R (u) for MU−LP K , g ∈ G. The user groups satisfy the following condi- different rate weight vectors u. The resulting rate region tions: K ∩ K =∅,if g = g ,and |G |= K.Denote g g g∈G R is the convex hull enclosing the resulting points. MU−LP π as one of the decoding orders of the users in K ,the g g In general, solution to problem (3) would provide the message of user-π (k) is decoded before the message of optimal MU–LP beamforming strategy for any channel user-π (j), ∀k ≤ j. The messages of user-π (k), ∀k ≤ i are g g deployment (in between aligned and orthogonal channels decoded at user-π (i) using SIC. The SINR experienced at and with similar or diverse channel strengths). user-π (i) to decode the message of user-π (k), k ≤ i is g g given by 3.2 NOMA H 2 |h p | NOMA relies on superposition coding at the transmitter π (k) π (i) g γ =  , π (i)→π (k) g g and successive interference cancellation at the receiver. As H 2 |h p | + I + 1 π (j) π (i) j>k,j∈K g g g π (i) discussed in the introduction, the two main strategies in (6) multi-antenna NOMA are the SC–SIC and SC–SIC per group. SC–SIC can be treated as a special case of SC–SIC H 2 where I = |h p | is the inter- π (i) j g g ∈G,g =g j∈K π (i) g g per group where there is only one group of users. group interference suffered at user-π (i). For a given weight vector u =[ u , ... , u ], a fixed grouping method 1 K 3.2.1 SC–SIC G andafixeddecodingorder π ={π , ... , π },the WSR 1 G In SC–SIC, the precoders and decoding orders have to achieved by SC–SIC per group is be optimized jointly. The decoding order is vital to the group rate obtained at each user. To maximize the WSR, all R (u, G, π) = max u R π (k) π (k) SC−SIC g g possible decoding orders of users are required to be con- g∈G k∈K sidered. Denote π as one of the decoding orders, the (7) s.t. tr PP ≤ P message of user-π(k) is decoded before the message of th user-π(j), ∀k ≤ j. The messages of user-π(k), ∀k ≤ i are R ≥ R , ∀k ∈ K decoded at user-π(i) using SIC. The SINR experienced at where R = min {log (1 + γ )}. Simi- π (k) i≥k,i∈K π (i)→π (k) g g 2 g g user-π(i) to decode the message of user-π(k), k ≤ i is larly to the SC–SIC strategy, the problem can be solved by given by using the WMMSE algorithm. To maximize the WSR, all |h p | π(k) possible grouping methods and decoding orders should be π(i) γ =  .(4) π(i)→π(k) H 2 considered. |h p | + 1 π(j) j>k,j∈K π(i) For a given weight vector u = [u , ... , u ] and a fixed 1 K Remark 1: As described in the introduction, it is com- decoding order π, the WSR achieved by SC–SIC is mon in the multi-antenna NOMA literature (SC–SIC and SC–SIC per group) to force users belonging to the same R (u, π) = max u R SC−SIC π(k) π(k) group to share the same precoder, so as to decrease the com- k∈K (5) plexity in user ordering and user grouping. Note that, in the s.t. tr PP ≤ P system model described for both SC–SIC and SC–SIC per th R ≥ R , ∀k ∈ K k group, we consider the most general framework where each k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 8 of 54 message is precoded by its own precoder. Hence, we here as noise. Therefore, each user decodes part of the message do not constrain symbols to be superimposed on the same of the other interfering user encoded in s . The interfer- precoder as this would further reduce the performance of ence is partially decoded at each user. The SINR of the NOMA strategies and therefore leading to even lower per- common stream at user-k is formance. Hence, the performance obtained with NOMA h p 12 k in this work can be seen as the best possible performance γ = .(8) 2 2 H H h p + h p + 1 achieved by NOMA. 1 2 k k Once s is successfully decoded, its contribution to the 4 Methods—proposed rate-splitting multiple original received signal y is subtracted.After that,user-k access decodes its private stream s by treating the private stream In this section, we firstly introduce the idea of RS by of user-j (j = k) as noise. The two-user transmission introducing a two-user example (K = 2) and a three- model using RS is shown in Fig. 1. The SINR of decoding user example (K = 3). Then, we propose the general- the private stream s at user-k is ized framework of RS and specify two low-complexity RS h p strategies. We further compare RSMA with SDMA and γ = .(9) NOMA from the fundamental structure and complex- h p + 1 ity aspects. Finally, we discuss the general optimization The corresponding achievable rates of user-k for the framework to solve the WSR problem. 12 12 streams s and s are R = log 1 + γ and k k R = log (1 + γ ).Toensurethat s is successfully k k 4.1 Two-user example decoded by both users, the achievable common rate shall 12 12 We first consider a two-user example. There are two not exceed R = min R , R . All boundary points for 1 2 messages W and W intended foruser-1and user-2, 1 2 thetwo-userRSrateregioncan be obtained by assuming respectively. The message of each user is split into two that R is shared between users such that C is the kth 12 1 12 2 parts, W , W for user-1 and W , W for user-2. 12 12 1 1 2 2 user’s portion of the common rate with C + C = R . 1 2 12 12 The messages W , W are encoded together into a com- 1 2 Following the two-user RS structure described above, the mon stream s using a codebook shared by both users. 12 total achievable rate of user-k is R = C + R .For a k,tot k Hence, s is a common stream required to be decoded by given pair of weights u = [u , u ], the WSR achieved by 1 2 1 2 both users. The messages W and W are encoded into 1 2 the two-user RS approach is the private stream s for user-1 and s for user-2, respec- 1 2 R (u) = max u R + u R (10a) RS 1 1,tot 2 2,tot tively. The overall data streams to be transmitted based on P,c RS is s =[ s , s , s ] . The data streams are linearly pre- 12 1 2 12 12 s.t. C + C ≤ R (10b) N ×1 12 t 1 2 coded via precoder P =[ p , p , p ],where p ∈ C 12 1 2 12 is the precoder for the common stream s . The resulting tr PP ≤ P (10c) transmit signal is x = Ps = p s + p s + p s . th 12 12 1 1 2 2 R ≥ R , k ∈{1, 2} (10d) k,tot We assume that tr ss = I, and the total transmit c ≥ 0 (10e) power is constrained by tr PP ≤ P . 12 12 At user sides, both user-1 and user-2 firstly decode the where c = C , C is the common rate vector required 1 2 data stream s by treating the interference from s and s 12 1 2 to be optimized in order to maximize the WSR. For a Fig. 1 Two-user transmission model using RS Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 9 of 54 fixed pair of weights, problem (10)can be solved using W are correspondingly encoded with the split messages 12 13 the WMMSE approach in [28], except we have perfect of user-2 W and user-3 W into data streams s and 2 3 CSIT here. By calculating R (u) for a set of different rate s . s is the partial common stream intended for user-1 RS 13 12 weights u, we obtain the rate region. and user-2. Hence, user-1 and user-2 will decode s while In contrast to MU–LP and SC–SIC, the RS scheme user-3 will decode its intended streams by treating s as described above offers a more flexible formulation. In par- noise. Similarly, we obtain s partially encoded for user- 1 2 3 ticular, instead of hard switching between MU–LP and 2and user-3. W , W ,andW are encoded into private 1 2 3 SC–SIC, it allows both to operate simultaneously if neces- streams s , s ,and s ,respectively. 1 2 3 sary, and hence smoothly bridges the two. In the extreme The vector of data streams to be transmitted is of treating multi-user interference as noise, RS boils down s = s , s , s , s , s , s , s . After linear precoding [ ] 123 12 13 23 1 2 3 to MU–LP by simply allocating no power to the com- using precoder P = [p , p , p , p , p , p , p ],the 123 12 13 23 1 2 3 mon stream s . In the other extreme of fully decoding signals are superposed and broadcast. The decoding interference, RS boils down to SC–SIC by forcing one user, procedure when K = 3 is more complex comparing with say user-1, to fully decode the message of the other user, that in the two-user example. The main difference lies say user-2. This is achieved by allocating no power to s , in decoding partial common streams for two-users. encoding W into s and encoding W into s ,suchthat Define the streams to be decoded by l users as l-order 1 1 2 12 x = p s + p s . User-1 and user-2 decode s by treat- streams. The 2-order streams to be decoded at user-1 are 12 12 1 1 12 ing s as noise and user-1 decodes s after canceling s .A s ands . The 2-order streams to be decoded at user-2 1 1 12 12 13 physical-layer multicasting strategy is obtained by encod- and user-3 are s ands and s ands ,respectively.As 12 23 13 23 ing both W and W into s and allocating no power to s the 1-order and 2-order streams to be decoded at differ- 1 2 12 1 and s . ent users are not the same, we take user-1 as an example. The decoding procedure is the same for other users. Remark 2 : It should be noted that while the RS transmit User-1 decodes four streams s , s , s ,ands based on 123 12 13 1 signal model resembles a broadcasting system with uni- SIC while treating other streams as noise. The decoding cast (private) streams and a multi-cast stream, the role of procedure starts from the 3-order stream (common the common message is fundamentally different. The com- stream) and progresses downwards to the 1-order stream mon message in a unicast-multi-cast system carries public (private stream). Specifically, user-1 first decodes s and information intended as a whole to all users in the sys- subtracts its contribution from the received signal. The tem, while the common message s in RS encapsulates SINR of the stream s at user-1 is 12 123 parts of private messages, and is not entirely required by all users, although decoded by the two users for interference h p mitigation purposes [12]. 123 1 γ = . 2 2 H H h p + h p + 1 i∈{12,13,23} 1 k=1 1 Remark 3 : A general framework is adopted where poten- (11) tially each user can split its message into common and private parts. Note however that depending on the objec- tive function, it is sometimes not needed for all users to split After that, user-1 decodes two streams s , s and 12 13 treats interference of s as noise. Both decoding orders their messages. For instance, for sum-rate maximization of decoding s followed by s and s followed by s subject to no individual rate constraint, it is sufficient to 12 13 13 12 should be considered in order to maximize the WSR. have only one user to split its message [28]. However, when Denote π as one of the decoding order to decode it comes to satisfying some fairness (WSR, QoS constraint, l-order streams. There is only one 1-order stream and one max-min fairness), splitting the message of multiple users 3-order stream to be decoded at each user. Therefore, only appears necessary [28, 32, 34]. one decoding order exists for both π and π .Incon- 1 3 4.2 Three-user example trast, each user is required to decode two 2-order streams. We further consider a three-user example. Different from Denote s as the ith data stream to be decoded at π (i) 2,k the two-user case, the message of user-1 is split into user-k based on the decoding order π . One instance of 123 12 13 1 W , W , W , W . Similarly, the message of user- π is 12 → 13 → 23, where s is decoded before 2 12 1 1 1 1 123 12 23 2 2 and user-3 is split into W , W , W , W and s and s is decoded before s at all users. Since 13 13 23 2 2 2 2 123 13 23 3 W , W , W , W , respectively. The superscript rep- only data streams s and s are decoded at user-1, 12 13 3 3 3 3 resents a specific group of users whose messages with the the decoding order at user-1 based on π is π = 2 2,1 same superscript are going to be encoded together. For 12 → 13. Hence, s = s and s = s . π (1) 12 π (2) 13 2,1 2,1 123 123 123 example, W , W ,andW are encoded into the com- The data stream s is decoded before s .The π (1) π (2) 2,1 2,1 1 2 3 mon stream s intended for all the three users. W and SINRs of decoding streams s and s at user-1 are 123 π (1) π (2) 2,1 2,1 1 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 10 of 54 123 12 23 123 13 h p R = C + C + C + R ,and R = C + C + 2,tot 2 3,tot π (1) π (1) 2,1 1 2,1 2 2 2 3 3 γ = . 23 2 2 2 C + R . For a given weight vector u = [u , u , u ] and a H H 3 H 3 1 2 3 h p + h p + h p + 1 π (2) 23 k 1 2,1 1 k=1 1 fixeddecodingorder π = [π , π , π ], the WSR achieved 1 2 3 (12) by the three-user RS approach is H 3 h p π (2) π (2) 1 2,1 2,1 γ = . (13) R (u, π) = max u R (15a) 1 RS 2 2 3 k k,tot H H h p + h p + 1 P,c 23 k 1 k=1 1 k=1 123 123 123 User-1 finally decodes s by treating other data streams s.t. C + C + C ≤ R (15b) 1 2 3 as noise. The three-user RS transmission model with the 12 12 C + C ≤ R (15c) 1 2 decoding order π = 12 → 13 → 23 is shown in Fig. 2. 13 13 C + C ≤ R (15d) TheSINRofdecoding s at user-1 is 13 1 3 23 23 C + C ≤ R (15e) 2 3 h p γ = . (14) 2 2 3 tr PP ≤ P (15f) H H t h p + h p + 1 23 k k=2 1 1 th R ≥ R , k ∈{1, 2, 3} (15g) k,tot The corresponding rate of each data stream is cal- c ≥ 0 (15h) culated in the same way as in the two-user exam- ple. To ensure that s is successfully decoded by all 123 123 123 12 12 13 13 23 23 users, the achievable common rate shall not exceed where c = C , C , C , C , C , C , C , C , C 1 2 3 1 2 1 3 2 3 123 123 123 R = min R , R , R .Toensurethat s is suc- 123 12 is thecommonratevectorrequiredtobeoptimized in 1 2 3 cessfully decoded by user-1 and user-2, the achievable order to maximize the WSR. By calculating R (u, π) RS 12 12 common rate shall not exceed R = min R , R . for a set of different rate weights u,weobtainthe 1 2 13 13 Similarly, we have R = min R , R and R = 13 23 rate region R (π ) of a certain decoding order π.The 1 3 RS 23 23 min R , R . All boundary points for the three-user RS rate region of the three-user RS is achieved as the 2 3 rate region can be obtained by assuming that R , R , 123 12 convex hull of the union over all decoding orders as R ,and R are shared by the corresponding group of 13 23 R = conv R (π ) . RS RS users. Denote the portion of the common rate allocated Similar to the two-user case, SC–SIC and MU–LP 123 123 to user-k for the message s as C ,wehave C + are again easily identified as special sub-strategies of RS k 1 123 123 12 12 C + C = R . Similarly, we have C + C = R , 123 12 by switching off some of the streams. Problem (15)is 2 3 1 2 13 13 23 23 C + C = R ,and C + C = R . Following the 13 23 non-convex and non-trivial. We propose an optimization 1 3 2 3 three-user RS structure described above, the total achiev- algorithm in Section 4.7 to solve it based on the WMMSE 123 12 13 able rate of each user is R = C + C + C + R , 1,tot 1 approach. 1 1 1 Fig. 2 Three-user transmission model using RS Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 11 of 54 4.3 Generalized rate-splitting is the interference at user-k to decode s . π (i) l,k H 2 We further propose a generalized RS framework for |h p | is the interference from the π (j) j>i k l,k K users. The users are indexed by the set K ={1, ... , K }. remaining non-decoded l-order streams in s . l,k |S | For any subset A of the users, A ⊆ K, the BS transmits l−1 l ,k H 2 |h p | is the interference from lower π (j) l =1 j=1 k l ,k adatastream s to be decoded by the users in the subset order streams s , ∀l < l to be decoded at user-k. l ,k A while treated as noise by other users. s loads messages H 2 |h p | is the interference from the streams of all the users in the subset A. The message intended for A ⊆K,k∈ /A k A that are not intended for user-k. The corresponding user-k (k ∈ K)issplit as {W |A ⊆ K, k ∈ A }.The mes- achievable rate of user-k for the data stream s is A π (i) l,k sages {W |k ∈ A} of users with the same superscript A π (i) π (i) l,k l,k are encoded together into the stream s . R = log 1 + γ . To ensure that the streams A 2 k k The stream order defined in Section 4.2 is applied to shared by more than two users are successfully decoded the generalized RS. The stream order of data stream s A by all users, the achievable rate of each user in the subset is |A|. For a given l ∈ K,there are distinct l-order A (A ∈ K,2 ≤|A|≤ K ) to decode the |A|-order stream streams. For example, we have only one K-order stream s shall not exceed (traditional common stream) while we have K 1-order R = min R | k ∈ A . (18) ×1 A ( ) k streams (private steams). Define s ∈ C as the l-order data stream vector formed by all l-order streams For a given l ∈ K,the l-order streams to be decoded in {s |A ⊆ K, |A |= l}. Note that when l = K,there at different users are different. s is decoded at user-k is a single K-order stream. s reduces to s . For example, K K (k ∈ A) based on the decoding order π . R becomes |A|,k when K = 3, the 3-order stream vector is s = s .The 3 123 therateofreceiving stream s at all users in the user 1-order and the 2-order stream vectors are s = [s , s , s ] 1 1 2 3 group A with a certain decoding order π . All boundary |A| and s = [s , s , s ] ,respectively. Thedatastreams are 2 12 13 23 points for the K-userRSrateregioncan be obtained by linearly precoded via the precoding matrix P formed by assuming that R is shared by all users in the user group p |A ⊆ K, |A |= l . The precoded streams are super- A. Denote the portion of the common rate allocated to posed and the resulting transmit signal is A A user-k (k ∈ A) as C ,wehave C = R . Follow- k k ∈A k K K ing the RS structure described above, the total achievable rate of user-k is x = P s = p s . (16) A A l l l=1 l=1 A ⊆K,|A |=l R = C + R , (19) k,tot k A ⊆K,k∈A At user sides, each user is required to decode the intended streams based on SIC. The decoding proce- where R is therateofthe 1-orderstream s .Itisintended k k dure starts from the K-order stream and then goes for user-k only. No common rate sharing is required for down to the 1-order stream. A given user is involved R . For a given weight vector u = [u , ··· , u ] and 1 K in multiple l-order streams with an exception of the a certain decoding order π ={π , ... , π },the WSR 1 K K-order and 1-order streams. Denote π as one of the achieved by RS is decoding orders to decode the l-order data streams s for all users. The l-order stream vector to be decoded R (u, π) = max u R RS k k,tot P,c at user-k basedonacertaindecodingorder π is k∈K s = s , ··· , s ,where S ={s |A ⊆ π π (1) π (|S |) A l,k s.t. C ≤ R , ∀A ⊆ K l,k l,k l,k l,k A K, |A |= l, k ∈ A } is the set of l-order streams to be k ∈A (20) decoded at user-k. We assume s is decoded before H π (i) l,k tr PP ≤ P s if i < j. The SINR of user-k to decode the l-order π (j) l,k th R ≥ R , k ∈ K k,tot stream s with a certain decoding order π is k π (i) l l,k c ≥ 0 H 2 |h p | π (i) π (i) l,k k l,k γ = , (17) P = [P , ... , P ] is the precoding matrix of all order k 1 K I + 1 π (i) l,k streams. c is the common rate vector formed by C |A ⊆ K, k ∈ A . For a fixed weight vector, problem (20) where can be solved using the WMMSE approach discussed in |S | l−1 l ,k Section 4.7 by establishing rate-WMMSE relationships for H 2 H 2 I = |h p | + |h p | π (i) π (j) π (j) l,k k l,k k l ,k all data streams. By calculating R (u, π) for a set of dif- RS j>i j=1 l =1 ferent rate weights u, we obtain the rate region R (π ) RS H 2 + |h p | of a certain decoding order π. To achieve the rate region, A ⊆K,k∈ /A all decoding orders should be considered. The capacity Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 12 of 54 region of RS is defined as the convex hull of the union over The data streams are linearly precoded via precoder all decoding orders as P = [p , p , ... , p ]. The resulting transmit signal is K 1 K x = Ps = p s + p s .Figure 3 shows a 1-layer RS K K k k k∈K model. Readers are referred to Fig. 1 in [12] for a detailed R = conv R (π ) . (21) RS RS π illustration of the 1-layer RS architecture. At user sides, all users firstly decode the data stream s 4.4 Structured and low-complexity rate-splitting by treating the interference from s , ... , s as noise. The 1 K The generalized RS described in Section 4.3 is able to pro- SINR of the K-order stream at user-k is vide more room for rate and QoS enhancements at the h p K k expense of more layers of SIC at receivers. Hence, though γ = . (22) the generalized RS framework is very general and can be h p + 1 j∈K used to identify the best possible performance, its imple- Once s is successfully decoded, its contribution to the mentation can be complex due to the large number of SIC original received signal y is subtracted.After that,user-k layers and common messages involved. To overcome the decodes its private stream s by treating the 1-order pri- problem, we introduce two low-complexity RS strategies vate streams of other users as noise. The SINR of decoding for K users, 1-layer RS and 2-layer hierarchical RS (HRS). the private stream s at user-k is Those two RS strategies require the implementation of one and two layers of SIC at each receiver, respectively. H h p γ = . (23) 4.4.1 1-layer RS 2 h p + 1 j∈K,j=k Instead of transmitting all order streams, 1-layer RS trans- mits the K-order common stream and 1-order private The corresponding achievable rates of user-k for the K K streams. Only one SIC is required at each receiver. streams s and s are R = log 1 + γ and R = K k k k k The message of each user is split into two parts log (1 + γ ).Toensurethat s is successfully decoded k K K k K K W , W , ∀k ∈ K. The messages W , ... , W are by all users, the achievable common rate shall not exceed 1 K k k K K jointly encoded into the K-order stream s intended to R = min R , ... , R . R is shared among users such K K 1 K k K be decoded by all users. W is encoded into s to be that C is the kth user’s portion of the common rate decoded by user-k only. The overall data streams to be with C = R . Following the two-user RS struc- k∈K transmitted based on 1-layer RS is s = [s , s , ... , s ] . K 1 K ture described above, the total achievable rate of user-k Fig. 3 One-layer RS model of K users. The common stream s is shared by all the users K Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 13 of 54 is R = C + R . For a given weight vector u = h p k,tot k K k γ = . (25) [u , ... , u ], the WSR achieved by the K-user 1-layer RS 2 2 1 K H H h p + h p + 1 K j g∈G g j∈K k k approach is Once s is successfully decoded, its contribution to the original received signal y is subtracted. After that, user-k R (u) = max u R (24a) decodes its group common stream s by treating other 1−layerRS k k,tot K P,c k∈K group common streams and 1-order private streams as noise. The SINR of decoding the |K |-order stream s at g K s.t. C ≤ R (24b) user-k is k∈K tr PP ≤ P (24c) H h p K g g k γ = . th R ≥ R , k ∈ K (24d) k,tot 2 k H H h p + h p + 1 K j g ∈G,g =g k j∈K k c ≥ 0 (24e) (26) After removing its contribution to the received sig- K K where c = C , ... , C . For a given weight vector, 1 K nal, user-k decodes its private stream s .The SINR of problem (24) can be solved using the WMMSE approach decoding the private stream s at user-k is in [28]. In contrast to NOMA, this 1-layer RS does not require 2 h p any user ordering or grouping at the transmitter side γ = . H H since all users decode the common message (using single h p + h p + 1 K j g ∈G,g =g j∈K,j=k k g k layer of SIC) before accessing their respective private mes- (27) sages. We also note that the 1-layer RS is a sub-scheme of the generalized RS and is a super-scheme of MU– The corresponding achievable rates of user-k for the LP (since by not allocating any power to the common K K streams s , s ,and s are R = log 1 + γ , K K k g 2 k k message, the 1-layer RS boils down to MU–LP). How- K K g g R = log 1 + γ and R = log 1 + γ .The ( ) k k 2 2 k k ever, for K > 2, SC–SIC and SC–SIC per group are not achievable common rate of s and s shall not exceed K K sub-schemes of 1-layer RS (even though they were sub- g K K schemes of the generalized RS). This explains why, in [12], R = min R , ... , R and R = min R | k ∈ K , K K g g k 1 K k the authors already contrasted 1-layer RS and NOMA respectively. R is shared among users such that C is the and expressed that the two strategies cannot be treated kth user’s portion of the common rate with C = k∈K as extensions or subsets of each other. This 1-layer RS R . R is shared among users in the group K such that K K g appeared in many scenarios subject to imperfect CSIT in C is the kth user’s portion of the common rate with [28, 29, 32–34, 38, 40, 41]. C = R . Following the two-user RS struc- k∈K k 4.4.2 2-layer HRS ture described above, the total achievable rate of user-k is The K users are divided into G groups G ={1, ... , G} R = C + C + R ,where k ∈ K . For a given weight k,tot k g k k with K , g ∈ G users in each group. The user groups sat- vector u =[ u , ... , u ], the WSR achieved by the K-user 1 K isfy the same conditions as in Section 3.2.2. Besides the 2-layer HRS approach is K-order stream and 1-order streams, 2-layer HRS also allows the transmission of a |K |-order stream intended R (u) = max u R (28a) 2−layerHRS k k,tot P,c for users in K . The overall data streams to be transmitted k∈K based on 2-layer RS is s = s , s , ... , s , s , ... , s . K K K 1 K K 1 G s.t. C ≤ R (28b) The data streams are linearly precoded via precoder k∈K P = p , p , ... , p , p , ... , p . The resulting trans- K K K 1 K 1 G C ≤ R , ∀g ∈ G (28c) mit signal is x = Ps = p s + p s + p s . K K K K K k k k g∈G g g k∈K k∈K Figure 4 shows an example of 2-layer HRS. The users are g divided into two groups, K ={1, 2}, K ={3, 4}. s is a 1 2 1234 tr PP ≤ P (28d) 4-order stream intended for all the users while s and s 12 34 th R ≥ R , k ∈ K (28e) k,tot are 2-order streams for users in each group only. c ≥ 0 (28f) Each user is required to decode three streams s , s , K K and s . We assume k ∈ K .The data stream s is decoded k g K where c is the common rate vector formed by first by treating the interference from all other streams as C , C |k ∈ K, k ∈ K , g ∈ G . For a given weight noise. TheSINRofthe K-order stream at user-k is k k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 14 of 54 Fig. 4 Two-layer HRS example, K = 4, G = 2, K ={1, 2}, K ={3, 4} 1 2 vector, problem (28) can be solved by simply modifying Let us further discuss how the proposed framework of the WMMSE approach discussed in Section 4.7. generalized RS in Section 4.3 contrasts and encompasses Comparing with SC–SIC per group where |K |−1layers NOMA, SDMA, and RS strategies. We first compare the of SIC are required at user sides, 2-layer HRS only requires four-user MIMO–NOMA scheme illustrated in Fig. 5 of two layers of SIC at each user. Moreover, the user order- [1] with the four-user 2-layer HRS strategy illustrated in ing issue in SC–SIC per group does not exist in 2-layer Fig. 4.InFig.5of [1], user-1 and user-2 are superposed HRS. The streams of a higher stream order will always be in the same beam. User-3 and user-4 share another beam. decoded before the streams of a lower stream order. One- The users are decoded based on SC–SIC within each layer RS is the simplest architecture since only one SIC is beam. As for the four-user 2-layer HRS strategy in Fig. 4, needed at each user and it is a sub-scheme of the 2-layer the encoded streams are precoded and transmitted jointly HRS. We also note that we can obtain a 1-layer RS per to users. If we set the common message s to be encoded group from the 2-layer HRS by not allocating any power to by the message of user-2 only and decoded by both user- s . Note that SC–SIC and SC–SIC per group are not nec- 1 and user-2, the common message s to be encoded by K 34 essarily sub-schemes of the 2-layer HRS. The 2-layer HRS the message of user-4 and decoded by user-3 and user- strategy was first introduced in [39] in the massive MIMO 4, we also set the precoders p and p to be equal, the 12 1 context. precoders p and p to be equal, and the precoders of 34 3 other streams to be 0, then the proposed RS scheme 4.5 Encompassing existing NOMA and SDMA reduces to the scheme illustrated in Fig. 5 of [1]. Simi- A comparison of NOMA, SDMA and RSMA are shown larly, the K-user RS model can be reduced to the K-user in Table 1. Comparing with NOMA and SDMA, the MIMO–NOMA scheme. Therefore, the MIMO–NOMA most important characteristic of RSMA is that it partially scheme proposed in [1] is a particular case of our RS decodes interference and partially treats interference as framework. noise through the split into common and privates mes- In view of the above discussions, it should now be sages. This capability enables RSMA to maintain a good clear that SDMA and the multi-antenna NOMA strate- performance for all user deployment scenarios and all gies discussed in the introduction (relying on SC–SIC and network loads, as it will appear clearer in the numerical SC–SIC per group) are all special instances of the gener- results of Section 5. alized RS framework. Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 15 of 54 Table 1 Comparison of different strategies Multiple NOMA SDMA RSMA access Strategy SC–SIC SC–SIC per group MU–LP All forms of RS Design Fully decode interference Fully decode interference in each Fully treat interference as Partially decode principle group and treat interference noise interference and partially between groups as noise treat interference as noise Decoder SIC at receivers SIC at receivers Treat interference as noise SIC at receivers architecture User Users experience aligned Users in each group experience Users channels are Any angle between deployment channel directions and a aligned channel directions and a (semi-)orthogonal with channels and any disparity scenario large disparity in channel large disparity in channel strengths. similar channel strengths in channel strengths strengths Users in different groups experience orthogonal channels Network load More suited to overloaded More suited to overloaded network More suited to Suited to any network load network underloaded network In the proposed generalized K-userRSmodel,ifwe Table 2.InTable 2, RS refers to the generalized RS of set P = 0, ∀l ∈{2, ··· , K }, only 1-order streams (private Section 4.3. streams) are transmitted. Each user only decodes its As mentioned in the introduction, the complexity of intended private stream by treating others as noise. NOMA in the multi-antenna setup is increasing signif- Problem (20) is then reduced to the SDMA problem (3). icantly at both the transmitter and the receivers. The If the message of each user is encoded into one stream optimal decoding order of NOMA is no longer fixed of distinct stream order, problem (20) is equivalent to the based on the channel gain as in the SISO BC. To maxi- SC–SIC problem (5). By keeping 1-order and K-order mize the WSR, the decoding order should be optimized streams, we have the 1-layer RS strategy whose perfor- together with precoders at the transmitter. Moreover, SC– mance benefit in the presence of imperfect CSIT was SIC is suitable for aligned users with large channel gain highlighted in various scenarios in [28, 29, 32–34, 38, 40, 41]. difference. A proper user scheduling algorithm increases There is only one common data stream to be transmitted the scheduler complexity. At user sides, K − 1layersof and decoded by all users before each user decodes its SIC are required at each user for a K-user SC–SIC sys- private stream. By keeping 1-order, K-order, and l-order tem. Increasing the number of users leads to a dramatic streams, where l is selected from {2, ··· , K − 1},the increase of the scheduler and receiver complexity and is problem becomes the 2-layer HRS originally proposed in subject to more error propagation in the SICs. [39] with two layers of common messages to be transmit- SC–SIC per group reduces the complexity at user sides. ted. Another example of such a multi-layer RS has also Only layers of SIC are required at each user if we appeared in the topological RS for MISO networks of uniformly group the K users into G groups. However, the [30]. Therefore, the formulated K-user RS problem is a complexity at the transmitter increases with the number more general problem. It encompasses SDMA, NOMA, of user groups. A joint design of user ordering and user and existing RS methods as special cases. grouping for all groups is necessary in order to maximize Though the current work focuses on MISO BC, the RS the WSR. For example, for a 4-user system, if we divide framework can be extended to multi-antenna users and the users into two groups with two users in each group, the general MIMO BC [31] as well as to a general network we should consider three different user grouping meth- scenario with multiple transmitters [30]. Nevertheless, the ods and four different decoding orders for each grouping optimization of the precoders in those scenarios remain method. interesting topics for future research. Applications of this The complexity of MU–LP is much reduced as it does RS framework to relay networks is also worth explor- not require any SIC at user sides. However, as MU–LP is ing. Preliminary ideas have appeared in [43], though joint more suitable for users with (semi-)orthogonal channels encoding of the splitted common messages are not taken and similar channel strengths, the transmitter requires into account. accurate CSIT and user scheduling should be carefully designed for interference coordination. The scheduler complexity at the transmitter is still high. 4.6 ComplexityofRSMA Comparing with NOMA and SDMA, RSMA is able We further discuss the complexity of RSMA by com- to balance the performance and complexity better. All paring it with NOMA and SDMA. A qualitative com- forms of RS are suitable for users with any channel gain parison of NOMA, SDMA, and RSMA is shown in Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 16 of 54 Table 2 Qualitative comparison of the complexity of different strategies Multiple NOMA SDMA RSMA access Strategy SC–SIC SC–SIC per group MU–LP RS 1-layer RS Encoder Encode K streams Encode K streams Encode K streams Encode K private streams Encode K + 1streams complexity plus additional common streams Scheduler Very complex as it Very complex as it requires Complex as MU–LP Complex as it requires to Simpler user scheduling as complexity requires to find to divide users into requires to pair together decide upon suitable RS copes with any user aligned users and orthogonal groups, with semi-orthogonal users decoding order of the deployment scenario, decide upon suitable aligned users in each with similar channel gains streams with the same does not rely on user user ordering group and decide upon stream order grouping and user suitable user ordering in ordering each group Receiver Requires multiple Requires multiple layers of Does not require any SIC Requires multiple layers of Requires a single layer of complexity layers of SIC. Subject SICineachgroup anda SIC. Subject to error SIC for all users. Less to error propagation single layer of SIC if propagatio subject to error groups are made of 2 propagation users. Subject to error propagation difference and any channel angle in between, though a SICs. The 3-order stream s is decoded first. It is 123 123 multi-layer RS would have more flexibility. Considering estimated as s ˆ = g y ,where g is the equal- 123 1 1 1 the generalized RS, the decoding order of multiple streams izer. After successfully decoding and removing s with the same stream order should be optimized together from y , the estimate of the 2-order stream s 1 π (1) 2,1 π (1) 2,1 H with the precoders when there are multiple streams of is s ˆ = g y − h p s . Similarly, we π (1) 1 123 123 2,1 1 1 the same stream order intended for each user (e.g., each calculate the estimates of s ˆ and s ˆ as s ˆ = π (2) 1 π (2) 2,1 2,1 user decodes two 2-order streams in the 3-user example π (2) 2,1 H H g y − h p s − h p s and s ˆ = 1 123 123 π (1) π (1) 1 2,1 2,1 1 1 1 of Section 4.2.). But its special case, 1-layer RS, simplifies 1 H H H g y −h p s −h p s − h p s , 1 123 123 π (1) π (1) π (2) π (2) 2,1 2,1 2,1 2,1 1 1 1 1 both the scheduler and receiver design, and it is still able π (1) π (2) 2,1 2,1 respectively. g , g , g are the corresponding to achieve a good performance in all user deployment sce- 1 1 1 equalizers at user-1. The mean square error (MSE) of narios. One-layer RS requires only one SIC at each user. It each stream is defined as ε  E |s −ˆ s | .Theyare k k k does not rely on user grouping and user ordering for user calculated as scheduling. Therefore, the complexity of the scheduler is much simplified. 123 123 2 123 123 H ε =|g | T − 2 g h p + 1, 1 1 1 1 1 The cost of RSMA comes with a slightly higher encoding π (1) π (1) π (1) π (1) 2,1 2,1 2 2,1 2,1 H complexity since private and common streams need to be ]ε =|g | T − 2 g h p + 1, π (1) 1 1 1 1 1 2,1 encoded. For the 1-layer RS in a K-user MISO BC, K + 1 π (2) π (2) π (2) π (2) 2,1 2,1 2 2,1 2,1 H streams need to be encoded in contrast to K streams for ε =|g | T − 2 g h p + 1, π (2) 1 2,1 1 1 1 1 NOMA and SDMA. 1 1 2 1 1 H ε =|g | T − 2 g h p + 1, 1 1 1 1 1 4.7 Optimization of RS (29) The WMMSE approach proposed in [42] is extended to 123 H 2 H 2 H 2 where T  |h p | +|h p | +|h p | + solve the problem. The WMMSE algorithm to solve the 123 12 13 1 1 1 1 H 2 H 2 H 2 H 2 |h p | +|h p | +|h p | +|h p | + 1 is the receive sum rate maximization problem with 1-layer RS (dis- 23 1 2 3 1 1 1 1 π (1) π (2) 2,1 2,1 123 H 2 cussed in Section 4.4.1)isproposedin[28]. We further power at user-1. T  T −|h p | , T 1 1 1 1 π (1) π (2) extend it to solve the generalized RS problem (20). To sim- 2,1 H 2 1 2,1 H 2 T −|h p | , T  T −|h p | .The π (1) π (2) 2,1 2,1 1 1 1 1 1 plify the explanation, we focus on the 3-user problem (15). optimum MMSE equalizers are It can be easily extended to solve the K-user generalized MMSE −1 RS problem. 123 H 123 g = (p ) h T , 123 1 1 1 As the 1-order and 2-order streams to be decoded MMSE −1 π (1) π (1) 2,1 H 2,1 at different users are not the same, we take user- g = (p ) h T , π (1) 1 1 2,1 1 1 as an example. The procedure of the WMMSE (30) MMSE −1 π (2) π (2) 2,1 H 2,1 algorithm is the same for other users. The signal g = (p ) h T , π (2) 1 1 2,1 1 received at user-1 is y = h Ps + n .Itdecodes 1 1 MMSE −1 1 H 1 four streams s , s , s , s sequentially using g = (p ) h T . 123 π (1) π (2) 1 1 1 2,1 2,1 1 1 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 17 of 54 π (1) 2,1 ∗ MMSE π (2) π (2) ∗ MMSE ∂ε ∂ε 2,1 2,1 1 1 1 1 g = g ,and g = g . They are calculated by solving = 0, = 0, 1 1 1 1 123 π (1) 2,1 ∂g 1 ∂g Substituting the optimum equalizers into (32), we obtain π (2) 2,1 ∂ε ∂ε 1 1 = 0, = 0. Substituting (30)into(29), the π (2) 1 2,1 ∂g ∂g 1 MMSE MMSE MMSEs become 123 123 123 123 123 ξ g = u ε − log u , 1 1 1 1 2 1 MMSE MMSE π2,1(1) π2,1(1) π2,1(1) π2,1(1) π2,1(1) ξ g = u ε − log u , 1 1 1 1 1 MMSE −1 123 123 123 123 MMSE MMSE ε  min ε = T I , π (2) π (2) π (2) π (2) π (2) 2,1 2,1 2,1 2,1 2,1 1 1 1 1 123 ξ g = u ε − log u , g 1 1 1 1 2 1 MMSE MMSE 1 1 1 1 1 MMSE −1 ξ g = u ε − log u . π (1) π (1) π (1) π (1) 1 1 1 1 2 1 2,1 2,1 2,1 2,1 ε  min ε = T I , 1 1 1 1 π (1) 2,1 (33) MMSE −1 π (2) π (2) π (2) π (2) 2,1 2,1 2,1 2,1 MMSE 123 123 ε  min ε = T I , ∂ξ g 1 1 1 1 1 1 π (2) 2,1 By furthersolving theequations = 0, g 123 1 ∂u MMSE MMSE π (1) π (1) π (2) π (2) 2,1 2,1 2,1 2,1 ∂ξ g ∂ξ g MMSE −1 1 1 1 1 1 1 1 1 ε  min ε = T I , 1 1 1 1 = 0, = 0, 1 π (1) π (2) 2,1 2,1 1 ∂u ∂u 1 1 MMSE 1 1 (31) ∂ξ g 1 1 and = 0, we obtain the optimum MMSE ∂u weights as π (1) π (1) π (2) π (2) 2,1 2,1 2,1 2,1 123 1 where I = T , I = T , I = T ,and 1 1 1 1 1 1 1 1 H 2 −1 = T −|h p | .Based on (31), the SINRs of decod- 1 ∗ MMSE MMSE 1 1 1 123 123 123 u = u  ε , 1 1 1 ing the intended streams at user-1 can be expressed as MMSE −1 MMSE π (1) π (1) 2,1 2,1 123 123 ∗ MMSE MMSE γ = 1/ ε −1, γ = 1/ ε −1, π (1) π (1) π (1) 2,1 2,1 2,1 1 1 1 1 u = u  ε , 1 1 1 MMSE π (2) π (2) MMSE 2,1 2,1 1 1 γ = 1/ ε −1, and γ = 1/ ε −1. −1 1 1 1 1 ∗ MMSE MMSE π (2) π (2) π (2) 2,1 2,1 2,1 u = u  ε , The corresponding rates are rewritten as R =− log 1 1 1 1 2 MMSE MMSE π (1) π (1) 2,1 2,1 −1 ε , R =− log ε , ∗ MMSE MMSE 1 1 1 1 1 2 1 u = u  ε . 1 1 1 MMSE π (2) π (2) 2,1 2,1 (34) R =− log ε ,and R =− log 2 2 1 1 1 MMSE ε . The augmented WMSEs are Substituting (34)into(33), we establish the rate WMMSE relationship as 123 123 123 123 ξ = u ε − log u , 1 1 1 2 1 MMSE 123 123 123 ξ  min ξ = 1 − R , 1 1 1 123 123 π (1) π (1) π (1) π (1) u ,g 2,1 2,1 2,1 2,1 1 1 ξ = u ε − log u , 1 1 1 2 1 MMSE π (1) π (1) π (1) 2,1 2,1 2,1 (32) ξ  min ξ = 1 − R , 1 1 1 π (2) π (2) π (2) π (2) 2,1 2,1 2,1 2,1 π (1) π (1) 2,1 2,1 ξ = u ε − log u , u ,g 2 1 1 1 1 1 1 MMSE π (2) π (2) π (2) 2,1 2,1 2,1 1 1 1 1 ξ  min ξ = 1 − R , ξ = u ε − log u , 1 1 1 1 1 1 2 1 π (2) π (2) 2,1 2,1 u ,g 1 1 MMSE 1 1 1 ξ  min ξ = 1 − R . 1 1 1 1 1 u ,g π (1) π (2) 1 1 123 2,1 2,1 1 where u , u , u ,andu are weights associated 1 1 1 1 π (1) (35) 2,1 ∂ξ ∂ξ 1 1 with each stream at user-1. By solving = 0, = 0, 123 π (1) 2,1 ∂g 1 ∂g π (2) 2,1 Similarly, we can establish the rate-WMMSE relationships ∂ξ ∂ξ 1 1 = 0, and = 0, we derive the optimum equaliz- π (2) 2,1 ∂g for user-2 and user-3. Motivated by the rate-WMMSE ∂g 1 ∗ MMSE relationship in (35), we reformulate the optimization ∗ MMSE π (1) π (1) 2,1 2,1 123 123 ers as g = g , g = g , 1 1 1 1 problem (15)as Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 18 of 54 on the updated (x, P) in nth iteration. is the tolerance min u ξ (36a) k k,tot of the algorithm. The AO algorithm is guaranteed to con- P,x,u,g k=1 verge as the WSR is increasing in each iteration and it is 123 123 123 bounded above for a given power constraint. s.t. X + X + X + 1 ≥ ξ (36b) 1 2 3 12 12 X + X + 1 ≥ ξ (36c) 1 2 Algorithm 1: Alternating Optimization Algorithm 13 13 X + X + 1 ≥ ξ (36d) 13 [n] 1 3 [n] 1 Initialize: n ← 0, P ,WSR ; 23 23 2 repeat X + X + 1 ≥ ξ (36e) 2 3 3 n ← n + 1; [n−1] tr PP ≤ P (36f) 4 P ← P; MMSE n−1 5 u ← u (P ); th ξ ≤ 1 − R , k ∈{1, 2, 3} (36g) k,tot MMSE n−1 6 g ← g (P ); 7 update (x, P) by solving (36)using theupdated u x ≤ 0 (36h) and g; [n] [n−1] 123 123 123 12 12 13 13 23 23 8 until |WSR − WSR |≤ ; where x = X , X , X , X , X , X , X , X , X 1 2 3 1 2 1 3 2 3 is the transformation of the common rate c. 123 123 123 12 12 13 13 23 23 1 2 2 u = u , u , u , u , u , u , u , u , u , u , u , u . 1 2 3 1 2 1 3 2 3 1 2 3 123 123 123 12 12 13 13 23 23 1 2 2 g = g , g , g , g , g , g , g , g , g , g , g , g . When considering imperfect CSIT, we follow the robust 1 2 3 1 2 1 3 2 3 1 2 3 123 12 13 1 123 12 23 2 ξ = X +X +X + ξ , ξ = X +X +X + ξ approach proposed in [28] for 1-layer RS with imper- 1,tot tot 1 1 1 1 2 2 2 2 123 13 23 3 and ξ = X + X + X + ξ are individual WMSEs. fect CSIT. The precoders are optimized based on the 3,tot 3 3 3 3 123 123 123 12 12 ξ = max ξ , ξ , ξ , ξ = max ξ , ξ , available channel estimate to maximize a conditional aver- 123 12 1 2 3 1 2 13 13 23 23 ξ = max ξ , ξ , ξ = max ξ , ξ are the aged weighted sum rate (AWSR) metric, computed using 13 23 1 3 2 3 achievable WMSEs of the corresponding streams. partial CSIT knowledge. The stochastic AWSR problem It can be easily shown that by minimizing (36a)with was transformed into a deterministic counter part using respect to u and g, respectively, we obtain the MMSE the sample average approximated (SAA) method. Then, MMSE MMSE solutions u , g formed by the corresponding the rate-WMMSE relationship is applied to transform the MMSE equalizers and weights. They satisfy the KKT opti- AWSR problem into a convex form and solved using an mality conditions of (36)for P. Therefore, according to AO algorithm. The robust approach for 1-layer RS in [28] the rate-WMMSE relationship (35) and the common rate can be easily extended to solve the K-user generalized RS transformation c =−x,problem (36) can be transformed problem based on our proposed Algorithm 1, which will ∗ ∗ ∗ ∗ to problem (15). For any point (x , P , u , g )satisfying not be explained here. the KKT optimality conditions of (36), the solution given ∗ ∗ ∗ by (c =−x , P ) satisfies the KKT optimality conditions 5 Results and discussion of (15). The WSR problem (15) is then transformed into In this section, we evaluate the performance of SDMA, the WMMSE problem (36). The problem (36) is still non- NOMA , and RSMA in a wide range of network convex for the joint optimization of (x, P, u, g). We have loads (underloaded and overloaded regimes) and user derived that when (x, P, u) are fixed, the optimal equal- deployments (with a diversity of channel directions, chan- MMSE izer is the MMSE equalizer g .When(x, P, g)are nel strengths, and qualities of channel state information at MMSE fixed, the optimal weight is the MMSE weight u . the transmitter). We first illustrate the rate region of dif- When (u, g)are fixed, (x, P) is coupled in the optimization ferent strategies in the two-user case followed by the WSR problem (36), closed-form solution cannot be derived. But comparisons of the three-user, four-user, and ten-user it is a convex quadratically constrained quadratic pro- cases. gram (QCQP) which can be solved using interior-point methods. These properties motivates us to use AO to 5.1 Underloaded two-user deployment with perfect CSIT solve the problem. In nth iteration of the AO algorithm, When K = 2, therateregionofall strategies canbe the equalizers and weights are firstly updated using the explicitly compared in a two-dimensional figure. As men- precoders obtained in the n − 1th iteration (u, g) = tioned earlier, the rate region is the set of all achievable MMSE [n−1] MMSE [n−1] u (P ), g (P ) . With the updated (u, g), points. Its boundary is calculated by varying the weights (x, P) can then be updated by solving the problem (36). assigned to users. In this work, the weight of user-1 is (u, g) and (x, P) are iteratively updated until the WSR fixed to u = 1. The weight of user-2 is varied as [−3,−1,−0.95,··· ,0.95,1,3] converges. The details of the AO algorithm is shown in u = 10 , which is the same as in [n] Algorithm 1, where WSR is the WSR calculated based [42]. To investigate the largest achievable rate region, the Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 19 of 54 individual rate constraints are set to 0 in all strategies as the number of transmit antennas increases. In con- th R = 0, ∀k ∈{1, 2}. trast, the average rate region achieved by SC–SIC is small. 2 2 In the perfect CSIT scenario, the capacity region is When σ = 1andσ = 1, there is no disparity of average 1 2 achieved by DPC. Therefore, we compare the rate regions channel strengths. SC–SIC is not able to achieve a good of different beamforming strategies with the DPC region. performance in such scenario. As the SC–SIC strategy is The DPC region is generated using the algorithm in [44]. motivated by leveraging the channel strength difference Since the WSR problems for all beamforming strategies among users, it achieves a good performance when the described earlier are non-convex, the initialization of P channels are degraded. Specifically, when the channels of is vitaltothe finalresult.Ithas been observed in[28] users are close to alignment, SC–SIC works better than that maximum ratio transmission (MRT) combined with MU–LP if the users have asymmetric channel strengths. singular value decomposition (SVD) provides good over- However, for the general non-degraded MISO-BC, all performance over various channel realizations. It is SC–SIC often yields a performance loss [19]. The simu- 2 2 used in this work for precoder initialization of RS. The lation results when σ = 1, σ = 0.09, and N = 2is 1 2 precoders for the private message p is initialized as illustrated in Fig. 6. The average channel gain difference h αP k t between the users increases to 5 dB, and the number of p = p ,where p = and 0 ≤ α ≤ 1. The k k k h  2 the transmit antenna reduces to two. In such scenario, the precoder for the common message is initialized as p = rate region gap between RS and MU–LP increases while p u ,where p = (1 − α)P and u is the largest left 12 12 12 t 12 the rate region gap between RS and SC–SIC decreases. It singular vector of the channel matrix H =[ h , h ]. It is 1 2 shows that SC–SIC is more suited to the scenarios where calculated as u = U(:, 1). U is derived based on the SVD the users experience a large disparity in channel strengths. of H, i.e., H = USV . To ensure a fair comparison, the In both Figs. 5 and 6, the rate region gaps among differ- precoders of MU–LP are initialized based on MRT. For ent strategies increase with SNR. RS achieves a larger rate SC–SIC, the precoder of the user decoded first is initial- region than SC–SIC and MU–LP, and it is closer to the ized based on SVD and that of the user decoded last is capacity region achieved by DPC. initialized based on MRT. 5.1.2 Specific channel realizations 5.1.1 Random channel realizations In order to have a better insight into the benefits of RS We firstly consider the scenarios when the channel of each over MU–LP and SC–SIC, we investigate the influence user h has independent and identically distributed (i.i.d.) of user angle and channel strength on the performance. complex Gaussian entries with a certain variance, i.e., When N = 4, the channels of users are realized as CN 0, σ . The BS is equipped with two or four antennas (N = 2, 4) and serves two single-antenna users. Figure 5 shows the average rate regions of different strategies over h = [1, 1, 1, 1] , 2 2 (37) 100 random channel realizations when σ = 1, σ = 1, 1 2 jθ j2θ j3θ h = γ × 1, e , e , e . and N = 4. SNRs are 10 and 20 dB, respectively. When the number of transmit antenna is larger than the number of users, MU–LP achieves a good performance. The gen- In above channel realizations, γ and θ are control vari- erated precoders of the users tend to be more orthogonal ables. γ controls the channel strength of user-2. If γ = 1, SNR=10 dB SNR=20 dB a b 6 10 DPC RS SC-SIC MU-LP 0 0 0246 02468 10 R (bit/s/Hz) R (bit/s/Hz) 1,tot 1,tot Fig. 5 Achievable rate region comparison of different strategies in underloaded two-user deployment with perfect CSIT, averaged over 100 random 2 2 channel realizations, σ = 1, σ = 1, and N = 4. a SNR = 10 dB. b SNR = 20 dB 1 2 R (bit/s/Hz) 2,tot R (bit/s/Hz) 2,tot Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 20 of 54 SNR=10 dB SNR=20 dB a b 1.5 4 DPC 0.5 RS 1 SC-SIC MU-LP 0 0 012345 02468 R (bit/s/Hz) R (bit/s/Hz) 1,tot 1,tot Fig. 6 Achievable rate region comparison of different strategies in underloaded two-user deployment with perfect CSIT, averaged over 100 random 2 2 channel realizations, σ = 1, σ = 0.09, and N = 2. a SNR = 10 dB. b SNR = 20 dB 1 2 π 2π π 4π π the channel strength of user-1 is equal to that of user-2. θ = , , , . Intuitively, when θ is less than ,the 9 9 3 9 9 If γ = 0.3, user-2 suffers from an additional 5 dB path channels of users are sufficiently aligned and SC–SIC per- 4π loss compared to user-1. θ controls the angle between the forms well. When θ is larger than , the channels of users channels of user-1 and user-2. It varies from 0 to .If are sufficiently orthogonal to each other and MU–LP is θ = 0, the channel of user-1 is aligned with that of user-2. more suitable. Therefore, we consider angles within the π π 4π If θ = , the channels of user-1 and user-2 are orthog- range of , . SNR is fixed to 20 dB. When N = 2, the 2 9 9 onal to each other. In the following results, γ = 1, 0.3, channels of user-1 and user-2 are realized as h = [1, 1] jθ which corresponds to 0 dB, 5 dB channel strength dif- and h = γ × 1, e , respectively. The same values of γ ference, respectively. For each γ , θ adopts value from and θ are adopted in N = 2asusedin N = 4 . t t ab cd Fig. 7 Achievable rate region comparison of different strategies in underloaded two-user deployment with perfect CSIT, γ = 1and N = 4, SNR = 20 dB. a θ = π/9. b θ = 2π/9. c θ = π/3. d θ = 4π/9 R (bit/s/Hz) 2,tot R (bit/s/Hz) 2,tot Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 21 of 54 Figure 7 shows the results when γ = 1and N = 4. In for users. MU–LP is more suited to underloaded scenarios all subfigures, the rate region achieved by RS is equal to or (N > K). In both Figs. 7 and 8, the rate region of SC–SIC larger than that of SC–SIC and MU–LP. When γ = 1and is the worst due to the equal channel gain. In contrast, RS θ = , the channels of user-1 and user-2 almost coincide. performs well for any angle between user channels. RS exhibits a clear rate region improvement over SC–SIC Figure 9 shows the rate region comparison of DPC, RS, and MU–LP. SC–SIC cannot achieve a good performance SC–SIC, and MU–LP transmission schemes with 5 dB due to the equal channel gain while the performance channel strength difference between the two users, i.e., of MU–LP is poor when the user channels are closely γ = 0.3 and N = 4. RS and SC–SIC are much closer to aligned to each other. As θ increases, the gap between the DPC region in the setting of Fig. 9 compared to Fig. 7 the rate regions of RS and MU–LP reduces as the per- because of the 5 dB channel strength difference. Figure 9b, formance of MU–LP is better when the channels of users c are interesting as SC–SIC and MU–LP outperform each are more orthogonal to each other while the gap between other at one part of the rate region. There is a crosspoint the rate regions of MU–LP and SC–SIC increases. The between the two schemes in each figure mentioned. The rate regions of RS and MU–LP tend to the capacity region rate region of RS is equal to or larger than the convex hull achieved by DPC as θ increases. As shown in Fig. 7d,when of the rate regions of SC–SIC and MU–LP. the channels of users are sufficiently orthogonal to each Figure 10 shows the rate region comparison when other, the rate regions of DPC, RS, and MU–LP are almost γ = 0.3 and N = 2. Comparing Fig. 10 with Fig. 9, identical. In such an orthogonal scenario, RS reduces to SC–SIC achieves a relatively better performance when the MU–LP. number of transmit antenna reduces. The WSRs of RS Figure 8 shows the results when γ = 1and N = 2. and SC–SIC are overlapped, and they almost achieve the In all subfigures, RS outperforms MU–LP and SC–SIC. capacity region when θ = . However, as θ increases, Comparing with the results of N = 4, the rate region gap the rate region gap between RS and SC–SIC increases between RS and MU–LP is enlarged when N = 2. When despite the 5 dB channel gain difference. Both SC–SIC and the number of transmit antenna decreases, it becomes RS rely on one SIC when there are two users in the sys- more difficult for MU–LP to design orthogonal precoders tem. Though the receiver complexity of SC–SIC and RS ab cd Fig. 8 Achievable rate region comparison of different strategies in underloaded two-user deployment with perfect CSIT, γ = 1and N = 2, SNR = 20 dB. a θ = π/9. b θ = 2π/9. c θ = π/3. d θ = 4π/9 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 22 of 54 ab cd Fig. 9 Achievable rate region comparison of different strategies in perfect CSIT, γ = 0.3, N = 4, SNR = 20 dB. a θ = π/9. b θ = 2π/9. c θ = π/3. d θ = 4π/9 are the same, RS achieves explicit performance gain over user-k. h has i.i.d. complex Gaussian entries drawn from SC–SIC in most investigated scenarios. Comparing with CN 0, σ . The error covariance of user-1 and user-2 e,k MU–LP and SC–SIC, RS is suited to any channel angles −0.6 −0.6 2 2 are σ = P and σ = γ P , respectively. The pre- e,1 t e,2 t and channel gain difference. coders are initialized and designed using the estimated More results of underloaded two-user deployments channels h andh and the same methods as stated in per- 1 2 with perfectCSITare giveninAppendix 1.Wefurther fect CSIT scenarios. One thousand different channel error illustrate the rate regions of different strategies when SNR samples are generated for each user. Each point in the is 10 dB. Comparing the corresponding figures of 10 dB rate region is the average rate over the generated 1000 and 20 dB, we conclude that as SNR increases, the gaps channels. SNR is fixed to 20 dB. among the rate regions of different schemes increase, with Figures 11 and 12 show the results when γ = 1and RS exhibiting further performance benefits. In all inves- γ = 0.3, respectively. Similarly to the results in per- tigated scenarios, RS always outperforms MU–LP and fect CSIT, the gaps between the rate regions of RS and SC–SIC. MU–LP reduce as θ increases in both figures. When 4π θ = , the channels of the two users are sufficiently 5.2 Underloaded two-user deployment with imperfect orthogonal. The rate regions of RS and MU–LP are almost CSIT identical. SC–SIC achieves a good performance when the Next, we investigate the rate region of different trans- channels of users are sufficiently aligned with enough mission schemes in the presence of imperfect CSIT. We channel gain difference, as shown in Fig. 12a. assume theusersareabletoestimatethe channelper- Comparing Figs. 11 and 7,the rate region gapbetween fectly while the instantaneous channel estimated at the RS and MU–LP increases in imperfect CSIT due to BS is imperfect. We assume the estimated channel of the residual interference introduced. The interference- user-1 and user-2 are h = [1, 1, 1, 1] and h = γ × 1 2 nulling in MU–LP is distorted and yields residual inter- jθ j2θ j3θ 1, e , e , e when N = 4. For the given channel ference at the receiver, which jeopardizes the achievable estimate at the BS, the channel realization is h = h + rate. In contrast, the rate region gap between RS and k k h and ∀k ∈{1, 2},where h is the estimation error of SC–SIC slightly reduces in imperfect CSIT, as observed k k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 23 of 54 ab cd Fig. 10 Achievable rate region comparison of different strategies in perfect CSIT, γ = 0.3, N = 2, SNR = 20 dB. a θ = π/9. b θ = 2π/9. c θ = π/3. d θ = 4π/9 ab cd Fig. 11 Average rate region comparison of different strategies in imperfect CSIT, γ = 1, N = 4, SNR = 20 dB. a θ = π/9. b θ = 2π/9. c θ = π/3. d θ = 4π/9 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 24 of 54 ab cd Fig. 12 Average rate region comparison of different strategies in imperfect CSIT, γ = 0.3, N = 4, SNR = 20 dB. a θ = π/9. b θ = 2π/9. c θ = π/3. d θ = 4π/9 by comparing Fig. 12 with Fig. 9.SC–SICislesssensitive with a fixed weight vector u, the WSRs instead of the rate to CSIT inaccuracy comparing with MU–LP. However, regions of different transmission strategies are compared the rate region gap between RS and SC–SIC is still obvi- in the three-user case. ous. In comparison, RS is more flexible and robust to Two RS schemes are investigated in three-user deploy- multi-user interference originating from the imperfect ments. RS refers to the generalized RS strategy of CSIT, as evidenced by the recent literature on RS with Section 4.2 and 1-layer RS refers to the low-complexity imperfect CSIT [27–33, 38–41]. With RS, the amount of RS strategy of Section 4.4.1. We compare the WSR of RS, interference decoded by both users (through the pres- 1-layer RS, DPC, SC–SIC, and MU–LP. The beamform- ence of common stream) is adjusted dynamically to the ing initialization of different strategies is extended based channel conditions (channel directions and strengths) and on the methods adopted in the two-user case. There are CSIT inaccuracy. three streams of distinct stream orders in RS (1/2/3-order More results of underloaded two-user deployments streams). The precoders of the streams are initialized dif- with imperfectCSITare giveninAppendix 2.The rate ferently. The transmit power P is divided into three parts regions of different strategies for varied SNR, N and γ α P , α P ,and α P for streams of three distinct stream t 1 t 2 t 3 t are illustrated. We further show that the performance of orders, where α , α , α ∈[0,1] and α + α + α = 1. The 1 2 3 1 2 3 RS is stable in a wide range of parameters, namely num- precoder p , ∀k ∈{1, 2, 3} of the 1-order stream (private h α P k 1 t ber of transmit antennas, user deployments, and CSIT stream) s is initialized as p = p ,where p = k k k k h  3 inaccuracy. RS achieves equal or better performance than is the allocated power. The precoders p , p ,andp of 12 13 23 MU–LP and SC–SIC in all simulated channels. the 2-order streams are initialized as p = p u , p = 12 12 12 13 p u ,and p = p u ,respectively,where p = p = 13 13 23 23 23 12 13 α P 2 t p = and u is the largest left singular vector of 5.3 Underloaded three-user deployment with perfect 23 12 the channel matrix H =[ h , h ]. Similarly, u and u CSIT 12 1 2 13 23 are the largest left singular vectors of the channel matri- When K = 3, the rate region of each strategy is a three- ces H =[ h , h ]and H =[ h , h ], respectively. The dimensional surface. The gaps among rate regions of dif- 13 1 3 23 2 3 precoder p of the 3-order stream (conventional com- ferent strategies are difficult to display. As each point of the rate region is derived by solving the WSR problem mon stream) s is initialized as p = p u ,where 123 123 123 123 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 25 of 54 p = α P and u is the largest left singular vector γ andγ and θ andθ are control variables as discussed in 123 3 t 123 1 2 1 2 of the channel matrix H =[ h , h , h ]. The beamform- thetwo-usercase. Foragivenset of γ andγ , θ adopts 123 1 2 3 1 2 1 π 2π π 4π ing initialization of 1-layer RS is similar as RS except we value from θ = , , , and θ = 2θ .When 1 2 1 9 9 3 9 π 2π have p and p , ∀k ∈{1, 2, 3} only. By setting α = 0, 123 k 2 θ = andθ = , the channels of user-1 and user- 1 2 9 9 the initialization of RS is applied to 1-layer RS. To ensure 2, and user-2 and user-3 are sufficiently aligned. When 4π 8π a fair comparison, the precoders of MU–LP are initial- θ = andθ = , the channels of user-1 and user-2 1 2 9 9 ized based on MRT. For SC–SIC, the precoder of the user and user-2 and user-3 are sufficiently orthogonal. We con- decoded first p is initialized as p = p u , π(1) π(1) π(1) π(1) sider SNRs within the range 0 to 30 dB. We assume the where p = α P and u is the largest left singu- π(1) 3 t π(1) sum of the weights allocated to users is equal to one, i.e., larvectorofthe channelmatrix H =[ h , h , h ]. The 123 1 2 3 u + u + u = 1. 1 2 3 precoder of the user decoded secondly p is initialized π(2) Figures 13 and 14 show the results when the weight as p = p u ,where p = α P and u π(2) π(2) π(2) π(2) 2 t π(2) vectors are u =[ 0.2, 0.3, 0.5] and u =[ 0.4, 0.3, 0.3], respec- is the largest left singular vector of the channel matrix tively. In both figures, γ = 1andγ = 0.3. There is a 5 dB 1 2 H =[ h , h ]. The user decoded last is initialized π(23) π(2) π(3) channel gain difference between user-1 and user-3 as well based on MRT. as between user-2 and user-3. In all scenarios and SNRs, We firstly consider an underloaded scenario. The BS is RS always outperforms MU–LP and SC–SIC. Comparing equipped with four transmit antennas (N = 4) and serves with Fig. 14, the WSR improvement of RS is more explicit three single-antenna users in all simulations. The individ- in Fig. 13. It implies that RS provides better enhancement th ual rate constraint is set to 0, R = 0, ∀k ∈{1, 2, 3}.The of system throughput and user fairness. The performance channel of users are realized as of SC–SIC is the worst in most subfigures. This is due to the underloaded user deployments where N > K.One h = 1, 1, 1, 1 , [ ] 1 of the three users are required to decode all the messages, and all the spatial multiplexing gains are sacrificed. There- jθ j2θ j3θ 1 1 1 h = γ × 1, e , e , e , (38) 2 1 fore, the sum DoF of SC–SIC is reduced to 1, resulting in jθ j2θ j3θ 2 2 2 h = γ × 1, e , e , e . the deteriorated performance of SC–SIC in underloaded 3 2 ab cd Fig. 13 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = 1, γ = 0.3, 1 2 th u = 0.2, u = 0.3, u = 0.5, N = 4, R = 0, k ∈{1, 2, 3}. a θ = π/9, θ = 2π/9. b θ = 2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = π/9, 1 2 3 t 1 2 1 2 1 2 1 θ = 8π/9 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 26 of 54 ab cd Fig. 14 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = 1, γ = 0.3, 1 2 th u =0.4, u =0.3, u =0.3, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =4π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 SC-SIC order 1:s → s → s 1 2 3 scenarios. In comparison, the performance of MU–LP is better than SC–SIC except in Fig. 14a. MU–LP is more SC-SIC order 2:s → s → s 2 1 3 likely to serve the users with higher weights and chan- SC-SIC order 3:s → s → s 1 3 2 nel gains by turning off the users with poor weights SC-SIC order 4:s → s → s 3 1 2 and channel gains when there is no individual rate con- SC-SIC order 5:s → s → s straints. It cannot deal efficiently with user fairness when a 2 3 1 higher weight is allocated to the user with weaker channel SC-SIC order 6:s → s → s 3 2 1 strength. In contrast, SC–SIC works better when user fair- ness is considered. The WSR achieved by low-complexity 1-layer RS is equal to or larger than that of MU–LP In Fig. 15, the WSR of six different decoding orders are andSC–SICinmostsubfigures. ComparingwithSC–SIC illustrated in the circumstance where there is a 5dB chan- and MU–LP, 1-layer RS is more robust to different user nel gain difference between user-1/2 and user-3. When deployments and only a single SIC is required at each γ = 1andγ = 0.3, it is typical to decode the message of 1 2 user. Moreover, the WSR of 1-layer RS is approaching that user-3 first as the channel gain of user-3 is the worst. How- of RS in all user deployments. Considering the trade-off ever, we notice that the optimal decoding order in Fig. 15 between performance and complexity, 1-layer RS is a good is order 3, user-1 is decoded first. This is due to the small- alternativetoRS. est weight allocated to user-1, u = 0.2. It implies that the In all three-user deployments of SC–SIC, the decod- weights assigned to users will affect the optimal decod- ing order is required to be optimized together with the ing order. The scheduler complexity of SC–SIC becomes precoder. To investigate the influence of different decod- extremely high in order to find the optimal decoding ing orders, we compare the WSRs of SC–SIC using order. In contrast, 1-layer RS has a much lower scheduling different decoding orders when u = 0.2, u = 0.3, complexity and does not rely on any user ordering at the 1 2 and u = 0.5. There are in total six different decoding transmitter. Moreover, it only requires a single SIC at each receiver. orders: Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 27 of 54 ab cd Fig. 15 Weighted sum rate versus SNR comparison of different decoding order of SC–SIC for underloaded three-user deployment with perfect CSIT, th γ = 1, γ = 0.3, u = 0.2, u = 0.3, u = 0.5, N = 4, R = 0, k ∈{1, 2, 3}. a θ = π/9, θ = 2π/9. b θ = 2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. 1 2 1 2 3 t 1 2 1 2 1 2 d θ = 4π/9, θ = 8π/9 1 2 More results of underloaded three-user deployments antenna deployment, we assume the rate threshold th th th with perfect CSIT and imperfect CSIT are given in of each user is equal R = R = R .SincetheBS 1 2 3 Appendices 3 and 5, respectively. The WSRs of different is able to serve users with higher QoS requirements strategies for varied SNR, N , γ , γ ,and u are illus- as SNR increases, the rate threshold is assumed to t 1 2 trated. In all figures, RS outperforms SC–SIC and MU–LP. increase with SNR. The rate threshold increases as Though the scheduler and receiver complexity of 1-layer r = [0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz for th RS is low, it achieves equal or better performance than SNR = [0, 5, 10, 15, 20, 25, 30] dBs. SC–SIC and MU–LP in most figures of perfect CSIT and We compare the performance of RS, 1-layer RS, SC–SIC, all figures of imperfect CSIT. All forms of RS are robust to MU–LP, and SC–SIC per group in the overloaded three- a wide range of CSIT inaccuracy, channel gain difference, user deployment. In SC–SIC per group, we consider a and channel angles among users. fixed grouping method. We assume user-1 is in group 1 while user-2 and user-3 are in group 2. The decoding order 5.4 Overloaded three-user deployment with perfect CSIT will be optimized together with the precoder. The beam- 5.4.1 Two transmit antenna deployment forming initialization of SC–SIC per group is different We first consider an overloaded scenario where the BS from SC–SIC. In group 1, the precoder of user-1 is ini- is equipped with two antennas (N = 2) and serves tialized basedonMRT.Ingroup 2, theprecoderofthe three single-antenna users. The channel realizations and user decoded first p is initialized as p = p u π(1) π(1) π(1) π(1) beamforming initialization follows the methods used in and u is the largest left singular vector of the channel π(1) the underloaded three-user deployment. The channel of matrix H =[ h , h ]. The precoder of the user decoded 23 2 3 jθ users are realized as h = [1, 1] , h = γ × 1, e , 1 2 1 secondly is initialized based on MRT. jθ and h = γ × 1, e . In overloaded scenarios, to RS exhibits a clear WSR gain over SC–SIC, SC–SIC per 3 2 guarantee some QoS, we add individual rate constraints group, and MU–LP in Fig. 16,where γ = 1, γ = 0.3, 1 2 to users as the system has otherwise a tendency to and u =[ 0.4, 0.3, 0.3]. The WSR of MU–LP deteriorates turn off some users. In all simulations of two transmit in such overloaded scenario. When the individual rate Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 28 of 54 ab cd Fig. 16 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with perfect CSIT, γ = 1, γ = 0.3, 1 2 u =0.4, u =0.3, u =0.3, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ = 2π/9, θ = 4π/9. c θ = π/3, θ =2π/3. 1 2 3 t 1 2 1 2 1 2 th d θ = 4π/9, θ = 8π/9 1 2 constraints are not zero and N < K, MU–LP cannot achieved by SC–SIC. Nevertheless, in view of the benefit coordinate the multi-user interference coming from all the of 1-layer RS in the MISO BC, we may wonder whether users served simultaneously. When the angles of chan- RS can be of any help in a SISO BC, especially when it nels are large enough (subfigure c and subfigure d of comes to reducing the complexity of the receivers and the Fig. 16), the WSR of SC–SIC per group is better than number of SIC needed. SC–SIC. This is due to its ability to combine treating inter- We therefore compare the performance of 1-layer ference as noise (to tackle inter-group interference) with RS with SC–SIC in a 3-user SISO BC. We note that decoding interference (to tackle intra-group interference). SC–SIC requires two layers of SIC while 1-layer RS However, as the angles of channels decrease, the perfor- requires a single SIC for all users. The channel of each user mance of SC–SIC becomes better while that of SC–SIC h has an i.i.d. complex Gaussian entry with a certain vari- per group is worse. Whether SC–SIC outperforms SC–SIC ance, i.e., CN 0, σ .Figure 17 shows the average WSRs per group depends on SNR and user deployments. To of different strategies over ten random channel realiza- 2 2 2 ensure the WSR of the NOMA system is maximized, a tions when σ = 1, σ = 0.3, andσ = 0.1. 1-layer RS is 1 2 3 joint optimization of NOMA strategies based on switch- able to achieve very close performance to SC–SIC. Com- ing between SC–SIC and SC–SIC per group on top of paring with SC–SIC, the complexity of 1-layer RS is much deciding, the user grouping and user ordering is required. reduced. There is no ordering issue at the BS, and only one Such switching method has high scheduler and receiver SIC is required at each user. Jointly considering the per- complexity while its achieved performance is still lower formance and complexity of the system, 1-layer RS is an than the simple 1-layer RS in most user deployments. attractive alternative to SC–SIC. More results of overloaded three-user deployments 5.4.2 Single transmit antenna deployment with perfect CSIT and imperfect CSIT are given in In a SISO BC, there is no need to split the messages into Appendices 4 and 6, respectively. The WSRs of different common and private parts since the capacity region is strategies for varied SNR, N , γ , γ ,and u are illustrated. t 1 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 29 of 54 h = [1, 1] , jθ h = γ × 1, e , 2 1 (39) jθ h = γ × 1, e , 3 2 jθ h = γ × 1, e . 4 3 γ , γ , γ and θ , θ , θ are control variables. θ is the chan- 1 2 3 1 2 3 1 nel angle between user-1 and user-2. It is denoted as intra-group angle of group 1. θ is the channel angle between user-1 and user-2. θ − θ is the channel angle 2 1 between user-2 and user-3, denoted as inter-group angle. θ is the channel angle between user-1 and user-3. θ − θ 3 3 2 is the channel angle between user-3 and user-4. It is the intra-group angle of group 2. In the following, we assume the intra-group angle of group 1 is the same as that of group 2. We have θ = θ + θ . In each figure, the intra- 3 1 2 π π π group angle is varied as θ = 0, , , . The individual 18 9 6 rate constraint is set to r =[ 0.03, 0.1, 0.2, 0.3, 0.4, 0.4, 0.4] th bit/s/Hz for SNR =[ 0, 5, 10, 15, 20, 25, 30] dBs. The Fig. 17 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with perfect CSIT, weightsofusers areassumedtobeequal,i.e., u = u = 1 2 2 2 2 σ = 1, σ = 0.3, σ = 0.1, N = 1, r =[ 0, 0, 0.01, 0.03, 0.1, 0.2, 0.3] t th 1 2 3 u = u = 0.25. We also assume the channel gain dif- 3 4 bit/s/Hz ference within each group is equal. The channel gain of user-3 is equal to that of user-1 (γ = 1), and the channel We further show that RS exhibits a clear WSR gain over gain of user-4 is equal to that of user-2 (γ = γ ). 3 1 SC–SIC, SC–SIC per group, and MU–LP in all simulated Figures 18 and 19 show the results when γ = 0.3. π π channels and weights. One-layer RS outperforms SC– The inter-group angles are and ,respectively. The 9 3 SIC, SC–SIC per group and MU–LP in most simulated WSR achieved by 2-layer HRS is equal to 1-layer RS in scenarios. It is more robust and achieves a nearly equiva- both figures, which means that 2-layer HRS reduces to lent WSR to that of RS in all user deployments. We also 1-layer RS in these user deployments. Two-layer HRS and show that 1-layer RS achieves near optimal performance 1-layer RS outperform all other schemes. The inter-group in various channel conditions of SISO BC. and intra-group interference can be jointly mitigated by one layer common message. As the inter-group angle 5.5 Overloaded four-user deployment with perfect CSIT increases, the WSR gaps between 2-layer HRS and 1-layer We further investigate the four-user system model shown RS per group reduces. The inter-group interference can in Fig. 4, where user-1 and user-2 are in group 1 while be coordinated by SDMA when the inter-group angle is user-3 and user-4 are in group 2. We compare the 2-layer sufficiently large. One-layer RS per group has the same HRS, 1-layer RS per group, 1-layer RS, SC–SIC per group, WSR as SC–SIC per group in both figures. It reduces and MU–LP. In 2-layer HRS, the intra-group interfer- to SC–SIC per group because SC–SIC is more suitable ence is mitigated using the intra-group common streams when the intra-group angle is sufficiently small and the s and s , and the inter-group interference is mitigated channel gain difference between users within each group 12 34 is sufficiently large. using the inter-group common stream s .One-layer RS More results of overloaded four-user deployments with and 1-layer RS per group are two special strategies of 2- layer HRS. All users in 1-layer RS are treated as single perfectCSITare giveninAppendix 7. The WSRs of dif- group. Only the 4-order common stream s and 1-order ferent strategies when there is no channel gain difference private streams are active. No power is allocated to s and (γ = 1) are illustrated. We further show that 2-layer HRS, 12 1 s . In contrast, 1-layer RS per group only allocate power 1-layer RS, and 1-layer RS per group achieve equal or bet- to the intra-group common stream s and s and 1-order ter performance than SC–SIC per group and MU–LP in 12 34 private streams. No power is allocated to the inter-group all simulated channel conditions. common stream s . Users within each group are served using RS and users across groups are served using SDMA 5.6 Overloaded ten-user deployment with perfect CSIT so as to mitigate the inter-group interference. We further consider an extremely overloaded scenario We consider an overloaded scenario. The BS is equipped subject to QoS constraints. The BS is equipped with two with two antennas and serves four single-antenna users. antennas (N = 2) and serves ten users. The channel of Thechannelofusers arerealizedas each user h has i.i.d. complex Gaussian entries with a k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 30 of 54 ab cd Fig. 18 Weighted sum rate versus SNR comparison of different strategies for overloaded four-user deployment with perfect CSIT, γ = 0.3, θ = θ + , r =[ 0.03, 0.1, 0.2, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = 0. b θ = π/18. c θ = π/9. d θ = π/6 2 1 1 1 1 1 th certain variance, i.e., CN (0, σ ). The rate of each user is limited by two, given the two transmit antennas). To fur- averaged over the 10 randomly generated channels. We ther investigate the reason behind the results, we focus on compare 1-layer RS, MU–LP, multi-cast, and SC–SIC with one random channel realization. The WSRs achieved by a certain decoding order. There are 10! different decod- all strategies when SNR = 30 dB are compared as shown ing orders of SC–SIC in the ten-user case. The optimal in Fig. 21. The optimized common rate vector of one-layer decoding order of SC–SIC is intractable. In the follow- RS is c =[0,0.1,0.1,0.1,0,0.1,0.1,0.1,0.1,0.1] bit/s/Hz. ing simulations, only the decoding order based on the No common rate is allocated to user-1 and user-5. But ascending channel gain is considered for WSR calculation in Fig. 21, we can observe that the rate allocated to user- in SC–SIC. It is the optimal decoding order in SISO BC. 1 and user-5 are the highest. It implies that RS uses the Multicast can be regarded as a special scheme of 1-layer common message to pack messages from eight users and RS with only the 10-order stream to be transmitted to all uses two transmit antennas to deliver private messages to users. The weight of each user is assumed to be equal to 1. user-1 and user-5. RS achieves a sum-DoF of 2 in the over- Figure 20 shows the WSRs of different strategies when loaded regime. In contrast, MU–LP and SC–SIC allocate 2 2 2 σ = σ = ... = σ = 1, r =[ 0.01, 0.03, 0.05, 0.1, 0.1, most of power to single user. The rate achieved by user- th 1 2 10 0.1, 0.1] bit/s/Hz. The WSR achieved by the multi-cast 5 when using MU–LP and the rate achieved by user-10 scheme is the worst. In such an overloaded user deploy- when using SC–SIC are much higher than other users in ment, the spectral efficiency of multi-cast is low as it is Fig. 21. The DoFs achieved by MU–LP and SC–SIC are difficult for a single beamformer to satisfy all users. Under limited to 1 in such circumstance. therateconstraint r , the WSR of SC–SIC is better than Note that results here show the usefulness of the RS th that of MU–LP while the slopes of the WSRs are the framework for massive IoT or MTC services. Those same for large SNRs. It implies that SC–SIC and MU–LP devices are typically cheap. In the example above, user-1 achieve the same DoF of 1. In contrast, 1-layer RS shows and user-5 could be high-end devices, for which RS would an obvious WSR improvement over all other strategies be implemented. Those devices would therefore perform and exhibits a DoF of two. This highlights that RS exploits SIC. All other devices could be IoT or MTC devices, who the maximum DoF of the considered deployments (that is would not need to implement RS, nor SIC, but simply Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 31 of 54 ab cd Fig. 19 Weighted sum rate versus SNR comparison of different strategies for overloaded four-user deployment with perfect CSIT, γ = 0.3, θ = θ + , r =[ 0.03, 0.1, 0.2, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = 0. b θ = π/9. c θ = 2π/9. d θ = π/3 2 1 1 1 1 1 th decode the common message. Hence, the RS framework illustrate WSRs of different strategies when the rate canbeusedtopackthe IoT/MTCtrafficinthe common threshold r , and channel gain difference are changed. th message. We show that the when the rate threshold of each user More results of overloaded ten-user deployments with is 0, MU–LP is able to achieve a DoF of 2. However, as perfect CSIT are given in Appendix 8.Wefurther the rate threshold increases, MU–LP cannot coordinate the inter-user interference and its achieved DoF drops to 1. In the extremely overloaded scenario, the WSR gap between RS and SC–SIC is still large. SC–SIC makes an inefficient use of the transmit antennas and achieves a DoF of 1. 6Conclusions To conclude, we propose a new multiple access called rate-splitting multiple access (RSMA). We compare the proposed RSMA with SDMA and NOMA by solving the problem of maximizing WSR in MISO-BC systems with QoS constraints. Both perfect and imperfect CSIT are investigated. WMMSE and its modified algorithms are adopted to solve the respective optimization problems. We show that SDMA and NOMA are subject to many limitations, including high-system complexity and a lack of robustness to user deployments, network load, and Fig. 20 Weighted sum rate versus SNR comparison of different CSIT inaccuracy. We propose a general multiple access strategies for overloaded ten-user deployment with perfect CSIT, framework based on rate splitting (RS), where the com- 2 2 2 σ = σ = ... = σ = 1, N = 2, SNR = 30 dB, r =[ 0.01, 0.03, 0.05, t th 1 2 10 mon symbols decoded by different groups of users are 0.1, 0.1, 0.1, 0.1] bit/s/Hz transmitted on top of private symbols decoded by the Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 32 of 54 1-layer RS SC-SIC MU-LP multicast -1 -2 123456789 10 users Fig. 21 Individual rate comparison of different strategies for overloaded ten-user deployment with perfect CSIT for 1 randomly generated channel estimate, SNR = 30 dB, N = 2, r =[ 0.01, 0.03, 0.05, 0.1, 0.1, 0.1, 0.1] bit/s/Hz th corresponding users only. Thanks to its ability of partially Endnotes decoding interference and partially treating interference In the sequel, power-domain NOMA will be referred as noise, RSMA softly bridges and outperforms SDMA simply by NOMA. and NOMA in any user deployments, CSIT inaccuracy, 2 Recall that SU–MIMO in LTE Rel. 8 was designed and network load. The simplified RS forms, such as with minimum mean square error–SIC (MMSE–SIC) in 1-layer RS and 2-layer HRS, show great potential to reduce mind [45]. the scheduler and receiver complexity but maintain good The DoF characterizes the number of interference-free and robust performance in any user deployments, CSIT streams that can be transmitted or equivalently the pre- inaccuracy, and network load. Particularly, we show that 1-layer RS is an attractive alternative to SC–SIC in a log factor of the rate at high SNR. SISO BC deployment due to its near optimal performance This can be easily seen since, for the receiver forced to and very low complexity. Therefore, RSMA is a more decode all streams, the model reduces to a multiple access general and powerful multiple access for downlink multi- channel (MAC) with a single-antenna receiver, which has antenna systems that encompasses SDMA and NOMA as a sum-DoF of 1. This was discussed in length in [34]. special cases. Recall that this spatial multiplexing gain is the main RSMA has the potential to change the design of the driver for using multiple antennas in a multi-user setup physical layer and MAC layer of next-generation com- and the introduction of MU–MIMO in 4G [18]. munication systems by unifying existing approaches and “Common” is sometimes referred to as “public.” relying on a superposed transmission of common and This also contrasts with NOMA, for which the use- private messages. Many interesting problems are left for future research, including among others the role played fulnessofSC–SICinaBC is knownfor several decades by RSMA to achieve the fundamental limits of broadcast, [7, 8]. interference and relay channels in the presence of imper- Note that in the specific case where we have finite pre- fect CSIT and disparity of channel strengths, optimization cision CSIT, the sum DoF collapses to 1 [26], and RS, (robust design, sum-rate maximization, max-min fair- SC–SIC,and TDMA all achieve the same optimal DoF. ness, QoS constraints) of RSMA, performance analysis of It is worth noting that Rate-Splitting Multiple Access RSMA, RSMA design for multi-user/massive/millimeter- (RSMA) also exists in the uplink for the SISO Multi- wave/multi-cell/network MIMO, modulation and cod- ing for RSMA, RSMA with multi-carrier transmissions, ple Access Channel [46]. Though they share the same RSMA with linear versus nonlinear precoding, resource name and the splitting of the messages, they have different allocation and cross-layer design of RSMA, security pro- motivations and structures. visioning in RSMA, RSMA design for cellular and satellite As already explained in [12], RS canalsobeseenasa communication networks, prototyping and experimenta- form of non-orthogonal multi-user transmission. Indeed, tion of RSMA, and standardization issues (link/system- in its simplest form, the common message in RS can be level evaluations, receiver implementation, transmission seen as a non-orthogonal layer added onto the private schemes/modes, CSI feedback mechanisms, and down- link and uplink signaling) of RSMA. layers. Individual Rate (bit/s/Hz) Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 33 of 54 ThisbenefitofRSwas brieflypointed outin[39]. theconvexhull of therateregions of SC–SIC andMU– 12 LP. However, as SNR decreases to 10 dB, the crosspoints Note that OMA (single-user beamforming) is a subset disappear in Figs. 24b and 25d. The rate regions of SC– of MU–LP and is obtained by allocating power exclusively SIC overlap with that of RS. RS reduces to SC–SIC, and to s or s . 1 2 they outperform MU–LP in the whole rate region. Note that for a given θ, the users’ direction of arrival Appendix 2 (DoA)are thesamefor N = 2and N = 4scenarios t t Underloaded two-user deployment with imperfect CSIT while the channel angle is more orthogonal when N = 4 To further study the influence of CSIT inaccuracy, SNR, comparing with that when N = 2. number of transmit antennas, and user deployments, we The readers are referred to [28] for a rigorous discus- illustrate the rate region of different strategies when SNR, sion about the notion of average rate. N ,and γ are varied in Figs. 26, 27, 28, 29, 30, and 31. Figures 26 and 27 show the corresponding results of Appendix 1 Figs. 11 and 12 when SNR decreases to 10 dB. The rate Underloaded two-user deployment with perfect CSIT region gaps among users decreases when SNR decreases. To further investigate the influence of SNR, we illustrate Figures 28 and 29 show the results when γ = 1and the rate region of different strategies when SNR is 10 dB N = 2. When SNR is 10 dB, the rate regions of the in Figs. 22, 23, 24, and 25 and compare with the results three schemes are very close to each other. When SNR is when SNR is 20 dB in Figs. 7, 8, 9,and 10.Comparing the 20 dB, the rate region of RS shows explicit improvement corresponding figures of 10 and 20 dB, we observe that the over the rate regions of MU–LP and SC–SIC. Comparing rate region gaps among different schemes grow with SNR. Fig. 29 with Fig. 8, the performance of MU–LP is worse As SNR increases, the performance improvement of RS when CSIT is imperfect. It shows that MU–LP requires becomes more obvious. Specifically, SC–SIC and MU–LP accurate CSIT to design precoders. There is no cross- outperform each other at one part of the rate region in point between SC–SIC and MU–LP in Figs. 27c and 12b Figs. 9b and 10d and the rate region of RS encompasses compared, respectively, with Figs. 24c and 9b. ab cd Fig. 22 Achievable rate region comparison of different strategies in underloaded two-user deployment with perfect CSIT, γ = 1, N = 4, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. d θ = 4π/9 = Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 34 of 54 ab cd Fig. 23 Achievable rate region comparison of different strategies in underloaded two-user deployment with perfect CSIT, γ = 1, N = 2, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. d θ = 4π/9 ab cd Fig. 24 Achievable rate region comparison of different strategies in perfect CSIT, γ = 0.3, N = 4, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 35 of 54 ab cd Fig. 25 Achievable rate region comparison of different strategies in perfect CSIT, γ = 0.3, N = 2, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 ab cd Fig. 26 Average rate region comparison of different strategies in imperfect CSIT, γ = 1, N = 4, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 36 of 54 ab cd Fig. 27 Average rate region comparison of different strategies in imperfect CSIT, γ = 0.3, N = 4, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 ab cd Fig. 28 Average rate region comparison of different strategies in imperfect CSIT, γ = 1, N = 2, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 37 of 54 ab cd Fig. 29 Average rate region comparison of different strategies in imperfect CSIT, γ = 1, N = 2, SNR = 20 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 ab cd Fig. 30 Average rate region comparison of different strategies in imperfect CSIT, γ = 0.3, N = 2, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 38 of 54 ab cd Fig. 31 Average rate region comparison of different strategies in imperfect CSIT, γ = 0.3, N = 2, SNR = 20 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 Figures 30 and 31 show the results when γ = 0.3. SNR vector, the WSR of SC–SIC becomes closer to that of is 10 and 20 dB, respectively. The rate region gap between RS as the channel gain differences among users increase. RS and SC–SIC reduces in imperfect CSIT, as observed by For example, we compare Figs. 13, 32, and 36 for a fixed comparing Fig. 31 with Fig. 10. Comparing with MU–LP, u =[ 0.2, 0.3, 0.5]. When u =[ 0.4, 0.3, 0.3], the WSR of SC–SIC is less sensitive to CSIT inaccuracy. RS and MU–LP are almost identical. In such scenario, RS reduces to MU–LP. In subfigure d of each figure, 4π 8π θ = and θ = , the channels of user-1 and Appendix 3 1 2 9 9 user-2, and the channels of user-2 and user-3 are suffi- Underloaded three-user deployment with perfect CSIT ciently orthogonal while the channels of user-1 and user-3 We consider three different sets of γ , γ .When γ = γ = 1, 1 2 1 2 are almost in opposite directions. In such circumstance, the three users have no channel strength difference. When the WSRs of RS and MU–LP strategies overlap with the γ = 1, γ = 0.3, there is a 5-dB channel strength differ- 1 2 optimal WSR achieved by DPC. ence between user-1 and user-3 as well as between user-2 and user-3. When γ = 0.3, γ = 0.1, there is a 5-dB 1 2 channel strength difference between user-1 and user-2 as Appendix 4 well as user-2 and user-3. The channel strength differ- Overloaded three-user deployment with perfect CSIT ence between user-1 and user-3 is 10 dB. We consider (1) Two transmit antenna deployment three different weight vectors for each set of γ , γ , i.e., Figures 39, 40, 41, 42, and 43 show the results when γ , γ , 1 2 1 2 u = [0.2, 0.3, 0.5], u = [0.4, 0.3, 0.3],and u = [0.6, 0.3, 0.1]. and u are varied as discussed in Appendix C. In all figures (Figs. 32, 33, 34, 35, 36, 37, and 38), RS exhibits a clear WSR gain over SC–SIC, SC–SIC theWSR of RS is equaltoorbetterthanthatofMU– per group, and MU–LP in all figures (Figs. 39, 40, 41, LP and SC–SIC. Considering a specific scenario where 42, and 43). One-layer RS outperforms SC–SIC, SC–SIC 2π 4π θ = , θ = ,and u =[ 0.6, 0.3, 0.1], the WSR of RS is per group, and MU–LP in most figures. It further shows 1 2 9 9 better than that of MU–LP and SC–SIC as shown in Figs. that 1-layer RS outperforms the joint switching between 34b, 35b, and 38b. As SNR increases, the WSR improve- SC–SIC and SC–SIC per group in most user deployments ment of RS is generally more obvious. For a fixed weight while the complexity of 1-layer RS is much reduced. In Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 39 of 54 ab cd Fig. 32 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = γ = 1, 1 2 th u =0.2, u =0.3, u =0.5, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 ab cd Fig. 33 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = γ = 1, 1 2 th u =0.4, u =0.3, u =0.3, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 40 of 54 ab cd Fig. 34 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = γ = 1, 1 2 th u =0.6, u =0.3, u =0.1, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 ab cd Fig. 35 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = 1, γ = 0.3, 1 2 th u =0.6, u =0.3, u =0.1, N =4, R =0, k ∈{1, 2, 3}.a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 41 of 54 ab cd Fig. 36 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = 0.3, γ = 0.1, 1 2 th u =0.2, u =0.3, u =0.5, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 ab cd Fig. 37 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = 0.3, γ = 0.1, 1 2 th u =0.4, u =0.3, u =0.3, N = 4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 42 of 54 ab cd Fig. 38 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = 0.3, γ = 0.1, 1 2 th u =0.6, u =0.3, u = 0.1, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 ab cd Fig. 39 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with perfect CSIT, γ = γ = 1, u = 0.2, 1 2 1 u = 0.3, u = 0.5, N = 2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ = 2π/9. b θ = 2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. 2 3 t 1 2 1 2 1 2 th d θ = 4π/9, θ = 8π/9 1 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 43 of 54 ab cd Fig. 40 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with perfect CSIT, γ = γ = 1, u = 0.4, 1 2 1 u = 0.3, u = 0.3, N = 2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ = 2π/9. b θ = 2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. 2 3 t 1 2 1 2 1 2 th d θ = 4π/9, θ = 8π/9 1 2 ab cd Fig. 41 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with perfect CSIT, γ = γ = 1, u = 0.6, 1 2 1 u = 0.3, u = 0.1, N = 2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ = 2π/9. b θ = 2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. 2 3 t th 1 2 1 2 1 2 d θ = 4π/9, θ = 8π/9 1 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 44 of 54 ab cd Fig. 42 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with perfect CSIT, γ = 1, γ = 0.3, 1 2 u =0.2, u =0.3, u =0.5, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t 1 2 1 2 1 2 th d θ = 4π/9, θ = 8π/9 1 2 ab cd Fig. 43 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with perfect CSIT, γ = 1, γ = 0.3, 1 2 u =0.6, u =0.3, u =0.1, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t th 1 2 1 2 1 2 d θ = 4π/9, θ = 8π/9 1 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 45 of 54 Figs. 39a–c and 40a–c, 1-layer RS achieves the same WSR as RS. It implies that RS reduces to 1-layer RS in these user deployments. Both of RS and 1-layer RS achieve higher WSRs than all other strategies. (2) Single transmit antenna deployment Figures 44 and 45 show the average rate regions of differ- ent strategies over 10 random channel realizations when 2 2 2 2 2 2 σ = σ = σ = 1and σ = σ = 1, σ = 0.3, respec- 1 2 3 1 2 3 tively. We further show that 1-layer RS is an attractive alternativetoSC–SIC. Appendix 5 Underloaded three-user deployment with imperfect CSIT We consider the imperfect CSIT scenarios. The channel model in the two-user deployment with imperfect CSIT is extended here. The estimated channel of user-1, user-2, and user-3 are initialized using Eq. (38). For the given channel estimate at the BS, the channel realization is Fig. 45 Weighted sum rate versus SNR comparison of different h = h + h , ∀k ∈{1, 2, 3},where h is the estimated k k k k strategies for overloaded three-user deployment with perfect CSIT, error of user-k. h has i.i.d. complex Gaussian entries 2 2 2 σ = σ = 1, σ = 0.3, N = 1, r =[ 0, 0, 0.01, 0.03, 0.1, 0.2, 0.3] t th 1 2 3 drawn from CN 0, σ . The error covariance of user-1, bit/s/Hz e,k 2 −0.6 2 −0.6 user-2, and user-3 are σ = P , σ = γ P ,and t t e,1 e,2 −0.6 σ = γ P , respectively. The precoders are initialized e,3 and designed using the estimated channels h , h ,andh 1 2 3 imperfect CSIT. In contrast, the WSR gap between RS and the same methods as stated in perfect CSIT scenarios. and 1-layer RS decreases in imperfect CSIT. One-layer RS One thousand different channel error samples are gener- achieves equal or better WSRs than SC–SIC, SC–SIC per ated for each user. Each point in the rate region is the group, and MU–LP in all figures (Figs. 46, 47, 48, 49, 50, average rate over the generated 1000 channels. and 51). As mentioned earlier, all forms of RS are suited Comparing with the simulation results in perfect CSIT, to any network load and channel circumstances of users. the WSR gap between RS and MU–LP increases in Moreover, all forms of RS are robust to imperfect CSIT. Appendix 6 Overloaded three-user deployment with imperfect CSIT We further investigate the overloaded three-user deploy- ment with imperfect CSIT. The BS is equipped with two antennas (N = 2). Figures 52, 53, 54, 55, 56, and 57 show the simulation results when the rate threshold is r = [0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. Compar- th ing Fig. 52 with Fig. 39, the WSR gaps between RS and SC–SIC per group, RS and MU–LP are increasing dra- matically while the WSR gap between RS and SC–SIC is decreasing. The inter-group interference of SC–SIC per group becomes difficult to coordinate due to the limited number of transmit antenna and imperfect CSIT. RS is able to overcome the limitations of SC–SIC per group and MU–LP by dynamically determining the level of multi- user interference to decode and treat as noise. Appendix 7 Fig. 44 Weighted sum rate versus SNR comparison of different Overloaded four-user deployment with perfect CSIT strategies for overloaded three-user deployment with perfect CSIT, 2 2 2 Figures 58 and 59 show the results when γ = 1. Compar- σ = σ = σ = 1, N = 1, r =[ 0, 0, 0.01, 0.03, 0.1, 0.2, 0.3] bit/s/Hz t th 1 2 3 ing with SC–SIC per group, 1-layer RS per group always Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 46 of 54 ab cd Fig. 46 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with imperfect CSIT, γ = γ = 1, 1 2 th u =0.2, u =0.3, u =0.5, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 ab cd Fig. 47 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with imperfect CSIT, γ = γ = 1, 1 2 th u =0.4, u =0.3, u =0.3, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 47 of 54 ab cd Fig. 48 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with imperfect CSIT, γ = γ = 1, 1 2 th u =0.6, u =0.3, u =0.1, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 ab cd Fig. 49 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with imperfect CSIT, γ = 1, γ = 0.3, 1 2 th u =0.2, u =0.3, u =0.5, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 48 of 54 ab cd Fig. 50 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with imperfect CSIT, γ = 1, γ = 0.3, 1 2 th u =0.4, u =0.3, u =0.3, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 ab cd Fig. 51 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with imperfect CSIT, γ = 1, γ = 0.3, 1 2 th u =0.6, u =0.3, u =0.1, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 49 of 54 ab cd Fig. 52 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with imperfect CSIT, γ = γ = 1, 1 2 u =0.2, u =0.3, u =0.5, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t 1 2 1 2 1 2 th d θ = 4π/9, θ = 8π/9 1 2 ab cd Fig. 53 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with imperfect CSIT, γ = γ = 1, 1 2 u =0.4, u =0.3, u =0.3, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t th 1 2 1 2 1 2 d θ = 4π/9, θ = 8π/9 1 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 50 of 54 ab cd Fig. 54 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with imperfect CSIT, γ = γ = 1, 1 2 u =0.6, u =0.3, u =0.1, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t 1 2 1 2 1 2 th d θ = 4π/9, θ = 8π/9 1 2 ab cd Fig. 55 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with imperfect CSIT, γ = 1, γ = 0.3, 1 2 u =0.2, u =0.3, u =0.5, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t th 1 2 1 2 1 2 d θ = 4π/9, θ = 8π/9 1 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 51 of 54 ab cd Fig. 56 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with imperfect CSIT, γ = 1, γ = 0.3, 1 2 u =0.4, u =0.3, u =0.3, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t 1 2 1 2 1 2 th d θ = 4π/9, θ = 8π/9 1 2 ab cd Fig. 57 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with imperfect CSIT, γ = 1, γ = 0.3, 1 2 u =0.6, u =0.3, u =0.1, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t th 1 2 1 2 1 2 d θ = 4π/9, θ = 8π/9 1 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 52 of 54 ab cd Fig. 58 Weighted sum rate versus SNR comparison of different strategies for overloaded four-user deployment with perfect CSIT, γ = 1, θ = θ + , 1 2 1 r =[ 0.03, 0.1, 0.2, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = 0. b θ = π/18. c θ = π/9. d θ = π/6 1 1 1 1 th ab cd Fig. 59 Weighted sum rate versus SNR comparison of different strategies for overloaded four-user deployment with perfect CSIT, γ = 1, θ = θ + , 1 2 1 r =[ 0.03, 0.1, 0.2, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = 0. b θ = π/18. c θ = π/9. d θ = π/6 1 1 1 1 th Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 53 of 54 Appendix 8 Overloaded ten-user deployment with perfect CSIT 2 2 Figure 60 shows the simulation results when σ = σ = 1 2 ... = σ = 1, r = [0, 0.001, 0.004, 0.01, 0.03, 0.06, 0.1] th bit/s/Hz. Comparing with Fig. 21, the rate threshold of each SNR is reduced in Fig. 60. The WSR achieved by MU–LP is approaching RS when SNR is 0 or 5 dB in Fig. 60. This is because the rate threshold is set to 0 when SNR is 0 dB or 5 dB. When the rate threshold is 0, MU–LP could deliver two interference free streams since there are two transmit antennas. It achieves a DoF of 2 while SC–SIC is always limited by a DoF of 1. Figure 61 shows the simulation results when 2 2 2 σ = 1, σ = 0.9, ...σ = 0.1. The rate threshold is 1 2 10 the same as in Fig. 60. In the extremely overloaded sce- nario, the WSR gap between RS and SC–SIC is still large despite the diversity in channel strengths. Here again, Fig. 60 Weighted sum rate versus SNR comparison of different SC–SIC makes an inefficient use of the transmit antennas strategies for overloaded ten-user deployment with perfect CSIT, and achieves a DoF of 1. In contrast, 1-layer RS, with a 2 2 2 σ = σ = ... = σ = 1, N = 2, SNR = 30 dB, r = [0, 0.001, 0.004, t th 1 2 10 low scheduler and receiver complexity, achieves a good 0.01, 0.03, 0.06, 0.1] bit/s/Hz performance in all network loads. Abbreviations AO: Alternating optimization; AWGN: Additive white gaussian noise; AWSR: Averaged weighted sum rate; CoMP: Coordinated multipoint; CSIR: Channel achieves equal or better WSR. One-layer RS per group state information at the receivers; CSIT: Channel state information at the is more general than SC–SIC per group. It enables the transmitter; DoA: Direction of arrival; DoF: Degrees of freedom; DPC: Dirty capability of partially decoding interference and partially paper coding; FDMA: Frequency-division multiple access; GDoF: Generalized degrees of freedom; HK: Han and Kobayashi; HRS: Hierarchical rate splitting; treating interference as noise in each user group. When IC: Interference channel; IoT: Internet of Things; MISO: Multiple-input there is a sufficient channel gain difference between users single-output; MRT: Maximum ratio transmission; MSE: Mean square error; within each group and a sufficient inter-group angle, the MTC: Machine-type communications; MU–LP: Multi-user linear precoding; MU–MIMO: Multi-user multiple-input multiple-output; MUST: Multi-user WSR of SC–SIC per group becomes closer to the WSR of superposition transmission; NOMA: Non-orthogonal multiple access; RS comparing Figs. 59 and 19. OMA: Orthogonal multiple access; QCQP: Quadratically constrained quadratic program; QoS: Quality of service; RS: Rate splitting; RSMA: Rate-splitting multiple access; SAA: Sample average approximated; SC: Superposition coding; SC–SIC: Superposition coding with successive interference cancellation; SCMA: Sparse code multiple access; SDMA: Space-division multiple access; SIC: Successive interference cancellation; SISO: Single-input single-output; SISO BC: Single-input single-output broadcast channel; SNR: Signal-to-noise ratio; SVD: Singular value decomposition; TDMA: Time-division multiple access; WMMSE: Weighted minimum mean square error; WSR: Weighted sum rate; ZFBF: Zero-forcing beamforming Acknowledgements The authors are deeply indebted to Dr. Hamdi Joudeh for his helpful comments and suggestions. Funding This work is partially supported by the UK Engineering and Physical Sciences Research Council (EPSRC) under grant EP/N015312/1. Authors’ contributions Authors’ contributions BC proposed the research idea. The co-authors discussed the model design and experiments together. YM performed the experiments. YM and BC co-wrote the first draft of the manuscript. BC and VOKL gave advice on writing and revised the manuscript. All authors read and approved the final manuscript. Competing interests Fig. 61 Weighted sum rate versus SNR comparison of different The authors declare that they have no competing interests. strategies for overloaded ten-user deployment with perfect CSIT, 2 2 2 Publisher’s Note σ = 1, σ = 0.9, ...σ = 0.1, N = 2, SNR=30 dB, 1 2 10 Springer Nature remains neutral with regard to jurisdictional claims in r = [0, 0.001, 0.004, 0.01, 0.03, 0.06, 0.1] bit/s/Hz th published maps and institutional affiliations. Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 54 of 54 Author details 23. VD Nguyen, HD Tuan, TQ Duong, HV Poor, OS Shin, Precoder design for Department of Electrical and Electronic Engineering, The University of Hong signal superposition in MIMO-NOMA multicell networks. IEEE J. Sel. Areas Kong, Pok Fu Lam Road, Hong Kong, China. Department of Electrical and Commun. 35(12), 2681–2695 (2017) Electronic Engineering, Imperial College London, Exhibition Road, SW7 2AZ 24. M Zeng, A Yadav, OA Dobre, GI Tsiropoulos, HV Poor, Capacity London, UK. comparison between MIMO-NOMA and MIMO-OMA with multiple users in a cluster. IEEE J. Sel. Areas Commun. 35(10), 2413–2424 (2017) Received: 3 November 2017 Accepted: 18 April 2018 25. T Han, K Kobayashi, A new achievable rate region for the interference channel. IEEE Trans. Inf. Theory. 27(1), 49–60 (1981) 26. AG Davoodi, SA Jafar, Aligned image sets under channel uncertainty: settling conjectures on the collapse of degrees of freedom under finite References precision CSIT. IEEE Trans. Inf. Theory. 62(10), 5603–5618 (2016) 1. Y Saito, Y Kishiyama, A Benjebbour, T Nakamura, A Li, K Higuchi, in 2013 27. S Yang, M Kobayashi, D Gesbert, X Yi, Degrees of freedom of time IEEE 77th Vehicular Technology Conference (VTC Spring). Non-orthogonal correlated MISO broadcast channel with delayed CSIT. IEEE Trans. Inf. multiple access (NOMA) for cellular future radio access (IEEE, 2013), Theory. 59(1), 315–328 (2013) pp. 1–5 28. H Joudeh, B Clerckx, Sum-rate maximization for linearly precoded 2. 3GPP TR 36.859, Study on downlink multiuser superposition transmission downlink multiuser MISO systems with partial CSIT: a rate-splitting (MUST) for LTE (Release 13). (3rd Generation Partnership Project (3GPP), approach. IEEE Trans. Commun. 64(11), 4847–4861 (2016) 2015). http://www.3gpp.org/dynareport/36859.htm 29. E Piovano, B Clerckx, Optimal DoF region of the K-user MISO BC with 3. H Nikopour, H Baligh, in 2013 IEEE 24th Annual International Symposium on partial CSIT. IEEE Commun. Lett. 21(11), 2368–2371 (2017) Personal, Indoor, and Mobile Radio Communications (PIMRC). Sparse code 30. C Hao, B Clerckx, MISO networks with imperfect CSIT: a topological multiple access (IEEE, 2013), pp. 332–336 rate-splitting approach. IEEE Trans. Commun. 65(5), 2164–2179 (2017) 4. L Dai, B Wang, Y Yuan, S Han, C-l I, Z Wang, Non-orthogonal multiple 31. C Hao, B Rassouli, B Clerckx, Achievable DoF regions of MIMO networks access for 5G: solutions, challenges, opportunities, and future research with imperfect CSIT. IEEE Trans. Inf. Theory. 63(10), 6587–6606 (2017) trends. IEEE Commun. Mag. 53(9), 74–81 (2015) 32. H Joudeh, B Clerckx, Robust transmission in downlink multiuser MISO 5. Z Ding, Y Liu, J Choi, Q Sun, M Elkashlan, C-l I, HV Poor, Application of systems: a rate-splitting approach. IEEE Trans. Signal Process. 64(23), non-orthogonal multiple access in LTE and 5G networks. IEEE Commun. 6227–6242 (2016) Mag. 55(2), 185–191 (2017) 33. E Piovano, H Joudeh, B Clerckx, in 2016 50th Asilomar Conference on 6. W Shin, M Vaezi, B Lee, DJ Love, J Lee, HV Poor, Non-orthogonal multiple Signals, Systems and Computers. Overloaded multiuser MISO transmission access in multi-cell networks: theory, performance, and practical with imperfect CSIT (IEEE, 2016), pp. 34–38 challenges. IEEE Commun. Mag. 55(10), 176–183 (2017) 34. H Joudeh, B Clerckx, Rate-splitting for max-min fair multigroup multicast 7. T Cover, Broadcast channels. IEEE Trans. Inf. Theory. 18(1), 2–14 (1972) beamforming in overloaded systems. IEEE Trans. Wirel. Commun. 16(11), 8. D Tse, P Viswanath, Fundamentals of wireless communication. (Cambridge 7276–7289 (2017) University Press, Cambridge, 2005) 35. RH Etkin, DNC Tse, H Wang, Gaussian interference channel capacity to 9. H Weingarten, Y Steinberg, SS Shamai, The capacity region of the within one bit. IEEE Trans. Inf. Theory. 54(12), 5534–5562 (2008) Gaussian multiple-input multiple-output broadcast channel. IEEE Trans. 36. AG Davoodi, SA Jafar, in 2016 IEEE International Symposium on Information Inf. Theory. 52(9), 3936–3964 (2006) Theory (ISIT). GDoF of the MISO BC: bridging the gap between finite 10. B Clerckx, C Oestges, MIMO wireless networks: channels, techniques and precision CSIT and perfect CSIT (IEEE, 2016), pp. 1297–1301 standards for multi-antenna, multi-user and multi-cell systems. (Academic 37. AG Davoodi, SA Jafar, Transmitter cooperation under finite precision CSIT: Press, Cambridge, 2013) a GDoF perspective. IEEE Trans. Inf. Theory. 63(9), 6020–6030 (2017) 11. T Yoo, A Goldsmith, On the optimality of multiantenna broadcast 38. C Hao, Y Wu, B Clerckx, Rate analysis of two-receiver MISO broadcast scheduling using zero-forcing beamforming. IEEE J. Sel. Areas Commun. channel with finite rate feedback: a rate-splitting approach. IEEE Trans. 24(3), 528–541 (2006) Commun. 63(9), 3232–3246 (2015) 12. B Clerckx, H Joudeh, C Hao, M Dai, B Rassouli, Rate splitting for MIMO 39. M Dai, B Clerckx, D Gesbert, G Caire, A rate splitting strategy for massive wireless networks: a promising PHY-layer strategy for LTE evolution. IEEE MIMO with imperfect CSIT. IEEE Trans. Wirel. Commun. 15(7), 4611–4624 Commun. Mag. 54(5), 98–105 (2016) (2016) 13. N Jindal, MIMO broadcast channels with finite-rate feedback. IEEE Trans. 40. M Dai, B Clerckx, Multiuser millimeter wave beamforming strategies with Inf. Theory. 52(11), 5045–5060 (2006) quantized and statistical CSIT. IEEE Trans. Wirel. Commun. 16(11), 14. MF Hanif, Z Ding, T Ratnarajah, GK Karagiannidis, A 7025–7038 (2017) minorization-maximization method for optimizing sum rate in the 41. A Papazafeiropoulos, B Clerckx, T Ratnarajah, Rate-splitting to mitigate downlink of non-orthogonal multiple access systems. IEEE Trans. Signal residual transceiver hardware impairments in massive MIMO systems. Process. 64(1), 76–88 (2016) IEEE Trans. Veh. Technol. 66(9), 8196–8211 (2017) 15. J Choi, Minimum power multicast beamforming with superposition 42. SS Christensen, R Agarwal, ED Carvalho, JM Cioffi, Weighted sum-rate coding for multiresolution broadcast and application to NOMA systems. maximization using weighted MMSE for MIMO-BC beamforming design. IEEE Trans. Commun. 63(3), 791–800 (2015) IEEE Trans. Wirel. Commun. 7(12), 4792–4799 (2008) 16. Q Sun, S Han, C-l I, Z Pan, On the ergodic capacity of MIMO NOMA 43. B Zheng, X Wang, M Wen, F Chen, NOMA-based multi-pair two-way relay systems. IEEE Wirel. Commun. Lett. 4(4), 405–408 (2015) networks with rate splitting and group decoding. IEEE J. Sel. Areas 17. Q Zhang, Q Li, J Qin, Robust beamforming for nonorthogonal Commun. 35(10), 2328–2341 (2017) multiple-access systems in MISO channels. IEEE Trans. Veh. Technol. 44. H Viswanathan, S Venkatesan, H Huang, Downlink capacity evaluation of 65(12), 10231–10236 (2016) cellular networks with known-interference cancellation. IEEE J. Sel. Areas 18. C Lim, T Yoo, B Clerckx, B Lee, B Shim, Recent trend of multiuser MIMO in Commun. 21(5), 802–811 (2003) LTE-advanced. IEEE Commun. Mag. 51(3), 127–135 (2013) 45. Q Li, G Li, W Lee, M-i Lee, D Mazzarese, B Clerckx, Z Li, MIMO techniques in 19. Z Chen, Z Ding, X Dai, GK Karagiannidis, On the application of WiMAX and LTE: a feature overview. IEEE Commun. Mag. 48(5), 86–92 quasi-degradation to MISO-NOMA downlink. IEEE Trans. Signal Process. (2010) 64(23), 6174–6189 (2016) 46. B Rimoldi, R Urbanke, A rate-splitting approach to the Gaussian 20. Z Ding, F Adachi, HV Poor, The application of MIMO to non-orthogonal multiple-access channel. IEEE Trans. Inf. Theory. 42(2), 364–375 (1996) multiple access. IEEE Trans. Wirel. Commun. 15(1), 537–552 (2016) 21. J Choi, On generalized downlink beamforming with NOMA. J. Commun. Netw. 19(4), 319–328 (2017) 22. W Shin, M Vaezi, B Lee, DJ Love, J Lee, HV Poor, Coordinated beamforming for multi-cell MIMO-NOMA. IEEE Commun. Lett. 21(1), 84–87 (2017) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png EURASIP Journal on Wireless Communications and Networking Springer Journals

Rate-splitting multiple access for downlink communication systems: bridging, generalizing, and outperforming SDMA and NOMA

Free
54 pages
Loading next page...
 
/lp/springer_journal/rate-splitting-multiple-access-for-downlink-communication-systems-kLe5wAq70s
Publisher
Springer International Publishing
Copyright
Copyright © 2018 by The Author(s)
Subject
Engineering; Signal,Image and Speech Processing; Communications Engineering, Networks; Information Systems Applications (incl.Internet)
eISSN
1687-1499
D.O.I.
10.1186/s13638-018-1104-7
Publisher site
See Article on Publisher Site

Abstract

Space-division multiple access (SDMA) utilizes linear precoding to separate users in the spatial domain and relies on fully treating any residual multi-user interference as noise. Non-orthogonal multiple access (NOMA) uses linearly precoded superposition coding with successive interference cancellation (SIC) to superpose users in the power domain and relies on user grouping and ordering to enforce some users to fully decode and cancel interference created by other users. In this paper, we argue that to efficiently cope with the high throughput, heterogeneity of quality of service (QoS), and massive connectivity requirements of future multi-antenna wireless networks, multiple access design needs to depart from those two extreme interference management strategies, namely fully treat interference as noise (as in SDMA) and fully decode interference (as in NOMA). Considering a multiple-input single-output broadcast channel, we develop a novel multiple access framework, called rate-splitting multiple access (RSMA). RSMA is a more general and more powerful multiple access for downlink multi-antenna systems that contains SDMA and NOMA as special cases. RSMA relies on linearly precoded rate-splitting with SIC to decode part of the interference and treat the remaining part of the interference as noise. This capability of RSMA to partially decode interference and partially treat interference as noise enables to softly bridge the two extremes of fully decoding interference and treating interference as noise and provides room for rate and QoS enhancements and complexity reduction. The three multiple access schemes are compared, and extensive numerical results show that RSMA provides a smooth transition between SDMA and NOMA and outperforms them both in a wide range of network loads (underloaded and overloaded regimes) and user deployments (with a diversity of channel directions, channel strengths, and qualities of channel state information at the transmitter). Moreover, RSMA provides rate and QoS enhancements over NOMA at a lower computational complexity for the transmit scheduler and the receivers (number of SIC layers). Keywords: RSMA, NOMA, SDMA, MISO BC, Linear precoding, Rate region, Weighted sum rate, Rate splitting 1 Introduction at each access point, respectively). Moreover, due to the With the dramatic upsurge in the number of devices heterogeneity of devices (high-end such as smartphones expected in 5G and beyond, wireless networks will be and low-end such as Internet of Things and Machine-Type operated in a variety of regimes ranging from underloaded Communications devices), deployments, and applications to overloaded (where the number of scheduled devices is in 5G and beyond, the transmitter will need to serve smaller and larger than the number of transmit antennas simultaneously users with different capabilities, deploy- ments, and qualities of channel state information at the *Correspondence: maoyijie@hku.hk transmitter (CSIT). This massive connectivity problem Department of Electrical and Electronic Engineering, The University of Hong together with the demands for high throughput and het- Kong, Pok Fu Lam Road, Hong Kong, China Full list of author information is available at the end of the article erogeneity of quality of service (QoS) has recently spurred © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 2 of 54 interests in re-thinking multiple access for the downlink orthogonal resources (using OMA), but that may lead to of communication systems. some performance loss and latency increase. In this paper, we propose a new multiple access In nowadays wireless networks, access points are often called rate-splitting multiple access (RSMA). In order to equipped with more than one antenna. This spatial fully assess the novelty of the proposed multiple access dimension opens the door to another well-known type paradigm and the design philosophy, we first review of multiple access, namely SDMA. SDMA superposes the state of the art of two major multiple accesses, users in the same time-frequency resource and sepa- namely non-orthogonal multiple access (NOMA) [1], also rates user via a proper use of the spatial dimensions. called Multi-User Superposition Transmission (MUST) in Contrary to the SISO BC, the multi-antenna BC is non- 3GPP LTE Rel-13 [2] and space-division multiple access degraded, i.e., users cannot be ordered based on their (SDMA). We identify their benefits and limitations and channel strengths in general settings. This is the reason make critical observations, before motivating the intro- why SC–SIC is not capacity-achieving, and the com- duction of the novel and more powerful RSMA. plex dirty paper coding (DPC) is the only strategy that achieves the capacity region of the multiple-input single- 1.1 SDMA and NOMA: the extremes output (MISO) (Gaussian) BC with perfect CSIT [9]. Contrary to orthogonal multiple access (OMA) that DPC, rather than performing interference cancellation at schedules users or groups of users in orthogonal dimen- the receivers as in SC–SIC, can be viewed as a form of sions, e.g., time (TDMA) and frequency (FDMA), NOMA enhanced interference cancellation at the transmitter and superposes users in the same time-frequency resource relies on perfect CSIT to do so. Due to the high computa- via the power domain or the code domain, leading to tional burden of DPC, linear precoding is often considered the power-domain NOMA (e.g., [1]) and code-domain the most attractive alternative to simplify the transmitter NOMA (e.g., sparse code multiple access (SCMA) [3]). design [10]. Interestingly, in a MISO BC, multi-user linear Power-domain NOMA relies on superposition coding precoding (MU–LP), e.g., either in closed form or opti- (SC) at the transmitter and successive interference cancel- mized using optimization methods, though suboptimal, lation (SIC) at the receivers (denoted in short as SC–SIC) is often very useful when users experience relatively sim- [1, 4–6]. Such a strategy is motivated by the well-known ilar channel strengths or long-term signal-to-noise ratio result that SC–SIC achieves the capacity region of the (SNR) and have semi-orthogonal to orthogonal channels single-input single-output (SISO) (Gaussian) broadcast [11]. SDMA is therefore commonly implemented using channel (BC) [7, 8]. It is also well known that the capac- MU–LP. The linear precoders create different beams with each beam being allocated a fraction of the total trans- ity region of the SISO BC is larger than the rate region achieved by OMA (e.g., TDMA) when users experience mit power. Hence, similarly to NOMA, SDMA can also be a disparity of channel strengths [8]. On the other hand, viewed as a superposition of users in the power domain, when users exhibit the same channel strengths, OMA though users are separated at the transmitter side by spa- based on TDMA is sufficient to achieve the capacity tial beamformers rather than by the use of SIC at the region [8]. receivers. The benefit of a single-antenna NOMA using SC–SIC SDMA based on MU–LP is a well-established multi- is therefore to be able, despite the presence of a single ple access that is nowadays the basic principle behind transmit antenna in a SISO BC, to cope with an over- numerous techniques in 4G and 5G such as multi- loaded regime in a spectrally efficient manner where mul- user multiple-input multiple-output (MU–MIMO), coor- tiple users experience potentially very different channel dinated multipoint (CoMP) coordinated beamforming, strengths/path losses (e.g., cell-center users and cell-edge network MIMO, millimeter-wave MIMO, and massive users) on the same time/frequency resource. MIMO. The limitation of a single-antenna NOMA lies in its The benefit of SDMA using MU–LP is therefore to reap complexity as the number of users grows. Indeed, for all spatial multiplexing benefits of a MISO BC with perfect a K-userSISOBC, thestrongest user needstodecode CSIT with a low precoder and receiver complexity. using SIC the K − 1 messages of all co-scheduled users The limitations of SDMA are threefold. and therefore peel off K − 1 layers before accessing its First, it is suited to the underloaded regime and per- intended stream. Though SIC of a small number of layers formance of MU–LP in the overloaded regime quickly should be feasible in practice , the complexity and likeli- drops as it requires more transmit antennas than users hood of error propagation becomes quickly significant for to be able to efficiently manage the multi-user inter- a large number of users. This calls for ways to decrease ference. When the MISO BC becomes overloaded, the current and popular approach for the transmitter is to the number of SIC layers at each user. One could divide users into small groups of users with disparate channels schedule group of users over orthogonal dimensions (e.g., and apply SC–SIC in each group and schedule groups on time/frequency) and perform linear precoding in each Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 3 of 54 group, which may increase latency and decrease QoS a dynamic switching between SC–SIC and zero-forcing depending on the application. beamforming (ZFBF) was investigated. Second, its performance is sensitive to the user channel The second strategy, which we denote as “SC–SIC per orthogonality and strengths and requires the scheduler to group,” consists in grouping K users into G groups. Users pair semi-orthogonal users with similar channel strengths within each group are served using SC–SIC, and users together. The complexity of the scheduler can quickly across groups are served using SDMA so as to mitigate increase when an exhaustive search is performed, though the inter-group interference. Examples of such a strategy low-complexity (suboptimal) scheduling and user-pairing can be found in [1, 20–24]. This strategy can therefore algorithms exist [10]. be seen as a combination of SDMA and NOMA where Third, it is optimal from a degrees of freedom (DoF), the multi-antenna system is effectively decomposed into also known as spatial multiplexing gain, perspective in the G hopefully non-interfering single-antenna NOMA chan- perfect CSIT setting but not in the presence of imperfect nels. For this “SC–SIC per group” approach to perform CSIT [12]. The problem of SDMA design in the presence at its best, users within each group need to have their of imperfect CSIT has been to strive to apply a framework channels aligned and users across groups need to be motivated by perfect CSIT to scenarios with imperfect orthogonal. CSIT, not to design a framework motivated by imperfect Similarly to SDMA, multi-antenna NOMA designs also CSIT from the beginning [12]. This leads to the well- rely on accurate CSIT. In the practical scenario of imper- known severe performance loss of MU–LP in the presence fect CSIT, NOMA design relies on the same above two of imperfect CSIT [13]. strategies butoptimizes theprecodersoastocopewith In view of SC–SIC benefits in a SISO BC, attempts have CSIT imperfection and resulting extra multi-user inter- been made to study multi-antenna NOMA. Two lines of ference. As an example, the MISO BC channel is again research have emerged that both rely on linearly precoded degraded in [17] and precoder optimization with imper- SC–SIC. fect CSIT is studied. The first strategy, which we simply denote as “SC–SIC,” The benefit of multi-antenna NOMA, similarly to the is a direct application of SC–SIC to the MISO BC by single-antenna NOMA, is the potential to cope with an degrading the multi-antenna broadcast channel. It con- overloaded regime where multiple users experience dif- sists in ordering users based on their effective scalar ferent channel strengths/path losses and/or are closely channel (after precoding) strengths and enforce receivers aligned with each other. to decode messages (and cancel interference) in a suc- The limitations of multi-antenna NOMA are fourfold. cessive manner. This is advocated and exemplified for First, the use of SC–SIC in NOMA is fundamentally instance in [14–17]. This NOMA strategy converts the motivated by a degraded BC in which users can be ordered multi-antenna non-degraded channel into an effective based on their channel strengths. This is the key property single-antenna degraded channel, as at least one receiver of the SISO BC that enables SC–SIC to achieve its capacity ends up decoding all messages. While such a strategy can region. Unfortunately, motivated by the promising gains cope with the deployment of users experiencing aligned of SC–SIC in a SISO BC, the multi-antenna NOMA lit- channels and different path loss conditions, it comes at erature strives to apply SC–SIC to a non-degraded MISO the expense of sacrificing and annihilating all spatial mul- BC. This forces to degrade a non-degraded BC and there- tiplexing gains in general settings. By forcing one receiver fore leads to an inefficient use of the spatial dimensions in to decode all streams, the sum DoF is reduced to unity . general settings, leading to a DoF loss. This is the same DoF as that achieved by TDMA/single- Second, NOMA is not suited for general user deploy- user beamforming (or OMA). This is significantly smaller ments since degrading a MISO BC is efficient when users than the sum DoF achieved by DPC and MU–LP in a are sufficiently aligned with each other and exhibit a MISO BC with perfect CSIT, which is the minimum of disparity of channel strengths, not in general settings. the number of transmit antennas and the number of Third, multi-antenna NOMA comes with an increase users . Moreover, this loss in multiplexing gain comes in complexity at both the transmitter and the receivers. with a significant increase in receiver complexity due to Indeed, a multi-layer SIC is needed at the receivers, sim- the multi-layer SIC compared to the treat interference ilarly to the single-antenna NOMA. However, in addition, as noise strategy of MU–LP. As a remedy to recover the since there exists no natural order for the users’ chan- DoF loss, we could envision a dynamic switching between nels in multi-antenna NOMA (because we deal with vec- NOMA and SDMA, reminiscent of the dynamic switch- tors rather than scalars), the precoders, the groups, and ing between SU–MIMO and MU–MIMO in 4G [18]. the decoding orders have to be jointly optimized by the One would dynamically choose the best option between scheduler at the transmitter. Taking as an example, the NOMA and SDMA as a function of the channel states. A application of NOMA based on “SC–SIC” to a three-user particular instance of this approach is taken in [19]where MISO BC, we need to optimize three precoders, one for Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 4 of 54 each user, along with the six possible decoding orders. two extreme of fully treat interference as noise and fully Increasing the number of users leads to an exponential decode interference. increase in the number of possible decoding orders. The idea of RS dates back to Carleial’s work and the “SC–SIC per group” divides users into multiple groups Han and Kobayashi (HK) scheme for the two-user single- but that approach leads to a joint design of user order- antenna interference channel (IC) [25]. However, the use ing and user grouping. To decrease the complexity in of RS as the building block of RSMA is motivated by user ordering and user grouping, multi-antenna NOMA recent works that have shown the benefit of RS in multi- (SC–SIC and SC–SIC per group) forces users belonging antenna BC and the recent progress on characterizing the to the same group to share the same precoder (beam- fundamental limits of a multi-antenna BC (and IC) with forming vector) [1]. Unfortunately, such a restriction can imperfect CSIT. Hence, importantly, in contrast with the only further hurt the overall performance since it shrinks conventional RS (HK scheme) used for the two-user SISO the overall optimization space. IC, we here use RS in a different setup, namely (1) in a BC Fourth, multi-antenna NOMA is subject to the same and (2) with multiple antennas. The use and benefits of RS drawback as SDMA in the presence of imperfect CSIT, in a multi-antenna BC only appeared in the last few years . namely its design is not motivated by any fundamental The capacity region of the K-user MISO BC with imper- limits of aMISOBCwithimperfect CSIT. fect CSIT remains an open problem. As an alternative, The key is to recognize that the limitations and draw- recent progress has been made to characterize the DoF backs of SDMA and NOMA originate from the fact that region of the underloaded and overloaded MISO BC with those two multiple accesses fundamentally rely on two imperfect CSIT. In [26], a novel information theoretic extreme interference management strategies, namely fully upperbound on the sum DoF of the K-user underloaded treat interference as noise and fully decode interference. MISO BC with imperfect CSIT was derived. Interestingly, Indeed, while NOMA relies on some users to fully decode this sum DoF coincides with the sum DoF achieved by and cancel interference created by other users, SDMA a linearly precoded RS strategy at the transmitter with relies on fully treating any residual multi-user interference SIC at the receivers [27, 28]. RS (with SIC) is therefore as noise. In the presence of imperfect CSIT, CSIT inaccu- optimum to achieve the sum DoF of the K-user under- racy results in an additional multi-user interference that is loaded MISO BC with imperfect CSIT, in contrast with treated as noise by both NOMA (SC–SIC per group) and MU–LP that is clearly suboptimum (and so is SC–SIC SDMA. since it achieves a sum DoF of unity )[28]. It turns out that RS with a flexible power allocation is not only opti- mum for the sum DoF but for the entire DoF region of 1.2 RSMA: bridging the extremes In contrast, with RSMA, we take a different route and an underloaded MISO BC with imperfect CSIT [29]. The depart from the SDMA and NOMA literature and those DoF benefit of RS in imperfect CSIT settings were also two extremes of fully decode interference and treat inter- shown in more complicated underloaded networks with ference as noise. We introduce a more general and power- multiple transmitters in [30] and multi-antenna receivers ful multiple access framework based on linearly precoded [31]. Considering user fairness, the optimum symmetric rate splitting (RS) at the transmitter and SIC at the DoF (or max-min DoF), i.e., the DoF that can be achieved receivers. This enables to decode part of the interference by all users simultaneously, of the underloaded MISO BC and treat the remaining part of the interference as noise with imperfect CSIT with MU–LP and RS was studied in [12]. This capability of RSMA to partially decode inter- [32]. RS symmetric DoF was shown to outperform that of ference and partially treat interference as noise enables to MU–LP. Finally, moving to the overloaded MISO BC with softly bridge the two extreme strategies of fully treating heterogeneous CSIT qualities, a multi-layer power parti- interference as noise and fully decoding interference. This tioning strategy that superimposes degraded symbols on contrasts sharply with SDMA and NOMA that exclusively top of linearly precoded rate-splitted symbols was shown rely on the two extremes or a combination thereof. in [33] to achieve the optimal DoF region. In order to partially decode interference and partially ThebenefitsofRShavealsoappearedinmulti-antenna treat interference as noise, RS splits messages into com- settings with perfect CSIT. In an overloaded multigroup mon and private messages and relies on a superimposed multicast setting with perfect CSIT, considering again transmission of common messages decoded by multiple fairness, the symmetric DoF achieved by RS, MU–LP, and users and private messages decoded by their correspond- degraded NOMA transmissions (where receivers decode ing users (and treated as noise by co-scheduled users). messages and cancel interference in a successive manner Users rely on SIC to first decode the common messages as in SC–SIC) was studied in [34]. It was shown that RS here again outperforms both MU–LP and SC–SIC. before accessing the private messages. By adjusting the message split and the power allocation to the common and The DoF metric is insightful to identify the multiplex- private messages, RS has the ability to softly bridge the ing gains of the MISO BC at high SNR but fails to capture Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 5 of 54 the diversity of channel strengths among users. This limi- is the first paper to explicitly recognize that SDMA and tation is countered by the generalized DoF (GDoF) frame- NOMA are both subsets of a more general transmission work, which inherits the tractability of the DoF framework framework based on RS . while capturing the diversity in channel strengths [35]. Second, we provide a general framework of multi- In [36, 37], the GDoF of an underloaded MISO BC with layer RS design that encompasses existing RS schemes imperfect CSIT is studied, and here again, RS is used as as special cases. In particular, the single-layer RS of part of the achievability scheme. [28, 29, 32–34, 38, 40, 41] and the multi-layer (hierarchical The DoF (GDoF) superiority of RS over MU–LP and and topological) RS of [30, 39] are special instances of the SC–SIC in all those multi-antenna settings (with perfect generalized RS strategy developed here. Moreover, the use and imperfect CSIT) comes from the ability of RS to better of RS was primarily motivated by multi-antenna deploy- handle the multi-user interference by evolving in a regime ments subject to multi-user interference due to imperfect in between the extremes of fully treating it as noise and CSIT in those works. The benefit of RS in the presence fully decoding it. of perfect CSIT and/or a diversity of channel strengths in Importantly, therateenhancementsofRSoverMU– a multi-antenna setup, as considered in this paper, is less LP, as predicted by the DoF analysis, are reflected in investigated. RS was shown in [34]toboost theperfor- the finite SNR regime as shown in a number of recent mance of overloaded multi-group multi-cast. However, no works. In [38], finite SNR rate analysis of RS in MISO attempt has been made so far to identify the benefit of RS BC in the presence of quantized feedback was analyzed in multi-antenna BC with perfect CSIT and/or a diversity and it was shown that RS benefits from a CSI feedback of channel strengths. overhead reduction compared to MU–LP. Using opti- Third, we show that the rate performance (rate region, mization methods, the precoder design of RS at finite weighted sum-rate with and without QoS constraints) of SNR was investigated in [28] for the sum rate and rate RSMA is always equal to or larger than that of SDMA and region maximization with imperfect CSIT, in [32]for NOMA. Considering a MISO BC with perfect CSIT and max-min fair transmission with imperfect CSIT, and in no QoS constraints, RSMA performance comes closer to [34] for multi-group multi-cast with perfect CSIT. More- the optimal DPC region than SDMA and NOMA. In sce- over, the benefit of RS over MU–LP in the finite SNR narios with QoS constraints or imperfect CSIT, RSMA regime was shown in massive MIMO [39], millimeter- always outperforms SDMA and NOMA. Since it is moti- wave systems [40] and multi-antenna deployments subject vated by fundamental DoF analysis, RSMA is also optimal to hardware impairments [41]. Finally, the performance from a DoF perspective in both perfect and imperfect benefits of the power-partitioning strategy relying on RS CSIT and therefore optimally exploit the spatial dimen- in the overloaded MISO BC with heterogeneous CSIT was sions and the availability of CSIT, in contrast with SDMA confirmed using simulations at finite SNR in the presence and NOMA that are suboptimal. of a diversity of channel strengths [33]. In particular, in Fourth, we show that RSMA is much more robust than contrast to the RS used in [12, 28, 29, 32–34, 38, 40, 41] SDMA and NOMA to user deployments, CSIT inaccu- that relies on a single common message, [39] (as well as racy, and network load. It can operate in a wide range of [30]) showed the benefits in the finite SNR regime of a practical deployments involving scenarios where the user multi-layer (hierarchical) RS relying on multiple common channels are neither orthogonal nor aligned and exhibit messages decoded by various groups of users. similar strengths or a diversity of strengths, where the In this paper, in view of the limitations of SDMA and CSI is perfectly or imperfectly known to the transmitter, NOMA and the above literature on RS in multi-antenna and where the network load can vary between the under- BC, we design a novel multiple access, called rate-splitting loaded and the overloaded regimes. In particular, in the multiple access (RSMA) for downlink communication overloaded regime, the RSMA framework is shown to be system . RSMA is a much more attractive solution (per- particularly suited to cope with a variety of device capa- formance and complexity-wise) that retains the benefits bilities, e.g., high-end devices along with cheap Internet- of SDMA and NOMA but tackles all the aforementioned of-Things (IoT)/Machine-Type Communications (MTC) limitations of SDMA and NOMA. Considering a MISO devices. Indeed, the RS framework can be used to pack BC, we make the following contributions. the IoT/MTC traffic in the common message, while still First, we show that RSMA is a more general delivering high-quality service to high-end devices. class/framework of multi-user transmission that encom- Fifth, we show that the performance gain can come passes SDMA and NOMA as special cases. RSMA is with a lower computational complexity than NOMA for showntoreducetoSDMAifchannelsareofsimilar both the transmit scheduler and the receivers. In con- strengths and sufficiently orthogonal with each other trast to NOMA that requires complicated user grouping and to NOMA if channels exhibit sufficiently diverse and ordering and potential dynamic switching (between strengths and are sufficiently aligned with each other. This SDMA, SC–SIC and SC–SIC per group) at the transmit Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 6 of 54 scheduler and multiple layers of SIC at the receivers, stand for an identity matrix and an all-zero vector, respec- a simple one-layer RS that does not require any user tively, with appropriate dimensions. CN (δ, σ ) represents ordering, grouping, or dynamic switching at the transmit a complex Gaussian distribution with mean δ and variance scheduler and a single layer of SIC at the receivers still σ . |A| is the cardinality of the set A. significantly outperforms NOMA. In contrast to SDMA, 2Systemmodel RSMA is less sensitive to user pairing and therefore does Consider a system where a base station (BS) equipped not require complex user scheduling and pairing .How- with N antennas serves K single-antenna users. The users ever, RSMA comes with a slightly higher encoding com- N ×1 areindexed by theset K ={1, ... , K }.Let x ∈ C plexity than SDMA and NOMA due to the encoding of denotes the signal vector transmitted in a given channel the common streams on top of the private streams. use. It is subject to the power constraint E{x }≤ P . Sixth, though SC–SIC is optimal to achieve the capacity The signal received at user-k is region of SISO BC, we show that a single-layer RS is a low- complexity alternative that only requires a single layer of y = h x + n , ∀k ∈ K (1) k k SIC at each receiver and achieves close to SC–SIC (with N ×1 where h ∈ C is the channel between the BS and user-k. multi-layer SIC) performance in a SISO BC deployment. n ∼ CN 0, σ is the additive white Gaussian noise As a takeaway message, we note that the ability of a k n,k wireless network architecture to partially decode inter- (AWGN) at the receiver. Without loss of generality, we ference and partially treat interference as noise can lead assume the noise variances are equal to one for all users. to enhanced throughput and QoS, increased robustness, The transmit SNR is equal to the total power consumption P . and lowered complexity compared to alternatives that are We assume CSI of users is perfectly known at the BS in forced to operate in the extreme regimes of fully treating the following model. The imperfect CSIT scenario will be interference as noise and fully decoding interference. discussed in the proposed algorithm and the numerical It is also worth making the analogy with other types results. Channel state information at the receivers (CSIR) of channels where the ability to bridge the extremes of isassumedtobeperfect. treating interference as noise and fully decoding inter- In this work, we are interested in beamforming designs ference has appeared. Considering a two-user SISO IC, for signal x at the BS. Specifically, the objective of beam- interference is fully decoded in the strong interference forming designs is to maximize the WSR of users subject regime and is treated as noise in the weak interference to a power constraint of the BS and QoS constraints of regime. Between those two extremes, interference is nei- each user. We firstly state and compare two baseline multi- ther strong enough to be fully decoded nor weak enough antenna multiple accesses, namely SDMA and NOMA. to be treated as noise. The best known strategy for the Then, RSMA is explained. The WSR problem of each two-user SISO IC is obtained using RS (so-called HK strategy will be formulated, and the algorithm adopted scheme). RS in this context is well known to be superior to solve the corresponding problem will be stated in the to strategies relying on fully treating interference as noise, following sections. fully decoding interference, or orthogonalization (TDMA, FDMA) [25, 35]. Limiting ourselves to those extremes 3 SDMA and NOMA strategies is suboptimal [25, 35]. In this section, we describe two baseline multiple accesses. The rest of the paper is organized as follows. The sys- The messages W , ... , W intended for users 1 to K, 1 K tem model is described in Section 2. The existing multiple respectively, are encoded into K independent data streams accesses are specified in Section 3.InSection 4,the s =[ s , ... , s ] independently. Symbols are mapped 1 K proposed RSMA and its low-complexity structures are to the transmit antennas through a precoding matrix N ×1 described and compared with existing multiple accesses. denoted by P =[ p , ... , p ], where p ∈ C is the 1 K The corresponding weighted sum rate (WSR) problems precoder for user-k. The superposed signal is x = Ps = are formulated, and the weighted MMSE (WMMSE) p s . Assuming that E{ss }= I, the transmit k k k∈K approach to solve the problem is discussed. Numerical power is constrained by tr(PP ) ≤ P . results are illustrated in Section 5, followed by conclusions and future works in Section 6. 3.1 SDMA Notations: The boldface uppercase and lowercase letters SDMA based on MU–LP is a well-established multiple are used to represent matrices and vectors. The super- access. Each user only decodes its desired message by T H scripts (·) and (·) denote transpose and conjugate- treating interference as noise. The signal-to-interference- transpose operators, respectively. tr(·) and diag(·) are the plus-noise ratio (SINR) at user-k is given by trace and diagonal entries, respectively. |·| is the absolute H 2 |h p | value, and · is the Euclidean norm. E{·} refers to the sta- γ =  .(2) |h p | + 1 tistical expectation. C denotes the complex space. I and 0 j j=k,j∈K k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 7 of 54 For a given weight vector u = [u , ... , u ],the WSR where R = min {log (1 + γ )}.In 1 K π(k) i≥k,i∈K π(i)→π(k) achieved by MU–LP is [14], the problem (5) with equal weights is solved by the approximation technique minorization-maximization R (u) = max u R MU−LP k k algorithm (MMA). To keep a single and unified approach k∈K to solve the WSR problem of different beamforming (3) s.t. tr PP ≤ P strategies, we still use the WMMSE algorithm to solve th it. By approximating the rate region with a set of rate R ≥ R , ∀k ∈ K weights, therateregion R (π ) with a certain decod- SC−SIC where R = log (1 + γ ) is theachievablerateofuser-k. k k ing order π is attained. To achieve the rate region of u is a non-negative constant which allows resource allo- SC–SIC, all decoding orders should be considered. The th cation to prioritize different users. R accounts for any k largest achievable rate region of SC–SIC is defined as potential individual rate constraint for user-k.Itensures the convex hull of the union over all decoding orders as the QoS of each user. The WMMSE algorithm proposed in R = conv(∪ R (π )). SC−SIC π SC−SIC [42] is adopted to solve problem (3). The main idea of the WMMSE algorithm is to reformulate the WSR problem 3.2.2 SC–SIC per group into its equivalent WMMSE problem and solve it using the Assuming the K users are divided into G groups, denoted alternating optimization (AO) approach. The rate region as G ={1, ... , G}. In each group, there is a subset of users of the MU–LP strategy is approximated by R (u) for MU−LP K , g ∈ G. The user groups satisfy the following condi- different rate weight vectors u. The resulting rate region tions: K ∩ K =∅,if g = g ,and |G |= K.Denote g g g∈G R is the convex hull enclosing the resulting points. MU−LP π as one of the decoding orders of the users in K ,the g g In general, solution to problem (3) would provide the message of user-π (k) is decoded before the message of optimal MU–LP beamforming strategy for any channel user-π (j), ∀k ≤ j. The messages of user-π (k), ∀k ≤ i are g g deployment (in between aligned and orthogonal channels decoded at user-π (i) using SIC. The SINR experienced at and with similar or diverse channel strengths). user-π (i) to decode the message of user-π (k), k ≤ i is g g given by 3.2 NOMA H 2 |h p | NOMA relies on superposition coding at the transmitter π (k) π (i) g γ =  , π (i)→π (k) g g and successive interference cancellation at the receiver. As H 2 |h p | + I + 1 π (j) π (i) j>k,j∈K g g g π (i) discussed in the introduction, the two main strategies in (6) multi-antenna NOMA are the SC–SIC and SC–SIC per group. SC–SIC can be treated as a special case of SC–SIC H 2 where I = |h p | is the inter- π (i) j g g ∈G,g =g j∈K π (i) g g per group where there is only one group of users. group interference suffered at user-π (i). For a given weight vector u =[ u , ... , u ], a fixed grouping method 1 K 3.2.1 SC–SIC G andafixeddecodingorder π ={π , ... , π },the WSR 1 G In SC–SIC, the precoders and decoding orders have to achieved by SC–SIC per group is be optimized jointly. The decoding order is vital to the group rate obtained at each user. To maximize the WSR, all R (u, G, π) = max u R π (k) π (k) SC−SIC g g possible decoding orders of users are required to be con- g∈G k∈K sidered. Denote π as one of the decoding orders, the (7) s.t. tr PP ≤ P message of user-π(k) is decoded before the message of th user-π(j), ∀k ≤ j. The messages of user-π(k), ∀k ≤ i are R ≥ R , ∀k ∈ K decoded at user-π(i) using SIC. The SINR experienced at where R = min {log (1 + γ )}. Simi- π (k) i≥k,i∈K π (i)→π (k) g g 2 g g user-π(i) to decode the message of user-π(k), k ≤ i is larly to the SC–SIC strategy, the problem can be solved by given by using the WMMSE algorithm. To maximize the WSR, all |h p | π(k) possible grouping methods and decoding orders should be π(i) γ =  .(4) π(i)→π(k) H 2 considered. |h p | + 1 π(j) j>k,j∈K π(i) For a given weight vector u = [u , ... , u ] and a fixed 1 K Remark 1: As described in the introduction, it is com- decoding order π, the WSR achieved by SC–SIC is mon in the multi-antenna NOMA literature (SC–SIC and SC–SIC per group) to force users belonging to the same R (u, π) = max u R SC−SIC π(k) π(k) group to share the same precoder, so as to decrease the com- k∈K (5) plexity in user ordering and user grouping. Note that, in the s.t. tr PP ≤ P system model described for both SC–SIC and SC–SIC per th R ≥ R , ∀k ∈ K k group, we consider the most general framework where each k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 8 of 54 message is precoded by its own precoder. Hence, we here as noise. Therefore, each user decodes part of the message do not constrain symbols to be superimposed on the same of the other interfering user encoded in s . The interfer- precoder as this would further reduce the performance of ence is partially decoded at each user. The SINR of the NOMA strategies and therefore leading to even lower per- common stream at user-k is formance. Hence, the performance obtained with NOMA h p 12 k in this work can be seen as the best possible performance γ = .(8) 2 2 H H h p + h p + 1 achieved by NOMA. 1 2 k k Once s is successfully decoded, its contribution to the 4 Methods—proposed rate-splitting multiple original received signal y is subtracted.After that,user-k access decodes its private stream s by treating the private stream In this section, we firstly introduce the idea of RS by of user-j (j = k) as noise. The two-user transmission introducing a two-user example (K = 2) and a three- model using RS is shown in Fig. 1. The SINR of decoding user example (K = 3). Then, we propose the general- the private stream s at user-k is ized framework of RS and specify two low-complexity RS h p strategies. We further compare RSMA with SDMA and γ = .(9) NOMA from the fundamental structure and complex- h p + 1 ity aspects. Finally, we discuss the general optimization The corresponding achievable rates of user-k for the framework to solve the WSR problem. 12 12 streams s and s are R = log 1 + γ and k k R = log (1 + γ ).Toensurethat s is successfully k k 4.1 Two-user example decoded by both users, the achievable common rate shall 12 12 We first consider a two-user example. There are two not exceed R = min R , R . All boundary points for 1 2 messages W and W intended foruser-1and user-2, 1 2 thetwo-userRSrateregioncan be obtained by assuming respectively. The message of each user is split into two that R is shared between users such that C is the kth 12 1 12 2 parts, W , W for user-1 and W , W for user-2. 12 12 1 1 2 2 user’s portion of the common rate with C + C = R . 1 2 12 12 The messages W , W are encoded together into a com- 1 2 Following the two-user RS structure described above, the mon stream s using a codebook shared by both users. 12 total achievable rate of user-k is R = C + R .For a k,tot k Hence, s is a common stream required to be decoded by given pair of weights u = [u , u ], the WSR achieved by 1 2 1 2 both users. The messages W and W are encoded into 1 2 the two-user RS approach is the private stream s for user-1 and s for user-2, respec- 1 2 R (u) = max u R + u R (10a) RS 1 1,tot 2 2,tot tively. The overall data streams to be transmitted based on P,c RS is s =[ s , s , s ] . The data streams are linearly pre- 12 1 2 12 12 s.t. C + C ≤ R (10b) N ×1 12 t 1 2 coded via precoder P =[ p , p , p ],where p ∈ C 12 1 2 12 is the precoder for the common stream s . The resulting tr PP ≤ P (10c) transmit signal is x = Ps = p s + p s + p s . th 12 12 1 1 2 2 R ≥ R , k ∈{1, 2} (10d) k,tot We assume that tr ss = I, and the total transmit c ≥ 0 (10e) power is constrained by tr PP ≤ P . 12 12 At user sides, both user-1 and user-2 firstly decode the where c = C , C is the common rate vector required 1 2 data stream s by treating the interference from s and s 12 1 2 to be optimized in order to maximize the WSR. For a Fig. 1 Two-user transmission model using RS Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 9 of 54 fixed pair of weights, problem (10)can be solved using W are correspondingly encoded with the split messages 12 13 the WMMSE approach in [28], except we have perfect of user-2 W and user-3 W into data streams s and 2 3 CSIT here. By calculating R (u) for a set of different rate s . s is the partial common stream intended for user-1 RS 13 12 weights u, we obtain the rate region. and user-2. Hence, user-1 and user-2 will decode s while In contrast to MU–LP and SC–SIC, the RS scheme user-3 will decode its intended streams by treating s as described above offers a more flexible formulation. In par- noise. Similarly, we obtain s partially encoded for user- 1 2 3 ticular, instead of hard switching between MU–LP and 2and user-3. W , W ,andW are encoded into private 1 2 3 SC–SIC, it allows both to operate simultaneously if neces- streams s , s ,and s ,respectively. 1 2 3 sary, and hence smoothly bridges the two. In the extreme The vector of data streams to be transmitted is of treating multi-user interference as noise, RS boils down s = s , s , s , s , s , s , s . After linear precoding [ ] 123 12 13 23 1 2 3 to MU–LP by simply allocating no power to the com- using precoder P = [p , p , p , p , p , p , p ],the 123 12 13 23 1 2 3 mon stream s . In the other extreme of fully decoding signals are superposed and broadcast. The decoding interference, RS boils down to SC–SIC by forcing one user, procedure when K = 3 is more complex comparing with say user-1, to fully decode the message of the other user, that in the two-user example. The main difference lies say user-2. This is achieved by allocating no power to s , in decoding partial common streams for two-users. encoding W into s and encoding W into s ,suchthat Define the streams to be decoded by l users as l-order 1 1 2 12 x = p s + p s . User-1 and user-2 decode s by treat- streams. The 2-order streams to be decoded at user-1 are 12 12 1 1 12 ing s as noise and user-1 decodes s after canceling s .A s ands . The 2-order streams to be decoded at user-2 1 1 12 12 13 physical-layer multicasting strategy is obtained by encod- and user-3 are s ands and s ands ,respectively.As 12 23 13 23 ing both W and W into s and allocating no power to s the 1-order and 2-order streams to be decoded at differ- 1 2 12 1 and s . ent users are not the same, we take user-1 as an example. The decoding procedure is the same for other users. Remark 2 : It should be noted that while the RS transmit User-1 decodes four streams s , s , s ,ands based on 123 12 13 1 signal model resembles a broadcasting system with uni- SIC while treating other streams as noise. The decoding cast (private) streams and a multi-cast stream, the role of procedure starts from the 3-order stream (common the common message is fundamentally different. The com- stream) and progresses downwards to the 1-order stream mon message in a unicast-multi-cast system carries public (private stream). Specifically, user-1 first decodes s and information intended as a whole to all users in the sys- subtracts its contribution from the received signal. The tem, while the common message s in RS encapsulates SINR of the stream s at user-1 is 12 123 parts of private messages, and is not entirely required by all users, although decoded by the two users for interference h p mitigation purposes [12]. 123 1 γ = . 2 2 H H h p + h p + 1 i∈{12,13,23} 1 k=1 1 Remark 3 : A general framework is adopted where poten- (11) tially each user can split its message into common and private parts. Note however that depending on the objec- tive function, it is sometimes not needed for all users to split After that, user-1 decodes two streams s , s and 12 13 treats interference of s as noise. Both decoding orders their messages. For instance, for sum-rate maximization of decoding s followed by s and s followed by s subject to no individual rate constraint, it is sufficient to 12 13 13 12 should be considered in order to maximize the WSR. have only one user to split its message [28]. However, when Denote π as one of the decoding order to decode it comes to satisfying some fairness (WSR, QoS constraint, l-order streams. There is only one 1-order stream and one max-min fairness), splitting the message of multiple users 3-order stream to be decoded at each user. Therefore, only appears necessary [28, 32, 34]. one decoding order exists for both π and π .Incon- 1 3 4.2 Three-user example trast, each user is required to decode two 2-order streams. We further consider a three-user example. Different from Denote s as the ith data stream to be decoded at π (i) 2,k the two-user case, the message of user-1 is split into user-k based on the decoding order π . One instance of 123 12 13 1 W , W , W , W . Similarly, the message of user- π is 12 → 13 → 23, where s is decoded before 2 12 1 1 1 1 123 12 23 2 2 and user-3 is split into W , W , W , W and s and s is decoded before s at all users. Since 13 13 23 2 2 2 2 123 13 23 3 W , W , W , W , respectively. The superscript rep- only data streams s and s are decoded at user-1, 12 13 3 3 3 3 resents a specific group of users whose messages with the the decoding order at user-1 based on π is π = 2 2,1 same superscript are going to be encoded together. For 12 → 13. Hence, s = s and s = s . π (1) 12 π (2) 13 2,1 2,1 123 123 123 example, W , W ,andW are encoded into the com- The data stream s is decoded before s .The π (1) π (2) 2,1 2,1 1 2 3 mon stream s intended for all the three users. W and SINRs of decoding streams s and s at user-1 are 123 π (1) π (2) 2,1 2,1 1 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 10 of 54 123 12 23 123 13 h p R = C + C + C + R ,and R = C + C + 2,tot 2 3,tot π (1) π (1) 2,1 1 2,1 2 2 2 3 3 γ = . 23 2 2 2 C + R . For a given weight vector u = [u , u , u ] and a H H 3 H 3 1 2 3 h p + h p + h p + 1 π (2) 23 k 1 2,1 1 k=1 1 fixeddecodingorder π = [π , π , π ], the WSR achieved 1 2 3 (12) by the three-user RS approach is H 3 h p π (2) π (2) 1 2,1 2,1 γ = . (13) R (u, π) = max u R (15a) 1 RS 2 2 3 k k,tot H H h p + h p + 1 P,c 23 k 1 k=1 1 k=1 123 123 123 User-1 finally decodes s by treating other data streams s.t. C + C + C ≤ R (15b) 1 2 3 as noise. The three-user RS transmission model with the 12 12 C + C ≤ R (15c) 1 2 decoding order π = 12 → 13 → 23 is shown in Fig. 2. 13 13 C + C ≤ R (15d) TheSINRofdecoding s at user-1 is 13 1 3 23 23 C + C ≤ R (15e) 2 3 h p γ = . (14) 2 2 3 tr PP ≤ P (15f) H H t h p + h p + 1 23 k k=2 1 1 th R ≥ R , k ∈{1, 2, 3} (15g) k,tot The corresponding rate of each data stream is cal- c ≥ 0 (15h) culated in the same way as in the two-user exam- ple. To ensure that s is successfully decoded by all 123 123 123 12 12 13 13 23 23 users, the achievable common rate shall not exceed where c = C , C , C , C , C , C , C , C , C 1 2 3 1 2 1 3 2 3 123 123 123 R = min R , R , R .Toensurethat s is suc- 123 12 is thecommonratevectorrequiredtobeoptimized in 1 2 3 cessfully decoded by user-1 and user-2, the achievable order to maximize the WSR. By calculating R (u, π) RS 12 12 common rate shall not exceed R = min R , R . for a set of different rate weights u,weobtainthe 1 2 13 13 Similarly, we have R = min R , R and R = 13 23 rate region R (π ) of a certain decoding order π.The 1 3 RS 23 23 min R , R . All boundary points for the three-user RS rate region of the three-user RS is achieved as the 2 3 rate region can be obtained by assuming that R , R , 123 12 convex hull of the union over all decoding orders as R ,and R are shared by the corresponding group of 13 23 R = conv R (π ) . RS RS users. Denote the portion of the common rate allocated Similar to the two-user case, SC–SIC and MU–LP 123 123 to user-k for the message s as C ,wehave C + are again easily identified as special sub-strategies of RS k 1 123 123 12 12 C + C = R . Similarly, we have C + C = R , 123 12 by switching off some of the streams. Problem (15)is 2 3 1 2 13 13 23 23 C + C = R ,and C + C = R . Following the 13 23 non-convex and non-trivial. We propose an optimization 1 3 2 3 three-user RS structure described above, the total achiev- algorithm in Section 4.7 to solve it based on the WMMSE 123 12 13 able rate of each user is R = C + C + C + R , 1,tot 1 approach. 1 1 1 Fig. 2 Three-user transmission model using RS Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 11 of 54 4.3 Generalized rate-splitting is the interference at user-k to decode s . π (i) l,k H 2 We further propose a generalized RS framework for |h p | is the interference from the π (j) j>i k l,k K users. The users are indexed by the set K ={1, ... , K }. remaining non-decoded l-order streams in s . l,k |S | For any subset A of the users, A ⊆ K, the BS transmits l−1 l ,k H 2 |h p | is the interference from lower π (j) l =1 j=1 k l ,k adatastream s to be decoded by the users in the subset order streams s , ∀l < l to be decoded at user-k. l ,k A while treated as noise by other users. s loads messages H 2 |h p | is the interference from the streams of all the users in the subset A. The message intended for A ⊆K,k∈ /A k A that are not intended for user-k. The corresponding user-k (k ∈ K)issplit as {W |A ⊆ K, k ∈ A }.The mes- achievable rate of user-k for the data stream s is A π (i) l,k sages {W |k ∈ A} of users with the same superscript A π (i) π (i) l,k l,k are encoded together into the stream s . R = log 1 + γ . To ensure that the streams A 2 k k The stream order defined in Section 4.2 is applied to shared by more than two users are successfully decoded the generalized RS. The stream order of data stream s A by all users, the achievable rate of each user in the subset is |A|. For a given l ∈ K,there are distinct l-order A (A ∈ K,2 ≤|A|≤ K ) to decode the |A|-order stream streams. For example, we have only one K-order stream s shall not exceed (traditional common stream) while we have K 1-order R = min R | k ∈ A . (18) ×1 A ( ) k streams (private steams). Define s ∈ C as the l-order data stream vector formed by all l-order streams For a given l ∈ K,the l-order streams to be decoded in {s |A ⊆ K, |A |= l}. Note that when l = K,there at different users are different. s is decoded at user-k is a single K-order stream. s reduces to s . For example, K K (k ∈ A) based on the decoding order π . R becomes |A|,k when K = 3, the 3-order stream vector is s = s .The 3 123 therateofreceiving stream s at all users in the user 1-order and the 2-order stream vectors are s = [s , s , s ] 1 1 2 3 group A with a certain decoding order π . All boundary |A| and s = [s , s , s ] ,respectively. Thedatastreams are 2 12 13 23 points for the K-userRSrateregioncan be obtained by linearly precoded via the precoding matrix P formed by assuming that R is shared by all users in the user group p |A ⊆ K, |A |= l . The precoded streams are super- A. Denote the portion of the common rate allocated to posed and the resulting transmit signal is A A user-k (k ∈ A) as C ,wehave C = R . Follow- k k ∈A k K K ing the RS structure described above, the total achievable rate of user-k is x = P s = p s . (16) A A l l l=1 l=1 A ⊆K,|A |=l R = C + R , (19) k,tot k A ⊆K,k∈A At user sides, each user is required to decode the intended streams based on SIC. The decoding proce- where R is therateofthe 1-orderstream s .Itisintended k k dure starts from the K-order stream and then goes for user-k only. No common rate sharing is required for down to the 1-order stream. A given user is involved R . For a given weight vector u = [u , ··· , u ] and 1 K in multiple l-order streams with an exception of the a certain decoding order π ={π , ... , π },the WSR 1 K K-order and 1-order streams. Denote π as one of the achieved by RS is decoding orders to decode the l-order data streams s for all users. The l-order stream vector to be decoded R (u, π) = max u R RS k k,tot P,c at user-k basedonacertaindecodingorder π is k∈K s = s , ··· , s ,where S ={s |A ⊆ π π (1) π (|S |) A l,k s.t. C ≤ R , ∀A ⊆ K l,k l,k l,k l,k A K, |A |= l, k ∈ A } is the set of l-order streams to be k ∈A (20) decoded at user-k. We assume s is decoded before H π (i) l,k tr PP ≤ P s if i < j. The SINR of user-k to decode the l-order π (j) l,k th R ≥ R , k ∈ K k,tot stream s with a certain decoding order π is k π (i) l l,k c ≥ 0 H 2 |h p | π (i) π (i) l,k k l,k γ = , (17) P = [P , ... , P ] is the precoding matrix of all order k 1 K I + 1 π (i) l,k streams. c is the common rate vector formed by C |A ⊆ K, k ∈ A . For a fixed weight vector, problem (20) where can be solved using the WMMSE approach discussed in |S | l−1 l ,k Section 4.7 by establishing rate-WMMSE relationships for H 2 H 2 I = |h p | + |h p | π (i) π (j) π (j) l,k k l,k k l ,k all data streams. By calculating R (u, π) for a set of dif- RS j>i j=1 l =1 ferent rate weights u, we obtain the rate region R (π ) RS H 2 + |h p | of a certain decoding order π. To achieve the rate region, A ⊆K,k∈ /A all decoding orders should be considered. The capacity Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 12 of 54 region of RS is defined as the convex hull of the union over The data streams are linearly precoded via precoder all decoding orders as P = [p , p , ... , p ]. The resulting transmit signal is K 1 K x = Ps = p s + p s .Figure 3 shows a 1-layer RS K K k k k∈K model. Readers are referred to Fig. 1 in [12] for a detailed R = conv R (π ) . (21) RS RS π illustration of the 1-layer RS architecture. At user sides, all users firstly decode the data stream s 4.4 Structured and low-complexity rate-splitting by treating the interference from s , ... , s as noise. The 1 K The generalized RS described in Section 4.3 is able to pro- SINR of the K-order stream at user-k is vide more room for rate and QoS enhancements at the h p K k expense of more layers of SIC at receivers. Hence, though γ = . (22) the generalized RS framework is very general and can be h p + 1 j∈K used to identify the best possible performance, its imple- Once s is successfully decoded, its contribution to the mentation can be complex due to the large number of SIC original received signal y is subtracted.After that,user-k layers and common messages involved. To overcome the decodes its private stream s by treating the 1-order pri- problem, we introduce two low-complexity RS strategies vate streams of other users as noise. The SINR of decoding for K users, 1-layer RS and 2-layer hierarchical RS (HRS). the private stream s at user-k is Those two RS strategies require the implementation of one and two layers of SIC at each receiver, respectively. H h p γ = . (23) 4.4.1 1-layer RS 2 h p + 1 j∈K,j=k Instead of transmitting all order streams, 1-layer RS trans- mits the K-order common stream and 1-order private The corresponding achievable rates of user-k for the K K streams. Only one SIC is required at each receiver. streams s and s are R = log 1 + γ and R = K k k k k The message of each user is split into two parts log (1 + γ ).Toensurethat s is successfully decoded k K K k K K W , W , ∀k ∈ K. The messages W , ... , W are by all users, the achievable common rate shall not exceed 1 K k k K K jointly encoded into the K-order stream s intended to R = min R , ... , R . R is shared among users such K K 1 K k K be decoded by all users. W is encoded into s to be that C is the kth user’s portion of the common rate decoded by user-k only. The overall data streams to be with C = R . Following the two-user RS struc- k∈K transmitted based on 1-layer RS is s = [s , s , ... , s ] . K 1 K ture described above, the total achievable rate of user-k Fig. 3 One-layer RS model of K users. The common stream s is shared by all the users K Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 13 of 54 is R = C + R . For a given weight vector u = h p k,tot k K k γ = . (25) [u , ... , u ], the WSR achieved by the K-user 1-layer RS 2 2 1 K H H h p + h p + 1 K j g∈G g j∈K k k approach is Once s is successfully decoded, its contribution to the original received signal y is subtracted. After that, user-k R (u) = max u R (24a) decodes its group common stream s by treating other 1−layerRS k k,tot K P,c k∈K group common streams and 1-order private streams as noise. The SINR of decoding the |K |-order stream s at g K s.t. C ≤ R (24b) user-k is k∈K tr PP ≤ P (24c) H h p K g g k γ = . th R ≥ R , k ∈ K (24d) k,tot 2 k H H h p + h p + 1 K j g ∈G,g =g k j∈K k c ≥ 0 (24e) (26) After removing its contribution to the received sig- K K where c = C , ... , C . For a given weight vector, 1 K nal, user-k decodes its private stream s .The SINR of problem (24) can be solved using the WMMSE approach decoding the private stream s at user-k is in [28]. In contrast to NOMA, this 1-layer RS does not require 2 h p any user ordering or grouping at the transmitter side γ = . H H since all users decode the common message (using single h p + h p + 1 K j g ∈G,g =g j∈K,j=k k g k layer of SIC) before accessing their respective private mes- (27) sages. We also note that the 1-layer RS is a sub-scheme of the generalized RS and is a super-scheme of MU– The corresponding achievable rates of user-k for the LP (since by not allocating any power to the common K K streams s , s ,and s are R = log 1 + γ , K K k g 2 k k message, the 1-layer RS boils down to MU–LP). How- K K g g R = log 1 + γ and R = log 1 + γ .The ( ) k k 2 2 k k ever, for K > 2, SC–SIC and SC–SIC per group are not achievable common rate of s and s shall not exceed K K sub-schemes of 1-layer RS (even though they were sub- g K K schemes of the generalized RS). This explains why, in [12], R = min R , ... , R and R = min R | k ∈ K , K K g g k 1 K k the authors already contrasted 1-layer RS and NOMA respectively. R is shared among users such that C is the and expressed that the two strategies cannot be treated kth user’s portion of the common rate with C = k∈K as extensions or subsets of each other. This 1-layer RS R . R is shared among users in the group K such that K K g appeared in many scenarios subject to imperfect CSIT in C is the kth user’s portion of the common rate with [28, 29, 32–34, 38, 40, 41]. C = R . Following the two-user RS struc- k∈K k 4.4.2 2-layer HRS ture described above, the total achievable rate of user-k is The K users are divided into G groups G ={1, ... , G} R = C + C + R ,where k ∈ K . For a given weight k,tot k g k k with K , g ∈ G users in each group. The user groups sat- vector u =[ u , ... , u ], the WSR achieved by the K-user 1 K isfy the same conditions as in Section 3.2.2. Besides the 2-layer HRS approach is K-order stream and 1-order streams, 2-layer HRS also allows the transmission of a |K |-order stream intended R (u) = max u R (28a) 2−layerHRS k k,tot P,c for users in K . The overall data streams to be transmitted k∈K based on 2-layer RS is s = s , s , ... , s , s , ... , s . K K K 1 K K 1 G s.t. C ≤ R (28b) The data streams are linearly precoded via precoder k∈K P = p , p , ... , p , p , ... , p . The resulting trans- K K K 1 K 1 G C ≤ R , ∀g ∈ G (28c) mit signal is x = Ps = p s + p s + p s . K K K K K k k k g∈G g g k∈K k∈K Figure 4 shows an example of 2-layer HRS. The users are g divided into two groups, K ={1, 2}, K ={3, 4}. s is a 1 2 1234 tr PP ≤ P (28d) 4-order stream intended for all the users while s and s 12 34 th R ≥ R , k ∈ K (28e) k,tot are 2-order streams for users in each group only. c ≥ 0 (28f) Each user is required to decode three streams s , s , K K and s . We assume k ∈ K .The data stream s is decoded k g K where c is the common rate vector formed by first by treating the interference from all other streams as C , C |k ∈ K, k ∈ K , g ∈ G . For a given weight noise. TheSINRofthe K-order stream at user-k is k k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 14 of 54 Fig. 4 Two-layer HRS example, K = 4, G = 2, K ={1, 2}, K ={3, 4} 1 2 vector, problem (28) can be solved by simply modifying Let us further discuss how the proposed framework of the WMMSE approach discussed in Section 4.7. generalized RS in Section 4.3 contrasts and encompasses Comparing with SC–SIC per group where |K |−1layers NOMA, SDMA, and RS strategies. We first compare the of SIC are required at user sides, 2-layer HRS only requires four-user MIMO–NOMA scheme illustrated in Fig. 5 of two layers of SIC at each user. Moreover, the user order- [1] with the four-user 2-layer HRS strategy illustrated in ing issue in SC–SIC per group does not exist in 2-layer Fig. 4.InFig.5of [1], user-1 and user-2 are superposed HRS. The streams of a higher stream order will always be in the same beam. User-3 and user-4 share another beam. decoded before the streams of a lower stream order. One- The users are decoded based on SC–SIC within each layer RS is the simplest architecture since only one SIC is beam. As for the four-user 2-layer HRS strategy in Fig. 4, needed at each user and it is a sub-scheme of the 2-layer the encoded streams are precoded and transmitted jointly HRS. We also note that we can obtain a 1-layer RS per to users. If we set the common message s to be encoded group from the 2-layer HRS by not allocating any power to by the message of user-2 only and decoded by both user- s . Note that SC–SIC and SC–SIC per group are not nec- 1 and user-2, the common message s to be encoded by K 34 essarily sub-schemes of the 2-layer HRS. The 2-layer HRS the message of user-4 and decoded by user-3 and user- strategy was first introduced in [39] in the massive MIMO 4, we also set the precoders p and p to be equal, the 12 1 context. precoders p and p to be equal, and the precoders of 34 3 other streams to be 0, then the proposed RS scheme 4.5 Encompassing existing NOMA and SDMA reduces to the scheme illustrated in Fig. 5 of [1]. Simi- A comparison of NOMA, SDMA and RSMA are shown larly, the K-user RS model can be reduced to the K-user in Table 1. Comparing with NOMA and SDMA, the MIMO–NOMA scheme. Therefore, the MIMO–NOMA most important characteristic of RSMA is that it partially scheme proposed in [1] is a particular case of our RS decodes interference and partially treats interference as framework. noise through the split into common and privates mes- In view of the above discussions, it should now be sages. This capability enables RSMA to maintain a good clear that SDMA and the multi-antenna NOMA strate- performance for all user deployment scenarios and all gies discussed in the introduction (relying on SC–SIC and network loads, as it will appear clearer in the numerical SC–SIC per group) are all special instances of the gener- results of Section 5. alized RS framework. Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 15 of 54 Table 1 Comparison of different strategies Multiple NOMA SDMA RSMA access Strategy SC–SIC SC–SIC per group MU–LP All forms of RS Design Fully decode interference Fully decode interference in each Fully treat interference as Partially decode principle group and treat interference noise interference and partially between groups as noise treat interference as noise Decoder SIC at receivers SIC at receivers Treat interference as noise SIC at receivers architecture User Users experience aligned Users in each group experience Users channels are Any angle between deployment channel directions and a aligned channel directions and a (semi-)orthogonal with channels and any disparity scenario large disparity in channel large disparity in channel strengths. similar channel strengths in channel strengths strengths Users in different groups experience orthogonal channels Network load More suited to overloaded More suited to overloaded network More suited to Suited to any network load network underloaded network In the proposed generalized K-userRSmodel,ifwe Table 2.InTable 2, RS refers to the generalized RS of set P = 0, ∀l ∈{2, ··· , K }, only 1-order streams (private Section 4.3. streams) are transmitted. Each user only decodes its As mentioned in the introduction, the complexity of intended private stream by treating others as noise. NOMA in the multi-antenna setup is increasing signif- Problem (20) is then reduced to the SDMA problem (3). icantly at both the transmitter and the receivers. The If the message of each user is encoded into one stream optimal decoding order of NOMA is no longer fixed of distinct stream order, problem (20) is equivalent to the based on the channel gain as in the SISO BC. To maxi- SC–SIC problem (5). By keeping 1-order and K-order mize the WSR, the decoding order should be optimized streams, we have the 1-layer RS strategy whose perfor- together with precoders at the transmitter. Moreover, SC– mance benefit in the presence of imperfect CSIT was SIC is suitable for aligned users with large channel gain highlighted in various scenarios in [28, 29, 32–34, 38, 40, 41]. difference. A proper user scheduling algorithm increases There is only one common data stream to be transmitted the scheduler complexity. At user sides, K − 1layersof and decoded by all users before each user decodes its SIC are required at each user for a K-user SC–SIC sys- private stream. By keeping 1-order, K-order, and l-order tem. Increasing the number of users leads to a dramatic streams, where l is selected from {2, ··· , K − 1},the increase of the scheduler and receiver complexity and is problem becomes the 2-layer HRS originally proposed in subject to more error propagation in the SICs. [39] with two layers of common messages to be transmit- SC–SIC per group reduces the complexity at user sides. ted. Another example of such a multi-layer RS has also Only layers of SIC are required at each user if we appeared in the topological RS for MISO networks of uniformly group the K users into G groups. However, the [30]. Therefore, the formulated K-user RS problem is a complexity at the transmitter increases with the number more general problem. It encompasses SDMA, NOMA, of user groups. A joint design of user ordering and user and existing RS methods as special cases. grouping for all groups is necessary in order to maximize Though the current work focuses on MISO BC, the RS the WSR. For example, for a 4-user system, if we divide framework can be extended to multi-antenna users and the users into two groups with two users in each group, the general MIMO BC [31] as well as to a general network we should consider three different user grouping meth- scenario with multiple transmitters [30]. Nevertheless, the ods and four different decoding orders for each grouping optimization of the precoders in those scenarios remain method. interesting topics for future research. Applications of this The complexity of MU–LP is much reduced as it does RS framework to relay networks is also worth explor- not require any SIC at user sides. However, as MU–LP is ing. Preliminary ideas have appeared in [43], though joint more suitable for users with (semi-)orthogonal channels encoding of the splitted common messages are not taken and similar channel strengths, the transmitter requires into account. accurate CSIT and user scheduling should be carefully designed for interference coordination. The scheduler complexity at the transmitter is still high. 4.6 ComplexityofRSMA Comparing with NOMA and SDMA, RSMA is able We further discuss the complexity of RSMA by com- to balance the performance and complexity better. All paring it with NOMA and SDMA. A qualitative com- forms of RS are suitable for users with any channel gain parison of NOMA, SDMA, and RSMA is shown in Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 16 of 54 Table 2 Qualitative comparison of the complexity of different strategies Multiple NOMA SDMA RSMA access Strategy SC–SIC SC–SIC per group MU–LP RS 1-layer RS Encoder Encode K streams Encode K streams Encode K streams Encode K private streams Encode K + 1streams complexity plus additional common streams Scheduler Very complex as it Very complex as it requires Complex as MU–LP Complex as it requires to Simpler user scheduling as complexity requires to find to divide users into requires to pair together decide upon suitable RS copes with any user aligned users and orthogonal groups, with semi-orthogonal users decoding order of the deployment scenario, decide upon suitable aligned users in each with similar channel gains streams with the same does not rely on user user ordering group and decide upon stream order grouping and user suitable user ordering in ordering each group Receiver Requires multiple Requires multiple layers of Does not require any SIC Requires multiple layers of Requires a single layer of complexity layers of SIC. Subject SICineachgroup anda SIC. Subject to error SIC for all users. Less to error propagation single layer of SIC if propagatio subject to error groups are made of 2 propagation users. Subject to error propagation difference and any channel angle in between, though a SICs. The 3-order stream s is decoded first. It is 123 123 multi-layer RS would have more flexibility. Considering estimated as s ˆ = g y ,where g is the equal- 123 1 1 1 the generalized RS, the decoding order of multiple streams izer. After successfully decoding and removing s with the same stream order should be optimized together from y , the estimate of the 2-order stream s 1 π (1) 2,1 π (1) 2,1 H with the precoders when there are multiple streams of is s ˆ = g y − h p s . Similarly, we π (1) 1 123 123 2,1 1 1 the same stream order intended for each user (e.g., each calculate the estimates of s ˆ and s ˆ as s ˆ = π (2) 1 π (2) 2,1 2,1 user decodes two 2-order streams in the 3-user example π (2) 2,1 H H g y − h p s − h p s and s ˆ = 1 123 123 π (1) π (1) 1 2,1 2,1 1 1 1 of Section 4.2.). But its special case, 1-layer RS, simplifies 1 H H H g y −h p s −h p s − h p s , 1 123 123 π (1) π (1) π (2) π (2) 2,1 2,1 2,1 2,1 1 1 1 1 both the scheduler and receiver design, and it is still able π (1) π (2) 2,1 2,1 respectively. g , g , g are the corresponding to achieve a good performance in all user deployment sce- 1 1 1 equalizers at user-1. The mean square error (MSE) of narios. One-layer RS requires only one SIC at each user. It each stream is defined as ε  E |s −ˆ s | .Theyare k k k does not rely on user grouping and user ordering for user calculated as scheduling. Therefore, the complexity of the scheduler is much simplified. 123 123 2 123 123 H ε =|g | T − 2 g h p + 1, 1 1 1 1 1 The cost of RSMA comes with a slightly higher encoding π (1) π (1) π (1) π (1) 2,1 2,1 2 2,1 2,1 H complexity since private and common streams need to be ]ε =|g | T − 2 g h p + 1, π (1) 1 1 1 1 1 2,1 encoded. For the 1-layer RS in a K-user MISO BC, K + 1 π (2) π (2) π (2) π (2) 2,1 2,1 2 2,1 2,1 H streams need to be encoded in contrast to K streams for ε =|g | T − 2 g h p + 1, π (2) 1 2,1 1 1 1 1 NOMA and SDMA. 1 1 2 1 1 H ε =|g | T − 2 g h p + 1, 1 1 1 1 1 4.7 Optimization of RS (29) The WMMSE approach proposed in [42] is extended to 123 H 2 H 2 H 2 where T  |h p | +|h p | +|h p | + solve the problem. The WMMSE algorithm to solve the 123 12 13 1 1 1 1 H 2 H 2 H 2 H 2 |h p | +|h p | +|h p | +|h p | + 1 is the receive sum rate maximization problem with 1-layer RS (dis- 23 1 2 3 1 1 1 1 π (1) π (2) 2,1 2,1 123 H 2 cussed in Section 4.4.1)isproposedin[28]. We further power at user-1. T  T −|h p | , T 1 1 1 1 π (1) π (2) extend it to solve the generalized RS problem (20). To sim- 2,1 H 2 1 2,1 H 2 T −|h p | , T  T −|h p | .The π (1) π (2) 2,1 2,1 1 1 1 1 1 plify the explanation, we focus on the 3-user problem (15). optimum MMSE equalizers are It can be easily extended to solve the K-user generalized MMSE −1 RS problem. 123 H 123 g = (p ) h T , 123 1 1 1 As the 1-order and 2-order streams to be decoded MMSE −1 π (1) π (1) 2,1 H 2,1 at different users are not the same, we take user- g = (p ) h T , π (1) 1 1 2,1 1 1 as an example. The procedure of the WMMSE (30) MMSE −1 π (2) π (2) 2,1 H 2,1 algorithm is the same for other users. The signal g = (p ) h T , π (2) 1 1 2,1 1 received at user-1 is y = h Ps + n .Itdecodes 1 1 MMSE −1 1 H 1 four streams s , s , s , s sequentially using g = (p ) h T . 123 π (1) π (2) 1 1 1 2,1 2,1 1 1 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 17 of 54 π (1) 2,1 ∗ MMSE π (2) π (2) ∗ MMSE ∂ε ∂ε 2,1 2,1 1 1 1 1 g = g ,and g = g . They are calculated by solving = 0, = 0, 1 1 1 1 123 π (1) 2,1 ∂g 1 ∂g Substituting the optimum equalizers into (32), we obtain π (2) 2,1 ∂ε ∂ε 1 1 = 0, = 0. Substituting (30)into(29), the π (2) 1 2,1 ∂g ∂g 1 MMSE MMSE MMSEs become 123 123 123 123 123 ξ g = u ε − log u , 1 1 1 1 2 1 MMSE MMSE π2,1(1) π2,1(1) π2,1(1) π2,1(1) π2,1(1) ξ g = u ε − log u , 1 1 1 1 1 MMSE −1 123 123 123 123 MMSE MMSE ε  min ε = T I , π (2) π (2) π (2) π (2) π (2) 2,1 2,1 2,1 2,1 2,1 1 1 1 1 123 ξ g = u ε − log u , g 1 1 1 1 2 1 MMSE MMSE 1 1 1 1 1 MMSE −1 ξ g = u ε − log u . π (1) π (1) π (1) π (1) 1 1 1 1 2 1 2,1 2,1 2,1 2,1 ε  min ε = T I , 1 1 1 1 π (1) 2,1 (33) MMSE −1 π (2) π (2) π (2) π (2) 2,1 2,1 2,1 2,1 MMSE 123 123 ε  min ε = T I , ∂ξ g 1 1 1 1 1 1 π (2) 2,1 By furthersolving theequations = 0, g 123 1 ∂u MMSE MMSE π (1) π (1) π (2) π (2) 2,1 2,1 2,1 2,1 ∂ξ g ∂ξ g MMSE −1 1 1 1 1 1 1 1 1 ε  min ε = T I , 1 1 1 1 = 0, = 0, 1 π (1) π (2) 2,1 2,1 1 ∂u ∂u 1 1 MMSE 1 1 (31) ∂ξ g 1 1 and = 0, we obtain the optimum MMSE ∂u weights as π (1) π (1) π (2) π (2) 2,1 2,1 2,1 2,1 123 1 where I = T , I = T , I = T ,and 1 1 1 1 1 1 1 1 H 2 −1 = T −|h p | .Based on (31), the SINRs of decod- 1 ∗ MMSE MMSE 1 1 1 123 123 123 u = u  ε , 1 1 1 ing the intended streams at user-1 can be expressed as MMSE −1 MMSE π (1) π (1) 2,1 2,1 123 123 ∗ MMSE MMSE γ = 1/ ε −1, γ = 1/ ε −1, π (1) π (1) π (1) 2,1 2,1 2,1 1 1 1 1 u = u  ε , 1 1 1 MMSE π (2) π (2) MMSE 2,1 2,1 1 1 γ = 1/ ε −1, and γ = 1/ ε −1. −1 1 1 1 1 ∗ MMSE MMSE π (2) π (2) π (2) 2,1 2,1 2,1 u = u  ε , The corresponding rates are rewritten as R =− log 1 1 1 1 2 MMSE MMSE π (1) π (1) 2,1 2,1 −1 ε , R =− log ε , ∗ MMSE MMSE 1 1 1 1 1 2 1 u = u  ε . 1 1 1 MMSE π (2) π (2) 2,1 2,1 (34) R =− log ε ,and R =− log 2 2 1 1 1 MMSE ε . The augmented WMSEs are Substituting (34)into(33), we establish the rate WMMSE relationship as 123 123 123 123 ξ = u ε − log u , 1 1 1 2 1 MMSE 123 123 123 ξ  min ξ = 1 − R , 1 1 1 123 123 π (1) π (1) π (1) π (1) u ,g 2,1 2,1 2,1 2,1 1 1 ξ = u ε − log u , 1 1 1 2 1 MMSE π (1) π (1) π (1) 2,1 2,1 2,1 (32) ξ  min ξ = 1 − R , 1 1 1 π (2) π (2) π (2) π (2) 2,1 2,1 2,1 2,1 π (1) π (1) 2,1 2,1 ξ = u ε − log u , u ,g 2 1 1 1 1 1 1 MMSE π (2) π (2) π (2) 2,1 2,1 2,1 1 1 1 1 ξ  min ξ = 1 − R , ξ = u ε − log u , 1 1 1 1 1 1 2 1 π (2) π (2) 2,1 2,1 u ,g 1 1 MMSE 1 1 1 ξ  min ξ = 1 − R . 1 1 1 1 1 u ,g π (1) π (2) 1 1 123 2,1 2,1 1 where u , u , u ,andu are weights associated 1 1 1 1 π (1) (35) 2,1 ∂ξ ∂ξ 1 1 with each stream at user-1. By solving = 0, = 0, 123 π (1) 2,1 ∂g 1 ∂g π (2) 2,1 Similarly, we can establish the rate-WMMSE relationships ∂ξ ∂ξ 1 1 = 0, and = 0, we derive the optimum equaliz- π (2) 2,1 ∂g for user-2 and user-3. Motivated by the rate-WMMSE ∂g 1 ∗ MMSE relationship in (35), we reformulate the optimization ∗ MMSE π (1) π (1) 2,1 2,1 123 123 ers as g = g , g = g , 1 1 1 1 problem (15)as Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 18 of 54 on the updated (x, P) in nth iteration. is the tolerance min u ξ (36a) k k,tot of the algorithm. The AO algorithm is guaranteed to con- P,x,u,g k=1 verge as the WSR is increasing in each iteration and it is 123 123 123 bounded above for a given power constraint. s.t. X + X + X + 1 ≥ ξ (36b) 1 2 3 12 12 X + X + 1 ≥ ξ (36c) 1 2 Algorithm 1: Alternating Optimization Algorithm 13 13 X + X + 1 ≥ ξ (36d) 13 [n] 1 3 [n] 1 Initialize: n ← 0, P ,WSR ; 23 23 2 repeat X + X + 1 ≥ ξ (36e) 2 3 3 n ← n + 1; [n−1] tr PP ≤ P (36f) 4 P ← P; MMSE n−1 5 u ← u (P ); th ξ ≤ 1 − R , k ∈{1, 2, 3} (36g) k,tot MMSE n−1 6 g ← g (P ); 7 update (x, P) by solving (36)using theupdated u x ≤ 0 (36h) and g; [n] [n−1] 123 123 123 12 12 13 13 23 23 8 until |WSR − WSR |≤ ; where x = X , X , X , X , X , X , X , X , X 1 2 3 1 2 1 3 2 3 is the transformation of the common rate c. 123 123 123 12 12 13 13 23 23 1 2 2 u = u , u , u , u , u , u , u , u , u , u , u , u . 1 2 3 1 2 1 3 2 3 1 2 3 123 123 123 12 12 13 13 23 23 1 2 2 g = g , g , g , g , g , g , g , g , g , g , g , g . When considering imperfect CSIT, we follow the robust 1 2 3 1 2 1 3 2 3 1 2 3 123 12 13 1 123 12 23 2 ξ = X +X +X + ξ , ξ = X +X +X + ξ approach proposed in [28] for 1-layer RS with imper- 1,tot tot 1 1 1 1 2 2 2 2 123 13 23 3 and ξ = X + X + X + ξ are individual WMSEs. fect CSIT. The precoders are optimized based on the 3,tot 3 3 3 3 123 123 123 12 12 ξ = max ξ , ξ , ξ , ξ = max ξ , ξ , available channel estimate to maximize a conditional aver- 123 12 1 2 3 1 2 13 13 23 23 ξ = max ξ , ξ , ξ = max ξ , ξ are the aged weighted sum rate (AWSR) metric, computed using 13 23 1 3 2 3 achievable WMSEs of the corresponding streams. partial CSIT knowledge. The stochastic AWSR problem It can be easily shown that by minimizing (36a)with was transformed into a deterministic counter part using respect to u and g, respectively, we obtain the MMSE the sample average approximated (SAA) method. Then, MMSE MMSE solutions u , g formed by the corresponding the rate-WMMSE relationship is applied to transform the MMSE equalizers and weights. They satisfy the KKT opti- AWSR problem into a convex form and solved using an mality conditions of (36)for P. Therefore, according to AO algorithm. The robust approach for 1-layer RS in [28] the rate-WMMSE relationship (35) and the common rate can be easily extended to solve the K-user generalized RS transformation c =−x,problem (36) can be transformed problem based on our proposed Algorithm 1, which will ∗ ∗ ∗ ∗ to problem (15). For any point (x , P , u , g )satisfying not be explained here. the KKT optimality conditions of (36), the solution given ∗ ∗ ∗ by (c =−x , P ) satisfies the KKT optimality conditions 5 Results and discussion of (15). The WSR problem (15) is then transformed into In this section, we evaluate the performance of SDMA, the WMMSE problem (36). The problem (36) is still non- NOMA , and RSMA in a wide range of network convex for the joint optimization of (x, P, u, g). We have loads (underloaded and overloaded regimes) and user derived that when (x, P, u) are fixed, the optimal equal- deployments (with a diversity of channel directions, chan- MMSE izer is the MMSE equalizer g .When(x, P, g)are nel strengths, and qualities of channel state information at MMSE fixed, the optimal weight is the MMSE weight u . the transmitter). We first illustrate the rate region of dif- When (u, g)are fixed, (x, P) is coupled in the optimization ferent strategies in the two-user case followed by the WSR problem (36), closed-form solution cannot be derived. But comparisons of the three-user, four-user, and ten-user it is a convex quadratically constrained quadratic pro- cases. gram (QCQP) which can be solved using interior-point methods. These properties motivates us to use AO to 5.1 Underloaded two-user deployment with perfect CSIT solve the problem. In nth iteration of the AO algorithm, When K = 2, therateregionofall strategies canbe the equalizers and weights are firstly updated using the explicitly compared in a two-dimensional figure. As men- precoders obtained in the n − 1th iteration (u, g) = tioned earlier, the rate region is the set of all achievable MMSE [n−1] MMSE [n−1] u (P ), g (P ) . With the updated (u, g), points. Its boundary is calculated by varying the weights (x, P) can then be updated by solving the problem (36). assigned to users. In this work, the weight of user-1 is (u, g) and (x, P) are iteratively updated until the WSR fixed to u = 1. The weight of user-2 is varied as [−3,−1,−0.95,··· ,0.95,1,3] converges. The details of the AO algorithm is shown in u = 10 , which is the same as in [n] Algorithm 1, where WSR is the WSR calculated based [42]. To investigate the largest achievable rate region, the Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 19 of 54 individual rate constraints are set to 0 in all strategies as the number of transmit antennas increases. In con- th R = 0, ∀k ∈{1, 2}. trast, the average rate region achieved by SC–SIC is small. 2 2 In the perfect CSIT scenario, the capacity region is When σ = 1andσ = 1, there is no disparity of average 1 2 achieved by DPC. Therefore, we compare the rate regions channel strengths. SC–SIC is not able to achieve a good of different beamforming strategies with the DPC region. performance in such scenario. As the SC–SIC strategy is The DPC region is generated using the algorithm in [44]. motivated by leveraging the channel strength difference Since the WSR problems for all beamforming strategies among users, it achieves a good performance when the described earlier are non-convex, the initialization of P channels are degraded. Specifically, when the channels of is vitaltothe finalresult.Ithas been observed in[28] users are close to alignment, SC–SIC works better than that maximum ratio transmission (MRT) combined with MU–LP if the users have asymmetric channel strengths. singular value decomposition (SVD) provides good over- However, for the general non-degraded MISO-BC, all performance over various channel realizations. It is SC–SIC often yields a performance loss [19]. The simu- 2 2 used in this work for precoder initialization of RS. The lation results when σ = 1, σ = 0.09, and N = 2is 1 2 precoders for the private message p is initialized as illustrated in Fig. 6. The average channel gain difference h αP k t between the users increases to 5 dB, and the number of p = p ,where p = and 0 ≤ α ≤ 1. The k k k h  2 the transmit antenna reduces to two. In such scenario, the precoder for the common message is initialized as p = rate region gap between RS and MU–LP increases while p u ,where p = (1 − α)P and u is the largest left 12 12 12 t 12 the rate region gap between RS and SC–SIC decreases. It singular vector of the channel matrix H =[ h , h ]. It is 1 2 shows that SC–SIC is more suited to the scenarios where calculated as u = U(:, 1). U is derived based on the SVD the users experience a large disparity in channel strengths. of H, i.e., H = USV . To ensure a fair comparison, the In both Figs. 5 and 6, the rate region gaps among differ- precoders of MU–LP are initialized based on MRT. For ent strategies increase with SNR. RS achieves a larger rate SC–SIC, the precoder of the user decoded first is initial- region than SC–SIC and MU–LP, and it is closer to the ized based on SVD and that of the user decoded last is capacity region achieved by DPC. initialized based on MRT. 5.1.2 Specific channel realizations 5.1.1 Random channel realizations In order to have a better insight into the benefits of RS We firstly consider the scenarios when the channel of each over MU–LP and SC–SIC, we investigate the influence user h has independent and identically distributed (i.i.d.) of user angle and channel strength on the performance. complex Gaussian entries with a certain variance, i.e., When N = 4, the channels of users are realized as CN 0, σ . The BS is equipped with two or four antennas (N = 2, 4) and serves two single-antenna users. Figure 5 shows the average rate regions of different strategies over h = [1, 1, 1, 1] , 2 2 (37) 100 random channel realizations when σ = 1, σ = 1, 1 2 jθ j2θ j3θ h = γ × 1, e , e , e . and N = 4. SNRs are 10 and 20 dB, respectively. When the number of transmit antenna is larger than the number of users, MU–LP achieves a good performance. The gen- In above channel realizations, γ and θ are control vari- erated precoders of the users tend to be more orthogonal ables. γ controls the channel strength of user-2. If γ = 1, SNR=10 dB SNR=20 dB a b 6 10 DPC RS SC-SIC MU-LP 0 0 0246 02468 10 R (bit/s/Hz) R (bit/s/Hz) 1,tot 1,tot Fig. 5 Achievable rate region comparison of different strategies in underloaded two-user deployment with perfect CSIT, averaged over 100 random 2 2 channel realizations, σ = 1, σ = 1, and N = 4. a SNR = 10 dB. b SNR = 20 dB 1 2 R (bit/s/Hz) 2,tot R (bit/s/Hz) 2,tot Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 20 of 54 SNR=10 dB SNR=20 dB a b 1.5 4 DPC 0.5 RS 1 SC-SIC MU-LP 0 0 012345 02468 R (bit/s/Hz) R (bit/s/Hz) 1,tot 1,tot Fig. 6 Achievable rate region comparison of different strategies in underloaded two-user deployment with perfect CSIT, averaged over 100 random 2 2 channel realizations, σ = 1, σ = 0.09, and N = 2. a SNR = 10 dB. b SNR = 20 dB 1 2 π 2π π 4π π the channel strength of user-1 is equal to that of user-2. θ = , , , . Intuitively, when θ is less than ,the 9 9 3 9 9 If γ = 0.3, user-2 suffers from an additional 5 dB path channels of users are sufficiently aligned and SC–SIC per- 4π loss compared to user-1. θ controls the angle between the forms well. When θ is larger than , the channels of users channels of user-1 and user-2. It varies from 0 to .If are sufficiently orthogonal to each other and MU–LP is θ = 0, the channel of user-1 is aligned with that of user-2. more suitable. Therefore, we consider angles within the π π 4π If θ = , the channels of user-1 and user-2 are orthog- range of , . SNR is fixed to 20 dB. When N = 2, the 2 9 9 onal to each other. In the following results, γ = 1, 0.3, channels of user-1 and user-2 are realized as h = [1, 1] jθ which corresponds to 0 dB, 5 dB channel strength dif- and h = γ × 1, e , respectively. The same values of γ ference, respectively. For each γ , θ adopts value from and θ are adopted in N = 2asusedin N = 4 . t t ab cd Fig. 7 Achievable rate region comparison of different strategies in underloaded two-user deployment with perfect CSIT, γ = 1and N = 4, SNR = 20 dB. a θ = π/9. b θ = 2π/9. c θ = π/3. d θ = 4π/9 R (bit/s/Hz) 2,tot R (bit/s/Hz) 2,tot Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 21 of 54 Figure 7 shows the results when γ = 1and N = 4. In for users. MU–LP is more suited to underloaded scenarios all subfigures, the rate region achieved by RS is equal to or (N > K). In both Figs. 7 and 8, the rate region of SC–SIC larger than that of SC–SIC and MU–LP. When γ = 1and is the worst due to the equal channel gain. In contrast, RS θ = , the channels of user-1 and user-2 almost coincide. performs well for any angle between user channels. RS exhibits a clear rate region improvement over SC–SIC Figure 9 shows the rate region comparison of DPC, RS, and MU–LP. SC–SIC cannot achieve a good performance SC–SIC, and MU–LP transmission schemes with 5 dB due to the equal channel gain while the performance channel strength difference between the two users, i.e., of MU–LP is poor when the user channels are closely γ = 0.3 and N = 4. RS and SC–SIC are much closer to aligned to each other. As θ increases, the gap between the DPC region in the setting of Fig. 9 compared to Fig. 7 the rate regions of RS and MU–LP reduces as the per- because of the 5 dB channel strength difference. Figure 9b, formance of MU–LP is better when the channels of users c are interesting as SC–SIC and MU–LP outperform each are more orthogonal to each other while the gap between other at one part of the rate region. There is a crosspoint the rate regions of MU–LP and SC–SIC increases. The between the two schemes in each figure mentioned. The rate regions of RS and MU–LP tend to the capacity region rate region of RS is equal to or larger than the convex hull achieved by DPC as θ increases. As shown in Fig. 7d,when of the rate regions of SC–SIC and MU–LP. the channels of users are sufficiently orthogonal to each Figure 10 shows the rate region comparison when other, the rate regions of DPC, RS, and MU–LP are almost γ = 0.3 and N = 2. Comparing Fig. 10 with Fig. 9, identical. In such an orthogonal scenario, RS reduces to SC–SIC achieves a relatively better performance when the MU–LP. number of transmit antenna reduces. The WSRs of RS Figure 8 shows the results when γ = 1and N = 2. and SC–SIC are overlapped, and they almost achieve the In all subfigures, RS outperforms MU–LP and SC–SIC. capacity region when θ = . However, as θ increases, Comparing with the results of N = 4, the rate region gap the rate region gap between RS and SC–SIC increases between RS and MU–LP is enlarged when N = 2. When despite the 5 dB channel gain difference. Both SC–SIC and the number of transmit antenna decreases, it becomes RS rely on one SIC when there are two users in the sys- more difficult for MU–LP to design orthogonal precoders tem. Though the receiver complexity of SC–SIC and RS ab cd Fig. 8 Achievable rate region comparison of different strategies in underloaded two-user deployment with perfect CSIT, γ = 1and N = 2, SNR = 20 dB. a θ = π/9. b θ = 2π/9. c θ = π/3. d θ = 4π/9 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 22 of 54 ab cd Fig. 9 Achievable rate region comparison of different strategies in perfect CSIT, γ = 0.3, N = 4, SNR = 20 dB. a θ = π/9. b θ = 2π/9. c θ = π/3. d θ = 4π/9 are the same, RS achieves explicit performance gain over user-k. h has i.i.d. complex Gaussian entries drawn from SC–SIC in most investigated scenarios. Comparing with CN 0, σ . The error covariance of user-1 and user-2 e,k MU–LP and SC–SIC, RS is suited to any channel angles −0.6 −0.6 2 2 are σ = P and σ = γ P , respectively. The pre- e,1 t e,2 t and channel gain difference. coders are initialized and designed using the estimated More results of underloaded two-user deployments channels h andh and the same methods as stated in per- 1 2 with perfectCSITare giveninAppendix 1.Wefurther fect CSIT scenarios. One thousand different channel error illustrate the rate regions of different strategies when SNR samples are generated for each user. Each point in the is 10 dB. Comparing the corresponding figures of 10 dB rate region is the average rate over the generated 1000 and 20 dB, we conclude that as SNR increases, the gaps channels. SNR is fixed to 20 dB. among the rate regions of different schemes increase, with Figures 11 and 12 show the results when γ = 1and RS exhibiting further performance benefits. In all inves- γ = 0.3, respectively. Similarly to the results in per- tigated scenarios, RS always outperforms MU–LP and fect CSIT, the gaps between the rate regions of RS and SC–SIC. MU–LP reduce as θ increases in both figures. When 4π θ = , the channels of the two users are sufficiently 5.2 Underloaded two-user deployment with imperfect orthogonal. The rate regions of RS and MU–LP are almost CSIT identical. SC–SIC achieves a good performance when the Next, we investigate the rate region of different trans- channels of users are sufficiently aligned with enough mission schemes in the presence of imperfect CSIT. We channel gain difference, as shown in Fig. 12a. assume theusersareabletoestimatethe channelper- Comparing Figs. 11 and 7,the rate region gapbetween fectly while the instantaneous channel estimated at the RS and MU–LP increases in imperfect CSIT due to BS is imperfect. We assume the estimated channel of the residual interference introduced. The interference- user-1 and user-2 are h = [1, 1, 1, 1] and h = γ × 1 2 nulling in MU–LP is distorted and yields residual inter- jθ j2θ j3θ 1, e , e , e when N = 4. For the given channel ference at the receiver, which jeopardizes the achievable estimate at the BS, the channel realization is h = h + rate. In contrast, the rate region gap between RS and k k h and ∀k ∈{1, 2},where h is the estimation error of SC–SIC slightly reduces in imperfect CSIT, as observed k k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 23 of 54 ab cd Fig. 10 Achievable rate region comparison of different strategies in perfect CSIT, γ = 0.3, N = 2, SNR = 20 dB. a θ = π/9. b θ = 2π/9. c θ = π/3. d θ = 4π/9 ab cd Fig. 11 Average rate region comparison of different strategies in imperfect CSIT, γ = 1, N = 4, SNR = 20 dB. a θ = π/9. b θ = 2π/9. c θ = π/3. d θ = 4π/9 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 24 of 54 ab cd Fig. 12 Average rate region comparison of different strategies in imperfect CSIT, γ = 0.3, N = 4, SNR = 20 dB. a θ = π/9. b θ = 2π/9. c θ = π/3. d θ = 4π/9 by comparing Fig. 12 with Fig. 9.SC–SICislesssensitive with a fixed weight vector u, the WSRs instead of the rate to CSIT inaccuracy comparing with MU–LP. However, regions of different transmission strategies are compared the rate region gap between RS and SC–SIC is still obvi- in the three-user case. ous. In comparison, RS is more flexible and robust to Two RS schemes are investigated in three-user deploy- multi-user interference originating from the imperfect ments. RS refers to the generalized RS strategy of CSIT, as evidenced by the recent literature on RS with Section 4.2 and 1-layer RS refers to the low-complexity imperfect CSIT [27–33, 38–41]. With RS, the amount of RS strategy of Section 4.4.1. We compare the WSR of RS, interference decoded by both users (through the pres- 1-layer RS, DPC, SC–SIC, and MU–LP. The beamform- ence of common stream) is adjusted dynamically to the ing initialization of different strategies is extended based channel conditions (channel directions and strengths) and on the methods adopted in the two-user case. There are CSIT inaccuracy. three streams of distinct stream orders in RS (1/2/3-order More results of underloaded two-user deployments streams). The precoders of the streams are initialized dif- with imperfectCSITare giveninAppendix 2.The rate ferently. The transmit power P is divided into three parts regions of different strategies for varied SNR, N and γ α P , α P ,and α P for streams of three distinct stream t 1 t 2 t 3 t are illustrated. We further show that the performance of orders, where α , α , α ∈[0,1] and α + α + α = 1. The 1 2 3 1 2 3 RS is stable in a wide range of parameters, namely num- precoder p , ∀k ∈{1, 2, 3} of the 1-order stream (private h α P k 1 t ber of transmit antennas, user deployments, and CSIT stream) s is initialized as p = p ,where p = k k k k h  3 inaccuracy. RS achieves equal or better performance than is the allocated power. The precoders p , p ,andp of 12 13 23 MU–LP and SC–SIC in all simulated channels. the 2-order streams are initialized as p = p u , p = 12 12 12 13 p u ,and p = p u ,respectively,where p = p = 13 13 23 23 23 12 13 α P 2 t p = and u is the largest left singular vector of 5.3 Underloaded three-user deployment with perfect 23 12 the channel matrix H =[ h , h ]. Similarly, u and u CSIT 12 1 2 13 23 are the largest left singular vectors of the channel matri- When K = 3, the rate region of each strategy is a three- ces H =[ h , h ]and H =[ h , h ], respectively. The dimensional surface. The gaps among rate regions of dif- 13 1 3 23 2 3 precoder p of the 3-order stream (conventional com- ferent strategies are difficult to display. As each point of the rate region is derived by solving the WSR problem mon stream) s is initialized as p = p u ,where 123 123 123 123 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 25 of 54 p = α P and u is the largest left singular vector γ andγ and θ andθ are control variables as discussed in 123 3 t 123 1 2 1 2 of the channel matrix H =[ h , h , h ]. The beamform- thetwo-usercase. Foragivenset of γ andγ , θ adopts 123 1 2 3 1 2 1 π 2π π 4π ing initialization of 1-layer RS is similar as RS except we value from θ = , , , and θ = 2θ .When 1 2 1 9 9 3 9 π 2π have p and p , ∀k ∈{1, 2, 3} only. By setting α = 0, 123 k 2 θ = andθ = , the channels of user-1 and user- 1 2 9 9 the initialization of RS is applied to 1-layer RS. To ensure 2, and user-2 and user-3 are sufficiently aligned. When 4π 8π a fair comparison, the precoders of MU–LP are initial- θ = andθ = , the channels of user-1 and user-2 1 2 9 9 ized based on MRT. For SC–SIC, the precoder of the user and user-2 and user-3 are sufficiently orthogonal. We con- decoded first p is initialized as p = p u , π(1) π(1) π(1) π(1) sider SNRs within the range 0 to 30 dB. We assume the where p = α P and u is the largest left singu- π(1) 3 t π(1) sum of the weights allocated to users is equal to one, i.e., larvectorofthe channelmatrix H =[ h , h , h ]. The 123 1 2 3 u + u + u = 1. 1 2 3 precoder of the user decoded secondly p is initialized π(2) Figures 13 and 14 show the results when the weight as p = p u ,where p = α P and u π(2) π(2) π(2) π(2) 2 t π(2) vectors are u =[ 0.2, 0.3, 0.5] and u =[ 0.4, 0.3, 0.3], respec- is the largest left singular vector of the channel matrix tively. In both figures, γ = 1andγ = 0.3. There is a 5 dB 1 2 H =[ h , h ]. The user decoded last is initialized π(23) π(2) π(3) channel gain difference between user-1 and user-3 as well based on MRT. as between user-2 and user-3. In all scenarios and SNRs, We firstly consider an underloaded scenario. The BS is RS always outperforms MU–LP and SC–SIC. Comparing equipped with four transmit antennas (N = 4) and serves with Fig. 14, the WSR improvement of RS is more explicit three single-antenna users in all simulations. The individ- in Fig. 13. It implies that RS provides better enhancement th ual rate constraint is set to 0, R = 0, ∀k ∈{1, 2, 3}.The of system throughput and user fairness. The performance channel of users are realized as of SC–SIC is the worst in most subfigures. This is due to the underloaded user deployments where N > K.One h = 1, 1, 1, 1 , [ ] 1 of the three users are required to decode all the messages, and all the spatial multiplexing gains are sacrificed. There- jθ j2θ j3θ 1 1 1 h = γ × 1, e , e , e , (38) 2 1 fore, the sum DoF of SC–SIC is reduced to 1, resulting in jθ j2θ j3θ 2 2 2 h = γ × 1, e , e , e . the deteriorated performance of SC–SIC in underloaded 3 2 ab cd Fig. 13 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = 1, γ = 0.3, 1 2 th u = 0.2, u = 0.3, u = 0.5, N = 4, R = 0, k ∈{1, 2, 3}. a θ = π/9, θ = 2π/9. b θ = 2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = π/9, 1 2 3 t 1 2 1 2 1 2 1 θ = 8π/9 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 26 of 54 ab cd Fig. 14 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = 1, γ = 0.3, 1 2 th u =0.4, u =0.3, u =0.3, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =4π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 SC-SIC order 1:s → s → s 1 2 3 scenarios. In comparison, the performance of MU–LP is better than SC–SIC except in Fig. 14a. MU–LP is more SC-SIC order 2:s → s → s 2 1 3 likely to serve the users with higher weights and chan- SC-SIC order 3:s → s → s 1 3 2 nel gains by turning off the users with poor weights SC-SIC order 4:s → s → s 3 1 2 and channel gains when there is no individual rate con- SC-SIC order 5:s → s → s straints. It cannot deal efficiently with user fairness when a 2 3 1 higher weight is allocated to the user with weaker channel SC-SIC order 6:s → s → s 3 2 1 strength. In contrast, SC–SIC works better when user fair- ness is considered. The WSR achieved by low-complexity 1-layer RS is equal to or larger than that of MU–LP In Fig. 15, the WSR of six different decoding orders are andSC–SICinmostsubfigures. ComparingwithSC–SIC illustrated in the circumstance where there is a 5dB chan- and MU–LP, 1-layer RS is more robust to different user nel gain difference between user-1/2 and user-3. When deployments and only a single SIC is required at each γ = 1andγ = 0.3, it is typical to decode the message of 1 2 user. Moreover, the WSR of 1-layer RS is approaching that user-3 first as the channel gain of user-3 is the worst. How- of RS in all user deployments. Considering the trade-off ever, we notice that the optimal decoding order in Fig. 15 between performance and complexity, 1-layer RS is a good is order 3, user-1 is decoded first. This is due to the small- alternativetoRS. est weight allocated to user-1, u = 0.2. It implies that the In all three-user deployments of SC–SIC, the decod- weights assigned to users will affect the optimal decod- ing order is required to be optimized together with the ing order. The scheduler complexity of SC–SIC becomes precoder. To investigate the influence of different decod- extremely high in order to find the optimal decoding ing orders, we compare the WSRs of SC–SIC using order. In contrast, 1-layer RS has a much lower scheduling different decoding orders when u = 0.2, u = 0.3, complexity and does not rely on any user ordering at the 1 2 and u = 0.5. There are in total six different decoding transmitter. Moreover, it only requires a single SIC at each receiver. orders: Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 27 of 54 ab cd Fig. 15 Weighted sum rate versus SNR comparison of different decoding order of SC–SIC for underloaded three-user deployment with perfect CSIT, th γ = 1, γ = 0.3, u = 0.2, u = 0.3, u = 0.5, N = 4, R = 0, k ∈{1, 2, 3}. a θ = π/9, θ = 2π/9. b θ = 2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. 1 2 1 2 3 t 1 2 1 2 1 2 d θ = 4π/9, θ = 8π/9 1 2 More results of underloaded three-user deployments antenna deployment, we assume the rate threshold th th th with perfect CSIT and imperfect CSIT are given in of each user is equal R = R = R .SincetheBS 1 2 3 Appendices 3 and 5, respectively. The WSRs of different is able to serve users with higher QoS requirements strategies for varied SNR, N , γ , γ ,and u are illus- as SNR increases, the rate threshold is assumed to t 1 2 trated. In all figures, RS outperforms SC–SIC and MU–LP. increase with SNR. The rate threshold increases as Though the scheduler and receiver complexity of 1-layer r = [0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz for th RS is low, it achieves equal or better performance than SNR = [0, 5, 10, 15, 20, 25, 30] dBs. SC–SIC and MU–LP in most figures of perfect CSIT and We compare the performance of RS, 1-layer RS, SC–SIC, all figures of imperfect CSIT. All forms of RS are robust to MU–LP, and SC–SIC per group in the overloaded three- a wide range of CSIT inaccuracy, channel gain difference, user deployment. In SC–SIC per group, we consider a and channel angles among users. fixed grouping method. We assume user-1 is in group 1 while user-2 and user-3 are in group 2. The decoding order 5.4 Overloaded three-user deployment with perfect CSIT will be optimized together with the precoder. The beam- 5.4.1 Two transmit antenna deployment forming initialization of SC–SIC per group is different We first consider an overloaded scenario where the BS from SC–SIC. In group 1, the precoder of user-1 is ini- is equipped with two antennas (N = 2) and serves tialized basedonMRT.Ingroup 2, theprecoderofthe three single-antenna users. The channel realizations and user decoded first p is initialized as p = p u π(1) π(1) π(1) π(1) beamforming initialization follows the methods used in and u is the largest left singular vector of the channel π(1) the underloaded three-user deployment. The channel of matrix H =[ h , h ]. The precoder of the user decoded 23 2 3 jθ users are realized as h = [1, 1] , h = γ × 1, e , 1 2 1 secondly is initialized based on MRT. jθ and h = γ × 1, e . In overloaded scenarios, to RS exhibits a clear WSR gain over SC–SIC, SC–SIC per 3 2 guarantee some QoS, we add individual rate constraints group, and MU–LP in Fig. 16,where γ = 1, γ = 0.3, 1 2 to users as the system has otherwise a tendency to and u =[ 0.4, 0.3, 0.3]. The WSR of MU–LP deteriorates turn off some users. In all simulations of two transmit in such overloaded scenario. When the individual rate Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 28 of 54 ab cd Fig. 16 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with perfect CSIT, γ = 1, γ = 0.3, 1 2 u =0.4, u =0.3, u =0.3, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ = 2π/9, θ = 4π/9. c θ = π/3, θ =2π/3. 1 2 3 t 1 2 1 2 1 2 th d θ = 4π/9, θ = 8π/9 1 2 constraints are not zero and N < K, MU–LP cannot achieved by SC–SIC. Nevertheless, in view of the benefit coordinate the multi-user interference coming from all the of 1-layer RS in the MISO BC, we may wonder whether users served simultaneously. When the angles of chan- RS can be of any help in a SISO BC, especially when it nels are large enough (subfigure c and subfigure d of comes to reducing the complexity of the receivers and the Fig. 16), the WSR of SC–SIC per group is better than number of SIC needed. SC–SIC. This is due to its ability to combine treating inter- We therefore compare the performance of 1-layer ference as noise (to tackle inter-group interference) with RS with SC–SIC in a 3-user SISO BC. We note that decoding interference (to tackle intra-group interference). SC–SIC requires two layers of SIC while 1-layer RS However, as the angles of channels decrease, the perfor- requires a single SIC for all users. The channel of each user mance of SC–SIC becomes better while that of SC–SIC h has an i.i.d. complex Gaussian entry with a certain vari- per group is worse. Whether SC–SIC outperforms SC–SIC ance, i.e., CN 0, σ .Figure 17 shows the average WSRs per group depends on SNR and user deployments. To of different strategies over ten random channel realiza- 2 2 2 ensure the WSR of the NOMA system is maximized, a tions when σ = 1, σ = 0.3, andσ = 0.1. 1-layer RS is 1 2 3 joint optimization of NOMA strategies based on switch- able to achieve very close performance to SC–SIC. Com- ing between SC–SIC and SC–SIC per group on top of paring with SC–SIC, the complexity of 1-layer RS is much deciding, the user grouping and user ordering is required. reduced. There is no ordering issue at the BS, and only one Such switching method has high scheduler and receiver SIC is required at each user. Jointly considering the per- complexity while its achieved performance is still lower formance and complexity of the system, 1-layer RS is an than the simple 1-layer RS in most user deployments. attractive alternative to SC–SIC. More results of overloaded three-user deployments 5.4.2 Single transmit antenna deployment with perfect CSIT and imperfect CSIT are given in In a SISO BC, there is no need to split the messages into Appendices 4 and 6, respectively. The WSRs of different common and private parts since the capacity region is strategies for varied SNR, N , γ , γ ,and u are illustrated. t 1 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 29 of 54 h = [1, 1] , jθ h = γ × 1, e , 2 1 (39) jθ h = γ × 1, e , 3 2 jθ h = γ × 1, e . 4 3 γ , γ , γ and θ , θ , θ are control variables. θ is the chan- 1 2 3 1 2 3 1 nel angle between user-1 and user-2. It is denoted as intra-group angle of group 1. θ is the channel angle between user-1 and user-2. θ − θ is the channel angle 2 1 between user-2 and user-3, denoted as inter-group angle. θ is the channel angle between user-1 and user-3. θ − θ 3 3 2 is the channel angle between user-3 and user-4. It is the intra-group angle of group 2. In the following, we assume the intra-group angle of group 1 is the same as that of group 2. We have θ = θ + θ . In each figure, the intra- 3 1 2 π π π group angle is varied as θ = 0, , , . The individual 18 9 6 rate constraint is set to r =[ 0.03, 0.1, 0.2, 0.3, 0.4, 0.4, 0.4] th bit/s/Hz for SNR =[ 0, 5, 10, 15, 20, 25, 30] dBs. The Fig. 17 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with perfect CSIT, weightsofusers areassumedtobeequal,i.e., u = u = 1 2 2 2 2 σ = 1, σ = 0.3, σ = 0.1, N = 1, r =[ 0, 0, 0.01, 0.03, 0.1, 0.2, 0.3] t th 1 2 3 u = u = 0.25. We also assume the channel gain dif- 3 4 bit/s/Hz ference within each group is equal. The channel gain of user-3 is equal to that of user-1 (γ = 1), and the channel We further show that RS exhibits a clear WSR gain over gain of user-4 is equal to that of user-2 (γ = γ ). 3 1 SC–SIC, SC–SIC per group, and MU–LP in all simulated Figures 18 and 19 show the results when γ = 0.3. π π channels and weights. One-layer RS outperforms SC– The inter-group angles are and ,respectively. The 9 3 SIC, SC–SIC per group and MU–LP in most simulated WSR achieved by 2-layer HRS is equal to 1-layer RS in scenarios. It is more robust and achieves a nearly equiva- both figures, which means that 2-layer HRS reduces to lent WSR to that of RS in all user deployments. We also 1-layer RS in these user deployments. Two-layer HRS and show that 1-layer RS achieves near optimal performance 1-layer RS outperform all other schemes. The inter-group in various channel conditions of SISO BC. and intra-group interference can be jointly mitigated by one layer common message. As the inter-group angle 5.5 Overloaded four-user deployment with perfect CSIT increases, the WSR gaps between 2-layer HRS and 1-layer We further investigate the four-user system model shown RS per group reduces. The inter-group interference can in Fig. 4, where user-1 and user-2 are in group 1 while be coordinated by SDMA when the inter-group angle is user-3 and user-4 are in group 2. We compare the 2-layer sufficiently large. One-layer RS per group has the same HRS, 1-layer RS per group, 1-layer RS, SC–SIC per group, WSR as SC–SIC per group in both figures. It reduces and MU–LP. In 2-layer HRS, the intra-group interfer- to SC–SIC per group because SC–SIC is more suitable ence is mitigated using the intra-group common streams when the intra-group angle is sufficiently small and the s and s , and the inter-group interference is mitigated channel gain difference between users within each group 12 34 is sufficiently large. using the inter-group common stream s .One-layer RS More results of overloaded four-user deployments with and 1-layer RS per group are two special strategies of 2- layer HRS. All users in 1-layer RS are treated as single perfectCSITare giveninAppendix 7. The WSRs of dif- group. Only the 4-order common stream s and 1-order ferent strategies when there is no channel gain difference private streams are active. No power is allocated to s and (γ = 1) are illustrated. We further show that 2-layer HRS, 12 1 s . In contrast, 1-layer RS per group only allocate power 1-layer RS, and 1-layer RS per group achieve equal or bet- to the intra-group common stream s and s and 1-order ter performance than SC–SIC per group and MU–LP in 12 34 private streams. No power is allocated to the inter-group all simulated channel conditions. common stream s . Users within each group are served using RS and users across groups are served using SDMA 5.6 Overloaded ten-user deployment with perfect CSIT so as to mitigate the inter-group interference. We further consider an extremely overloaded scenario We consider an overloaded scenario. The BS is equipped subject to QoS constraints. The BS is equipped with two with two antennas and serves four single-antenna users. antennas (N = 2) and serves ten users. The channel of Thechannelofusers arerealizedas each user h has i.i.d. complex Gaussian entries with a k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 30 of 54 ab cd Fig. 18 Weighted sum rate versus SNR comparison of different strategies for overloaded four-user deployment with perfect CSIT, γ = 0.3, θ = θ + , r =[ 0.03, 0.1, 0.2, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = 0. b θ = π/18. c θ = π/9. d θ = π/6 2 1 1 1 1 1 th certain variance, i.e., CN (0, σ ). The rate of each user is limited by two, given the two transmit antennas). To fur- averaged over the 10 randomly generated channels. We ther investigate the reason behind the results, we focus on compare 1-layer RS, MU–LP, multi-cast, and SC–SIC with one random channel realization. The WSRs achieved by a certain decoding order. There are 10! different decod- all strategies when SNR = 30 dB are compared as shown ing orders of SC–SIC in the ten-user case. The optimal in Fig. 21. The optimized common rate vector of one-layer decoding order of SC–SIC is intractable. In the follow- RS is c =[0,0.1,0.1,0.1,0,0.1,0.1,0.1,0.1,0.1] bit/s/Hz. ing simulations, only the decoding order based on the No common rate is allocated to user-1 and user-5. But ascending channel gain is considered for WSR calculation in Fig. 21, we can observe that the rate allocated to user- in SC–SIC. It is the optimal decoding order in SISO BC. 1 and user-5 are the highest. It implies that RS uses the Multicast can be regarded as a special scheme of 1-layer common message to pack messages from eight users and RS with only the 10-order stream to be transmitted to all uses two transmit antennas to deliver private messages to users. The weight of each user is assumed to be equal to 1. user-1 and user-5. RS achieves a sum-DoF of 2 in the over- Figure 20 shows the WSRs of different strategies when loaded regime. In contrast, MU–LP and SC–SIC allocate 2 2 2 σ = σ = ... = σ = 1, r =[ 0.01, 0.03, 0.05, 0.1, 0.1, most of power to single user. The rate achieved by user- th 1 2 10 0.1, 0.1] bit/s/Hz. The WSR achieved by the multi-cast 5 when using MU–LP and the rate achieved by user-10 scheme is the worst. In such an overloaded user deploy- when using SC–SIC are much higher than other users in ment, the spectral efficiency of multi-cast is low as it is Fig. 21. The DoFs achieved by MU–LP and SC–SIC are difficult for a single beamformer to satisfy all users. Under limited to 1 in such circumstance. therateconstraint r , the WSR of SC–SIC is better than Note that results here show the usefulness of the RS th that of MU–LP while the slopes of the WSRs are the framework for massive IoT or MTC services. Those same for large SNRs. It implies that SC–SIC and MU–LP devices are typically cheap. In the example above, user-1 achieve the same DoF of 1. In contrast, 1-layer RS shows and user-5 could be high-end devices, for which RS would an obvious WSR improvement over all other strategies be implemented. Those devices would therefore perform and exhibits a DoF of two. This highlights that RS exploits SIC. All other devices could be IoT or MTC devices, who the maximum DoF of the considered deployments (that is would not need to implement RS, nor SIC, but simply Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 31 of 54 ab cd Fig. 19 Weighted sum rate versus SNR comparison of different strategies for overloaded four-user deployment with perfect CSIT, γ = 0.3, θ = θ + , r =[ 0.03, 0.1, 0.2, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = 0. b θ = π/9. c θ = 2π/9. d θ = π/3 2 1 1 1 1 1 th decode the common message. Hence, the RS framework illustrate WSRs of different strategies when the rate canbeusedtopackthe IoT/MTCtrafficinthe common threshold r , and channel gain difference are changed. th message. We show that the when the rate threshold of each user More results of overloaded ten-user deployments with is 0, MU–LP is able to achieve a DoF of 2. However, as perfect CSIT are given in Appendix 8.Wefurther the rate threshold increases, MU–LP cannot coordinate the inter-user interference and its achieved DoF drops to 1. In the extremely overloaded scenario, the WSR gap between RS and SC–SIC is still large. SC–SIC makes an inefficient use of the transmit antennas and achieves a DoF of 1. 6Conclusions To conclude, we propose a new multiple access called rate-splitting multiple access (RSMA). We compare the proposed RSMA with SDMA and NOMA by solving the problem of maximizing WSR in MISO-BC systems with QoS constraints. Both perfect and imperfect CSIT are investigated. WMMSE and its modified algorithms are adopted to solve the respective optimization problems. We show that SDMA and NOMA are subject to many limitations, including high-system complexity and a lack of robustness to user deployments, network load, and Fig. 20 Weighted sum rate versus SNR comparison of different CSIT inaccuracy. We propose a general multiple access strategies for overloaded ten-user deployment with perfect CSIT, framework based on rate splitting (RS), where the com- 2 2 2 σ = σ = ... = σ = 1, N = 2, SNR = 30 dB, r =[ 0.01, 0.03, 0.05, t th 1 2 10 mon symbols decoded by different groups of users are 0.1, 0.1, 0.1, 0.1] bit/s/Hz transmitted on top of private symbols decoded by the Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 32 of 54 1-layer RS SC-SIC MU-LP multicast -1 -2 123456789 10 users Fig. 21 Individual rate comparison of different strategies for overloaded ten-user deployment with perfect CSIT for 1 randomly generated channel estimate, SNR = 30 dB, N = 2, r =[ 0.01, 0.03, 0.05, 0.1, 0.1, 0.1, 0.1] bit/s/Hz th corresponding users only. Thanks to its ability of partially Endnotes decoding interference and partially treating interference In the sequel, power-domain NOMA will be referred as noise, RSMA softly bridges and outperforms SDMA simply by NOMA. and NOMA in any user deployments, CSIT inaccuracy, 2 Recall that SU–MIMO in LTE Rel. 8 was designed and network load. The simplified RS forms, such as with minimum mean square error–SIC (MMSE–SIC) in 1-layer RS and 2-layer HRS, show great potential to reduce mind [45]. the scheduler and receiver complexity but maintain good The DoF characterizes the number of interference-free and robust performance in any user deployments, CSIT streams that can be transmitted or equivalently the pre- inaccuracy, and network load. Particularly, we show that 1-layer RS is an attractive alternative to SC–SIC in a log factor of the rate at high SNR. SISO BC deployment due to its near optimal performance This can be easily seen since, for the receiver forced to and very low complexity. Therefore, RSMA is a more decode all streams, the model reduces to a multiple access general and powerful multiple access for downlink multi- channel (MAC) with a single-antenna receiver, which has antenna systems that encompasses SDMA and NOMA as a sum-DoF of 1. This was discussed in length in [34]. special cases. Recall that this spatial multiplexing gain is the main RSMA has the potential to change the design of the driver for using multiple antennas in a multi-user setup physical layer and MAC layer of next-generation com- and the introduction of MU–MIMO in 4G [18]. munication systems by unifying existing approaches and “Common” is sometimes referred to as “public.” relying on a superposed transmission of common and This also contrasts with NOMA, for which the use- private messages. Many interesting problems are left for future research, including among others the role played fulnessofSC–SICinaBC is knownfor several decades by RSMA to achieve the fundamental limits of broadcast, [7, 8]. interference and relay channels in the presence of imper- Note that in the specific case where we have finite pre- fect CSIT and disparity of channel strengths, optimization cision CSIT, the sum DoF collapses to 1 [26], and RS, (robust design, sum-rate maximization, max-min fair- SC–SIC,and TDMA all achieve the same optimal DoF. ness, QoS constraints) of RSMA, performance analysis of It is worth noting that Rate-Splitting Multiple Access RSMA, RSMA design for multi-user/massive/millimeter- (RSMA) also exists in the uplink for the SISO Multi- wave/multi-cell/network MIMO, modulation and cod- ing for RSMA, RSMA with multi-carrier transmissions, ple Access Channel [46]. Though they share the same RSMA with linear versus nonlinear precoding, resource name and the splitting of the messages, they have different allocation and cross-layer design of RSMA, security pro- motivations and structures. visioning in RSMA, RSMA design for cellular and satellite As already explained in [12], RS canalsobeseenasa communication networks, prototyping and experimenta- form of non-orthogonal multi-user transmission. Indeed, tion of RSMA, and standardization issues (link/system- in its simplest form, the common message in RS can be level evaluations, receiver implementation, transmission seen as a non-orthogonal layer added onto the private schemes/modes, CSI feedback mechanisms, and down- link and uplink signaling) of RSMA. layers. Individual Rate (bit/s/Hz) Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 33 of 54 ThisbenefitofRSwas brieflypointed outin[39]. theconvexhull of therateregions of SC–SIC andMU– 12 LP. However, as SNR decreases to 10 dB, the crosspoints Note that OMA (single-user beamforming) is a subset disappear in Figs. 24b and 25d. The rate regions of SC– of MU–LP and is obtained by allocating power exclusively SIC overlap with that of RS. RS reduces to SC–SIC, and to s or s . 1 2 they outperform MU–LP in the whole rate region. Note that for a given θ, the users’ direction of arrival Appendix 2 (DoA)are thesamefor N = 2and N = 4scenarios t t Underloaded two-user deployment with imperfect CSIT while the channel angle is more orthogonal when N = 4 To further study the influence of CSIT inaccuracy, SNR, comparing with that when N = 2. number of transmit antennas, and user deployments, we The readers are referred to [28] for a rigorous discus- illustrate the rate region of different strategies when SNR, sion about the notion of average rate. N ,and γ are varied in Figs. 26, 27, 28, 29, 30, and 31. Figures 26 and 27 show the corresponding results of Appendix 1 Figs. 11 and 12 when SNR decreases to 10 dB. The rate Underloaded two-user deployment with perfect CSIT region gaps among users decreases when SNR decreases. To further investigate the influence of SNR, we illustrate Figures 28 and 29 show the results when γ = 1and the rate region of different strategies when SNR is 10 dB N = 2. When SNR is 10 dB, the rate regions of the in Figs. 22, 23, 24, and 25 and compare with the results three schemes are very close to each other. When SNR is when SNR is 20 dB in Figs. 7, 8, 9,and 10.Comparing the 20 dB, the rate region of RS shows explicit improvement corresponding figures of 10 and 20 dB, we observe that the over the rate regions of MU–LP and SC–SIC. Comparing rate region gaps among different schemes grow with SNR. Fig. 29 with Fig. 8, the performance of MU–LP is worse As SNR increases, the performance improvement of RS when CSIT is imperfect. It shows that MU–LP requires becomes more obvious. Specifically, SC–SIC and MU–LP accurate CSIT to design precoders. There is no cross- outperform each other at one part of the rate region in point between SC–SIC and MU–LP in Figs. 27c and 12b Figs. 9b and 10d and the rate region of RS encompasses compared, respectively, with Figs. 24c and 9b. ab cd Fig. 22 Achievable rate region comparison of different strategies in underloaded two-user deployment with perfect CSIT, γ = 1, N = 4, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. d θ = 4π/9 = Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 34 of 54 ab cd Fig. 23 Achievable rate region comparison of different strategies in underloaded two-user deployment with perfect CSIT, γ = 1, N = 2, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. d θ = 4π/9 ab cd Fig. 24 Achievable rate region comparison of different strategies in perfect CSIT, γ = 0.3, N = 4, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 35 of 54 ab cd Fig. 25 Achievable rate region comparison of different strategies in perfect CSIT, γ = 0.3, N = 2, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 ab cd Fig. 26 Average rate region comparison of different strategies in imperfect CSIT, γ = 1, N = 4, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 36 of 54 ab cd Fig. 27 Average rate region comparison of different strategies in imperfect CSIT, γ = 0.3, N = 4, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 ab cd Fig. 28 Average rate region comparison of different strategies in imperfect CSIT, γ = 1, N = 2, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 37 of 54 ab cd Fig. 29 Average rate region comparison of different strategies in imperfect CSIT, γ = 1, N = 2, SNR = 20 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 ab cd Fig. 30 Average rate region comparison of different strategies in imperfect CSIT, γ = 0.3, N = 2, SNR = 10 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 38 of 54 ab cd Fig. 31 Average rate region comparison of different strategies in imperfect CSIT, γ = 0.3, N = 2, SNR = 20 dB. a θ = π/9. b θ 2π/9. c θ = π/3. t = d θ = 4π/9 Figures 30 and 31 show the results when γ = 0.3. SNR vector, the WSR of SC–SIC becomes closer to that of is 10 and 20 dB, respectively. The rate region gap between RS as the channel gain differences among users increase. RS and SC–SIC reduces in imperfect CSIT, as observed by For example, we compare Figs. 13, 32, and 36 for a fixed comparing Fig. 31 with Fig. 10. Comparing with MU–LP, u =[ 0.2, 0.3, 0.5]. When u =[ 0.4, 0.3, 0.3], the WSR of SC–SIC is less sensitive to CSIT inaccuracy. RS and MU–LP are almost identical. In such scenario, RS reduces to MU–LP. In subfigure d of each figure, 4π 8π θ = and θ = , the channels of user-1 and Appendix 3 1 2 9 9 user-2, and the channels of user-2 and user-3 are suffi- Underloaded three-user deployment with perfect CSIT ciently orthogonal while the channels of user-1 and user-3 We consider three different sets of γ , γ .When γ = γ = 1, 1 2 1 2 are almost in opposite directions. In such circumstance, the three users have no channel strength difference. When the WSRs of RS and MU–LP strategies overlap with the γ = 1, γ = 0.3, there is a 5-dB channel strength differ- 1 2 optimal WSR achieved by DPC. ence between user-1 and user-3 as well as between user-2 and user-3. When γ = 0.3, γ = 0.1, there is a 5-dB 1 2 channel strength difference between user-1 and user-2 as Appendix 4 well as user-2 and user-3. The channel strength differ- Overloaded three-user deployment with perfect CSIT ence between user-1 and user-3 is 10 dB. We consider (1) Two transmit antenna deployment three different weight vectors for each set of γ , γ , i.e., Figures 39, 40, 41, 42, and 43 show the results when γ , γ , 1 2 1 2 u = [0.2, 0.3, 0.5], u = [0.4, 0.3, 0.3],and u = [0.6, 0.3, 0.1]. and u are varied as discussed in Appendix C. In all figures (Figs. 32, 33, 34, 35, 36, 37, and 38), RS exhibits a clear WSR gain over SC–SIC, SC–SIC theWSR of RS is equaltoorbetterthanthatofMU– per group, and MU–LP in all figures (Figs. 39, 40, 41, LP and SC–SIC. Considering a specific scenario where 42, and 43). One-layer RS outperforms SC–SIC, SC–SIC 2π 4π θ = , θ = ,and u =[ 0.6, 0.3, 0.1], the WSR of RS is per group, and MU–LP in most figures. It further shows 1 2 9 9 better than that of MU–LP and SC–SIC as shown in Figs. that 1-layer RS outperforms the joint switching between 34b, 35b, and 38b. As SNR increases, the WSR improve- SC–SIC and SC–SIC per group in most user deployments ment of RS is generally more obvious. For a fixed weight while the complexity of 1-layer RS is much reduced. In Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 39 of 54 ab cd Fig. 32 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = γ = 1, 1 2 th u =0.2, u =0.3, u =0.5, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 ab cd Fig. 33 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = γ = 1, 1 2 th u =0.4, u =0.3, u =0.3, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 40 of 54 ab cd Fig. 34 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = γ = 1, 1 2 th u =0.6, u =0.3, u =0.1, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 ab cd Fig. 35 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = 1, γ = 0.3, 1 2 th u =0.6, u =0.3, u =0.1, N =4, R =0, k ∈{1, 2, 3}.a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 41 of 54 ab cd Fig. 36 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = 0.3, γ = 0.1, 1 2 th u =0.2, u =0.3, u =0.5, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 ab cd Fig. 37 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = 0.3, γ = 0.1, 1 2 th u =0.4, u =0.3, u =0.3, N = 4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 42 of 54 ab cd Fig. 38 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with perfect CSIT, γ = 0.3, γ = 0.1, 1 2 th u =0.6, u =0.3, u = 0.1, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 ab cd Fig. 39 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with perfect CSIT, γ = γ = 1, u = 0.2, 1 2 1 u = 0.3, u = 0.5, N = 2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ = 2π/9. b θ = 2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. 2 3 t 1 2 1 2 1 2 th d θ = 4π/9, θ = 8π/9 1 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 43 of 54 ab cd Fig. 40 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with perfect CSIT, γ = γ = 1, u = 0.4, 1 2 1 u = 0.3, u = 0.3, N = 2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ = 2π/9. b θ = 2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. 2 3 t 1 2 1 2 1 2 th d θ = 4π/9, θ = 8π/9 1 2 ab cd Fig. 41 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with perfect CSIT, γ = γ = 1, u = 0.6, 1 2 1 u = 0.3, u = 0.1, N = 2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ = 2π/9. b θ = 2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. 2 3 t th 1 2 1 2 1 2 d θ = 4π/9, θ = 8π/9 1 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 44 of 54 ab cd Fig. 42 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with perfect CSIT, γ = 1, γ = 0.3, 1 2 u =0.2, u =0.3, u =0.5, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t 1 2 1 2 1 2 th d θ = 4π/9, θ = 8π/9 1 2 ab cd Fig. 43 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with perfect CSIT, γ = 1, γ = 0.3, 1 2 u =0.6, u =0.3, u =0.1, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t th 1 2 1 2 1 2 d θ = 4π/9, θ = 8π/9 1 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 45 of 54 Figs. 39a–c and 40a–c, 1-layer RS achieves the same WSR as RS. It implies that RS reduces to 1-layer RS in these user deployments. Both of RS and 1-layer RS achieve higher WSRs than all other strategies. (2) Single transmit antenna deployment Figures 44 and 45 show the average rate regions of differ- ent strategies over 10 random channel realizations when 2 2 2 2 2 2 σ = σ = σ = 1and σ = σ = 1, σ = 0.3, respec- 1 2 3 1 2 3 tively. We further show that 1-layer RS is an attractive alternativetoSC–SIC. Appendix 5 Underloaded three-user deployment with imperfect CSIT We consider the imperfect CSIT scenarios. The channel model in the two-user deployment with imperfect CSIT is extended here. The estimated channel of user-1, user-2, and user-3 are initialized using Eq. (38). For the given channel estimate at the BS, the channel realization is Fig. 45 Weighted sum rate versus SNR comparison of different h = h + h , ∀k ∈{1, 2, 3},where h is the estimated k k k k strategies for overloaded three-user deployment with perfect CSIT, error of user-k. h has i.i.d. complex Gaussian entries 2 2 2 σ = σ = 1, σ = 0.3, N = 1, r =[ 0, 0, 0.01, 0.03, 0.1, 0.2, 0.3] t th 1 2 3 drawn from CN 0, σ . The error covariance of user-1, bit/s/Hz e,k 2 −0.6 2 −0.6 user-2, and user-3 are σ = P , σ = γ P ,and t t e,1 e,2 −0.6 σ = γ P , respectively. The precoders are initialized e,3 and designed using the estimated channels h , h ,andh 1 2 3 imperfect CSIT. In contrast, the WSR gap between RS and the same methods as stated in perfect CSIT scenarios. and 1-layer RS decreases in imperfect CSIT. One-layer RS One thousand different channel error samples are gener- achieves equal or better WSRs than SC–SIC, SC–SIC per ated for each user. Each point in the rate region is the group, and MU–LP in all figures (Figs. 46, 47, 48, 49, 50, average rate over the generated 1000 channels. and 51). As mentioned earlier, all forms of RS are suited Comparing with the simulation results in perfect CSIT, to any network load and channel circumstances of users. the WSR gap between RS and MU–LP increases in Moreover, all forms of RS are robust to imperfect CSIT. Appendix 6 Overloaded three-user deployment with imperfect CSIT We further investigate the overloaded three-user deploy- ment with imperfect CSIT. The BS is equipped with two antennas (N = 2). Figures 52, 53, 54, 55, 56, and 57 show the simulation results when the rate threshold is r = [0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. Compar- th ing Fig. 52 with Fig. 39, the WSR gaps between RS and SC–SIC per group, RS and MU–LP are increasing dra- matically while the WSR gap between RS and SC–SIC is decreasing. The inter-group interference of SC–SIC per group becomes difficult to coordinate due to the limited number of transmit antenna and imperfect CSIT. RS is able to overcome the limitations of SC–SIC per group and MU–LP by dynamically determining the level of multi- user interference to decode and treat as noise. Appendix 7 Fig. 44 Weighted sum rate versus SNR comparison of different Overloaded four-user deployment with perfect CSIT strategies for overloaded three-user deployment with perfect CSIT, 2 2 2 Figures 58 and 59 show the results when γ = 1. Compar- σ = σ = σ = 1, N = 1, r =[ 0, 0, 0.01, 0.03, 0.1, 0.2, 0.3] bit/s/Hz t th 1 2 3 ing with SC–SIC per group, 1-layer RS per group always Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 46 of 54 ab cd Fig. 46 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with imperfect CSIT, γ = γ = 1, 1 2 th u =0.2, u =0.3, u =0.5, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 ab cd Fig. 47 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with imperfect CSIT, γ = γ = 1, 1 2 th u =0.4, u =0.3, u =0.3, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 47 of 54 ab cd Fig. 48 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with imperfect CSIT, γ = γ = 1, 1 2 th u =0.6, u =0.3, u =0.1, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 ab cd Fig. 49 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with imperfect CSIT, γ = 1, γ = 0.3, 1 2 th u =0.2, u =0.3, u =0.5, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 48 of 54 ab cd Fig. 50 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with imperfect CSIT, γ = 1, γ = 0.3, 1 2 th u =0.4, u =0.3, u =0.3, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 ab cd Fig. 51 Weighted sum rate versus SNR comparison of different strategies for underloaded three-user deployment with imperfect CSIT, γ = 1, γ = 0.3, 1 2 th u =0.6, u =0.3, u =0.1, N =4, R =0, k ∈{1, 2, 3}. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. d θ = 4π/9, θ = 8π/9 1 2 3 t 1 2 1 2 1 2 1 2 k Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 49 of 54 ab cd Fig. 52 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with imperfect CSIT, γ = γ = 1, 1 2 u =0.2, u =0.3, u =0.5, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ = 4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t 1 2 1 2 1 2 th d θ = 4π/9, θ = 8π/9 1 2 ab cd Fig. 53 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with imperfect CSIT, γ = γ = 1, 1 2 u =0.4, u =0.3, u =0.3, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t th 1 2 1 2 1 2 d θ = 4π/9, θ = 8π/9 1 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 50 of 54 ab cd Fig. 54 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with imperfect CSIT, γ = γ = 1, 1 2 u =0.6, u =0.3, u =0.1, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t 1 2 1 2 1 2 th d θ = 4π/9, θ = 8π/9 1 2 ab cd Fig. 55 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with imperfect CSIT, γ = 1, γ = 0.3, 1 2 u =0.2, u =0.3, u =0.5, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t th 1 2 1 2 1 2 d θ = 4π/9, θ = 8π/9 1 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 51 of 54 ab cd Fig. 56 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with imperfect CSIT, γ = 1, γ = 0.3, 1 2 u =0.4, u =0.3, u =0.3, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t 1 2 1 2 1 2 th d θ = 4π/9, θ = 8π/9 1 2 ab cd Fig. 57 Weighted sum rate versus SNR comparison of different strategies for overloaded three-user deployment with imperfect CSIT, γ = 1, γ = 0.3, 1 2 u =0.6, u =0.3, u =0.1, N =2, r =[ 0.02, 0.08, 0.19, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = π/9, θ =2π/9. b θ =2π/9, θ =4π/9. c θ = π/3, θ = 2π/3. 1 2 3 t th 1 2 1 2 1 2 d θ = 4π/9, θ = 8π/9 1 2 Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 52 of 54 ab cd Fig. 58 Weighted sum rate versus SNR comparison of different strategies for overloaded four-user deployment with perfect CSIT, γ = 1, θ = θ + , 1 2 1 r =[ 0.03, 0.1, 0.2, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = 0. b θ = π/18. c θ = π/9. d θ = π/6 1 1 1 1 th ab cd Fig. 59 Weighted sum rate versus SNR comparison of different strategies for overloaded four-user deployment with perfect CSIT, γ = 1, θ = θ + , 1 2 1 r =[ 0.03, 0.1, 0.2, 0.3, 0.4, 0.4, 0.4] bit/s/Hz. a θ = 0. b θ = π/18. c θ = π/9. d θ = π/6 1 1 1 1 th Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 53 of 54 Appendix 8 Overloaded ten-user deployment with perfect CSIT 2 2 Figure 60 shows the simulation results when σ = σ = 1 2 ... = σ = 1, r = [0, 0.001, 0.004, 0.01, 0.03, 0.06, 0.1] th bit/s/Hz. Comparing with Fig. 21, the rate threshold of each SNR is reduced in Fig. 60. The WSR achieved by MU–LP is approaching RS when SNR is 0 or 5 dB in Fig. 60. This is because the rate threshold is set to 0 when SNR is 0 dB or 5 dB. When the rate threshold is 0, MU–LP could deliver two interference free streams since there are two transmit antennas. It achieves a DoF of 2 while SC–SIC is always limited by a DoF of 1. Figure 61 shows the simulation results when 2 2 2 σ = 1, σ = 0.9, ...σ = 0.1. The rate threshold is 1 2 10 the same as in Fig. 60. In the extremely overloaded sce- nario, the WSR gap between RS and SC–SIC is still large despite the diversity in channel strengths. Here again, Fig. 60 Weighted sum rate versus SNR comparison of different SC–SIC makes an inefficient use of the transmit antennas strategies for overloaded ten-user deployment with perfect CSIT, and achieves a DoF of 1. In contrast, 1-layer RS, with a 2 2 2 σ = σ = ... = σ = 1, N = 2, SNR = 30 dB, r = [0, 0.001, 0.004, t th 1 2 10 low scheduler and receiver complexity, achieves a good 0.01, 0.03, 0.06, 0.1] bit/s/Hz performance in all network loads. Abbreviations AO: Alternating optimization; AWGN: Additive white gaussian noise; AWSR: Averaged weighted sum rate; CoMP: Coordinated multipoint; CSIR: Channel achieves equal or better WSR. One-layer RS per group state information at the receivers; CSIT: Channel state information at the is more general than SC–SIC per group. It enables the transmitter; DoA: Direction of arrival; DoF: Degrees of freedom; DPC: Dirty capability of partially decoding interference and partially paper coding; FDMA: Frequency-division multiple access; GDoF: Generalized degrees of freedom; HK: Han and Kobayashi; HRS: Hierarchical rate splitting; treating interference as noise in each user group. When IC: Interference channel; IoT: Internet of Things; MISO: Multiple-input there is a sufficient channel gain difference between users single-output; MRT: Maximum ratio transmission; MSE: Mean square error; within each group and a sufficient inter-group angle, the MTC: Machine-type communications; MU–LP: Multi-user linear precoding; MU–MIMO: Multi-user multiple-input multiple-output; MUST: Multi-user WSR of SC–SIC per group becomes closer to the WSR of superposition transmission; NOMA: Non-orthogonal multiple access; RS comparing Figs. 59 and 19. OMA: Orthogonal multiple access; QCQP: Quadratically constrained quadratic program; QoS: Quality of service; RS: Rate splitting; RSMA: Rate-splitting multiple access; SAA: Sample average approximated; SC: Superposition coding; SC–SIC: Superposition coding with successive interference cancellation; SCMA: Sparse code multiple access; SDMA: Space-division multiple access; SIC: Successive interference cancellation; SISO: Single-input single-output; SISO BC: Single-input single-output broadcast channel; SNR: Signal-to-noise ratio; SVD: Singular value decomposition; TDMA: Time-division multiple access; WMMSE: Weighted minimum mean square error; WSR: Weighted sum rate; ZFBF: Zero-forcing beamforming Acknowledgements The authors are deeply indebted to Dr. Hamdi Joudeh for his helpful comments and suggestions. Funding This work is partially supported by the UK Engineering and Physical Sciences Research Council (EPSRC) under grant EP/N015312/1. Authors’ contributions Authors’ contributions BC proposed the research idea. The co-authors discussed the model design and experiments together. YM performed the experiments. YM and BC co-wrote the first draft of the manuscript. BC and VOKL gave advice on writing and revised the manuscript. All authors read and approved the final manuscript. Competing interests Fig. 61 Weighted sum rate versus SNR comparison of different The authors declare that they have no competing interests. strategies for overloaded ten-user deployment with perfect CSIT, 2 2 2 Publisher’s Note σ = 1, σ = 0.9, ...σ = 0.1, N = 2, SNR=30 dB, 1 2 10 Springer Nature remains neutral with regard to jurisdictional claims in r = [0, 0.001, 0.004, 0.01, 0.03, 0.06, 0.1] bit/s/Hz th published maps and institutional affiliations. Mao et al. EURASIP Journal on Wireless Communications and Networking (2018) 2018:133 Page 54 of 54 Author details 23. VD Nguyen, HD Tuan, TQ Duong, HV Poor, OS Shin, Precoder design for Department of Electrical and Electronic Engineering, The University of Hong signal superposition in MIMO-NOMA multicell networks. IEEE J. Sel. Areas Kong, Pok Fu Lam Road, Hong Kong, China. Department of Electrical and Commun. 35(12), 2681–2695 (2017) Electronic Engineering, Imperial College London, Exhibition Road, SW7 2AZ 24. M Zeng, A Yadav, OA Dobre, GI Tsiropoulos, HV Poor, Capacity London, UK. comparison between MIMO-NOMA and MIMO-OMA with multiple users in a cluster. IEEE J. Sel. Areas Commun. 35(10), 2413–2424 (2017) Received: 3 November 2017 Accepted: 18 April 2018 25. T Han, K Kobayashi, A new achievable rate region for the interference channel. IEEE Trans. Inf. Theory. 27(1), 49–60 (1981) 26. AG Davoodi, SA Jafar, Aligned image sets under channel uncertainty: settling conjectures on the collapse of degrees of freedom under finite References precision CSIT. IEEE Trans. Inf. Theory. 62(10), 5603–5618 (2016) 1. Y Saito, Y Kishiyama, A Benjebbour, T Nakamura, A Li, K Higuchi, in 2013 27. S Yang, M Kobayashi, D Gesbert, X Yi, Degrees of freedom of time IEEE 77th Vehicular Technology Conference (VTC Spring). Non-orthogonal correlated MISO broadcast channel with delayed CSIT. IEEE Trans. Inf. multiple access (NOMA) for cellular future radio access (IEEE, 2013), Theory. 59(1), 315–328 (2013) pp. 1–5 28. H Joudeh, B Clerckx, Sum-rate maximization for linearly precoded 2. 3GPP TR 36.859, Study on downlink multiuser superposition transmission downlink multiuser MISO systems with partial CSIT: a rate-splitting (MUST) for LTE (Release 13). (3rd Generation Partnership Project (3GPP), approach. IEEE Trans. Commun. 64(11), 4847–4861 (2016) 2015). http://www.3gpp.org/dynareport/36859.htm 29. E Piovano, B Clerckx, Optimal DoF region of the K-user MISO BC with 3. H Nikopour, H Baligh, in 2013 IEEE 24th Annual International Symposium on partial CSIT. IEEE Commun. Lett. 21(11), 2368–2371 (2017) Personal, Indoor, and Mobile Radio Communications (PIMRC). Sparse code 30. C Hao, B Clerckx, MISO networks with imperfect CSIT: a topological multiple access (IEEE, 2013), pp. 332–336 rate-splitting approach. IEEE Trans. Commun. 65(5), 2164–2179 (2017) 4. L Dai, B Wang, Y Yuan, S Han, C-l I, Z Wang, Non-orthogonal multiple 31. C Hao, B Rassouli, B Clerckx, Achievable DoF regions of MIMO networks access for 5G: solutions, challenges, opportunities, and future research with imperfect CSIT. IEEE Trans. Inf. Theory. 63(10), 6587–6606 (2017) trends. IEEE Commun. Mag. 53(9), 74–81 (2015) 32. H Joudeh, B Clerckx, Robust transmission in downlink multiuser MISO 5. Z Ding, Y Liu, J Choi, Q Sun, M Elkashlan, C-l I, HV Poor, Application of systems: a rate-splitting approach. IEEE Trans. Signal Process. 64(23), non-orthogonal multiple access in LTE and 5G networks. IEEE Commun. 6227–6242 (2016) Mag. 55(2), 185–191 (2017) 33. E Piovano, H Joudeh, B Clerckx, in 2016 50th Asilomar Conference on 6. W Shin, M Vaezi, B Lee, DJ Love, J Lee, HV Poor, Non-orthogonal multiple Signals, Systems and Computers. Overloaded multiuser MISO transmission access in multi-cell networks: theory, performance, and practical with imperfect CSIT (IEEE, 2016), pp. 34–38 challenges. IEEE Commun. Mag. 55(10), 176–183 (2017) 34. H Joudeh, B Clerckx, Rate-splitting for max-min fair multigroup multicast 7. T Cover, Broadcast channels. IEEE Trans. Inf. Theory. 18(1), 2–14 (1972) beamforming in overloaded systems. IEEE Trans. Wirel. Commun. 16(11), 8. D Tse, P Viswanath, Fundamentals of wireless communication. (Cambridge 7276–7289 (2017) University Press, Cambridge, 2005) 35. RH Etkin, DNC Tse, H Wang, Gaussian interference channel capacity to 9. H Weingarten, Y Steinberg, SS Shamai, The capacity region of the within one bit. IEEE Trans. Inf. Theory. 54(12), 5534–5562 (2008) Gaussian multiple-input multiple-output broadcast channel. IEEE Trans. 36. AG Davoodi, SA Jafar, in 2016 IEEE International Symposium on Information Inf. Theory. 52(9), 3936–3964 (2006) Theory (ISIT). GDoF of the MISO BC: bridging the gap between finite 10. B Clerckx, C Oestges, MIMO wireless networks: channels, techniques and precision CSIT and perfect CSIT (IEEE, 2016), pp. 1297–1301 standards for multi-antenna, multi-user and multi-cell systems. (Academic 37. AG Davoodi, SA Jafar, Transmitter cooperation under finite precision CSIT: Press, Cambridge, 2013) a GDoF perspective. IEEE Trans. Inf. Theory. 63(9), 6020–6030 (2017) 11. T Yoo, A Goldsmith, On the optimality of multiantenna broadcast 38. C Hao, Y Wu, B Clerckx, Rate analysis of two-receiver MISO broadcast scheduling using zero-forcing beamforming. IEEE J. Sel. Areas Commun. channel with finite rate feedback: a rate-splitting approach. IEEE Trans. 24(3), 528–541 (2006) Commun. 63(9), 3232–3246 (2015) 12. B Clerckx, H Joudeh, C Hao, M Dai, B Rassouli, Rate splitting for MIMO 39. M Dai, B Clerckx, D Gesbert, G Caire, A rate splitting strategy for massive wireless networks: a promising PHY-layer strategy for LTE evolution. IEEE MIMO with imperfect CSIT. IEEE Trans. Wirel. Commun. 15(7), 4611–4624 Commun. Mag. 54(5), 98–105 (2016) (2016) 13. N Jindal, MIMO broadcast channels with finite-rate feedback. IEEE Trans. 40. M Dai, B Clerckx, Multiuser millimeter wave beamforming strategies with Inf. Theory. 52(11), 5045–5060 (2006) quantized and statistical CSIT. IEEE Trans. Wirel. Commun. 16(11), 14. MF Hanif, Z Ding, T Ratnarajah, GK Karagiannidis, A 7025–7038 (2017) minorization-maximization method for optimizing sum rate in the 41. A Papazafeiropoulos, B Clerckx, T Ratnarajah, Rate-splitting to mitigate downlink of non-orthogonal multiple access systems. IEEE Trans. Signal residual transceiver hardware impairments in massive MIMO systems. Process. 64(1), 76–88 (2016) IEEE Trans. Veh. Technol. 66(9), 8196–8211 (2017) 15. J Choi, Minimum power multicast beamforming with superposition 42. SS Christensen, R Agarwal, ED Carvalho, JM Cioffi, Weighted sum-rate coding for multiresolution broadcast and application to NOMA systems. maximization using weighted MMSE for MIMO-BC beamforming design. IEEE Trans. Commun. 63(3), 791–800 (2015) IEEE Trans. Wirel. Commun. 7(12), 4792–4799 (2008) 16. Q Sun, S Han, C-l I, Z Pan, On the ergodic capacity of MIMO NOMA 43. B Zheng, X Wang, M Wen, F Chen, NOMA-based multi-pair two-way relay systems. IEEE Wirel. Commun. Lett. 4(4), 405–408 (2015) networks with rate splitting and group decoding. IEEE J. Sel. Areas 17. Q Zhang, Q Li, J Qin, Robust beamforming for nonorthogonal Commun. 35(10), 2328–2341 (2017) multiple-access systems in MISO channels. IEEE Trans. Veh. Technol. 44. H Viswanathan, S Venkatesan, H Huang, Downlink capacity evaluation of 65(12), 10231–10236 (2016) cellular networks with known-interference cancellation. IEEE J. Sel. Areas 18. C Lim, T Yoo, B Clerckx, B Lee, B Shim, Recent trend of multiuser MIMO in Commun. 21(5), 802–811 (2003) LTE-advanced. IEEE Commun. Mag. 51(3), 127–135 (2013) 45. Q Li, G Li, W Lee, M-i Lee, D Mazzarese, B Clerckx, Z Li, MIMO techniques in 19. Z Chen, Z Ding, X Dai, GK Karagiannidis, On the application of WiMAX and LTE: a feature overview. IEEE Commun. Mag. 48(5), 86–92 quasi-degradation to MISO-NOMA downlink. IEEE Trans. Signal Process. (2010) 64(23), 6174–6189 (2016) 46. B Rimoldi, R Urbanke, A rate-splitting approach to the Gaussian 20. Z Ding, F Adachi, HV Poor, The application of MIMO to non-orthogonal multiple-access channel. IEEE Trans. Inf. Theory. 42(2), 364–375 (1996) multiple access. IEEE Trans. Wirel. Commun. 15(1), 537–552 (2016) 21. J Choi, On generalized downlink beamforming with NOMA. J. Commun. Netw. 19(4), 319–328 (2017) 22. W Shin, M Vaezi, B Lee, DJ Love, J Lee, HV Poor, Coordinated beamforming for multi-cell MIMO-NOMA. IEEE Commun. Lett. 21(1), 84–87 (2017)

Journal

EURASIP Journal on Wireless Communications and NetworkingSpringer Journals

Published: May 29, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off