Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Improving Orbit Prediction Accuracy through Supervised Machine Learning

Improving Orbit Prediction Accuracy through Supervised Machine Learning Due to the lack of information such as the space environment condition and resident space objects' (RSOs') body characteristics, current orbit predictions that are solely grounded on physics-based models may fail to achieve required accuracy for collision avoidance and have led to satellite collisions already. This paper presents a methodology to predict RSOs' trajectories with higher accuracy than that of the current methods. Inspired by the machine learning (ML) theory through which the models are learned based on large amounts of observed data and the prediction is conducted without explicitly modeling space objects and space environment, the proposed ML approach integrates physics-based orbit prediction algorithms with a learning-based process that focuses on reducing the prediction errors. Using a simulation-based space catalog environment as the test bed, the paper demonstrates three types of generalization capability for the proposed ML approach: 1) the ML model can be used to improve the same RSO's orbit information that is not available during the learning process but shares the same time interval as the training data; 2) the ML model can be used to improve predictions of the same RSO at future epochs; and 3) the ML model based on a RSO can be applied to other RSOs that share some common features. Keywords: Orbit Prediction; Resident Space Object; Supervised Machine Learning; Support Vector Machine. 1. Introduction The amount of resident space objects (RSOs) and the quantity of con icts between RSOs are rapidly escalating. The number of space objects larger than 10 cm is presently approach- ing 21,000, the estimated population of objects between 1 and 10cm is about 500, 000, and for objects smaller than 1cm the number exceeds 100 million (NASA). At the heart of the challenges for Space Situational Awareness (SSA) is to predict each object's orbit eciently Corresponding author Email addresses: AstroH.Peng@gmail.com (Hao Peng), xiaoli.bai@rutgers.edu (Xiaoli Bai) Postdoctoral Associate. Assistant Professor. Preprint submitted to Advances in Space Research January 16, 2018 arXiv:1801.04856v1 [astro-ph.EP] 15 Jan 2018 and accurately. Several painful incidents have occurred in recent decades, such as the Febru- ary 2009 collision of a U.S. Iridium communications satellite and a Russian Cosmos 2251 communication satellite (Kelso, 2012), and the RED threshold late notice conjunction threat to the International Space Station (ISS) from the \25090 PAM-D" debris (Bergin, 2009). The limitation of the current orbit prediction capability is among the main causes for these collision events. Current approaches for orbit prediction are physics-based, the success of which requires a good knowledge of the state of the space object at the start of the trajectory computa- tion, and the environment information such as earth gravity, atmospheric drag, and solar radiation pressure, as well as the intent information for the maneuvering objects. However, our understanding of the space/time variation of atmospheric density is limited, and our information about the space object is not accurately updated, for example the maneuvering of a spacecraft that is owned by other nations could be unavailable when the orbit predic- tion is conducted. Moreover, our current surveillance resources are limited and expensive and space-tracking measurements are sparse and noisy. All of these issues lead to the fact that current errors in physics-based predictions can be too large to be used for meaningful actions. The machine learning (ML) method presents a di erent modeling and prediction capabil- ity compared to the physics-based approach. The prediction can be made without explicitly modeling space objects, spacecraft maneuvers, and space environment. Instead, the models are learned based on large amounts of observed data, similar to human cognition in learning from past experience to predict future events. There are di erent types of machine learning, with supervised learning, reinforcement learning, and unsupervised learning as the three most common types (Abu-Mostafa et al., 2012). Reinforcement learning is usually used to make decisions, while unsupervised learning is used to nd patterns and structures within the data such as to cluster data into di erent groups without providing outputs to describe the groups. Supervised learning is such a method that learns a function or mapping from labeled data. The training data are pairs of input and its corresponding output. The super- vised ML method is the appropriate approach for improving orbit prediction error based on historical measurements and error information. In the remainder of this paper, for brevity, we will refer to supervised machine learning methods as ML methods. The ML methods have shown great capability for a wide range of applications (Abu- Mostafa et al., 2012; Mitchell, 1997), including many in the aerospace eld (Ampatzis and Izzo, 2009; Hartikainen et al., 2012; Sharma and Cutler, 2015). Hartikainen et al. combine physical models with non-parametric data-driven techniques to build a model for orbit pre- diction (Hartikainen et al., 2012). Their method focuses on the data mining aspect and can extract information of unknown forces from historical data. Sharma and Cutler have pre- sented a learning approach to do orbit determination based on distribution regression and transfer learning methods (Sharma and Cutler, 2015). Their tests show that the proposed machine learning approach is superior to the conventional methods such as the extended Kalman lter. Moreover, the method is able to estimate signi cantly varying orbital param- eters. This paper presents a computational framework that improves orbit prediction accuracy 2 through the ML method. A crucial feature of the proposed approach is that the overall framework will introduce and embed a learning-centered strategy into the physics-based prediction. To take advantage of the knowledge on the physics-based models that are im- portant and represent the state of the practice, we design the learning process to focus on modeling the prediction errors instead of the full dynamics. In this way, the prediction errors introduced in di erent steps including measurement, estimation, and prediction, are captured as a surrogate model. An extra bene t from this approach is that the learning is only burdened with nding the incremental corrections to the physics-based prediction, which reduces the dimensionality for the learning task. The support vector machine (SVM) regression method (Smola and Sch olkopf, 2004), which is an advanced method proven to be robust to outliers in the dataset, is chosen in this paper, although the proposed methodology is expected to be general and applicable to other ML methods. Furthermore, in contrast with other researches that focus on TLE data (Legendre et al., 2006; Sang et al., 2014, 2013; Wang et al., 2009), we use a simulation-based space catalog environment to validate the proposed orbit prediction method. This approach is important, because a known \truth" will underlie all simulations, and the solutions from the machine learning can be compared to the \truth" to make de nitive statements on the performance. Our simulation results, interestingly and innovatively, demonstrate three types of gen- eralization capability for the proposed ML approach. First, the ML model can be used to improve the RSO's orbit information that is not available during the learning process but share the same time interval as the training data. Second, the ML model can be used to improve predictions of the same RSO at future epochs. And third, the ML model based on one RSO can be applied to other RSOs that share some common features. Our results also provide insight upon several questions that are important for the SSA applications including: more observation stations could generate more historical data and thus better performance can be expected from the ML model; larger size of the training data can lead to better performance but there appears to be a limit; and the generalization capability to nearby RSOs implies that it is possible to build an ML system for a large number of RSOs while only representative RSOs are used for training. The paper outline is as follows. The high- delity simulation environment is rst described in Section 2. Next, the framework of the machine learning approach to improve orbit prediction accuracy is presented in Section 3. In Section 4, the numerical simulation results are analyzed, where a performance metric is proposed to evaluate the ML approach. In Section 5, the generalization problem to future epochs is proposed and evaluated for SSA application. Then in Section 6, the generalization problem to di erent RSOs is examined by varying the inclination and the semi-major axis of the simulated RSO. Finally, conclusions and future studies are presented in the last section. 2. Simulation Environment The framework of the orbit predictions through supervised machine learning is shown in Fig. 1. The top four blocks show the conventional orbit prediction process, while the bottom three blocks represent the proposed ML approach to enhance the conventional orbit 3 prediction. In the following subsections, the top four blocks will be explained, and the bottom three blocks will be presented in Section 3. Truth Dynamic Models • Earth gravity model Measurement Models • Third-body perturbation • Azimuth, Elevation, Range True • Atmosphere drag • Measurement noises orbit • Solar radiation pressure • Radar station models • Characteristics of RSO Tracks Assumed Dynamic Model Estimation Models Orbit Prediction Estimated • Orbital elements • Orbital state states • Drag coefficients • Characteristics of RSO • Reflection coefficients Conventional Orbit Prediction Predicted states Machine Learning Enhancement Analysis Data Collection Machine Learning • Validation results Error Dataset • Track information • Supervised or not • Performance metrics model • Prediction errors • Model tuning • Generalization capability • Interpret results Figure 1: The framework of improving orbit predictions through the ML method. 2.1. Truth Dynamic Models The truth dynamic models include Basic Newtonian two-body gravitational force, with an accurate gravitational constant. High- delity non-spherical gravity model of the Earth. Third-body perturbations. Air drag force model with high- delity atmosphere model. Solar radiation pressure model with high- delity solar activity model. Other parameters depending on the speci c RSO, including the mass, inertia, geome- try, material properties, etc. The choice of truth dynamic model for the simulation depends on the time scale. In this paper, the truth dynamic model is expected to include the major factors that could con- tribute to the orbit prediction error within at least 60 days. Therefore, forces that have True orbit long-period e ects do not need to be modeled with high precision. The setup of the truth model is summarized in Table 1. The Newtonian gravitational force is added with an Earth 14 3 2 gravitational constant of 3:986  10 m =s . The non-spherical e ect of the Earth gravity is modeled using spherical harmonic functions, with coecients provided by the EIGEN- 6S model (F orste et al., 2012), truncated with degree/order 40  40 as the truth gravity. Third-body perturbations of all major solar system bodies are considered, including the Sun, all the planets, the Pluto, and the Moon. The position of these bodies are provided by DE430 data le from the JPL (Folkner et al., 2014). The DTM2000 model is used to approximate the atmosphere. The Marshall Solar Activity Future Estimate Solar (MSAFE) data from NASA is used to provide solar activity information, which has signi cant e ect on the density and the speed of the atmosphere. The solar radiation pressure is calculated 6 2 with the reference value as 4:56  10 N=m at 1 AU (149,597,870.0 km) from the Sun. And the e ect of the penumbra and eclipse are considered. Additionally, a spherical RSO with a constant section area is assumed, and the drag coecient C and single-parameter re ection coecient C are assumed to be constant. These models are implemented using the Orekit, which is a low level space dynamics library written in Java (Maisonobe et al., May 3{6, 2010). Table 1: Parameters of the \truth" model used to generate orbits and measurements, and the assumed model used in the estimation and pre- diction. Parameters Truth model Assumed model Earth Shape WGS84 WGS84 Harmonics Gravity Field 40 40 20 20 Third-Body Purterbation Sun + Solar Planets Sun + Jupiter + Pluto + the Moon + the Moon Atomosphere Model DTM2000 DTM2000 Solar Activity Model MSAFE MSAFE 2.2. Measurement Models Measurements of a RSO can be obtained using ground-based radar or optical stations (Hill et al., 2012), or space-based systems using specially designed satellite (Lyon, 2004). In this study, only ground radar stations are considered because the paper focuses on the general machine learning framework and the target RSO is assumed to be an LEO object. In the simulations, the radar station is located at a topocentric frame centered at a given geodetic point location de ned on the WGS84 Earth ellipsoid. At each step of orbit propagation, the visibility of the RSO with respect to the ground stations is checked. If the RSO is visible to a station, the station will provide measurements including the azimuth , the elevation , and the range . These measurements are generated by converting the state vector of the RSO in the Earth Centered Inertial (ECI) frame (Vallado, 1997) to the topocentric frame and then calculating the angle information. A series of consecutive measurements are organized as a track. If more than one station detect the RSO at a particular time, all the stations 5 will generate measurements which are combined into one track. Therefore, one track will start when it is visible to any station, and end when no stations can detect it. Table 2: Ground-based radar stations modeled in the paper (Hill et al., 2012). Station Eglin, FL Clear, AK Kaena Point, HI Latitude [deg] 30.57 64.29 21.57 Longitude [deg] 86.21 149.19 158.27 Altitude [m] 34.7 213.3 300.2 Maximum Range  [m] 13210 4910 6380 Feasible Elevation [deg] 1{90 1{90 0{90 [m] 32.1 62.5 92.5 [deg] 0.0154 0.0791 0.0224 [deg] 0.0147 0.024 0.0139 In the simulations, three radar stations are simulated, with their parameters listed in Table 2. The RSO is visible to a station only if the range is less than the maximum range and the elevation is within the feasible elevation range. Measurements are given at a 20-second interval. The measurement errors are simulated as normal distributions with zero biases and standard deviations of  ,  , and  for the azimuth, elevation and range respectively, as listed in Table 2. We remark that the constraint of the azimuth range of the station (Hill et al., 2012) has been omitted in the current study, so the error model does not depends on the azimuth. 2.3. Estimation Models Estimation window ˆ ˆ (, X C ) id 1,i1 ˆ ˆ (, X C ) id 1,i1 track i track i+1 track i-1 ˆ ˆ (, X C ) id ,i est. time t t t t i-1 i i+1 Figure 2: Illustration of the tracks included for an estimation at t , providing the estimated state X and i i drag coecient C of the RSO. Figure 2 illustrates the estimation strategy. The black curve represents the true orbit of a RSO. The colored (light orange and blue) boxes along the curve represent observation 6 tracks. The estimation is carried out at the beginning epoch of each track, as represented by the red dots in the gure. For the i-th track in the gure, whose beginning epoch is t , measurements on current track and all previous tracks t with t t  t are consid- b b i est. ered, where t = 12 hours is the chosen estimation window. This estimation window is est. shown by the gray rectangle underlying the true orbit in Fig. 2. Orange rectangles show ^ ^ all tracks in the estimation window that will be used to produce the estimation (X ; C ) i d;i at t , while blue rectangles show tracks outside the estimation window. We remark that this strategy is adopted to simulate the realistic process of the orbital determination, where single observation track may not have enough measurements. The batch Least Square (LS) estimator (Crassidis and Junkins, 2011) is chosen to estimate the state of the RSO. In practice, the LS estimator is only sub-optimal due to the nonlinearity of the model. However, the focus of this paper is the framework of the proposed ML approach and our hope is that it can work with most of the speci c estimation methods. Additionally, we note that although the batch least square estimator can also generate the covariance matrix, which is important for collision risk assessment, only the orbital state is used in our simulation. This is due to the fact that the covariance information is not available for the two-line element (TLE) catalog (Vallado et al., 2006). Starting from the block of estimation models in Fig. 1, simulations are carried out using an assumed dynamic model. In practice, the assumed dynamic model used during estimation and propagation is often only an approximation to the real physics. In the simulations, the assumed model is set up with the spherical harmonic gravity model of degree/order 20 20, and with the third-body perturbations including only three major resources, the Sun, the Moon and the Jupiter, as summarized in Table 1. The orbit state X and the drag coecient C of the RSO are estimated by the LS estimator, which generates the ^ ^ pairs of estimations denoted as (X; C ). Other parameters of the RSO are xed, including the cross section area, the mass, and the re ection coecient C of the surface. Finally ^ ^ all pairs of estimations (X; C ) are recorded for the orbit predictions. We remark that if the di erences between the true model and the assumed model are smaller, the error of the estimation will be smaller, and also the deviations of propagating two estimates to the same epoch will be smaller. As a consequence, compared with other assumed models with larger di erences, less information is embedded in the simulated catalog using the current assumed model. Using such conservative assumptions, the results in Sections 4 to 6 show that the ML approach improves the orbit prediction accuracy. Therefore, we demonstrate the capability of the ML approach in a more signi cant way. 2.4. Orbit Prediction ^ ^ After obtaining estimations (X ; C ) for all the tracks, the prediction process is straight- i d;i forward. Using the assumed dynamic model, which is the same as the one that has been used for estimation, as show in Fig. 1, each estimated state X is propagated to a future epoch t with t > t , when another observation track begins. At the future epoch t , the j j i j ^ ^ predicted state X based on X is compared with the true state X , which provides the j;i i j prediction errors. Since the assumed dynamic model, the measurement process, and the es- timation process will all introduce errors, the resulting prediction error can grow quickly to 7 a meaningless magnitude as the propagation time increases. Therefore, we use a maximum prediction duration t = 7 days to restrict the orbit prediction process in this paper. max This duration is also long enough for the surveillance and collision avoidance for the LEO objects. 3. Prediction Accuracy Improvement through Machine Learning In this section, the proposed ML approach to improve orbit prediction accuracy is pre- sented. We rst introduce the SVM model, which is the adopted machine learning method in this paper. We then present the machine learning model of the orbit prediction error. And we discuss the dataset chosen for the SVM model to capture the orbit prediction error in the last subsection. 3.1. Support Vector Machine The support vector machine (SVM) method is a machine learning algorithm that can be used for both classi cation and regression problems. One strength of the SVM method is that they are nonparametric techniques, so we do not need to specify the basis functions in prior. The SVM regression can handle nonlinear problems since it relies on kernel functions. Moreover, the SVM method has universal approximation capability with various kernels including the Gaussian kernel (Hammer and Gersmann, 2003; Micchelli et al., 2006). We note that although it cannot be veri ed that the orbit prediction problem under this study satis es the fundamental assumptions of SVM's universal approximation property, the result in this paper demonstrate that the SVM method is practically e ective to reduce orbit prediction errors. The concept of the SVM is brie y reviewed below, and the details are referred to the references. The "-SVM regression method aims to nd a function f (x) that has at most " deviation from the actual obtained targets for all the training data, and at the same time f (x) is required to be as at as possible (Smola and Sch olkopf, 2004). Suppose we have a set of training data X = f(x ; y ); (x ; y ); : : : ; (x ; y )g 2 L  R, where L denotes the space of 1 1 2 2 n n input learning variables. In the linear case, the desired function takes the form f (x) = hw;xi + b; (1) where ! 2 L, b 2 R, and h;i represents the inner product operation. Then the training problem is to nd the attest function in the space L, where the atness of the function is represented by k!k . The training problem is cast as a convex optimization problem to minimize the cost function k!k , subject to the constraint jy h!;x i bj  " for all i i the training data. By introducing slack variables and then solving the dual problem of the above quadratic problem, the support vector expansion of the regression function f (x) can be expresses as f (x) = ( )hx ;xi + b; (2) i i where and are the dual variables solved from the dual problem (Smola and Sch olkopf, 2004). Comparing Eqs. (1) and (2), we can see that ! can be completely described as a 8 linear combination of the training data x . This property of the SVM makes it possible to deal with nonlinear regressions via kernels. Substituting the inner product h;i in Eq: (2) by the kernel k(;), the optimization problem is reformulated to nd a attest function in the feature space indicated by the kernel, expressed as f (x) = ( )k(x ;x) + b: (3) i i As shown in Eq. (3), when using the kernel technique, the coecient ! in Eq. (1) will not be provided explicitly. However, when given a new testing data x , according to Eq. (3), test we only need the kernel k(;) and the corresponding dual variables and to nd the prediction f (x ) by the trained SVM regression model. test 3.2. Machine Learning Model of the Prediction Error The concept of the ML approach to modify the prediction is illustrated in Fig. 3. After the ground station observes a RSO, the estimation of the RSO is conducted. The estimated state will have errors, and the following orbit prediction will further deviate from the true orbit because of the assumed dynamic models. We introduce the ML approach to directly modify this prediction, such that the ML-modi ed prediction will be closer to the true state. Importantly, as shown in Fig. 3, this modi cation does not address the state around the estimation at the current epoch, while in contrast, it directly improves the prediction at the future epoch. prediction based on estimation ML-modified prediction true state estimated state true state ECI (J2000) Figure 3: Modify the prediction through ML-based correction. 3.3. Dataset Capturing the Orbit Prediction Error Model The structure of the training data should be carefully designed to capture the prediction errors. We note that although the dataset designed in this paper is not the unique solution, 9 results in the paper show that the ML approach can indeed improve the orbit prediction accuracy based on the proposed design of the dataset. True Estimated Predicted ML-modified Deviated orbit True orbit ML-Modification e(t ;t ) i+1 i e(t ;t ) i+2 i Modifying process Learning process time t t t t t t i-1 i i+1 i+2 i+3 Figure 4: Illustration of the dataset for the ML approach to improve orbit prediction accuracy. Figure 4 illustrates the construction of the dataset for the ML approach in this study. The horizontal axis represents the time t. The dark solid curve represents the true orbit, and the dashed curves represent the predicted orbit arcs propagated based on the assumed dynamic models. The true state of the RSO at any speci ed epoch t is denoted as X (t), which can be expressed in di erent coordinate frames and forms. In the classical orbital COE T 6 elements (COE) form, the state X (t) is expressed as (t) = [a; e; i; !; ; f ] 2 R . In the Cartesian coordinates of the Earth-centered inertial (ECI) frame, the state X (t) is expressed ECI T 6 as x(t) = [X; Y; Z; V ; V ; V ] 2 R . At a speci c time t , we have the true state X (t ), X Y Z i i ^ ^ ^ estimated state X (t ), predicted state X (t ; t ), and ML-modi ed state X (t ; t ), i i b ML-modi ed i b where the second time parameter t in the parenthesis indicates the prediction is based on the previous estimation at an earlier epoch t  t . b i In Fig. 4, the lighter gray box on the left hand side shows the learning process of the ML method, during which the training data are collected to train the SVM model. As annotated in the gure, at the epoch t , there are three states about the RSO, the true state X (t ), i i ^ ^ the estimated state X (t ), and the predicted state X (t ; t ) which is based on the previous i i i1 estimation X . We have three types of error (as shown in Fig. 4): i1 True prediction error, e(t ; t ) = X (t ; t ) X (t ). i i1 i i1 i True estimation error, e ^(t ) = X (t ) X (t ). i i i ^ ^ Relative error of prediction, "(t ; t ) = X (t ; t ) X (t ). i i1 i i1 i These three errors are related by the equation e(t ; t ) = e ^(t ) + "(t ; t ): (4) i i1 i i i1 10 Notice that the relative prediction error "(t ; t ) is identical to the de nition of the con- i i1 sistency error between two estimations in the work of Rivera and Bai (2016). These orbit prediction errors often represented in the RSW coordinate frame, where x-axis (radial) is the radial direction, the y-axis (along-track) is perpendicular to the x-axis in the orbital plane and points to the inertial velocity direction, and the z axis (cross-track) is along the angular momentum direction (Vallado, 1997). Next, we construct the dataset for the ML model. Apart from the three types of errors, there are other variables available at t , such as the measurement of the current track, which could also contain information about the orbit prediction error. Take the epoch t as an instance, and assume the previous estimation is obtained at the epoch t , with t  t . b b i Then, the learning variables of a particular data point of the dataset consist of the following: The relative prediction error "(t ; t ) based on the previous estimation X (t ), expressed i b b COE RSW in both the form of COE as  , and the form of RSW frame as x . So we " " COE T RSW T T will have  = [  ; x ] as a representation of "(t ; t ), with redundancies. i b " " It is expected that  carries the information of the model error between the assumed model and the truth model. The prediction duration t = t t . This information is an important factor, as the i b longer the prediction, the larger the prediction error. COE Current estimation of the orbit state, expressed in the COE form as (t ). This information can re ect physical information of the orbit, such as the altitude and the shape. The atmosphere drag force and the solar radiation force also depend on these information. ^ ^ The estimated drag coecient C (t ) at the previous estimation, and its error C (t ; t ) d b d i b ^ ^ ^ = C (t ) C (t ) with respect to the current estimation C (t ). This information is d b d i d i expected to be important for RSOs in LEO. Maximum elevation  at the current track starting with t , and the corresponding azimuth and range . This information is usually located around the middle of the track. They can re ect the diculty and accuracy of the measurements along the track, because when the RSO has a larger range and smaller elevation, the sensitivity of the estimated state to the measurement will be smaller. We note that it is not necessary to distinguish the estimation error and the propagation error for the proposed ML approach. The target variable for the above data point is chosen as: True prediction error e at a future epoch t (t < t ), expressed in the RSW frame as j i j RSW T 6 x = [e ; e ; e ; e ; e ; e ] 2 R . e x y z vx vy vz The illustrated e(t ; t ) and e(t ; t ) in Fig. 4 are examples of the target variable, based i+1 i i+2 i on the same estimation X but with di erent prediction durations of t = t t and i i+1 i 11 RSW t t respectively. With x being a 6-dimensional vector, six SVM models in total i+2 i e will be trained for all the components. With these pairs of learning and target variables, the ML model is designed to di- rectly modify a future orbit prediction. As shown by the prediction process in Fig. 4 (the darker gray block at the right side), the predicted state is adjusted by the additional ML- modi cation, which is the output of the ML model. If the learning and target variables are related, and if their relationship is at least partially contained in the designed dataset, the ML model should capture the underlying relationship and the ML-modi cation is expected to bring the prediction closer to the true orbit. Collect all learning variables as  = [t; ; ^; ; ; ]. One data point of the dataset is expressed as ((t ); e (t )), where  2 fx; y; z; vx; vy; vzg corresponds to di erent state i  j components of the target true error e, and t , t (t < t ) indicate the di erent epochs of the i j i j learning variables  and the target variable e respectively. Then nally, the whole dataset X is constructed as, X = f : : : : : : : : : : : : : : : ((t ); e (t )) ; ((t ); e (t )) ; ((t ); e (t )) ; : : : i2  i1 i2  i i2  i+1 ((t ); e (t )) ; ((t ); e (t )) ; : : : i1  i i1  i+1 ((t ); e (t )) ; : : : i  i+1 : : :g; where the i-th row shows prediction data from t to all the following epochs t represented i j by the j-th columns. We choose these learning variables and target variables based on the rationale that the true state and the true models are usually not available, so any variables that explicitly depend on the true orbit information shall be not used, such as the true error e. However, the information of the true error e can be implicitly contained in the relative error " to some degrees. In this way, the proposed ML approach actually also connects the predicted error with the true prediction error at a future epoch, with the assistance of other learning variables. 4. Numerical Results of Improving Orbit Prediction In this section, the framework established in the previous section is applied to a simulated RSO. The ML approach to improve orbit prediction is demonstrated by training an SVM model rst and then using it to improve the orbit prediction accuracy. We will also discuss the e ect of the size of the training data at the last subsection. 4.1. Building the SVM model for the Simulated RSO Parameters for the simulated RSO are summarized in Table 3. The initial orbital param- eters are given in the ECI frame, which is directly converted from a TLE of the International 12 Table 3: Parameters of the simulated RSO. Parameter Value UTC epoch 2016/09/28 12:09:45 Semi-major axis a [km] 6783.34 Eccentricity e 0.006793 Inclination i [deg] 51.6393 Argument of perigee ! [deg] 14.5438 RAAN [deg] 262.6471 Mean anomaly [deg] 345.5909 Area-to-mass ratio 0.05 Drag coecient C 2.2 Re ection coecient C 1.25 Space Station (ISS). However, we note here that we study a generic RSO with area-to-mass ratio and re ection coecient di erent from the ISS. The RSO is propagated for 30 days using the truth dynamic model. The numerical integrator is chosen to be the Dormand-Prince 8(5,3) method for all the propagation in this paper. The absolute tolerance on each position component in the ECI frame is set as 0.1 m, and the propagation step size is limited between 0.001 and 90 seconds. Figure 5: True orbit (colorful arcs) and observations (green dots) of the RSO during the simulation in the ECI frame. The true orbit is colored by red at the beginning, and gradually transits to the blue color at the end of the propagation. Figure 5 demonstrates the orbit of the simulated RSO for 30 days. The orbit is colored by red at the beginning, and is seen to gradually change to blue at the end. This is expected as that the orbit processes within the simulated time interval, due to the perturbation forces. In Fig. 5, the green dots along the orbit show the locations where measurements are generated by the ground stations. The revolution of the instantaneous COE is shown in Fig. 6, alongside with zoom-in plots in t 2 [25; 25:4] days to clearly show their variations. 13 Figure 6: Instantaneous classical orbital elements of the simulated RSO. Apart from the clear precession motion, the orbit also shows short-term and secular changes on the semi-major axis, eccentricity and inclination. After obtaining the dataset following the description in the previous section, we perform data cleaning before starting to train the SVM model. The guidelines for the data cleaning usually depend on the speci c dataset, the target problem, and also the experience of the modeler. Here, two types of data cleanings are performed. First, we remove the cases where the estimations of the drag coecient C lie outside of the interval [1:2; 3:2]. Because the ^ ^ atmosphere drag is sensitive to C , a largely biased C can cause signi cant errors. In d d practice, C can be compared with the historical data, and those with large biases can be detected. Second, we use the Tukey's test (Tukey, 1977) to remove outliers in the dataset, which means data points with too large errors are eliminated. Speci cally, denoting the quartiles of the along-track error e as q , q , and q , the errors out of the interval [q y 1 2 3 1 3(q q ); q + 3(q q )] are discarded as outliers. 3 1 1 3 1 In this paper, the MATLAB (R2017b) implementation of SVM regression function fitrsvm is used (MathWorks, 2017). The kernel function is set to the Gaussian kernel to deal with the nonlinearity of the prediction error. Some default settings of the MATLAB implemen- tation are used: the scaling parameter for the Gaussian kernel is selected by a heuristic procedure with a subsampling process; the "-margin is selected as iqr(y)=13:49 where iqr(y) indicates the interquartile; the box constraint parameter C is selected as iqr(Y )=1:349; the optimization solver for the SVM training is the Sequential Minimal Optimization (SMO) method. The learning variables will be standardized using the mean and standard deviation of each variable. The Karush-Kuhn-Tucker (KKT) violation is checked at every iteration to determine the convergence when solving the optimization problem, where the convergence tolerance is 10 . Based on the collected dataset, six SVM models in total can be trained to capture the underlining orbit prediction errors for the six components of the true error 14 RSW T x = [e ; e ; e ; e ; e ; e ] . e x y z vx vy vz Additionally, in order to reproduce the training results, the seed of the random stream is set to 42 before the training. We remark that our experiments reveal that the results in the following discussions do not depend on the particular random seed. A di erent seed will slightly vary the speci c metric but will not change the general pattern. 4.2. Validation of the Trained SVM Model First, we consider a simpler case, where only the rst ground station in Table 2 is con- sidered. The measurements and estimation errors of a single station are more likely to be coherent, thus the underlining prediction errors are more likely to follow a deterministic pattern, which can be learned by the SVM. This case re ects the situation where a station only has access to its own historical data. (a) Position components (logarithmic plot of abso- (b) Velocity components (real value) lute value) Figure 7: Distribution of the true prediction error e with respect to the prediction duration t. Figure 7 shows the true prediction error e of the whole dataset, where the RSO has been propagated for 30 days and the maximum prediction duration t for each estimated state max is set is 7 days. The horizontal axis represents the prediction duration t. The vertical axis represents the position components of e in Fig. 7(a) and the velocity components in Fig. 7(b). In Fig. 7(a), the position errors are shown in the logarithm coordinate, since the along-track position error e is signi cantly larger than the other two components. In Fig. 7(b), although e is slightly more spread out than the other two components, all the errors are small and vy there is no dominant component. The most important task is to reduce the orbit prediction error of the along-track position error e . Therefore, in the following study, we will only use e as the study object and demonstrate the performance of reducing e through the ML y y approach. 80% of the dataset is randomly chosen to train the SVM model while the remaining 20% is used to evaluate the performance of the learned model. For evaluation, the learning variable of the testing data is the input to the SVM model, and the output is the ML-predicted error, denoted as e ^ . The ML-modi ed prediction X is achieved by subtracting e ^ ML ML ML 15 ^ from the prediction X . So the residual orbit prediction error e after the ML-modi cation res is e = e e = X X; (5) res ML ML-modi ed which shall be zero if the SVM model completely captures the underlying errors. We use the concept of the volume-weighted symmetric mean absolute percentage error (volume-weighted MAPE) from statistics as the metric to evaluate the prediction accuracy between the reference value and the output value of a prediction model. The performance metric P used to quantify the trained SVM model is de ned as the ratio between the ML summation of the absolute errors between e and e on each data point in the testing data, ML and the summation of the original absolute true error of all data in the testing data, which can be mathematically expressed as, je e ^ j ML i=1 P = 100% ; (6) ML n jej i=1 where n is the size of the testing data. The metric reaches its lower boundary zero when the ML-predicted error is identical to the true error, but has no upper boundary. Notice that the numerator in Eq. (6) is actually the residual error after ML-modi cation, as de ned in Eq. (5). Therefore, we have je j res i=1 P = 100% ; (7) ML jej i=1 which indicates that the metric actually measures the percentage of the residual error e res with respect to the true error of the testing data. The lower this percentage P is, the ML more errors are corrected, thus the better the performance of the trained SVM model is. Notice that this chosen metric is consistent with the objective of the SVM model, which is to include all data point with a surface as smooth as possible through minimizing the total error within the training data. In the following analysis, this metric will be used to quantify the performance of the trained SVM model. Figure 8 demonstrates the results of the SVM model on the training data. The horizontal axis represents the prediction duration t. The vertical axis shows true, ML-predicted and residual along-track errors e respectively, as annotated by the legend in the gure. In both Figs 8(a) and 8(b), the left gures directly show the scatter plot, where the black circle represents the true error e , the green dot represents the ML-predicted error e ^ , and the y ML;y red dot represents the residual error e after ML-modi cations. The right gures show the res errorbar plot of the left scatter plot, where the center point represents the mean value of each clustered group of the error, and the length from the middle of the bar to the top (or the bottom) represents the standard deviation of the corresponding clustered group. For clarity, the three curves are slightly displaced along the horizontal axis to avoid overlapping. (Similar gures will appear in the remaining part of the paper, and we will only explain the di erence there.) In Fig. 8(a), the absolute values of the errors are demonstrated. The ML-predicted error e ^ is scattering within the same area of the true error e , and the errorbar curve is very ML;y y 16 (a) Absolute value. (b) Real value. Figure 8: Results of the trained SVM model on training data with 1 station. close to each other. The performance metric P is 18.3%, indicating that the trained SVM ML model has captured most features of the underlying error model. The residual error is also very small as revealed by both the scatter plot and the errorbar plot. Therefore, we can conclude that the trained SVM model captures the orbit prediction error in the training dataset very well. In Fig. 8(b), learning results are demonstrated with the real value of the errors. As shown in both plots, the error is distributed around zero almost evenly. In this case, the orbit prediction error cannot be compensated for if we only perform a modi cation based on tting the mean error with respect to the prediction duration t, which was reported in the reference (Rivera and Bai, 2016). In contrast, the trained SVM model successfully compensates for the errors and reduces the standard deviation. A common concern in the machine learning applications is the performance of the trained model on a di erent set of data that is not included in the training data. We test this generalization capability here, where the remaining testing data is used as the input to the trained SVM model, and the results are shown in Fig. 9. The metric P is 22.6% now, ML which is larger than that of the training data. This is reasonable because the testing data can contain information beyond the training data. The errorbar plots in both sub gures show that the mean value and the standard deviation have been greatly reduced, even though the 17 (a) Absolute value. (b) Real value. Figure 9: Results of the trained SVM model on testing data with 1 station. testing data is unknown to the trained SVM model. In the above discussions, only one station is considered. A clear drawback of this situation is that a RSO is possible to be below the horizon of the station for a long time, as revealed by the gaps in Figs 8 and 9. Next, all three ground stations in Table 2 are included in the simulations. The obtained dataset is also randomly (with xed seed to allow reproduction) divided into training (80%) and testing (20%) data. The result of the new SVM model on the testing data is shown in Fig. 10. We observe two clear di erences between the new results and the results using only one station: The data in the scatter plot is denser along the horizontal axis, due to the fact that more tracks are available when three stations are used. The results have been clustered into more groups in the error bar plots to show more details in the gure. The maximum of true prediction errors (black circles) are smaller, due to the fact that the estimations become more accurate when more measurements are available. The performance has been greatly improved when the three stations are used. The metric P is 9.6% now, improved by more than a half compared with that of the single station ML 18 (a) Absolute value (b) Real value Figure 10: Results of the trained SVM model on testing data with 3 stations. situation, where P is 22.6%. The mean errors have been reduced to almost zero, and the ML standard deviations have also been greatly reduced. As revealed in the above analysis, with more available ground stations, the trained SVM model can capture the underlying orbit prediction error better, and the generalization per- formance is also signi cantly improved. In the following studies, we will only present results when three ground stations are used. 4.3. Di erent Partition of Dataset An often accepted concept for the machine learning method is that the more training data used, the better performance the machine learning model has. In this subsection, we study this e ect by investigating di erent partition of the dataset on the performance of the trained SVM model. Figure 11 shows the result of the trained SVM model, using 40% or 90% of the dataset as training data respectively. (Note in the follow analysis, for clarity, only the absolute value of the result are demonstrated.) The performance is clearly improved as shown by the thiner and steadier errorbar curves of the residual error e . Also, the metric res P drops from 14.8% to 9.7%. ML The variation of the metric with respect to the size of the training data is further in- 19 (a) training data is 40% of dataset (b) training data is 90% of dataset Figure 11: Results of the SVM model trained by di erent size of training data. Figure 12: Performance metric P of the SVM model trained by di erent size of training data. ML vestigated by varying the percentage of the training data from 10% to 90%. The results are shown in Fig. 12. Interestingly, we notice that the metric P drops quickly at the ML beginning, but tends to a constant once the size of the training data exceeds 80%. This implies that the performance of the SVM model will not be further improved once sucient amount of training data has been used. Additionally, the results validate that our choice of 80% dataset in the previous subsections is appropriate to achieve quality performance. 5. Generalize the SVM Model to Future Epochs In the previous section, the generalization capability of the trained SVM model is evalu- ated similarly as the classical machine learning studies where the ML model performance is tested using testing data di erent from the training data, without distinguishing in which way the testing data is di erent from the training data. In improving the orbit prediction of RSOs, we are more interested in testing the SVM model for future epochs. Therefore, in this section, we introduce a constraint on the testing data, requiring them to be predictions at future epochs with respect to the training data. 20 5.1. Results on Simulated RSO The simulation of the RSO is extended to 50 days, in order to provide testing data beyond the training data within the rst 0{30 days. The data in the rst 30 days is used to train the SVM model, the data in the 30{40 days (future 10 days) is collected as the testing data of the trained SVM model, other data are used to analyze the dependency of the performance on the amount of the testing data. Again, as the along-track error e is the largest component in the RSW frame, we will only demonstrate the performance of the trained SVM model on e . (a) Absolute value (b) Real value Figure 13: Results of using the SVM model trained by data in 0{30 days, to modify the error in 30{40 days (future 10 days). Figure 13 shows the results using training data in the rst 0{30 days to modify predictions in the following 10 days (30{40 days). In the plot of the absolute value, the mean error is reduced from around 50 km to less than 20 km at t = 7 days. The standard deviation is reduced by more than a half in most cases. The plot of the real value implies that the error is distributed almost evenly along both positive and negative directions, and the mean value is reduced to almost zero with less uctuations. The metric P , the percentage of the residual ML error, is reduced to 35.5% of the true error. Compared with the results in the last section, in Fig. 10, the performance is worse as the metric becomes larger, which is expected since 21 generalizing the trained SVM model to the future time should be harder than within the same time interval. Fundamentally, the orbit of the studied RSO has changed, as illustrated in Figs 5 and 6. We note that the maximum prediction duration t in Fig. 13 is 7 days, and the max predictions in the testing data are made at epochs in the following 10 days. For example, a prediction made at t = 40 days can base on the estimation given at t = 33 days at the earliest (with t = t = 7 days), and in this situation both the prediction and its based max estimation are not included in the training data. So there is no contradiction between t max and time length of the testing data. (a) Training data: 0{30 days, testing data: 30{35 days (future 5 days) (b) Training data: 0{30 days, testing data: 30{45 days (future 15 days) Figure 14: Results of the trained SVM model on di erent length of future epochs. We show other two cases, with the testing data in future 5 days and 15 days respectively, in Fig. 14. The top one is for the testing data in 30{35 days, and the bottom one is for 30{45 days. Comparing Figs 14(a) and 14(b), we observe that the percentage P of residual error ML after ML-modi cation increases as further-into-future testing data is used. So far, we have demonstrated that the average performance evaluated by P can be ML improved by the proposed ML approach. In the practice, it is also possible that the individual performance of the trained SVM model on one data point at a future epoch is concerned. The ideal case is that all the residual errors after the ML-modi cation are smaller than the 22 true prediction errors. To evaluate the individual performance of the trained SVM model, we introduce an individual metric p , de ned with a given  as ML i je e ^ j je j ML res p j = 100% = 100% : (8) ML jej jej i i The lower bound of p is 0 when the orbit prediction error is perfectly compensated, while ML the upper bound of p is positive in nity. ML The performance of the trained SVM model on the simulated RSO, using the same data as in Fig. 13, is demonstrated in Fig. 15, with only the true error (black circles) and the residual error (colorful circles) are shown. The individual metric p are calculated at each ML orbit prediction result, and then used to color the residual errors in the gures. Figure 15(a) shows all the data points whose errors are reduced by the trained SVM model, and Fig. 15(b) shows all the data points on which the ML-modi cation fails, i.e., the residual errors are increased compared with the true errors, with all the p larger than 500% is colored by ML yellow. Comparing the two gures reveals that the trained SVM model works well on the majority of the testing data point. In Fig. 15(b), both the true error and the residual error are relatively small compared with those in Fig. 15(a), and most failures of the ML approach occur when t  1 days. We expect that is due to the fact the orbit prediction is accurate enough and below the modi cation capability of the trained SVM model. However, further studies are required to draw concrete conclusions. (a) Data point with reduced errors. (b) Data point with increased errors. Figure 15: Individual performance of the trained SVM model on future epochs, demonstrated by true error (black circles) and residual errors after ML-modi cation (colored by p ). ML Figure 16 demonstrates the distribution of the individual metric p . The vertical axis ML is the percentage of the data point in the whole training data, whose p is less than the ML corresponding p represented by the horizontal axis. For example, the intersection of ML:max the curve and the dashed line indicates that for about 80% percent of the testing data, the percentage of the residual error is less than 100%, meaning that their prediction errors have been successfully reduced by the trained SVM model. So, generally, the ML approach can 23 improve the orbit prediction accuracy at the majority of the future epochs. Since the results of the individual metric p on di erent RSOs are similar, for brevity, the discussions on ML other RSOs will be omitted in the following paper. Figure 16: Distribution of the individual metric p . ML 5.2. Results on Other RSOs To further demonstrate the capability of the trained ML model, six other RSOs with di erent orbits are studied. These RSOs are chosen from the International Laser Ranging Service (ILRS) (Pearlman et al., 2002), and the parameters are extracted from the TLE catalog, as summarized in Table 4. The orbit altitudes increase from left to right in Table 4, and the LAGEOS-1 has the highest altitude of about 5860 km. The simulated RSOs are based on the two-line element (TLE) data of the real satellites, and are assumed with the same area-to-mass ratio of 0.05 m /kg, drag coecient of 2.2, and re ection coecient of 1.25, which are identical to those in Table 3. For simplicity, we will refer to these RSOs by the name of their base RSOs. However, we note that these simulated RSOs do not re ect true orbit information of the base RSOs, because the parameters in Table 4 are only used as the initial conditions for simulations in this paper. Table 4: Parameters of simulated RSOs, based on orbits extracted from TLE data retrieved on April 14, Base RSO SWARM-A LARETS STELLA HAIYANG 2A LARES LAGEOS-1 NORAD catalog number 39452 27944 22824 37781 38077 8820 Orbit altitude [km] 460 691 804{812 971 1450 5860 Semi-major axis a [km] 6820.0 7060.2 7167.5 7340.6 7813.9 12268.9 Eccentricity e 0.00144 0.0002 0.0010 0.0012 0.0001 0.0045 Inclination i [deg] 87.36 98.204 98.6 99.35 69.5 109.84 Argument of perigee ! [deg] 91.51 81.13 284.18 65.83 133.44 278.75 RAAN [deg] 151.01 355.07 177.33 230.41 58.02 151.20 Mean anomaly  [deg] 297.75 59.94 176.41 169.42 151.57 313.31 24 (a) SWARM-A. (b) LARETS. (c) STELLA. (d) HAIYANG 2A. (e) LARES. (f ) LAGEOS-1. Figure 17: Results of the ML approach applied on RSOs with di erent orbit altitude, when generalizing to future epochs. The results of generalization capability to future epochs of these six new RSOs are shown in Fig. 17. For all the RSOs, both the mean values and the standard deviations have been greatly reduced by the trained SVM models. The performance of the ML approach varies 25 with the RSO's orbit, and further studies are necessary to reveal the detailed relationship. 6. Generalization to Nearby RSOs In this section, we investigate another generalization capability that is of potential interest for the SSA applications. In practice, we may only have accurate historical data about a limited number of RSOs. And the question is whether an error model learned from such RSOs can be applied to improve orbit prediction accuracy of other RSOs, for which not enough data is available to build the ML model. 6.1. Varying Orbit Inclination Here we vary the orbit of the training RSO (based on the ISS in Table 3) to evaluate the generalization capability of the trained SVM model to nearby RSOs. The inclination of the learned RSO is 51:6393 . We have changed the inclination to 48 , 50 , 52 and 54 respectively, but have kept all the other orbital elements unchanged. All the new RSOs are propagated from the same starting epoch, and then are measured, estimated, and predicted within 0{50 days. The training data is the error data collected in the rst 30 days of the training RSO. And then the trained model is used to modify the orbit prediction of the other RSOs in the following 10 and 20 days. In other words, the testing data of the trained SVM model is the error collected in the 30{40 days and 30{50 days of the other RSOs. Figure 18: Results of the trained SVM model on a new RSO with i = 50 . The results of modifying the prediction error of the new RSO with i = 50 in following 10 days are shown in Fig. 18, where the testing data is the error data in 30{40 days. The mean error and the standard deviation have been greatly reduced. Compared with the performance on the training RSO, shown in Fig. 13, the performance metric P has increased from 35.5% ML to 40.2%, which indicates the improvement still exists but the performance becomes worse. The results for all the ve RSOs are summarized in Fig. 19. The performance metric P on the training RSO (with i = 51:6393 ) is the smallest. Compared with the training ML RSO, P on other new RSOs are larger. But even in the worse case, P is still less than ML ML 50%, which means more than half of the prediction errors can be corrected by the trained SVM model. Furthermore, the results show that the larger the deviation of the inclination Figure 19: ML-modi ed results on nearby new RSOs with i ranging from 48 to 54 . from the RSO that has been learned, the worst the performance. Again, such results are intuitive, but nice to be seen from the machine learning results. 6.2. Varying Orbit Altitude Another important parameter for RSOs in LEO is the orbit altitude. Here, the semi- major axis of the simulated RSO based on ISS is increased. The orbit prediction results in day 30{40 are used as the testing data, and the SVM model is still trained using the training data of the original RSO in day 0{30. Figure 20: Results of the trained SVM model applied to a new RSO with a = 6833:34 km (with an 50 km increment on the original semi-major axis in Table 3). The results of the trained SVM on a new RSO with a = 6833:34 km are shown in Fig. 20. The semi-major axis of the new RSO is increased by 50 km, and the metric P is 86.0%, ML which means the orbit prediction errors are only reduced by about 14.0%. The performance of the trained SVM model is not as good as that in Fig. 13, but it is expected because the simulated RSO is in LEO and an increment of 50 km on the semi-major axis can introduce a lot of di erences. 27 Figure 21: ML-modi ed results on nearby new RSOs with an increment of a ranging from 0 to 100 km. The performance metric P of RSOs with other increments are demonstrated in Fig. 21. ML We remark that a decrease of the semi-major axis of the original RSO will result in a reentry in less than 40 days, so only the increment of semi-major axis is considered. It is observed that P is decreasing at the beginning. This could be caused by many factors, such as ML the randomness of the dataset, and the possible variation of the atmosphere density in this region. Meanwhile, the P increases quickly as the increment of semi-major axis becomes ML larger than 30 km. There are also some uctuation on the performance when the increment is larger than 60 km. However, for these cases the metric are all larger than 100%, meaning that the ML approach does not improve the orbit prediction accuracy. For the moment, we can only conclude that the trained SVM model can be generalized to other RSOs with di erent semi-major axes, but it will fail when the variation is too large. The results in this section are interesting as there is no information directly related to the new RSOs in the training dataset, but the ML approach could still be used to improve the orbit prediction accuracy. Some knowledge learned for the data depends on RSO's properties such as the orbit altitude and inclination. But some other knowledge can be universal to other RSOs, such as the errors of the assumed model, measurement, and estimator, which depend on the environment and the catalog system. Our conjecture is that the proposed ML approach does not distinguish these di erences, so the trained SVM model captures not only the information about the speci c RSO that has been used for training, but also some information of the system. 7. Conclusions In this paper, a new approach of using the supervised machine learning (ML) method to improve the orbit prediction accuracy of the resident space objects (RSOs) is proposed. The framework of incorporating an ML model in the orbit prediction process is formulated and the structure of the dataset is designed based on analyzing the prediction errors and also other possibly available parameters. A simulation-based space catalog environment is developed to evaluate the performance of the proposed approach, and the support vector machine (SVM) model is chosen as the 28 machine learning algorithm in the paper. RSOs in low Earth orbit are simulated to examine the performance of the SVM model. The results demonstrate that the trained SVM model can: 1) improve the information about the same RSO that shares the same time duration as the training data but are not available during training; 2) improve the prediction accuracy of the same RSO at future epochs; and 3) improve the prediction accuracy of other nearby RSOs unknown to the trained SVM model. It is shown that the trained SVM can reduce more than a half of the prediction error for all the three types of improvement. We note that this is an exploration study and further research is required before practical applications of the ML approach. Further research is suggested to apply the established framework to real operational data, where the RSOs with large amount of measurement data can be used as the training objects and the trained SVM model can be used to improve orbit accuracy of the other RSOs that share some similar orbit characteristics. Acknowledgment The authors would acknowledge the research support from the Air Force Oce of Scienti c Research (AFOSR) FA9550-16-1-0184 and the Oce of Naval Research (ONR) N00014-16- 1-2729. Large-scale simulations have been carried out on the School of Engineering (SOE) HPC cluster at Rutgers University. We also appreciate anonymous reviewers for insightful discussions and comments on the paper. References References Abu-Mostafa, Y.S., Magdon-Ismail, M., Lin, H.T.. Learning from Data. AMLBook, 2012. Ampatzis, C., Izzo, D.. Machine learning techniques for approximation of objective functions in trajectory optimisation. Proceedings of the International Joint Conference on Arti cial Intelligence 2009 Workshop on Arti cial Intelligence in Space 2009;2009(September). Bergin, C.. RED threshold late notice conjunction threat misses ISS. https://www.nasaspaceflight. com/2009/03/threat-to-iss-crew-soyuz; 2009. Accessed on 2017-02-05. Crassidis, J.L., Junkins, J.L.. Optimal Estimation of Dynamic Systems. Second edi ed. Chapman and Hall/CRC Applied Mathematics & Nonlinear Science. CRC Press, 2011. Folkner, W.M., Williams, J.G., Boggs, D.H., Park, R.S., Kuchynka, P.. The Planetary and Lunar Ephemerides DE430 and DE431. JPL Interplanetary Network Progress Report 42-196; JPL; 2014. F orste, C., Bruinsma, S.L., Shako, R., Abrikosov, O., Flechtner, F., Marty, J.C., Lemoine, J.M., Dahle, C., Neumeyer, H., Barthelmes, F., Biancale, R., Balmino, G., K onig, R.. A new release of EIGEN-6: The latest combined global gravity eld model including LAGEOS, GRACE and GOCE data from the collaboration of GFZ Potsdam and GRGS Toulouse. In: American Geophysical Union, General Assembly 2012. Vienna, Austria; volume 14; 2012. p. 2821. Hammer, B., Gersmann, K.. A Note on the Universal Approximation Capability of Support Vector Machines. Neural Processing Letters 2003;17(1):43{53. doi:10.1023/A:1022936519097. Hartikainen, J., Sepp anen, M., S arkk a, S.. State-Space Inference for Non-Linear Latent Force Models with Application to Satellite Orbit Prediction. Proceedings of the 29th International Conference on Machine Learning (ICML-12) 2012;:903{910. 29 Hill, K., Sabol, C., Alfriend, K.T.. Comparison of Covariance Based Track Association Approaches Using Simulated Radar Data. The Journal of the Astronautical Sciences 2012;59(1-2):281{300. doi:10.1007/ s40295-013-0018-1. Kelso, T.. Iridium 33/Cosmos 2251 Collision. http://celestrak.com/events/collision/; 2012. Accessed on 2017-02-15. Legendre, P., Deguine, B., Garmier, R., Revelin, B.. Two Line Element Accuracy Assessment Based On A Mixture of Gaussian Laws. In: AIAA/AAS Astrodynamics Specialist Conference and Exhibit. Reston, Virigina: American Institute of Aeronautics and Astronautics; 2006. p. 1{13. doi:10.2514/6.2006-6518. Lyon, R.H.. Geosynchronous Orbit Determination Using Space Surveillance Network Observations and Improved Radiative Force Modeling. Ph.D. thesis; Massachusetts Institute of Technology; 2004. Maisonobe, L., Pommier, V., Parraud, P.. Orekit: An Open Source Library for Operational Flight Dynamics Applications. In: 4th International Conference on Astrodynamics Tools and Techniques. May 3{6, 2010. . Micchelli, C.A., Xu, Y., Zhang, H.. Universal Kernels. Journal of Machine Learning Research 2006;7(Dec):2651{2667. 00204. Mitchell, T.M.. Machine Learning. 1st ed. McGraw-Hill Education, 1997. NASA, . Orbital Debris Program Oce Frequently Asked Questions. http://celestrak.com/events/ collision/. Accessed on 2017-02-15. Pearlman, M.R., Degnan, J.J., Bosworth, J.M.. The International Laser Ranging Service. Advances in Space Research 2002;30(2):135{143. doi:10.1016/S0273-1177(02)00277-6. Rivera, J., Bai, X.. Improving the Orbit Propagation Accuracy of Two-Line-Element Satellite. In: 67th International Astronautical Congress. Guadalajara, Mexico; 2016. p. 26{30. Sang, J., Bennett, J.C., Smith, C.. Experimental results of debris orbit predictions using sparse tracking data from Mt. Stromlo. Acta Astronautica 2014;102:258{268. doi:10.1016/j.actaastro.2014.06.012. Sang, J., Bennett, J.C., Smith, C.H.. Estimation of ballistic coecients of low altitude debris objects from historical two line elements. Advances in Space Research 2013;52(1):117{124. doi:10.1016/j.asr. 2013.03.010. Sharma, S., Cutler, J.W.. Robust Orbit Determination and Classi cation: A Learning Theoretic Approach. IPN Progress Report 42-203; Jet Propulsion Laboratory; 2015. Smola, A.J., Sch olkopf, B.. A tutorial on support vector regression. Statistics and Computing 2004;14(3):199{222. doi:10.1023/B:STCO.0000035301.49549.88. The MathWorks, . Statistics and Machine Learning Toolbox User's Guide R2017b. The MathWorks, Inc., Natick, Massachusetts, United States, 2017. Tukey, J.W.. Exploratory Data Analysis. Addison-Wesley series in behavioral science. Reading, Mass: Addison-Wesley Pub. Co, 1977. Vallado, D., Crawford, P., Hujsak, R., Kelso, T.. Revisiting Spacetrack Report #3. In: AIAA/AAS Astrodynamics Specialist Conference and Exhibit. American Institute of Aeronautics and Astronautics; 2006. p. 1{88. doi:10.2514/6.2006-6753. Vallado, D.A.. Fundamentals of Astrodynamics and Applications. The McGraw-Hill Companies, Inc., 1997. Wang, R., Liu, J., Zhang, Q.M.. Propagation errors analysis of TLE data. Advances in Space Research 2009;43(7):1065{1069. doi:10.1016/j.asr.2008.11.017. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Statistics arXiv (Cornell University)

Improving Orbit Prediction Accuracy through Supervised Machine Learning

Statistics , Volume 2018 (1801) – Jan 15, 2018

Loading next page...
 
/lp/arxiv-cornell-university/improving-orbit-prediction-accuracy-through-supervised-machine-j17le73fGu
ISSN
0273-1177
eISSN
ARCH-3347
DOI
10.1016/j.asr.2018.03.001
Publisher site
See Article on Publisher Site

Abstract

Due to the lack of information such as the space environment condition and resident space objects' (RSOs') body characteristics, current orbit predictions that are solely grounded on physics-based models may fail to achieve required accuracy for collision avoidance and have led to satellite collisions already. This paper presents a methodology to predict RSOs' trajectories with higher accuracy than that of the current methods. Inspired by the machine learning (ML) theory through which the models are learned based on large amounts of observed data and the prediction is conducted without explicitly modeling space objects and space environment, the proposed ML approach integrates physics-based orbit prediction algorithms with a learning-based process that focuses on reducing the prediction errors. Using a simulation-based space catalog environment as the test bed, the paper demonstrates three types of generalization capability for the proposed ML approach: 1) the ML model can be used to improve the same RSO's orbit information that is not available during the learning process but shares the same time interval as the training data; 2) the ML model can be used to improve predictions of the same RSO at future epochs; and 3) the ML model based on a RSO can be applied to other RSOs that share some common features. Keywords: Orbit Prediction; Resident Space Object; Supervised Machine Learning; Support Vector Machine. 1. Introduction The amount of resident space objects (RSOs) and the quantity of con icts between RSOs are rapidly escalating. The number of space objects larger than 10 cm is presently approach- ing 21,000, the estimated population of objects between 1 and 10cm is about 500, 000, and for objects smaller than 1cm the number exceeds 100 million (NASA). At the heart of the challenges for Space Situational Awareness (SSA) is to predict each object's orbit eciently Corresponding author Email addresses: AstroH.Peng@gmail.com (Hao Peng), xiaoli.bai@rutgers.edu (Xiaoli Bai) Postdoctoral Associate. Assistant Professor. Preprint submitted to Advances in Space Research January 16, 2018 arXiv:1801.04856v1 [astro-ph.EP] 15 Jan 2018 and accurately. Several painful incidents have occurred in recent decades, such as the Febru- ary 2009 collision of a U.S. Iridium communications satellite and a Russian Cosmos 2251 communication satellite (Kelso, 2012), and the RED threshold late notice conjunction threat to the International Space Station (ISS) from the \25090 PAM-D" debris (Bergin, 2009). The limitation of the current orbit prediction capability is among the main causes for these collision events. Current approaches for orbit prediction are physics-based, the success of which requires a good knowledge of the state of the space object at the start of the trajectory computa- tion, and the environment information such as earth gravity, atmospheric drag, and solar radiation pressure, as well as the intent information for the maneuvering objects. However, our understanding of the space/time variation of atmospheric density is limited, and our information about the space object is not accurately updated, for example the maneuvering of a spacecraft that is owned by other nations could be unavailable when the orbit predic- tion is conducted. Moreover, our current surveillance resources are limited and expensive and space-tracking measurements are sparse and noisy. All of these issues lead to the fact that current errors in physics-based predictions can be too large to be used for meaningful actions. The machine learning (ML) method presents a di erent modeling and prediction capabil- ity compared to the physics-based approach. The prediction can be made without explicitly modeling space objects, spacecraft maneuvers, and space environment. Instead, the models are learned based on large amounts of observed data, similar to human cognition in learning from past experience to predict future events. There are di erent types of machine learning, with supervised learning, reinforcement learning, and unsupervised learning as the three most common types (Abu-Mostafa et al., 2012). Reinforcement learning is usually used to make decisions, while unsupervised learning is used to nd patterns and structures within the data such as to cluster data into di erent groups without providing outputs to describe the groups. Supervised learning is such a method that learns a function or mapping from labeled data. The training data are pairs of input and its corresponding output. The super- vised ML method is the appropriate approach for improving orbit prediction error based on historical measurements and error information. In the remainder of this paper, for brevity, we will refer to supervised machine learning methods as ML methods. The ML methods have shown great capability for a wide range of applications (Abu- Mostafa et al., 2012; Mitchell, 1997), including many in the aerospace eld (Ampatzis and Izzo, 2009; Hartikainen et al., 2012; Sharma and Cutler, 2015). Hartikainen et al. combine physical models with non-parametric data-driven techniques to build a model for orbit pre- diction (Hartikainen et al., 2012). Their method focuses on the data mining aspect and can extract information of unknown forces from historical data. Sharma and Cutler have pre- sented a learning approach to do orbit determination based on distribution regression and transfer learning methods (Sharma and Cutler, 2015). Their tests show that the proposed machine learning approach is superior to the conventional methods such as the extended Kalman lter. Moreover, the method is able to estimate signi cantly varying orbital param- eters. This paper presents a computational framework that improves orbit prediction accuracy 2 through the ML method. A crucial feature of the proposed approach is that the overall framework will introduce and embed a learning-centered strategy into the physics-based prediction. To take advantage of the knowledge on the physics-based models that are im- portant and represent the state of the practice, we design the learning process to focus on modeling the prediction errors instead of the full dynamics. In this way, the prediction errors introduced in di erent steps including measurement, estimation, and prediction, are captured as a surrogate model. An extra bene t from this approach is that the learning is only burdened with nding the incremental corrections to the physics-based prediction, which reduces the dimensionality for the learning task. The support vector machine (SVM) regression method (Smola and Sch olkopf, 2004), which is an advanced method proven to be robust to outliers in the dataset, is chosen in this paper, although the proposed methodology is expected to be general and applicable to other ML methods. Furthermore, in contrast with other researches that focus on TLE data (Legendre et al., 2006; Sang et al., 2014, 2013; Wang et al., 2009), we use a simulation-based space catalog environment to validate the proposed orbit prediction method. This approach is important, because a known \truth" will underlie all simulations, and the solutions from the machine learning can be compared to the \truth" to make de nitive statements on the performance. Our simulation results, interestingly and innovatively, demonstrate three types of gen- eralization capability for the proposed ML approach. First, the ML model can be used to improve the RSO's orbit information that is not available during the learning process but share the same time interval as the training data. Second, the ML model can be used to improve predictions of the same RSO at future epochs. And third, the ML model based on one RSO can be applied to other RSOs that share some common features. Our results also provide insight upon several questions that are important for the SSA applications including: more observation stations could generate more historical data and thus better performance can be expected from the ML model; larger size of the training data can lead to better performance but there appears to be a limit; and the generalization capability to nearby RSOs implies that it is possible to build an ML system for a large number of RSOs while only representative RSOs are used for training. The paper outline is as follows. The high- delity simulation environment is rst described in Section 2. Next, the framework of the machine learning approach to improve orbit prediction accuracy is presented in Section 3. In Section 4, the numerical simulation results are analyzed, where a performance metric is proposed to evaluate the ML approach. In Section 5, the generalization problem to future epochs is proposed and evaluated for SSA application. Then in Section 6, the generalization problem to di erent RSOs is examined by varying the inclination and the semi-major axis of the simulated RSO. Finally, conclusions and future studies are presented in the last section. 2. Simulation Environment The framework of the orbit predictions through supervised machine learning is shown in Fig. 1. The top four blocks show the conventional orbit prediction process, while the bottom three blocks represent the proposed ML approach to enhance the conventional orbit 3 prediction. In the following subsections, the top four blocks will be explained, and the bottom three blocks will be presented in Section 3. Truth Dynamic Models • Earth gravity model Measurement Models • Third-body perturbation • Azimuth, Elevation, Range True • Atmosphere drag • Measurement noises orbit • Solar radiation pressure • Radar station models • Characteristics of RSO Tracks Assumed Dynamic Model Estimation Models Orbit Prediction Estimated • Orbital elements • Orbital state states • Drag coefficients • Characteristics of RSO • Reflection coefficients Conventional Orbit Prediction Predicted states Machine Learning Enhancement Analysis Data Collection Machine Learning • Validation results Error Dataset • Track information • Supervised or not • Performance metrics model • Prediction errors • Model tuning • Generalization capability • Interpret results Figure 1: The framework of improving orbit predictions through the ML method. 2.1. Truth Dynamic Models The truth dynamic models include Basic Newtonian two-body gravitational force, with an accurate gravitational constant. High- delity non-spherical gravity model of the Earth. Third-body perturbations. Air drag force model with high- delity atmosphere model. Solar radiation pressure model with high- delity solar activity model. Other parameters depending on the speci c RSO, including the mass, inertia, geome- try, material properties, etc. The choice of truth dynamic model for the simulation depends on the time scale. In this paper, the truth dynamic model is expected to include the major factors that could con- tribute to the orbit prediction error within at least 60 days. Therefore, forces that have True orbit long-period e ects do not need to be modeled with high precision. The setup of the truth model is summarized in Table 1. The Newtonian gravitational force is added with an Earth 14 3 2 gravitational constant of 3:986  10 m =s . The non-spherical e ect of the Earth gravity is modeled using spherical harmonic functions, with coecients provided by the EIGEN- 6S model (F orste et al., 2012), truncated with degree/order 40  40 as the truth gravity. Third-body perturbations of all major solar system bodies are considered, including the Sun, all the planets, the Pluto, and the Moon. The position of these bodies are provided by DE430 data le from the JPL (Folkner et al., 2014). The DTM2000 model is used to approximate the atmosphere. The Marshall Solar Activity Future Estimate Solar (MSAFE) data from NASA is used to provide solar activity information, which has signi cant e ect on the density and the speed of the atmosphere. The solar radiation pressure is calculated 6 2 with the reference value as 4:56  10 N=m at 1 AU (149,597,870.0 km) from the Sun. And the e ect of the penumbra and eclipse are considered. Additionally, a spherical RSO with a constant section area is assumed, and the drag coecient C and single-parameter re ection coecient C are assumed to be constant. These models are implemented using the Orekit, which is a low level space dynamics library written in Java (Maisonobe et al., May 3{6, 2010). Table 1: Parameters of the \truth" model used to generate orbits and measurements, and the assumed model used in the estimation and pre- diction. Parameters Truth model Assumed model Earth Shape WGS84 WGS84 Harmonics Gravity Field 40 40 20 20 Third-Body Purterbation Sun + Solar Planets Sun + Jupiter + Pluto + the Moon + the Moon Atomosphere Model DTM2000 DTM2000 Solar Activity Model MSAFE MSAFE 2.2. Measurement Models Measurements of a RSO can be obtained using ground-based radar or optical stations (Hill et al., 2012), or space-based systems using specially designed satellite (Lyon, 2004). In this study, only ground radar stations are considered because the paper focuses on the general machine learning framework and the target RSO is assumed to be an LEO object. In the simulations, the radar station is located at a topocentric frame centered at a given geodetic point location de ned on the WGS84 Earth ellipsoid. At each step of orbit propagation, the visibility of the RSO with respect to the ground stations is checked. If the RSO is visible to a station, the station will provide measurements including the azimuth , the elevation , and the range . These measurements are generated by converting the state vector of the RSO in the Earth Centered Inertial (ECI) frame (Vallado, 1997) to the topocentric frame and then calculating the angle information. A series of consecutive measurements are organized as a track. If more than one station detect the RSO at a particular time, all the stations 5 will generate measurements which are combined into one track. Therefore, one track will start when it is visible to any station, and end when no stations can detect it. Table 2: Ground-based radar stations modeled in the paper (Hill et al., 2012). Station Eglin, FL Clear, AK Kaena Point, HI Latitude [deg] 30.57 64.29 21.57 Longitude [deg] 86.21 149.19 158.27 Altitude [m] 34.7 213.3 300.2 Maximum Range  [m] 13210 4910 6380 Feasible Elevation [deg] 1{90 1{90 0{90 [m] 32.1 62.5 92.5 [deg] 0.0154 0.0791 0.0224 [deg] 0.0147 0.024 0.0139 In the simulations, three radar stations are simulated, with their parameters listed in Table 2. The RSO is visible to a station only if the range is less than the maximum range and the elevation is within the feasible elevation range. Measurements are given at a 20-second interval. The measurement errors are simulated as normal distributions with zero biases and standard deviations of  ,  , and  for the azimuth, elevation and range respectively, as listed in Table 2. We remark that the constraint of the azimuth range of the station (Hill et al., 2012) has been omitted in the current study, so the error model does not depends on the azimuth. 2.3. Estimation Models Estimation window ˆ ˆ (, X C ) id 1,i1 ˆ ˆ (, X C ) id 1,i1 track i track i+1 track i-1 ˆ ˆ (, X C ) id ,i est. time t t t t i-1 i i+1 Figure 2: Illustration of the tracks included for an estimation at t , providing the estimated state X and i i drag coecient C of the RSO. Figure 2 illustrates the estimation strategy. The black curve represents the true orbit of a RSO. The colored (light orange and blue) boxes along the curve represent observation 6 tracks. The estimation is carried out at the beginning epoch of each track, as represented by the red dots in the gure. For the i-th track in the gure, whose beginning epoch is t , measurements on current track and all previous tracks t with t t  t are consid- b b i est. ered, where t = 12 hours is the chosen estimation window. This estimation window is est. shown by the gray rectangle underlying the true orbit in Fig. 2. Orange rectangles show ^ ^ all tracks in the estimation window that will be used to produce the estimation (X ; C ) i d;i at t , while blue rectangles show tracks outside the estimation window. We remark that this strategy is adopted to simulate the realistic process of the orbital determination, where single observation track may not have enough measurements. The batch Least Square (LS) estimator (Crassidis and Junkins, 2011) is chosen to estimate the state of the RSO. In practice, the LS estimator is only sub-optimal due to the nonlinearity of the model. However, the focus of this paper is the framework of the proposed ML approach and our hope is that it can work with most of the speci c estimation methods. Additionally, we note that although the batch least square estimator can also generate the covariance matrix, which is important for collision risk assessment, only the orbital state is used in our simulation. This is due to the fact that the covariance information is not available for the two-line element (TLE) catalog (Vallado et al., 2006). Starting from the block of estimation models in Fig. 1, simulations are carried out using an assumed dynamic model. In practice, the assumed dynamic model used during estimation and propagation is often only an approximation to the real physics. In the simulations, the assumed model is set up with the spherical harmonic gravity model of degree/order 20 20, and with the third-body perturbations including only three major resources, the Sun, the Moon and the Jupiter, as summarized in Table 1. The orbit state X and the drag coecient C of the RSO are estimated by the LS estimator, which generates the ^ ^ pairs of estimations denoted as (X; C ). Other parameters of the RSO are xed, including the cross section area, the mass, and the re ection coecient C of the surface. Finally ^ ^ all pairs of estimations (X; C ) are recorded for the orbit predictions. We remark that if the di erences between the true model and the assumed model are smaller, the error of the estimation will be smaller, and also the deviations of propagating two estimates to the same epoch will be smaller. As a consequence, compared with other assumed models with larger di erences, less information is embedded in the simulated catalog using the current assumed model. Using such conservative assumptions, the results in Sections 4 to 6 show that the ML approach improves the orbit prediction accuracy. Therefore, we demonstrate the capability of the ML approach in a more signi cant way. 2.4. Orbit Prediction ^ ^ After obtaining estimations (X ; C ) for all the tracks, the prediction process is straight- i d;i forward. Using the assumed dynamic model, which is the same as the one that has been used for estimation, as show in Fig. 1, each estimated state X is propagated to a future epoch t with t > t , when another observation track begins. At the future epoch t , the j j i j ^ ^ predicted state X based on X is compared with the true state X , which provides the j;i i j prediction errors. Since the assumed dynamic model, the measurement process, and the es- timation process will all introduce errors, the resulting prediction error can grow quickly to 7 a meaningless magnitude as the propagation time increases. Therefore, we use a maximum prediction duration t = 7 days to restrict the orbit prediction process in this paper. max This duration is also long enough for the surveillance and collision avoidance for the LEO objects. 3. Prediction Accuracy Improvement through Machine Learning In this section, the proposed ML approach to improve orbit prediction accuracy is pre- sented. We rst introduce the SVM model, which is the adopted machine learning method in this paper. We then present the machine learning model of the orbit prediction error. And we discuss the dataset chosen for the SVM model to capture the orbit prediction error in the last subsection. 3.1. Support Vector Machine The support vector machine (SVM) method is a machine learning algorithm that can be used for both classi cation and regression problems. One strength of the SVM method is that they are nonparametric techniques, so we do not need to specify the basis functions in prior. The SVM regression can handle nonlinear problems since it relies on kernel functions. Moreover, the SVM method has universal approximation capability with various kernels including the Gaussian kernel (Hammer and Gersmann, 2003; Micchelli et al., 2006). We note that although it cannot be veri ed that the orbit prediction problem under this study satis es the fundamental assumptions of SVM's universal approximation property, the result in this paper demonstrate that the SVM method is practically e ective to reduce orbit prediction errors. The concept of the SVM is brie y reviewed below, and the details are referred to the references. The "-SVM regression method aims to nd a function f (x) that has at most " deviation from the actual obtained targets for all the training data, and at the same time f (x) is required to be as at as possible (Smola and Sch olkopf, 2004). Suppose we have a set of training data X = f(x ; y ); (x ; y ); : : : ; (x ; y )g 2 L  R, where L denotes the space of 1 1 2 2 n n input learning variables. In the linear case, the desired function takes the form f (x) = hw;xi + b; (1) where ! 2 L, b 2 R, and h;i represents the inner product operation. Then the training problem is to nd the attest function in the space L, where the atness of the function is represented by k!k . The training problem is cast as a convex optimization problem to minimize the cost function k!k , subject to the constraint jy h!;x i bj  " for all i i the training data. By introducing slack variables and then solving the dual problem of the above quadratic problem, the support vector expansion of the regression function f (x) can be expresses as f (x) = ( )hx ;xi + b; (2) i i where and are the dual variables solved from the dual problem (Smola and Sch olkopf, 2004). Comparing Eqs. (1) and (2), we can see that ! can be completely described as a 8 linear combination of the training data x . This property of the SVM makes it possible to deal with nonlinear regressions via kernels. Substituting the inner product h;i in Eq: (2) by the kernel k(;), the optimization problem is reformulated to nd a attest function in the feature space indicated by the kernel, expressed as f (x) = ( )k(x ;x) + b: (3) i i As shown in Eq. (3), when using the kernel technique, the coecient ! in Eq. (1) will not be provided explicitly. However, when given a new testing data x , according to Eq. (3), test we only need the kernel k(;) and the corresponding dual variables and to nd the prediction f (x ) by the trained SVM regression model. test 3.2. Machine Learning Model of the Prediction Error The concept of the ML approach to modify the prediction is illustrated in Fig. 3. After the ground station observes a RSO, the estimation of the RSO is conducted. The estimated state will have errors, and the following orbit prediction will further deviate from the true orbit because of the assumed dynamic models. We introduce the ML approach to directly modify this prediction, such that the ML-modi ed prediction will be closer to the true state. Importantly, as shown in Fig. 3, this modi cation does not address the state around the estimation at the current epoch, while in contrast, it directly improves the prediction at the future epoch. prediction based on estimation ML-modified prediction true state estimated state true state ECI (J2000) Figure 3: Modify the prediction through ML-based correction. 3.3. Dataset Capturing the Orbit Prediction Error Model The structure of the training data should be carefully designed to capture the prediction errors. We note that although the dataset designed in this paper is not the unique solution, 9 results in the paper show that the ML approach can indeed improve the orbit prediction accuracy based on the proposed design of the dataset. True Estimated Predicted ML-modified Deviated orbit True orbit ML-Modification e(t ;t ) i+1 i e(t ;t ) i+2 i Modifying process Learning process time t t t t t t i-1 i i+1 i+2 i+3 Figure 4: Illustration of the dataset for the ML approach to improve orbit prediction accuracy. Figure 4 illustrates the construction of the dataset for the ML approach in this study. The horizontal axis represents the time t. The dark solid curve represents the true orbit, and the dashed curves represent the predicted orbit arcs propagated based on the assumed dynamic models. The true state of the RSO at any speci ed epoch t is denoted as X (t), which can be expressed in di erent coordinate frames and forms. In the classical orbital COE T 6 elements (COE) form, the state X (t) is expressed as (t) = [a; e; i; !; ; f ] 2 R . In the Cartesian coordinates of the Earth-centered inertial (ECI) frame, the state X (t) is expressed ECI T 6 as x(t) = [X; Y; Z; V ; V ; V ] 2 R . At a speci c time t , we have the true state X (t ), X Y Z i i ^ ^ ^ estimated state X (t ), predicted state X (t ; t ), and ML-modi ed state X (t ; t ), i i b ML-modi ed i b where the second time parameter t in the parenthesis indicates the prediction is based on the previous estimation at an earlier epoch t  t . b i In Fig. 4, the lighter gray box on the left hand side shows the learning process of the ML method, during which the training data are collected to train the SVM model. As annotated in the gure, at the epoch t , there are three states about the RSO, the true state X (t ), i i ^ ^ the estimated state X (t ), and the predicted state X (t ; t ) which is based on the previous i i i1 estimation X . We have three types of error (as shown in Fig. 4): i1 True prediction error, e(t ; t ) = X (t ; t ) X (t ). i i1 i i1 i True estimation error, e ^(t ) = X (t ) X (t ). i i i ^ ^ Relative error of prediction, "(t ; t ) = X (t ; t ) X (t ). i i1 i i1 i These three errors are related by the equation e(t ; t ) = e ^(t ) + "(t ; t ): (4) i i1 i i i1 10 Notice that the relative prediction error "(t ; t ) is identical to the de nition of the con- i i1 sistency error between two estimations in the work of Rivera and Bai (2016). These orbit prediction errors often represented in the RSW coordinate frame, where x-axis (radial) is the radial direction, the y-axis (along-track) is perpendicular to the x-axis in the orbital plane and points to the inertial velocity direction, and the z axis (cross-track) is along the angular momentum direction (Vallado, 1997). Next, we construct the dataset for the ML model. Apart from the three types of errors, there are other variables available at t , such as the measurement of the current track, which could also contain information about the orbit prediction error. Take the epoch t as an instance, and assume the previous estimation is obtained at the epoch t , with t  t . b b i Then, the learning variables of a particular data point of the dataset consist of the following: The relative prediction error "(t ; t ) based on the previous estimation X (t ), expressed i b b COE RSW in both the form of COE as  , and the form of RSW frame as x . So we " " COE T RSW T T will have  = [  ; x ] as a representation of "(t ; t ), with redundancies. i b " " It is expected that  carries the information of the model error between the assumed model and the truth model. The prediction duration t = t t . This information is an important factor, as the i b longer the prediction, the larger the prediction error. COE Current estimation of the orbit state, expressed in the COE form as (t ). This information can re ect physical information of the orbit, such as the altitude and the shape. The atmosphere drag force and the solar radiation force also depend on these information. ^ ^ The estimated drag coecient C (t ) at the previous estimation, and its error C (t ; t ) d b d i b ^ ^ ^ = C (t ) C (t ) with respect to the current estimation C (t ). This information is d b d i d i expected to be important for RSOs in LEO. Maximum elevation  at the current track starting with t , and the corresponding azimuth and range . This information is usually located around the middle of the track. They can re ect the diculty and accuracy of the measurements along the track, because when the RSO has a larger range and smaller elevation, the sensitivity of the estimated state to the measurement will be smaller. We note that it is not necessary to distinguish the estimation error and the propagation error for the proposed ML approach. The target variable for the above data point is chosen as: True prediction error e at a future epoch t (t < t ), expressed in the RSW frame as j i j RSW T 6 x = [e ; e ; e ; e ; e ; e ] 2 R . e x y z vx vy vz The illustrated e(t ; t ) and e(t ; t ) in Fig. 4 are examples of the target variable, based i+1 i i+2 i on the same estimation X but with di erent prediction durations of t = t t and i i+1 i 11 RSW t t respectively. With x being a 6-dimensional vector, six SVM models in total i+2 i e will be trained for all the components. With these pairs of learning and target variables, the ML model is designed to di- rectly modify a future orbit prediction. As shown by the prediction process in Fig. 4 (the darker gray block at the right side), the predicted state is adjusted by the additional ML- modi cation, which is the output of the ML model. If the learning and target variables are related, and if their relationship is at least partially contained in the designed dataset, the ML model should capture the underlying relationship and the ML-modi cation is expected to bring the prediction closer to the true orbit. Collect all learning variables as  = [t; ; ^; ; ; ]. One data point of the dataset is expressed as ((t ); e (t )), where  2 fx; y; z; vx; vy; vzg corresponds to di erent state i  j components of the target true error e, and t , t (t < t ) indicate the di erent epochs of the i j i j learning variables  and the target variable e respectively. Then nally, the whole dataset X is constructed as, X = f : : : : : : : : : : : : : : : ((t ); e (t )) ; ((t ); e (t )) ; ((t ); e (t )) ; : : : i2  i1 i2  i i2  i+1 ((t ); e (t )) ; ((t ); e (t )) ; : : : i1  i i1  i+1 ((t ); e (t )) ; : : : i  i+1 : : :g; where the i-th row shows prediction data from t to all the following epochs t represented i j by the j-th columns. We choose these learning variables and target variables based on the rationale that the true state and the true models are usually not available, so any variables that explicitly depend on the true orbit information shall be not used, such as the true error e. However, the information of the true error e can be implicitly contained in the relative error " to some degrees. In this way, the proposed ML approach actually also connects the predicted error with the true prediction error at a future epoch, with the assistance of other learning variables. 4. Numerical Results of Improving Orbit Prediction In this section, the framework established in the previous section is applied to a simulated RSO. The ML approach to improve orbit prediction is demonstrated by training an SVM model rst and then using it to improve the orbit prediction accuracy. We will also discuss the e ect of the size of the training data at the last subsection. 4.1. Building the SVM model for the Simulated RSO Parameters for the simulated RSO are summarized in Table 3. The initial orbital param- eters are given in the ECI frame, which is directly converted from a TLE of the International 12 Table 3: Parameters of the simulated RSO. Parameter Value UTC epoch 2016/09/28 12:09:45 Semi-major axis a [km] 6783.34 Eccentricity e 0.006793 Inclination i [deg] 51.6393 Argument of perigee ! [deg] 14.5438 RAAN [deg] 262.6471 Mean anomaly [deg] 345.5909 Area-to-mass ratio 0.05 Drag coecient C 2.2 Re ection coecient C 1.25 Space Station (ISS). However, we note here that we study a generic RSO with area-to-mass ratio and re ection coecient di erent from the ISS. The RSO is propagated for 30 days using the truth dynamic model. The numerical integrator is chosen to be the Dormand-Prince 8(5,3) method for all the propagation in this paper. The absolute tolerance on each position component in the ECI frame is set as 0.1 m, and the propagation step size is limited between 0.001 and 90 seconds. Figure 5: True orbit (colorful arcs) and observations (green dots) of the RSO during the simulation in the ECI frame. The true orbit is colored by red at the beginning, and gradually transits to the blue color at the end of the propagation. Figure 5 demonstrates the orbit of the simulated RSO for 30 days. The orbit is colored by red at the beginning, and is seen to gradually change to blue at the end. This is expected as that the orbit processes within the simulated time interval, due to the perturbation forces. In Fig. 5, the green dots along the orbit show the locations where measurements are generated by the ground stations. The revolution of the instantaneous COE is shown in Fig. 6, alongside with zoom-in plots in t 2 [25; 25:4] days to clearly show their variations. 13 Figure 6: Instantaneous classical orbital elements of the simulated RSO. Apart from the clear precession motion, the orbit also shows short-term and secular changes on the semi-major axis, eccentricity and inclination. After obtaining the dataset following the description in the previous section, we perform data cleaning before starting to train the SVM model. The guidelines for the data cleaning usually depend on the speci c dataset, the target problem, and also the experience of the modeler. Here, two types of data cleanings are performed. First, we remove the cases where the estimations of the drag coecient C lie outside of the interval [1:2; 3:2]. Because the ^ ^ atmosphere drag is sensitive to C , a largely biased C can cause signi cant errors. In d d practice, C can be compared with the historical data, and those with large biases can be detected. Second, we use the Tukey's test (Tukey, 1977) to remove outliers in the dataset, which means data points with too large errors are eliminated. Speci cally, denoting the quartiles of the along-track error e as q , q , and q , the errors out of the interval [q y 1 2 3 1 3(q q ); q + 3(q q )] are discarded as outliers. 3 1 1 3 1 In this paper, the MATLAB (R2017b) implementation of SVM regression function fitrsvm is used (MathWorks, 2017). The kernel function is set to the Gaussian kernel to deal with the nonlinearity of the prediction error. Some default settings of the MATLAB implemen- tation are used: the scaling parameter for the Gaussian kernel is selected by a heuristic procedure with a subsampling process; the "-margin is selected as iqr(y)=13:49 where iqr(y) indicates the interquartile; the box constraint parameter C is selected as iqr(Y )=1:349; the optimization solver for the SVM training is the Sequential Minimal Optimization (SMO) method. The learning variables will be standardized using the mean and standard deviation of each variable. The Karush-Kuhn-Tucker (KKT) violation is checked at every iteration to determine the convergence when solving the optimization problem, where the convergence tolerance is 10 . Based on the collected dataset, six SVM models in total can be trained to capture the underlining orbit prediction errors for the six components of the true error 14 RSW T x = [e ; e ; e ; e ; e ; e ] . e x y z vx vy vz Additionally, in order to reproduce the training results, the seed of the random stream is set to 42 before the training. We remark that our experiments reveal that the results in the following discussions do not depend on the particular random seed. A di erent seed will slightly vary the speci c metric but will not change the general pattern. 4.2. Validation of the Trained SVM Model First, we consider a simpler case, where only the rst ground station in Table 2 is con- sidered. The measurements and estimation errors of a single station are more likely to be coherent, thus the underlining prediction errors are more likely to follow a deterministic pattern, which can be learned by the SVM. This case re ects the situation where a station only has access to its own historical data. (a) Position components (logarithmic plot of abso- (b) Velocity components (real value) lute value) Figure 7: Distribution of the true prediction error e with respect to the prediction duration t. Figure 7 shows the true prediction error e of the whole dataset, where the RSO has been propagated for 30 days and the maximum prediction duration t for each estimated state max is set is 7 days. The horizontal axis represents the prediction duration t. The vertical axis represents the position components of e in Fig. 7(a) and the velocity components in Fig. 7(b). In Fig. 7(a), the position errors are shown in the logarithm coordinate, since the along-track position error e is signi cantly larger than the other two components. In Fig. 7(b), although e is slightly more spread out than the other two components, all the errors are small and vy there is no dominant component. The most important task is to reduce the orbit prediction error of the along-track position error e . Therefore, in the following study, we will only use e as the study object and demonstrate the performance of reducing e through the ML y y approach. 80% of the dataset is randomly chosen to train the SVM model while the remaining 20% is used to evaluate the performance of the learned model. For evaluation, the learning variable of the testing data is the input to the SVM model, and the output is the ML-predicted error, denoted as e ^ . The ML-modi ed prediction X is achieved by subtracting e ^ ML ML ML 15 ^ from the prediction X . So the residual orbit prediction error e after the ML-modi cation res is e = e e = X X; (5) res ML ML-modi ed which shall be zero if the SVM model completely captures the underlying errors. We use the concept of the volume-weighted symmetric mean absolute percentage error (volume-weighted MAPE) from statistics as the metric to evaluate the prediction accuracy between the reference value and the output value of a prediction model. The performance metric P used to quantify the trained SVM model is de ned as the ratio between the ML summation of the absolute errors between e and e on each data point in the testing data, ML and the summation of the original absolute true error of all data in the testing data, which can be mathematically expressed as, je e ^ j ML i=1 P = 100% ; (6) ML n jej i=1 where n is the size of the testing data. The metric reaches its lower boundary zero when the ML-predicted error is identical to the true error, but has no upper boundary. Notice that the numerator in Eq. (6) is actually the residual error after ML-modi cation, as de ned in Eq. (5). Therefore, we have je j res i=1 P = 100% ; (7) ML jej i=1 which indicates that the metric actually measures the percentage of the residual error e res with respect to the true error of the testing data. The lower this percentage P is, the ML more errors are corrected, thus the better the performance of the trained SVM model is. Notice that this chosen metric is consistent with the objective of the SVM model, which is to include all data point with a surface as smooth as possible through minimizing the total error within the training data. In the following analysis, this metric will be used to quantify the performance of the trained SVM model. Figure 8 demonstrates the results of the SVM model on the training data. The horizontal axis represents the prediction duration t. The vertical axis shows true, ML-predicted and residual along-track errors e respectively, as annotated by the legend in the gure. In both Figs 8(a) and 8(b), the left gures directly show the scatter plot, where the black circle represents the true error e , the green dot represents the ML-predicted error e ^ , and the y ML;y red dot represents the residual error e after ML-modi cations. The right gures show the res errorbar plot of the left scatter plot, where the center point represents the mean value of each clustered group of the error, and the length from the middle of the bar to the top (or the bottom) represents the standard deviation of the corresponding clustered group. For clarity, the three curves are slightly displaced along the horizontal axis to avoid overlapping. (Similar gures will appear in the remaining part of the paper, and we will only explain the di erence there.) In Fig. 8(a), the absolute values of the errors are demonstrated. The ML-predicted error e ^ is scattering within the same area of the true error e , and the errorbar curve is very ML;y y 16 (a) Absolute value. (b) Real value. Figure 8: Results of the trained SVM model on training data with 1 station. close to each other. The performance metric P is 18.3%, indicating that the trained SVM ML model has captured most features of the underlying error model. The residual error is also very small as revealed by both the scatter plot and the errorbar plot. Therefore, we can conclude that the trained SVM model captures the orbit prediction error in the training dataset very well. In Fig. 8(b), learning results are demonstrated with the real value of the errors. As shown in both plots, the error is distributed around zero almost evenly. In this case, the orbit prediction error cannot be compensated for if we only perform a modi cation based on tting the mean error with respect to the prediction duration t, which was reported in the reference (Rivera and Bai, 2016). In contrast, the trained SVM model successfully compensates for the errors and reduces the standard deviation. A common concern in the machine learning applications is the performance of the trained model on a di erent set of data that is not included in the training data. We test this generalization capability here, where the remaining testing data is used as the input to the trained SVM model, and the results are shown in Fig. 9. The metric P is 22.6% now, ML which is larger than that of the training data. This is reasonable because the testing data can contain information beyond the training data. The errorbar plots in both sub gures show that the mean value and the standard deviation have been greatly reduced, even though the 17 (a) Absolute value. (b) Real value. Figure 9: Results of the trained SVM model on testing data with 1 station. testing data is unknown to the trained SVM model. In the above discussions, only one station is considered. A clear drawback of this situation is that a RSO is possible to be below the horizon of the station for a long time, as revealed by the gaps in Figs 8 and 9. Next, all three ground stations in Table 2 are included in the simulations. The obtained dataset is also randomly (with xed seed to allow reproduction) divided into training (80%) and testing (20%) data. The result of the new SVM model on the testing data is shown in Fig. 10. We observe two clear di erences between the new results and the results using only one station: The data in the scatter plot is denser along the horizontal axis, due to the fact that more tracks are available when three stations are used. The results have been clustered into more groups in the error bar plots to show more details in the gure. The maximum of true prediction errors (black circles) are smaller, due to the fact that the estimations become more accurate when more measurements are available. The performance has been greatly improved when the three stations are used. The metric P is 9.6% now, improved by more than a half compared with that of the single station ML 18 (a) Absolute value (b) Real value Figure 10: Results of the trained SVM model on testing data with 3 stations. situation, where P is 22.6%. The mean errors have been reduced to almost zero, and the ML standard deviations have also been greatly reduced. As revealed in the above analysis, with more available ground stations, the trained SVM model can capture the underlying orbit prediction error better, and the generalization per- formance is also signi cantly improved. In the following studies, we will only present results when three ground stations are used. 4.3. Di erent Partition of Dataset An often accepted concept for the machine learning method is that the more training data used, the better performance the machine learning model has. In this subsection, we study this e ect by investigating di erent partition of the dataset on the performance of the trained SVM model. Figure 11 shows the result of the trained SVM model, using 40% or 90% of the dataset as training data respectively. (Note in the follow analysis, for clarity, only the absolute value of the result are demonstrated.) The performance is clearly improved as shown by the thiner and steadier errorbar curves of the residual error e . Also, the metric res P drops from 14.8% to 9.7%. ML The variation of the metric with respect to the size of the training data is further in- 19 (a) training data is 40% of dataset (b) training data is 90% of dataset Figure 11: Results of the SVM model trained by di erent size of training data. Figure 12: Performance metric P of the SVM model trained by di erent size of training data. ML vestigated by varying the percentage of the training data from 10% to 90%. The results are shown in Fig. 12. Interestingly, we notice that the metric P drops quickly at the ML beginning, but tends to a constant once the size of the training data exceeds 80%. This implies that the performance of the SVM model will not be further improved once sucient amount of training data has been used. Additionally, the results validate that our choice of 80% dataset in the previous subsections is appropriate to achieve quality performance. 5. Generalize the SVM Model to Future Epochs In the previous section, the generalization capability of the trained SVM model is evalu- ated similarly as the classical machine learning studies where the ML model performance is tested using testing data di erent from the training data, without distinguishing in which way the testing data is di erent from the training data. In improving the orbit prediction of RSOs, we are more interested in testing the SVM model for future epochs. Therefore, in this section, we introduce a constraint on the testing data, requiring them to be predictions at future epochs with respect to the training data. 20 5.1. Results on Simulated RSO The simulation of the RSO is extended to 50 days, in order to provide testing data beyond the training data within the rst 0{30 days. The data in the rst 30 days is used to train the SVM model, the data in the 30{40 days (future 10 days) is collected as the testing data of the trained SVM model, other data are used to analyze the dependency of the performance on the amount of the testing data. Again, as the along-track error e is the largest component in the RSW frame, we will only demonstrate the performance of the trained SVM model on e . (a) Absolute value (b) Real value Figure 13: Results of using the SVM model trained by data in 0{30 days, to modify the error in 30{40 days (future 10 days). Figure 13 shows the results using training data in the rst 0{30 days to modify predictions in the following 10 days (30{40 days). In the plot of the absolute value, the mean error is reduced from around 50 km to less than 20 km at t = 7 days. The standard deviation is reduced by more than a half in most cases. The plot of the real value implies that the error is distributed almost evenly along both positive and negative directions, and the mean value is reduced to almost zero with less uctuations. The metric P , the percentage of the residual ML error, is reduced to 35.5% of the true error. Compared with the results in the last section, in Fig. 10, the performance is worse as the metric becomes larger, which is expected since 21 generalizing the trained SVM model to the future time should be harder than within the same time interval. Fundamentally, the orbit of the studied RSO has changed, as illustrated in Figs 5 and 6. We note that the maximum prediction duration t in Fig. 13 is 7 days, and the max predictions in the testing data are made at epochs in the following 10 days. For example, a prediction made at t = 40 days can base on the estimation given at t = 33 days at the earliest (with t = t = 7 days), and in this situation both the prediction and its based max estimation are not included in the training data. So there is no contradiction between t max and time length of the testing data. (a) Training data: 0{30 days, testing data: 30{35 days (future 5 days) (b) Training data: 0{30 days, testing data: 30{45 days (future 15 days) Figure 14: Results of the trained SVM model on di erent length of future epochs. We show other two cases, with the testing data in future 5 days and 15 days respectively, in Fig. 14. The top one is for the testing data in 30{35 days, and the bottom one is for 30{45 days. Comparing Figs 14(a) and 14(b), we observe that the percentage P of residual error ML after ML-modi cation increases as further-into-future testing data is used. So far, we have demonstrated that the average performance evaluated by P can be ML improved by the proposed ML approach. In the practice, it is also possible that the individual performance of the trained SVM model on one data point at a future epoch is concerned. The ideal case is that all the residual errors after the ML-modi cation are smaller than the 22 true prediction errors. To evaluate the individual performance of the trained SVM model, we introduce an individual metric p , de ned with a given  as ML i je e ^ j je j ML res p j = 100% = 100% : (8) ML jej jej i i The lower bound of p is 0 when the orbit prediction error is perfectly compensated, while ML the upper bound of p is positive in nity. ML The performance of the trained SVM model on the simulated RSO, using the same data as in Fig. 13, is demonstrated in Fig. 15, with only the true error (black circles) and the residual error (colorful circles) are shown. The individual metric p are calculated at each ML orbit prediction result, and then used to color the residual errors in the gures. Figure 15(a) shows all the data points whose errors are reduced by the trained SVM model, and Fig. 15(b) shows all the data points on which the ML-modi cation fails, i.e., the residual errors are increased compared with the true errors, with all the p larger than 500% is colored by ML yellow. Comparing the two gures reveals that the trained SVM model works well on the majority of the testing data point. In Fig. 15(b), both the true error and the residual error are relatively small compared with those in Fig. 15(a), and most failures of the ML approach occur when t  1 days. We expect that is due to the fact the orbit prediction is accurate enough and below the modi cation capability of the trained SVM model. However, further studies are required to draw concrete conclusions. (a) Data point with reduced errors. (b) Data point with increased errors. Figure 15: Individual performance of the trained SVM model on future epochs, demonstrated by true error (black circles) and residual errors after ML-modi cation (colored by p ). ML Figure 16 demonstrates the distribution of the individual metric p . The vertical axis ML is the percentage of the data point in the whole training data, whose p is less than the ML corresponding p represented by the horizontal axis. For example, the intersection of ML:max the curve and the dashed line indicates that for about 80% percent of the testing data, the percentage of the residual error is less than 100%, meaning that their prediction errors have been successfully reduced by the trained SVM model. So, generally, the ML approach can 23 improve the orbit prediction accuracy at the majority of the future epochs. Since the results of the individual metric p on di erent RSOs are similar, for brevity, the discussions on ML other RSOs will be omitted in the following paper. Figure 16: Distribution of the individual metric p . ML 5.2. Results on Other RSOs To further demonstrate the capability of the trained ML model, six other RSOs with di erent orbits are studied. These RSOs are chosen from the International Laser Ranging Service (ILRS) (Pearlman et al., 2002), and the parameters are extracted from the TLE catalog, as summarized in Table 4. The orbit altitudes increase from left to right in Table 4, and the LAGEOS-1 has the highest altitude of about 5860 km. The simulated RSOs are based on the two-line element (TLE) data of the real satellites, and are assumed with the same area-to-mass ratio of 0.05 m /kg, drag coecient of 2.2, and re ection coecient of 1.25, which are identical to those in Table 3. For simplicity, we will refer to these RSOs by the name of their base RSOs. However, we note that these simulated RSOs do not re ect true orbit information of the base RSOs, because the parameters in Table 4 are only used as the initial conditions for simulations in this paper. Table 4: Parameters of simulated RSOs, based on orbits extracted from TLE data retrieved on April 14, Base RSO SWARM-A LARETS STELLA HAIYANG 2A LARES LAGEOS-1 NORAD catalog number 39452 27944 22824 37781 38077 8820 Orbit altitude [km] 460 691 804{812 971 1450 5860 Semi-major axis a [km] 6820.0 7060.2 7167.5 7340.6 7813.9 12268.9 Eccentricity e 0.00144 0.0002 0.0010 0.0012 0.0001 0.0045 Inclination i [deg] 87.36 98.204 98.6 99.35 69.5 109.84 Argument of perigee ! [deg] 91.51 81.13 284.18 65.83 133.44 278.75 RAAN [deg] 151.01 355.07 177.33 230.41 58.02 151.20 Mean anomaly  [deg] 297.75 59.94 176.41 169.42 151.57 313.31 24 (a) SWARM-A. (b) LARETS. (c) STELLA. (d) HAIYANG 2A. (e) LARES. (f ) LAGEOS-1. Figure 17: Results of the ML approach applied on RSOs with di erent orbit altitude, when generalizing to future epochs. The results of generalization capability to future epochs of these six new RSOs are shown in Fig. 17. For all the RSOs, both the mean values and the standard deviations have been greatly reduced by the trained SVM models. The performance of the ML approach varies 25 with the RSO's orbit, and further studies are necessary to reveal the detailed relationship. 6. Generalization to Nearby RSOs In this section, we investigate another generalization capability that is of potential interest for the SSA applications. In practice, we may only have accurate historical data about a limited number of RSOs. And the question is whether an error model learned from such RSOs can be applied to improve orbit prediction accuracy of other RSOs, for which not enough data is available to build the ML model. 6.1. Varying Orbit Inclination Here we vary the orbit of the training RSO (based on the ISS in Table 3) to evaluate the generalization capability of the trained SVM model to nearby RSOs. The inclination of the learned RSO is 51:6393 . We have changed the inclination to 48 , 50 , 52 and 54 respectively, but have kept all the other orbital elements unchanged. All the new RSOs are propagated from the same starting epoch, and then are measured, estimated, and predicted within 0{50 days. The training data is the error data collected in the rst 30 days of the training RSO. And then the trained model is used to modify the orbit prediction of the other RSOs in the following 10 and 20 days. In other words, the testing data of the trained SVM model is the error collected in the 30{40 days and 30{50 days of the other RSOs. Figure 18: Results of the trained SVM model on a new RSO with i = 50 . The results of modifying the prediction error of the new RSO with i = 50 in following 10 days are shown in Fig. 18, where the testing data is the error data in 30{40 days. The mean error and the standard deviation have been greatly reduced. Compared with the performance on the training RSO, shown in Fig. 13, the performance metric P has increased from 35.5% ML to 40.2%, which indicates the improvement still exists but the performance becomes worse. The results for all the ve RSOs are summarized in Fig. 19. The performance metric P on the training RSO (with i = 51:6393 ) is the smallest. Compared with the training ML RSO, P on other new RSOs are larger. But even in the worse case, P is still less than ML ML 50%, which means more than half of the prediction errors can be corrected by the trained SVM model. Furthermore, the results show that the larger the deviation of the inclination Figure 19: ML-modi ed results on nearby new RSOs with i ranging from 48 to 54 . from the RSO that has been learned, the worst the performance. Again, such results are intuitive, but nice to be seen from the machine learning results. 6.2. Varying Orbit Altitude Another important parameter for RSOs in LEO is the orbit altitude. Here, the semi- major axis of the simulated RSO based on ISS is increased. The orbit prediction results in day 30{40 are used as the testing data, and the SVM model is still trained using the training data of the original RSO in day 0{30. Figure 20: Results of the trained SVM model applied to a new RSO with a = 6833:34 km (with an 50 km increment on the original semi-major axis in Table 3). The results of the trained SVM on a new RSO with a = 6833:34 km are shown in Fig. 20. The semi-major axis of the new RSO is increased by 50 km, and the metric P is 86.0%, ML which means the orbit prediction errors are only reduced by about 14.0%. The performance of the trained SVM model is not as good as that in Fig. 13, but it is expected because the simulated RSO is in LEO and an increment of 50 km on the semi-major axis can introduce a lot of di erences. 27 Figure 21: ML-modi ed results on nearby new RSOs with an increment of a ranging from 0 to 100 km. The performance metric P of RSOs with other increments are demonstrated in Fig. 21. ML We remark that a decrease of the semi-major axis of the original RSO will result in a reentry in less than 40 days, so only the increment of semi-major axis is considered. It is observed that P is decreasing at the beginning. This could be caused by many factors, such as ML the randomness of the dataset, and the possible variation of the atmosphere density in this region. Meanwhile, the P increases quickly as the increment of semi-major axis becomes ML larger than 30 km. There are also some uctuation on the performance when the increment is larger than 60 km. However, for these cases the metric are all larger than 100%, meaning that the ML approach does not improve the orbit prediction accuracy. For the moment, we can only conclude that the trained SVM model can be generalized to other RSOs with di erent semi-major axes, but it will fail when the variation is too large. The results in this section are interesting as there is no information directly related to the new RSOs in the training dataset, but the ML approach could still be used to improve the orbit prediction accuracy. Some knowledge learned for the data depends on RSO's properties such as the orbit altitude and inclination. But some other knowledge can be universal to other RSOs, such as the errors of the assumed model, measurement, and estimator, which depend on the environment and the catalog system. Our conjecture is that the proposed ML approach does not distinguish these di erences, so the trained SVM model captures not only the information about the speci c RSO that has been used for training, but also some information of the system. 7. Conclusions In this paper, a new approach of using the supervised machine learning (ML) method to improve the orbit prediction accuracy of the resident space objects (RSOs) is proposed. The framework of incorporating an ML model in the orbit prediction process is formulated and the structure of the dataset is designed based on analyzing the prediction errors and also other possibly available parameters. A simulation-based space catalog environment is developed to evaluate the performance of the proposed approach, and the support vector machine (SVM) model is chosen as the 28 machine learning algorithm in the paper. RSOs in low Earth orbit are simulated to examine the performance of the SVM model. The results demonstrate that the trained SVM model can: 1) improve the information about the same RSO that shares the same time duration as the training data but are not available during training; 2) improve the prediction accuracy of the same RSO at future epochs; and 3) improve the prediction accuracy of other nearby RSOs unknown to the trained SVM model. It is shown that the trained SVM can reduce more than a half of the prediction error for all the three types of improvement. We note that this is an exploration study and further research is required before practical applications of the ML approach. Further research is suggested to apply the established framework to real operational data, where the RSOs with large amount of measurement data can be used as the training objects and the trained SVM model can be used to improve orbit accuracy of the other RSOs that share some similar orbit characteristics. Acknowledgment The authors would acknowledge the research support from the Air Force Oce of Scienti c Research (AFOSR) FA9550-16-1-0184 and the Oce of Naval Research (ONR) N00014-16- 1-2729. Large-scale simulations have been carried out on the School of Engineering (SOE) HPC cluster at Rutgers University. We also appreciate anonymous reviewers for insightful discussions and comments on the paper. References References Abu-Mostafa, Y.S., Magdon-Ismail, M., Lin, H.T.. Learning from Data. AMLBook, 2012. Ampatzis, C., Izzo, D.. Machine learning techniques for approximation of objective functions in trajectory optimisation. Proceedings of the International Joint Conference on Arti cial Intelligence 2009 Workshop on Arti cial Intelligence in Space 2009;2009(September). Bergin, C.. RED threshold late notice conjunction threat misses ISS. https://www.nasaspaceflight. com/2009/03/threat-to-iss-crew-soyuz; 2009. Accessed on 2017-02-05. Crassidis, J.L., Junkins, J.L.. Optimal Estimation of Dynamic Systems. Second edi ed. Chapman and Hall/CRC Applied Mathematics & Nonlinear Science. CRC Press, 2011. Folkner, W.M., Williams, J.G., Boggs, D.H., Park, R.S., Kuchynka, P.. The Planetary and Lunar Ephemerides DE430 and DE431. JPL Interplanetary Network Progress Report 42-196; JPL; 2014. F orste, C., Bruinsma, S.L., Shako, R., Abrikosov, O., Flechtner, F., Marty, J.C., Lemoine, J.M., Dahle, C., Neumeyer, H., Barthelmes, F., Biancale, R., Balmino, G., K onig, R.. A new release of EIGEN-6: The latest combined global gravity eld model including LAGEOS, GRACE and GOCE data from the collaboration of GFZ Potsdam and GRGS Toulouse. In: American Geophysical Union, General Assembly 2012. Vienna, Austria; volume 14; 2012. p. 2821. Hammer, B., Gersmann, K.. A Note on the Universal Approximation Capability of Support Vector Machines. Neural Processing Letters 2003;17(1):43{53. doi:10.1023/A:1022936519097. Hartikainen, J., Sepp anen, M., S arkk a, S.. State-Space Inference for Non-Linear Latent Force Models with Application to Satellite Orbit Prediction. Proceedings of the 29th International Conference on Machine Learning (ICML-12) 2012;:903{910. 29 Hill, K., Sabol, C., Alfriend, K.T.. Comparison of Covariance Based Track Association Approaches Using Simulated Radar Data. The Journal of the Astronautical Sciences 2012;59(1-2):281{300. doi:10.1007/ s40295-013-0018-1. Kelso, T.. Iridium 33/Cosmos 2251 Collision. http://celestrak.com/events/collision/; 2012. Accessed on 2017-02-15. Legendre, P., Deguine, B., Garmier, R., Revelin, B.. Two Line Element Accuracy Assessment Based On A Mixture of Gaussian Laws. In: AIAA/AAS Astrodynamics Specialist Conference and Exhibit. Reston, Virigina: American Institute of Aeronautics and Astronautics; 2006. p. 1{13. doi:10.2514/6.2006-6518. Lyon, R.H.. Geosynchronous Orbit Determination Using Space Surveillance Network Observations and Improved Radiative Force Modeling. Ph.D. thesis; Massachusetts Institute of Technology; 2004. Maisonobe, L., Pommier, V., Parraud, P.. Orekit: An Open Source Library for Operational Flight Dynamics Applications. In: 4th International Conference on Astrodynamics Tools and Techniques. May 3{6, 2010. . Micchelli, C.A., Xu, Y., Zhang, H.. Universal Kernels. Journal of Machine Learning Research 2006;7(Dec):2651{2667. 00204. Mitchell, T.M.. Machine Learning. 1st ed. McGraw-Hill Education, 1997. NASA, . Orbital Debris Program Oce Frequently Asked Questions. http://celestrak.com/events/ collision/. Accessed on 2017-02-15. Pearlman, M.R., Degnan, J.J., Bosworth, J.M.. The International Laser Ranging Service. Advances in Space Research 2002;30(2):135{143. doi:10.1016/S0273-1177(02)00277-6. Rivera, J., Bai, X.. Improving the Orbit Propagation Accuracy of Two-Line-Element Satellite. In: 67th International Astronautical Congress. Guadalajara, Mexico; 2016. p. 26{30. Sang, J., Bennett, J.C., Smith, C.. Experimental results of debris orbit predictions using sparse tracking data from Mt. Stromlo. Acta Astronautica 2014;102:258{268. doi:10.1016/j.actaastro.2014.06.012. Sang, J., Bennett, J.C., Smith, C.H.. Estimation of ballistic coecients of low altitude debris objects from historical two line elements. Advances in Space Research 2013;52(1):117{124. doi:10.1016/j.asr. 2013.03.010. Sharma, S., Cutler, J.W.. Robust Orbit Determination and Classi cation: A Learning Theoretic Approach. IPN Progress Report 42-203; Jet Propulsion Laboratory; 2015. Smola, A.J., Sch olkopf, B.. A tutorial on support vector regression. Statistics and Computing 2004;14(3):199{222. doi:10.1023/B:STCO.0000035301.49549.88. The MathWorks, . Statistics and Machine Learning Toolbox User's Guide R2017b. The MathWorks, Inc., Natick, Massachusetts, United States, 2017. Tukey, J.W.. Exploratory Data Analysis. Addison-Wesley series in behavioral science. Reading, Mass: Addison-Wesley Pub. Co, 1977. Vallado, D., Crawford, P., Hujsak, R., Kelso, T.. Revisiting Spacetrack Report #3. In: AIAA/AAS Astrodynamics Specialist Conference and Exhibit. American Institute of Aeronautics and Astronautics; 2006. p. 1{88. doi:10.2514/6.2006-6753. Vallado, D.A.. Fundamentals of Astrodynamics and Applications. The McGraw-Hill Companies, Inc., 1997. Wang, R., Liu, J., Zhang, Q.M.. Propagation errors analysis of TLE data. Advances in Space Research 2009;43(7):1065{1069. doi:10.1016/j.asr.2008.11.017.

Journal

StatisticsarXiv (Cornell University)

Published: Jan 15, 2018

There are no references for this article.