TY - JOUR AU - McHale, Ian, G AB - Abstract Match fixing is a growing threat to the integrity of sport, facilitated by new online in-play betting markets sufficiently liquid to allow substantial profits to be made from manipulating an event. Screens to detect a fix employ in-play forecasting models whose predictions are compared in real-time with observed betting odds on websites around the world. Suspicions arise where model odds and market odds diverge. We provide real examples of monitoring for football and tennis matches and describe how suspicious matches are investigated by analysts before a final assessment of how likely it was that a fix took place is made. Results from monitoring driven by this application of forensic statistics have been accepted as primary evidence at cases in the Court of Arbitration for Sport, leading more sports outside football and tennis to adopt this approach to detecting and preventing manipulation. 1. Introduction The growth of online commerce since the millennium has changed many aspects of society. One factor behind this revolution has been the massive increase in the speed with which information flows. The effects are at least as evident in the world of sports betting as anywhere else. Indeed, instead of only pre-match betting being available, the speed with which information on the progress of a sports event can transmit to bettors and bookmakers, and the speed with which wagers can be placed in response, makes it practical now for bookmakers routinely to offer in-play betting; bets are placed while the match is in progress. The internet has been the catalyst for the rapid rise of in-play betting, which now makes up the majority of online betting turnover, for example 67% in Spain in 2017 (Gómez & Lalanda, 2018), and this has given birth to numerous new bookmakers around the world and betting exchanges such as Betfair and Betdaq. With the increased availability of betting, and the advent of in-play betting, global sports betting activity has increased very rapidly. Between 2000 and 2010, annual global gross gaming revenue—the amount bookmakers win from their clients—from sports betting, defined to exclude bets on horses and dogs, was estimated to have grown from €6b to €19b (Sport Accord, 2011). By 2016, it was estimated as €30b (IRIS, 2017). Since intensifying competition was reducing bookmaker margins throughout this period, sports betting turnover will have increased even faster than these figures indicate. Some of the increase in activity will have been accounted for by the extension of the range of sports events on which wagering is offered. For example, bets can now be placed on youth games in football around the world, and there are now active betting markets, particularly popular in Asia, on e-sports (Abarbanel & Johnson, 2018). Further increase in activity will have been brought about by increased volume on events already in the bookmaker offer. The net result is that there are now many highly liquid betting markets not only on elite sports competitions but also at relatively low levels of competition. For example, Boniface et al. (2012) reported that the consensus among a group of Asian betting agents was that bets of €300,000 could be placed on a Belgian Second Division football match without attracting undue attention. Where there is such high liquidity, substantial sums can be won if events on the field can be manipulated. The danger is greatest when there is high liquidity in the betting market but low player wages since then prospective wins on the betting market are high and the cost of ‘buying’ a match from corrupt players is relatively low (Forrest et al., 2008, Forrest, 2012). Risk is elevated further in betting markets that are illegal or unregulated (Preston & Szymanski, 2003). Although the highest levels of sport have not been immune, the combination of high liquidity and low pay has applied in many fixing cases that have come to light, for example in matches in lower-division football in major football countries such as England and Spain, in top-tier football in lesser footballing powers such as Ireland and Sweden and on the ATP Futures and Challenger Tours in tennis (where few players even cover their costs over the season). The threat to sport from match fixing has increasingly been regarded as serious. Indeed the then President of the International Olympic Committee, Jacques Rogge, believed that match fixing had a bigger impact than doping (http://www.bbc.co.uk/blogs/davidbond/2011/03/match_fixing_is_now_a.html). Competitions where the play is not believed to be authentic cannot expect to continue to command public support. In response to the threat to integrity, sports governing bodies have begun to take action. In many sports, player education programmes have been put in place in the hope of increasing player resistance to corrupt practices; and sports rules have been amended to make failure to report suspicious approaches a serious disciplinary offence. Some well-resourced sports, such as cricket and tennis, have established international intelligence units, staffed by former senior police officers, to investigate and prosecute corruption. In a few countries, in anticipation of signing the Macolin Convention, ‘National Platforms’ have already been established to provide formal procedures for betting houses to notify sports governing bodies if they observe suspicious betting, such as unexpectedly high volume of betting on a particular outcome in a match. Against this background, researchers and practitioners have begun to ask whether the application of forensic statistics may be able to play a role in combatting the threat to sports integrity. 2. Statistics and corruption in sport: a review Statistical modelling has been proposed as a means of meeting two objectives for a sport. First, the sport should know the size of the problem it faces, and therefore, there is value in seeking to estimate prevalence rates. Second, it will wish to detect cases of corruption with a view to excluding corrupt athletes from competition. Regarding estimation of prevalence rates, the number of cases to become known (from whistle blowing, intelligence, police inquiries, etc.) is likely to be a poor guide because successful corruption is never directly observed. This is a common problem in assessing the scale of corruption in any sphere, not just sport. However, in the general literature on corruption, it has been shown that it might be possible to identify patterns in aggregate data, which are suggestive of the presence and magnitude of corruption. This approach has a long history of being applied in a variety of spheres where corruption takes place. For example, at one time, income tax due in the USA was a fixed amount depending on into which band of income (each band $50 wide) the declared amount fell. Slemrod (1985) saw no reason why the true distribution of incomes in the population should not be continuous but found instead that the distribution of declared incomes featured a clustering of declared incomes just below each tax band boundary. The finding suggested that taxpayers were willing to under-declare, to avoid a jump in tax, and offered some clue as to how many taxpayers were willing to offend. Taking a somewhat similar approach, Wolfers (2006) examined results from more than 44,000 college basketball matches in the USA. College basketball is a popular subject for betting, and the most common type of wager is the handicap bet. The bookmaker announces that college X, the favourite in the match, is ‘expected’ to beat college Y by nine points (‘the spread’). Their clients then bet on whether X will ‘beat the spread’, i.e. whether college X will win by more than nine points or fail to win by more than nine points. Against this background, proven corruption cases have usually featured ‘point shaving’. Bettors bribe players on the stronger team to hold back such that they fail to beat the spread (even though they still gain the sporting rewards for winning the match itself). Wolfers treated the spreads announced by the bookmaker as unbiased forecasts of the actual results relative to the spread, which might be expected to be normally distributed with a mean of 0. However, he identified a departure from the normal distribution in that there was an unexpectedly high number of observations where the favourite had just failed to beat the spread. He interpreted the excess number of such observations as indicative of the prevalence of point shaving and then estimated that around 6% of matches with a strong favourite (one favoured to win by at least 12 points) were ‘fixed’ (in this way). This would imply that 1% of all matches were ‘fixed’ (by point shaving). The conclusion that so many matches are fixed has been treated subsequently with some scepticism. Alternative explanations for the pattern observed in the data have been sought either in the betting market or by focusing on the strategic choices facing the teams in different on-court situations. In the betting market, the relatively high frequency of strong favourites failing to beat the spread may represent biases among bookmakers and bettors such that strong teams are, on average, overrated (Borghesi et al., 2010), a failure of betting market efficiency. Bernhardt & Heston (2010) noted that a similar pattern of match results (relative to spread) had been found in professional NBA basketball as in the college game; favourites were more likely to win by a few less than by a few more points than the spread. But widespread fixing of NBA matches is implausible on account of very high player salaries. They suggested that the explanation might lie in strategy on the court. For example, basketball teams ahead by a moderate margin may maximize their chance of winning by engaging in time-wasting tactics, which reduce the rate of points production, rather than seek to win by the margin they could. Using more formal sports analytics including how incentives to foul evolve with the score, Gregory (2018) finds that models predict the same pattern of match results as Wolfers observed and interpreted as indicative of corruption. Statistical analysis has also been employed in seeking to uncover the scale of match fixing motivated by sporting rather than betting goals. Famously, Duggan & Levitt (2002) offered support for anecdotal talk of match fixing in Japanese sumo wrestling. The sport was organized such that each athlete took part in 15 bouts in each tournament. Those who won eight times rose in the rankings whereas those who failed to win eight were demoted (and their salaries thereby reduced). Some individual contests at the end of a tournament therefore became crucial for one of the participants. Duggan & Levitt found that competitors for whom the bout was crucial won disproportionately often. This in itself could have had an innocent explanation because asymmetric rewards to winning would be expected to induce greater effort from the competitor with more to play for. However, this would not explain their finding that next time the two competitors met, the loser of the first bout was disproportionately likely to win. This suggested collusion where at least part of the payment for giving the win to the fighter who needed it was in terms of future favours. Duggan & Levitt regarded the number of ‘excess wins’ in the follow-up matches as indicative of the scale of corruption in the sport. Subsequent papers have also focused on the distribution of outcomes in matches where a win was very important to one of the opposing participants. Jetter & Walker (2017) found excess wins in men’s professional tennis for players on the cusp of qualifying by world ranking for the immediately following Grand Slam tournament (which would be highly lucrative) and offered supplementary evidence that this was unlikely to be explained by incentives to effort. Employing match data from 75 countries, Elaad et al. (2018) focused on football matches where one of the teams was a relegation candidate but the other team had no strong stake in the outcome. The probability that the relegation candidate would win was significantly greater (both statistically and in terms of the effect size) in countries that had higher scores on the Corruption Perceptions Index published by Transparency International. All these papers identify a particular motive for match fixing a priori and then look for an elevated number of observations in a particular match result category (the favourite failing to beat the spread, the sumo wrestler facing demotion winning the bout, the tennis player on the cusp of qualifying for a Grand Slam winning the match and the football team struggling to retain its place in the division finishing victorious). The number of observations in the category is then found to be significantly elevated relative to what would be expected from a statistical model. However, in all the cases, the expected number of observations in the result category studied is still greater (many times as great in the college basketball example) than the number of ‘excess’ cases argued to be linked to manipulation. Thus, while the approach we have described may contribute to understanding the scale of match fixing of a specific type, it does not offer a viable means of identifying particular matches as fixed. A screen based on this approach would have low specificity. For example, only a small proportion of college basketball matches where the favourite fails to beat the spread will in fact be fixed. Alternative methods for employing forensic statistics to identify particular matches as fixed have therefore had to be developed. Detection, if sufficiently reliable to support judicial or disciplinary proceedings, may lead directly to the exclusion of corrupt actors from the sport and deter others from participating in a fix. It might be thought that analysis of sports performance data would be a fruitful avenue to explore when seeking to identify individual instances of manipulation. After all, match fixing almost always involves underperformance by some, or all, participants in the sports event, and underperformance should be reflected in the data. However, this approach has limitations because data from sport are inherently very noisy. Indeed the very essence of sport is that the unexpected may happen. It follows that the presence of corruption will be very imperfectly correlated with deviations from expected performance. Underperformance by athletes is very common and is at the heart of the uncertainty of outcome that drives the popularity of sport. Jewell & Reade (2014) examined data from 18 one-day cricket internationals that were already known to have been fixed and compared them with data from all 3,510 one-day internationals played between 1971 and 2014. They found that statistics on individual performances reflected that manipulation was taking place. For example, in matches which had been fixed by one particular team, its batsmen, on average, recorded worse statistics than the average across all matches. They suggest that this points to potential benefits from using match data in the detection of match fixing. However, they fail to discuss whether this approach could lead to screens for detecting match fixing that would not yield a high proportion of false positives. Similar to other sports, batsmen in cricket frequently ‘fail’. Arguably, Sir Don Bradman was the most successful player in any team sport. Prior to his last innings in Test cricket, he had scored 6,996 Test runs with an average of 101.39. In his much anticipated final innings before retirement, he was dismissed for 0, bringing his career average down to 99.94. Clearly, by his standards, the score of 0 represented significant underperformance. Should he therefore have been investigated for violation of integrity standards? Screening for manipulation cannot therefore rely on sports data alone. Specificity of the screen would be too low. At least for betting-related fixing, screening through monitoring of the betting markets associated with a sports event has proven much more fruitful. For example, in well-regulated jurisdictions in Europe, bookmakers are obliged to report ‘suspicious’ activity. Their algorithms will, for example, generate an alert if there is exceptional volume of betting on an event, and it is spatially concentrated in a town where the bets placed have been predominantly for the local team to lose. Monitoring of national betting markets through such mechanisms has brought to light several match fixes, often instigated by players to allow them and their families to make extra money. But it is insufficient to protect sport against the largest fixing operations. Organized crime carries out such operations on an industrial scale (their modus operandi vividly illustrated in Hill, 2010). In the first such case to come to public attention, criminals based in Bochum, Germany were convicted of fixing 320 football matches in 13 countries. In this and subsequent cases, prosecutors showed that the betting activity associated with fixing was largely conducted in Asian markets regardless of where the sports event took place. The liquidity in Asian markets on cricket, football and many other sports dominates that in the rest of the world, allowing larger bets to be placed and consequently greater profits to be made from a given fix. Further, betting is barely regulated in Asia (even where legal at all) and most of the volume is generated through agents of licensed offshore operators such that it would not be possible to trace stakes back to source if suspicions arose concerning a particular match. Asian markets are therefore where ‘professional’ fixers will almost invariably conduct their betting operations. This suggests that systematic monitoring of global betting markets is highly desirable in order to identify instances where there is abnormal betting activity consistent with a fix having been arranged. In contrast to monitoring of national markets in well-regulated countries, screening in international markets must depend on observing only odds. Volumes of bets on particular outcomes cannot be observed in the unregulated and illegal sectors, but operators have to reveal odds through their websites in order to conduct their business. Inferences can then be made about betting volumes. Football has the longest experience of funding systematic monitoring of global betting markets by specialist contractors in order to alert governing bodies to activity suggestive of a fix. For example, UEFA commissions monitoring of international markets (represented by several hundred betting websites) for all matches in its own competitions and all matches in the top two divisions of each of the 55 national football associations that make up its membership. Latterly, following cases at the Court of Arbitration for Sport (CAS) where sanctions based on evidence from monitoring systems have been confirmed, most other leading sports, such as badminton, rugby and, in North America, Major League Baseball and the National Hockey League, have followed suit. In this paper we describe the theory behind such monitoring systems and provide illustrative examples of the process in action. The remainder of the paper is structured as follows. The next section provides an overview of the logic behind monitoring betting markets to detect corruption. Section 3 follows with the description of mathematical models for forecasting the results of football and tennis matches while a match is in progress. Mathematical modelling is at the heart of monitoring systems because identifying where odds are behaving abnormally depends on algorithms being able to generate benchmark odds so that actual odds movements can be compared with what would be expected given the state of the game. In Section 4 we provide some illustrative examples of both regular matches (those believed to be corruption-free) and matches believed to have been manipulated. Section 5 describes the use of the tools presented here in practice and Section 6 offers some closing thoughts. 3. Monitoring betting markets for corruption Essentially, the effect of a fix on the betting market is a special case of insider trading in financial markets. The fixers have information unknown to the rest of the market: that some participants in the match will attempt to manipulate events to bring about a particular outcome. They (and other parties that know of the fix and wish to make a profit from it) will place bets on the outcome they have sought to arrange. This additional betting will raise volume above what would normally be expected. In regulated markets, abnormal volume (where the additional money is biased in favour of one particular outcome) is the key metric for triggering investigation. Figure 1 shows the volumes of bets placed with a UK bookmaker on the outcome of the first frame in the first round of matches of snooker’s UK Championship in 2008. Fixing a minor component of a match (here the first frame) may be more appealing to the player than planning to lose the whole thing. However, liquidity is liable to be low in the market on a single frame, such that ‘unexpected betting’ would be very obvious. Indeed in some matches in this tournament, the bookmaker took no bets at all on the first frame. In one match, however, the bookmaker took £3,754 that Stephen Hendry would beat Stephen Lee on the first frame. The unexpected volume and its being placed overwhelmingly on one side of the bet triggered suspicion. Subsequently Lee was suspended from competition on match-fixing charges, including one related to this match, and received a 12-year ban from the sport (https://www.bbc.co.uk/sport/snooker/24114861). Figure 1 Open in new tabDownload slide Volume bet on winner of first frame in first round matches at snooker’s UK Championship in 2008. Figure 1 Open in new tabDownload slide Volume bet on winner of first frame in first round matches at snooker’s UK Championship in 2008. Monitoring of volume on national, regulated markets has thrown up a number of such cases across sports such as football and handball as well as snooker. However, as noted above, large-scale fixers will typically bet on Asian markets, to take advantage of the high liquidity and lack of official scrutiny. Monitoring of these markets has necessarily to be of odds rather than volume because only odds can be observed. Fortunately, Asian bookmakers tend towards a book-balancing business model such that any net inflow of money that a particular outcome will occur will automatically be met with a shortening of the odds on that outcome and lengthening of odds on other outcomes. This continuous realignment of the odds to minimize operator risk from over-exposure to one outcome allows monitoring systems to interpret odds movements as reflective of relative amounts being wagered on the different outcomes. For example, if observed odds for a particular event outcome fall sharply, this may be taken as indicative that bets arriving in the market are predominantly being placed in support of that outcome coming to pass. The basic concept behind the monitoring of betting market odds is rooted in the notion that market odds may be interpreted as probabilistic forecasts of outcome probabilities. ‘Forecast efficiency’ as a general concept refers to the extent to which information is incorporated into forecasts (Nordhaus, 1987). Studies of betting markets have generally concluded that odds (corrected for the presence of implicit bookmaker commission built into these prices) broadly satisfy this assumption of forecast efficiency, albeit with the exception of favourite-longshot bias in markets on some sports (Vaughan Williams, 1999). Moreover this conclusion begins to be validated for contemporary in-play markets also. For example, Croxson & Reade (2014) demonstrate that betting exchange odds adjust fully and rapidly in response to the arrival of a goal in a football game. A well-constructed sports forecasting model should also produce efficient forecasts. In most situations, it should yield probabilities that correspond closely to those implied by odds in the betting market. Typically indeed they do so. However, in the case of a fixed match, the fixers will place bets on the Asian betting markets such that the relative volumes of bets on the different event outcomes are not now congruent with the probabilities implied by the current odds. Asian bookmakers will respond by modifying the odds in an attempt to avoid over-exposure to the outcome favoured by the additional money in the market. The change in odds will reflect the knowledge that some traders have, that there is to be an attempt to manipulate the match. But this information is not known to the mathematical forecasting model. Hence probabilities implied by odds observed in the market will begin to diverge from model probabilities. This divergence will grow as the fixers continue to pump money into the betting market and the bookmakers respond by adjusting the odds further. By looking for examples of this divergence between model and market probabilities, potentially fixed matches can be identified. 4. In-play models Detecting match fixing relies on obtaining up-to-date betting market information and on having a reliable mathematical model to estimate the probabilities of the outcome of the sporting event as events unfold. Forecasting models exist for many sports. Cricket in particular has attracted much attention in the academic literature. Indeed, the Duckworth–Lewis method (Duckworth & Lewis, 1998) is essentially a forecasting model and was adopted by cricket’s governing bodies to determine the official results of matches that were not completed because of inclement weather. Since the publication of the original Duckworth–Lewis model, it has been refined further by McHale & Asif (2013) and Stern (2016) for example. Other authors have employed concepts from the Duckworth–Lewis model to develop explicit forecasting models, noted by them as relevant in betting markets (Bailey & Clarke, 2006, Akhtar & Scarf, 2012). Further, Asif & McHale (2016) plot, for sample games from One Day International cricket, in-play betting odds and the probabilities implied by a statistical model and demonstrate that the betting market acts ‘logically’ in the sense that odds typically track model probabilities. In the remainder of this section we describe models for forecasting the results of a football match in-play and the result of a tennis match in-play. 4.1. In-play models of football Forecasting models for football are either pre-match or in-play. Pre-match forecasting models estimate either (a) the probabilities of the match finishing in each possible scoreline (0-0, 1-0, 0-1, 1-1 and so on) or (b) the match result probabilities (win, draw or loss). Of course, the estimated scoreline probabilities from the first type of model can be summed to obtain the match result probabilities. In-play models estimate the final scoreline in a match once the match has begun using estimated team scoring rates. In fact, the core task of forecasting the results of football matches, be it pre-match or in-play, is that of estimating appropriate scoring rates for the two teams in a particular match, based on their records of scoring and conceding goals in previous matches. In a seminal paper on forecasting in football by Maher (1982) the number of goals by each team in a match was assumed to follow a Poisson distribution in which the scoring rates depended on the attacking strength of the team and the defensive strength of the opposition. Since then, research has focussed on making improvements to the Maher model. For example, Dixon and Coles (1997) allowed more recent matches to influence the estimated attack and defence strengths more than matches further in the past, while Karlis & Ntzoufras (2000) proposed a diagonally inflated bivariate Poisson model to allow for a higher frequency of observed draws than would be expected under the independent Poisson model. McHale & Scarf (2007) used a copula to induce a bivariate distribution allowing for dependence between the numbers of goals scored by the two teams in a match. More recently, Koopman & Lit (2015) used a state-space model to allow team strengths to vary stochastically. In Boshnakov et al. (2017), the authors propose a Weibull count model as an alternative discrete distribution to the Poisson model used previously. But in all of these models, the central task is to estimate the scoring rates of the two teams in a particular match. The papers discussed above deal with pre-match forecasting. Somewhat surprisingly given the popularity of in-play betting in football, models for forecasting the results of matches, once the match has begun are rare in the academic literature. Dixon & Robinson (1998) present a paper dealing with in-play forecasting in football and use a bivariate birth process to estimate the hazard (instantaneous scoring rate) of the two teams scoring throughout a match. Titman et al. (2015) present a similar model but allow for both interdependence between the goals scored by the two teams and the yellow and red cards received by the two teams. Volf (2009) presents a semi-parametric Cox model for the scoring hazards. These models typically allow three variables to affect the hazards of scoring for the two teams in a football match: (i) the passing of time, (ii) the awarding of red cards and (iii) the current score line. Once the hazards have been estimated, forecasts are produced using Monte Carlo simulation whereby each minute of the match is simulated for the occurrence of any goals and propagated forward until the match is complete (typically this is set to around 93 minutes to allow for any injury time added at the end of the game). In the bookmaking industry, it is typical that scoring rates of the teams are first estimated using a pre-match model, before subjective adjustments are made by specialist expert traders. The traders adjust the scoring rates in light of information that the model used for estimating scoring rates would not know. For example, an injury to a key player might be expected to reduce a team’s scoring rate (or increase the opposition’s scoring rate). Given a pair of scoring rates for the two teams in a match, minute-by-minute scoring rates are estimated by distributing them across the minutes of a match. These minute-by-minute scoring rates effectively approximate the hazards of scoring that would normally be estimated using a survival model. Fig. 2 shows the relative frequency of goals per minute in all matches from the English Premier League between the 2007–2008 and 2017–2018 seasons. Ignoring the spikes at minutes 45 and 90 where extra minutes of play have been added together, the scoring rate can be seen to increase throughout the match. Traders in the bookmaking industry typically distribute the estimated scoring rates in proportion to the distribution depicted in Fig. 2. The resulting minute-by-minute scoring rates are then used to simulate the goals scored in the remaining minutes of the match, and the outcome of the simulations reveal the probabilities of the match ending in each and every possible scoreline. Figure 2 Open in new tabDownload slide Relative frequency of goals scored in each minute of football match played in the English Premier League between the 2007–2008 and 2017–2018 seasons. Figure 2 Open in new tabDownload slide Relative frequency of goals scored in each minute of football match played in the English Premier League between the 2007–2008 and 2017–2018 seasons. 4.2. In-play models of tennis Just as forecasting in football centres around the task of estimating team scoring rates, in-play forecasting for tennis focusses on the key task of estimating the probability that a player will win a point on serve given he/she is playing against a given other player. Further, analogous to estimating attack and defence strengths of football teams, the probability of winning a point on serve depends on both that player’s serving ability and the opposing player’s returning ability. Once the probability of winning a point on serve is known for both players, and assuming this stays constant throughout a match (an assumption shown to be approximately true in Klaassen & Magnus, 2001, and considered further by Viney & Bedford, 2018), the match win probabilities can be derived analytically. Klaassen & Magnus (2003) present such a model and use it to forecast the outcome of matches at the Wimbledon tournament, while Barnett (2006) presents derivations of all closed expressions for using point-win probabilities to calculating game-win, set-win and match-win probabilities. Barnett & Clarke (2005) present a methodology for combining player statistics on serve and return to estimate the probability of a player winning a point on serve. Spanias & Knottenbelt (2013) use tennis player statistics in a Markov Chain model for in-play forecasting and explicitly recognize that their model could be used as an input into fraud detection. Figure 3 Open in new tabDownload slide Market odds and model odds for an example of a football match where odds evolve in the expected way. The odds are shown for the Asian Totals market. The black lines represent the odds implied by the mathematical model, while the coloured lines show the actual odds offered by a betting operator (for various totals goals markets). Vertical green lines represent goals and vertical red lines represent red cards. The shaded grey area represents the half-time period. Figure 3 Open in new tabDownload slide Market odds and model odds for an example of a football match where odds evolve in the expected way. The odds are shown for the Asian Totals market. The black lines represent the odds implied by the mathematical model, while the coloured lines show the actual odds offered by a betting operator (for various totals goals markets). Vertical green lines represent goals and vertical red lines represent red cards. The shaded grey area represents the half-time period. Figure 4 Open in new tabDownload slide Market odds and model odds for a real example of a football match reported as likely to have been manipulated. The odds are shown for the Asian Totals market. The black lines represent the odds implied by the mathematical model, while the coloured lines show the actual odds offered by the market (for various totals goals markets). Vertical green lines represent goals and vertical red lines represent red cards. The shaded grey area represents the half-time period. Figure 4 Open in new tabDownload slide Market odds and model odds for a real example of a football match reported as likely to have been manipulated. The odds are shown for the Asian Totals market. The black lines represent the odds implied by the mathematical model, while the coloured lines show the actual odds offered by the market (for various totals goals markets). Vertical green lines represent goals and vertical red lines represent red cards. The shaded grey area represents the half-time period. 5. Examples In this section we provide examples of using in-play models and betting markets to detect potentially fixed matches in football and tennis. The data were provided by Sportradar, the organization charged with providing alerts of potential fixing to the governing bodies of European and world football, UEFA and FIFA and one of the governing bodies of men’s professional tennis, the ATP. Figures 3 and 4 show the betting odds and the odds implied by an in-play football model for the ‘Asian Totals’ market on two particular matches. Asian Totals is typically a very liquid market. Bookmakers quote a benchmark for the total number of goals over the whole game and bettors may wager that the final number will be either over or under this benchmark. A bookmaker might offer odds of say 1.8 that the total goals in the match will be over 2.5. The punter will win the bet (collecting $1.80 for each $1 staked, a profit of $0.80) if there are three or more goals in the match, but lose the bet if there are only zero, one or two goals in the match. The baseline of 2.5 goals switches depending on the odds being offered and on the current scoreline. For example, once three goals have been scored in a match, offering a market on ‘under 2.5 goals’ would be meaningless. Or, as a match nears its end, bets on ‘under 3 goals’ would have to be offered at extremely short odds if the score were currently 1-0; this would be unappealing to customers, so the bookmaker might switch to offering odds on over-or-under-two goals instead. Note that the benchmark number of goals in the Asian Totals market may be quoted to a quarter of a goal. For example, the bettor is invited to wager over or under 2.25 goals. This has a particular meaning in the context of Asian Totals rules. Suppose, for instance, the bettor chooses ‘over 2.25’. If three goals are scored, the bet is won. If only one goal is scored, the bet is lost. But, if exactly two goals are scored, the bettor forfeits only half the stake. The odds implied by the mathematical model are depicted as black lines in Figs 3 to 7 and are derived from proprietary models used by Sportradar. We have shown the odds calculated using this mathematical model since these are the ones actually used in application of monitoring for UEFA, FIFA and the ATP. Forrest & McHale (2015) provide a detailed analysis of the models used by Sportradar to generate probabilities for football and find that there is a high level of agreement between Sportradar’s proprietary model and the model presented above and that the estimated probabilities do indeed represent the empirical frequencies of outcomes of football matches. In Fig. 3, the market odds (various colours) and the odds calculated from the mathematical model (black), follow each other closely. As the match progresses, and the time remaining for further goals runs out, the odds increase (probability decreases) for there being more than the baseline number of goals. A goal is scored in the 43rd minute, and the market switches from ‘over-under 1.25 goals’ to ‘over-under 2 goals’. Figure 4 shows the evolution of market and model odds for a match reported by the monitoring agency as likely to have been manipulated. From the very first minute, the model odds and the market begin to diverge. From minute 18 to 31 for example, the model odds show how the odds increase on the over 2.25 goals bet. This is intuitively what one would expect: as the match progresses, there is less time for goals to be scored and so the probability of more than 2.25 goals will decrease, such that the odds offered should increase. The market, shown by the pink line, does the exact opposite and the odds actually decrease on there being more than 2.25 goals. This pattern continues throughout the match. Evidently, some bettors were more confident than the model would entitle them to be that, eventually, the goals would be scored (and indeed the late goals anticipated in the market duly appeared). Unlike Fig. 4, Fig. 5 depicts the evolution of odds in a match where at first, the two sets of odds are in agreement. For the first 20 minutes of the match the black line representing the model odds and the lines representing the market odds are in almost exact agreement. From minute 20 the odds begin to diverge. This is likely due to the fixers delaying placing wagers until the match is well underway. But once the bets begin to be recognized by the market, the odds quickly begin to behave perversely. The orange line representing the odds for over 3.5 goals decreases sharply, despite there being only two goals in the match to that point, and time running out for the goals to be scored that would make that bet a winning one. Figures 6 and 7 show market and model odds for the outcomes of two tennis matches. The match depicted in Fig. 6 is a match free of suspicion of fixing. After each game, the match odds for player 1 winning adjust to the latest information, and it is clear that the model (black line) and market (blue line) are very much in agreement as to the likely outcome of the match. Figure 7 shows the model and market odds for a tennis match suspected to have been manipulated. In this case, player 1 was very much the pre-match underdog. Odds implied by the model shortened as events unfolded and a win for the underdog became more likely. But, from very early on in the match, the market odds were lower than the model odds for player 1 winning, i.e. market expectations of a surprise result were stronger than sporting logic would have suggested. Figure 5 Open in new tabDownload slide Market odds and model odds for a second real example of a football match believed to have been manipulated. The odds are shown for the Asian Totals market. The black lines represent the odds implied by the mathematical model, while the coloured lines show the actual odds offered by the market (for various totals goals markets). Vertical green lines represent goals and vertical red lines represent red cards. The shaded grey area represents the half-time period. Figure 5 Open in new tabDownload slide Market odds and model odds for a second real example of a football match believed to have been manipulated. The odds are shown for the Asian Totals market. The black lines represent the odds implied by the mathematical model, while the coloured lines show the actual odds offered by the market (for various totals goals markets). Vertical green lines represent goals and vertical red lines represent red cards. The shaded grey area represents the half-time period. Figure 6 Open in new tabDownload slide Market odds and model odds for tennis match where odds evolve in the expected way. The odds are shown for player 1 winning the match. The black line represents the odds implied by the mathematical model, while the blue line shows the actual odds offered by the market. The x-axis gives the game score in each of the three sets played in the match (player 1’s games are shown first). Figure 6 Open in new tabDownload slide Market odds and model odds for tennis match where odds evolve in the expected way. The odds are shown for player 1 winning the match. The black line represents the odds implied by the mathematical model, while the blue line shows the actual odds offered by the market. The x-axis gives the game score in each of the three sets played in the match (player 1’s games are shown first). Figure 7 Open in new tabDownload slide Market odds and model odds for a real example of a tennis match reported as likely to have been manipulated. The odds are shown for player 1 winning the match. The black line represents the odds implied by the mathematical model, while the blue line shows the actual odds offered by the market. The x-axis gives the game score in each of the two sets played in the match (player 1’s games are shown first). Figure 7 Open in new tabDownload slide Market odds and model odds for a real example of a tennis match reported as likely to have been manipulated. The odds are shown for player 1 winning the match. The black line represents the odds implied by the mathematical model, while the blue line shows the actual odds offered by the market. The x-axis gives the game score in each of the two sets played in the match (player 1’s games are shown first). 6. A match-fixing detection system in practice Fixers may place their wagers in either or (more typically) both the pre-match and in-play markets. Spreading their money across both markets (and indeed, in the case of football, across the main types of bet, total goals, result in terms of home/ draw/away and margin of victory) reduces the risk of detection. Nevertheless, the bulk of their ‘investment’ will usually be in-play because here greater profit may be obtained. For example, their strategy may be to take much of their profit by betting that a team will lose by more than X goals (a popular market). They arrange for players on that team to make sufficient defensive errors in the second half to bring about the desired result. In the average case, since the match is initially played in a normal way, the odds for bets on that result will lengthen because less and less time is available for the required goals to be scored. Profit will therefore be greater if the nefarious money is fed into the market as odds lengthen than if the same bet in favour of a large defeat for the particular team had been placed pre-match. Van Rompuy (2015) had access to data relating to 1,468 (worldwide) football matches that the monitoring service Sportradar had reported as suspicious over a period of 5 years. In 67% of these matches, betting market anomalies had been detected in both pre-match and in-play markets. This and other statistics presented by Van Rompuy confirm that the presence of a fixing operation is likely to impact odds both before and during a match. Sportradar’s algorithms, applied to a continuous stream of odds data from hundreds of websites, work differently pre-match and in-play. Before the match, most alerts are triggered by sharp shifts in odds, whereas in-play monitoring focuses on deviation of observed odds from the odds predicted by the mathematical model. Here we concentrate on the in-play market, which is where most betting (legitimate and nefarious alike) takes place. Our description of procedures and related statistics were obtained while conducting an audit of the Sportradar system on behalf of UEFA, one of its principal clients. As such, our observations relate to monitoring in football. The first practical issue in setting up monitoring is the fixing of thresholds that have to be exceeded before the algorithms trigger an alert. Naturally, the mathematical model, which is only a forecasting model, will never track the evolution of observed odds exactly as there will be ‘noise’ in the latter series. So a judgement must be made as to how much deviation should be required before the system triggers an alert. The answer to this question will be different for different levels of competition (for example a greater deviation is tolerated for minor matches where the betting market is thin and so even a single moderate bet might induce a non-trivial movement in odds) but in all cases will depend on the relative importance of sensitivity and specificity in the task at hand. Low thresholds will generate a relatively large number of false positives but high thresholds may lead to the system missing true fixing cases. The task in hand at this stage is not in fact to generate a final verdict on whether a match fix has been attempted but rather to determine whether the betting market activity observed should be subject to scrutiny by expert analysts. Such scrutiny would be necessary in any monitoring system screening for manipulation because deviation in observed odds from the odds predicted by the mathematical model may not in fact be illogical. The in-play forecasting model captures the influence of team strengths, the current score, the number of minutes remaining and whether either team has been reduced in size by a red card (disciplinary dismissal of a player from the field). But the betting market might have access to additional information relevant to match outcomes. For example, odds in an efficient market would reflect an increase in the expected number of goals if one of the goalkeepers had been injured and replaced by a much less able substitute. In this case, deviation of the odds from those implied by the mathematical model should not be a cause for concern. Again, those viewing a match or following a commentary will be able to form a judgement on the momentum in the game that will give them information relevant to what direction the game will take. Indeed, McIvor et al. (2018) carried out textual analysis of ball-by-ball cricket commentaries and found that the incidence of positive or negative remarks about batsmen and bowlers anticipated future wickets. In similar vein, Brown et al. (2018) examined tweets during English Premier League football matches and reported that an increase in positive-tone tweets about a team coincided with a significantly raised probability that the team would go on to win. Each of these papers illustrates that there is match-specific information that is not accounted for in the generic statistical models used by bookmakers to set odds and by monitoring systems in detecting corruption. Hence deviations from between markets odds and model odds will always need to be assessed in case events in the match provide a plausible reason for the deviation. Given that the application of the algorithms can only be the first stage in a process, it is reasonable at this stage that sensitivity should be prioritized over specificity. In fact, thresholds appear to be set low in Sportradar’s system because of 45,569 matches covered during a 1-year period, 33.2% triggered at least one alert at the level where the test result was determined as ‘positive’ and passed on to analysts. Analysts are experts recruited for their knowledge of both sport and of betting markets. When an alert is issued, it goes straight to the desk of one of the duty analysts whose role is to carry out an immediate preliminary assessment of whether there is a legitimate explanation for the betting market anomaly. The analyst has access to multiple information sources and to data feeds from the match. He might conclude that the irregular behaviour in the odds was adequately explained by circumstances in either the betting market or in the sporting event. For example, a minor league might be facing more competition for betting volume on a particular day because of unusually heavy fixture lists in other competitions; this could mean that the standard thresholds for identifying deviations of observed from expected odds are unrealistically low and the deviations reported by the algorithms reasonable given the lower than usual liquidity in the market. On the sporting side, social media reports may tell of intense pressure on the goal of one of the teams such that scoring seems much more likely than the mathematical model predicts. In fact, most matches considered by Sportradar analysts at this stage are discarded from further review. In the year of data we examined, 15,129 matches generated alerts but only 7.95% of these were ‘hotlisted’ by the analyst (i.e. referred to the third and final stage of the screening process). That final stage usually takes place on the following day. Overnight, information on events in the game and videos of incidents where available are collected from country correspondents, typically journalists, engaged permanently by Sportradar to provide reports on local football (their regular reports on player injuries, etc., inform assessment of unexpected odds in pre-match markets). The full team of analysts on duty considers and discusses the ‘hotlisting’ report and the additional information assembled overnight and determines whether or not the match should be ‘escalated’, i.e. classified as ‘likely’ or ‘very likely’ to have been manipulated. In this case a comprehensive report giving the reasons for suspicion is prepared for the relevant governing body. In the year of data we examined, only 24% of matches initially hotlisted were subsequently escalated. These 291 matches represented 0.64% of all matches monitored during the 12 months we examined. The great shrinkage in the number of matches classified as suspicious between the alert and escalation stages reflects an evident shift towards emphasizing specificity over sensitivity in successive stages of the screening process. The description here shows that a considerable infrastructure is required to support the initial screening by algorithms. Investment is needed in systematic access to multiple sources of information as well as employment of valuable human expertise if the final classification of a match as suspicious is to be credible. The important role of judgement in the procedures should not, however, detract from an appreciation of the key role of forensic statistics in the output from monitoring. Essentially, modelling is the key to effective, automated identification of potentially fixed matches. Experts then search for legitimate explanations for the betting market anomalies that have triggered alerts. Where no plausible explanation can be found, the match will be reported as suspicious. Naturally the report will be more convincing if unusual odds movements are followed by on-field events that they appear to foretell (for example, the market seems illogical to favour two very late goals and they are then duly scored after two dubious penalties are awarded). Nevertheless, some matches are reported even without such corroboration by events because the odds movements are highly suggestive of a fix being attempted but not succeeding. Reports of likely fixed matches do not always get to be fully investigated by the sports governing body. However, in several cases, they have led to detailed inquiry, supported by police and leading to convictions. One well-known example was in Australian soccer. Three English players, each of whom had previously played for clubs in English tier-6 (Conference South) that had taken part in matches suspected of match fixing, transferred together to a minor club, Southern Stars, in the State of Victoria. Subsequently, a number of its matches were identified as likely to have been manipulated. Victoria police then obtained evidence of links between the players and criminals and built a case sufficient to secure conviction of the three players and their criminal handler. This and other successful outcomes in the criminal justice system have been achieved by police gathering physical evidence (for example, telephone records or bank transfers between players/ referees and criminals) to support a prima facie case made from the application of forensic statistics to the betting market. But law enforcement is not always willing to allocate resources to such cases nor can it be certain that adequate physical evidence can be assembled to satisfy the high standard of proof required in a criminal court. This raises the interesting question of whether betting market analysis can ever be an adequate basis for the governing body acting to protect its own sport by using its own disciplinary procedures (where standard of proof tends to be balance of probability rather than beyond reasonable doubt). Two important cases in the CAS have tested the ability of football authorities to impose sanctions where the statistical analysis indicates likely manipulation but detailed supporting evidence is unavailable (indeed, to protect the credibility of their events, sports governing bodies may wish to impose sanctions even if a criminal investigation is still in progress). The first case (CAS 2016/A/4650, Klubi Sportiv Skenderbeu v. UEFA) followed the exclusion of the Albanian club KS Skenderbeu from the European Champions League. Several of its matches, domestic and European, had been reported as suspicious by Sportradar. UEFA had used powers to exclude a club from its competitions to protect the reputation of those competitions. Having exhausted internal UEFA appeals procedures, Skenderbeu took the case to the CAS. The Court confirmed the exclusion. Although it cited some supporting evidence, such as the public statement of an opposition player about unusual play by Skenderbeu, the Court’s Judgement rested heavily on the Sportradar reports and on testimony as to the robustness of the Sportradar monitoring system by Forrest & McHale (2015). In a commentary on the published Judgement, Kerr (2017) notes that the Forrest & McHale study was cited at five separate points as grounds for the FDS being correct and that the Judgement also states that Forrest and McHale’s “expert evidence” was “not rebutted by differing expert evidence”. The Skenderbeu case was widely taken as a precedent. Blackshaw (2018, p. 244)), a CAS Arbitrator himself, wrote that ‘it is generally considered that, as a result of the CAS Award in the Skenderbeu match fixing case, it should be easier for Sports Governing Bodies to sanction match fixers.’ Indeed, following the case, new sports appeared more ready to subscribe to services offered by Sportradar and other monitoring companies. The Skenderbeu case was, however, relatively easy in the sense that UEFA depended on its rule that clubs could be held vicariously responsible for wrongdoing. It was sufficient to demonstrate (from the statistical analysis of betting odds) that a fix had likely occurred. No individuals were sanctioned, so it was not necessary to produce evidence of who had executed the fix. The next case of note was, however, different in that it involved sanctioning of an individual. FIFA had banned referee Joseph Lamptey for life after its disciplinary panels concluded that he had manipulated a World Cup qualifying match in 2016. In 2017, the CAS heard Mr. Lamptey’s appeal (CAS 2017/A/5173, Lamptey v. FIFA). Five betting services—Sportradar, EWS, GLMS, Starlizard and Genius Sports—had all reported very unusual odds movements in markets on total goals, with the unusual activity immediately preceding the scoring of two goals that had resulted from what had seemed to have been blatantly incorrect refereeing decisions. The Court Judgement again validated the logic behind monitoring systems. Its Judgement (para. 83) emphasized that ‘the deviation from the expected, ordinary movement in odds on ‘overs’ in the Match, contradicting the mathematical model, is a decisive sign that bettors had some information that the mathematical model did not have and expected that at least two goals be scored irrespective of the lapse of time.’ It found ‘remarkable’ that odds, which had behaved normally for most of the match, had deviated from expected odds immediately before two goals were scored as a result of referee decisions and had then returned to what the mathematical model predicted. Following the CAS award, FIFA ordered the replay of the match. The result was reversed, with Senegal and not South Africa now the winner, and Senegal duly won its place at the World Cup in Russia. From its very explicit validation of the conceptual framework behind monitoring—the comparison between observed odds and odds predicted by a mathematical model—the Court’s Judgement in this case provides further encouragement for the continued use of mathematical analysis in detecting manipulation in sport. Of course there is also potential to apply this forensic approach to the analysis of sports as well as betting data since this could reveal where players have underperformed to a significant degree. However, given that performance levels in sport are subject to wild variation, attempts to detect fixing via this route alone would yield too many false positives. It seems likely that future monitoring systems will have to include analysis of betting data even if it comes in time to be supplemented by analysis of the rich data now generated by professional sports matches. 7. Conclusions Online technology has helped generate betting markets with enormous levels of liquidity, which attract those seeking to make profits from match fixing. This paper describes how forensic statistics can be used to detect match fixing and combat the threat match fixing poses to sport integrity. At the heart of the detection algorithms lie in-play forecasting models whose predictions are compared to the odds being offered on betting markets. Suspicions are raised in instances where the market odds and the model odds diverge. We provide real examples of monitoring for football and tennis matches and describe how suspicious matches are investigated by expert analysts before finally being brought to the attention of the relevant governing body. We further cite two court cases in which forensic statistics played a central role in the fight against corruption in sport. Following the success of monitoring systems in, for example, football and tennis, major North American leagues such as Major League Baseball and the National Hockey League have adopted systems to monitor betting markets in search of fixes. The adoption of forensic statistics by sports governing bodies should be seen as success story for mathematics and it is clear that mathematical modelling will continue to be at the heart of processes to protect sport from the risk of corruption. Acknowledgements We would like to thank Sportradar for making data on example matches available to us. References Abarbanel , B.R. & Johnson , M.R. ( 2018 ) Esports consumer perspectives on match fixing , International Gambling Studies , doi: https://doi.org/10.1080/14459795.2018.1558451 . WorldCat Asif , M. & McHale , I. G. ( 2016 ) In-play forecasting of win probability in One-Day International cricket: a dynamic logistic regression model . Int. J. Forecast. , 32 , 34 – 41 . Google Scholar Crossref Search ADS WorldCat Akhtar , S. & Scarf , P. A. ( 2012 ) Forecasting test cricket match outcomes in play . Int. J. Forecast. , 28 , 632 – 643 . Google Scholar Crossref Search ADS WorldCat Bailey , M. & Clarke , S. R. ( 2006 ) Predicting the match outcome in one day international cricket matches, while the game is in progress . J. Sports Sci. Med. , 5 , 480 – 487 . Google Scholar PubMed WorldCat Barnett , T. ( 2006 ) Mathematical modelling in hierarchical games with specific reference to tennis. Ph.D. Thesis, Swinburne University of Technology, Melbourne, Australia . Barnett , T. & Clarke , S. R. ( 2005 ) Combining player statistics to predict outcomes of tennis matches . IMA J. Manag. Math. , 16 , 113 – 120 . Google Scholar Crossref Search ADS WorldCat Bernhardt , D. & Heston , S. ( 2010 ) Point shaving in college basketball: a cautionary tale for forensic economics . Econ. Inq. , 48 , 14 – 25 . Google Scholar Crossref Search ADS WorldCat Blackshaw , I. ( 2018 ) The role of the Court of Arbitration for Sport (CAS) in countering the manipulation of Sport. The Palgrave Handbook on the Economics of Sport ( M. Breuer & D. Forrest eds). Cham, Switzerland : Palgrave Macmillan , pp. 223 – 246 . Google Preview WorldCat COPAC Boniface , P. , Lacarrière , S. & Verschuuren , P. ( 2012 ) Paris sportifs et corruption. Iris Éditions Paris : in French . Google Preview WorldCat COPAC Borghesi , R. , Paul , R. J. & Weinbach , A. P. ( 2010 ) Totals market as evidence against the widespread point shaving . Journal of Prediction Markets , 4 , 15 – 22 . WorldCat Boshnakov , G. , Kharrat , T. & McHale , I. ( 2017 ) A bivariate Weibull count model for forecasting association football scores . Int. J. Forecast. , 33 , 458 – 466 . Google Scholar Crossref Search ADS WorldCat Brown , A. , Rambacussing , D. , Reade , J. J. & Rossi , G. ( 2018 ) Forecasting with social media: evidence from tweets on soccer matches . Econ. Inq. , 56 , 1748 – 1763 . Google Scholar Crossref Search ADS WorldCat Croxson , K. & Reade , J. J. ( 2014 ) Information and efficiency: goal arrival in soccer betting . Econ. J. , 124 , 62 – 91 . Google Scholar Crossref Search ADS WorldCat Dixon , M. J. & Coles , S. G. ( 1997 ) Modelling association football scores and inefficiencies in the football betting market . J. Roy. Statist. Soc. Ser. C , 46 , 265 – 280 . Google Scholar Crossref Search ADS WorldCat Dixon , M. & Robinson , M. ( 1998 ) A birth process model for association football matches . J. Roy. Statist. Soc. Ser. D , 47 , 523 – 538 . Google Scholar Crossref Search ADS WorldCat Duckworth , F. C. & Lewis , A. J. ( 1998 ) A fair method of resetting the target in interrupted one-day cricket matches . J. Oper. Res. Soc. , 49 , 220 – 227 . Google Scholar Crossref Search ADS WorldCat Duggan , M. & Levitt , S. D. ( 2002 ) Winning isn’t everything: corruption in sumo wrestling . Am. Econ. Rev. , 92 , 1594 – 1605 . Google Scholar Crossref Search ADS WorldCat Elaad , G. , Grumer , A. & Kantor , J. ( 2018 ) Corruption and sensitive soccer games: cross-country evidence . J. Law Econ. Organ. , 34 , 364 – 394 . Google Scholar Crossref Search ADS WorldCat Forrest , D. ( 2012 ) The threat to football from betting-related corruption . Int. J. Sport Finance , 7 , 99 – 116 . WorldCat Forrest , D. and McHale , I.G. ( 2015 ). An evaluation of Sportradar’s Fraud Detection System. https://fds.integrity.sportradar.com/wp-content/uploads/sites/18/2016/03/Sportradar-Security-Services_University-of-Liverpool_An-Evaluation-of-the-FDS.pdf . Forrest , D. , McHale , I. & McAuley , K. ( 2008 ) ‘Say it ain’t so’: betting related malpractice in sport . Int. J. Sport Finance , 3 , 156 – 166 . WorldCat Gómez , J. A. & Lalanda , C. ( 2018 ) Anuario del Juego en España 2018 . Madrid : Instituto de Política y Gobernanza de la Universidad Carlos III and Grupo Codere . Google Preview WorldCat COPAC Gregory , J. ( 2018 ) Do basketball scoring patterns reflect illegal point shaving or optimal in-game adjustments? Quant. Econom. , 9 , 1053 – 1085 . Google Scholar Crossref Search ADS WorldCat Hill , D. ( 2010 ) The Fix: Soccer and Organized Crime . Toronto : McClelland and Stewart . Google Preview WorldCat COPAC IRIS ( 2017 ) Preventing Criminal Risks Linked to the Sports Betting Market . Paris : The French Institute for International and Strategic Affairs (IRIS) . Jetter , M. J. & Walker , J. K. ( 2017 ) Good girl, bad boy? Evidence consistent with collusion in professional tennis . South. Econ. J. , 84 , 155 – 180 . Google Scholar Crossref Search ADS WorldCat Jewell , S. & Reade , J. ( 2014 ) On fixing international cricket matches. Economics & Management Discussion Papers em-dp2014-08, Henley Business School, Reading University . Karlis , D. & Ntzoufras , I. ( 2000 ) On modelling soccer data . Student , 3 , 229 – 244 . WorldCat Kerr , J. ( 2017 ) How to build an ‘open’ match-fixing alert system . Int. Sports Law J. 17 , 49 – 67 , doi: https://doi.org/10.1007/s40318-017-0115-6 Google Scholar Crossref Search ADS WorldCat Klaassen , F. J. G. M. & Magnus , J. R. ( 2003 ) Forecasting the winner of a tennis match . Eur. J. Oper. Res. , 148 , 257 – 267 . Google Scholar Crossref Search ADS WorldCat Klaassen , F. J. G. M. & Magnus , J. R. ( 2001 ) Are points in tennis independent and identically distributed? Evidence from a dynamic binary panel data model . J. Amer. Statist. Assoc. , 96 , 500 – 509 . Google Scholar Crossref Search ADS WorldCat Koopman , S. J. & Lit , R. ( 2015 ) A dynamic bivariate Poisson model for analysing and forecasting match results in the English Premier League . J. R. Stat. Soc. Ser. A Stat. Soc. , 178 , 167 – 186 . Google Scholar Crossref Search ADS WorldCat Maher , M. J. ( 1982 ) Modelling association football scores . Statist. Neerlandica , 36 , 109 – 118 . Google Scholar Crossref Search ADS WorldCat McHale , I. G. & Asif , M. ( 2013 ) A modified Duckworth–Lewis method for adjusting targets in interrupted limited overs cricket . Eur. J. Oper. Res. , 225 , 353 – 362 . Google Scholar Crossref Search ADS WorldCat McHale , I. G. & Scarf , P. A. ( 2007 ) Modelling soccer matches using bivariate discrete distributions with general dependence structure . Statist. Neerlandica , 4 , 432 – 445 . Google Scholar Crossref Search ADS WorldCat McIvor , J.T. , Ankit , K. P. , Hilder , T. & Bracewell , P.J. ( 2018 ) Commentary sentiment as a predictor of in-game events in T20 cricket. Proceedings of 14th Meeting, Australasian Conference on Mathematics and Computers in Sport . Queensland, Australia : ANZIAM Mathsport , 44 – 49 , http://www.anziam.org.au/tiki-download_file.php?fileId=113. Nordhaus , W. D. ( 1987 ) Forecasting efficiency: concepts and applications . Rev. Econ. Stat. , 69 , 667 – 674 . Google Scholar Crossref Search ADS WorldCat Preston , I. & Szymanski , S. ( 2003 ) Cheating in contests . Oxf. Rev. Econ. Policy , 19 , 612 – 624 . Google Scholar Crossref Search ADS WorldCat Slemrod , J. ( 1985 ) An empirical test for tax evasion . Rev. Econ. Stat. , 67 , 232 – 238 . Google Scholar Crossref Search ADS WorldCat Spanias , D. & Knottenbelt , W. J. ( 2013 ) Predicting the outcomes of tennis matches using a low-level point model . IMA J. Manag. Math. , 24 , 311 – 320 . Google Scholar Crossref Search ADS WorldCat Sport Accord ( 2011 ) Integrity in Sport: Understanding and Predicting Match Fixing . Moudon, Switzerland : Sport Accord . Google Preview WorldCat COPAC Stern , S. E. ( 2016 ) The Duckworth–Lewis–Stern method: extending the Duckworth–Lewis methodology to deal with modern scoring rates . J. Oper. Res. Soc. , 67 , 1469 – 1480 . Google Scholar Crossref Search ADS WorldCat Titman , A. C. , Costain , D. A. , Ridall , P. G. & Gregory , K. ( 2015 ) Joint modelling of goals and bookings in association football . J. R. Stat. Soc. Ser. A Stat. Soc. , 178 , 659 – 683 . Google Scholar Crossref Search ADS WorldCat Van Rompuy , B. ( 2015 ) The Odds of Match Fixing: Facts and Figures on the Integrity Risk of Certain Sports Bets . The Hague : T.M.C. Asser Instituut . Google Preview WorldCat COPAC Vaughan Williams , L. ( 1999 ) Information efficiency in betting markets: a survey . Bull. Econ. Res. , 51 , 1 – 30 . Google Scholar Crossref Search ADS WorldCat Viney , M. & Bedford , A. ( 2018 ) Altering the probability of winning a point on serve for the most and least important point in tennis. Proceedings of 14th Meeting, Australasian Conference on Mathematics and Computers in Sport . Queensland, Australia : ANZIAM Mathsport , 95 – 101 , http://www.anziam.org.au/tiki-download_file.php?fileId=113. Volf , P. ( 2009 ) A random point process model for the score in sport matches . IMA J. Manag. Math. , 205 , 121 – 131 . WorldCat Wolfers , J. ( 2006 ) Point shaving: corruption in NCAA basketball . Am. Econ. Rev. , 96 , 279 – 283 . Google Scholar Crossref Search ADS WorldCat © The Author(s) 2019. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) TI - Using statistics to detect match fixing in sport JF - IMA Journal of Management Mathematics DO - 10.1093/imaman/dpz008 DA - 2019-09-02 UR - https://www.deepdyve.com/lp/oxford-university-press/using-statistics-to-detect-match-fixing-in-sport-9ZvA7V1hZR SP - 431 VL - 30 IS - 4 DP - DeepDyve ER -