You are viewing a javascript disabled version of the site. Please enable Javascript for this site to function properly.
Go to headerGo to navigationGo to searchGo to contentsGo to footer
In content section. Select this link to jump to navigation

On the distribution of rally length in professional tennis matches

Abstract

In the literature, information on the rally length distribution is quite incomplete, fragmented and non-homogeneous. In this paper we fill the gap deeply analyzing the distribution of rally length in professional tennis matches in the following directions: i) we provide the empirical distribution of the rally length, not only for some categories, but for each single length; ii) we consider different distributions for men and women and for different surfaces; iii) we find the statistical distribution best fitting the data for each surface; iv) we show how the rally distribution depends on some variables, such as the probabilities of winning a point at serve and players’ heights; v) previous points are based on a much larger sample size than other works leading to very reliable results. Our analyses point out that the best distribution for rally length is a zero-one-modified Geometric distribution, whose parameters are functions of the probabilities of winning a point at serve and of the players’ heights. Results suggest that the the players’ heights is the most impacting variable on the rally length distribution.

1Introduction

A rally in tennis is the sequence of back and forth shots between players, within a single point. A rally starts with the serve, can involve any kind of shot and ends when a point is scored.

Rally statistics, particularly rally lengths, are useful to measure different styles of play, to define strategies of play and to analyze different aspects of the game (Makino et al., 2020). Usually players dominant on serve tend to play shorter rallies while baseliner players are often engaged in significantly longer rallies. As the majority of points are 4 shots or fewer, some analysts have stressed the importance of a game strategy designed to close the point as fast as possible.

Besides the style of play, the rally length is affected by several other factors: obviously, by the game context but also by the court surface, by balls features, by weather conditions and by the physical characteristics and gender of players. Slower surfaces, as clay courts, tend to produce longer rallies than hard and, even more, grass courts. Hotter weather fosters faster balls, helping the servers and, potentially, increasing the 0–4 rally count. Likewise, taller players tend to be associated to shorter rallies due to their strong service.

For all these reasons, the number of shots in a point, i.e. the rally length, can and should be treated as a random variable. As a consequence, we can wonder which is the distribution of such a random variable. Although this issue is very interesting, it has received relatively little attention in the literature and, to date, only very partial and incomplete results are available. In the present work, we fill the gap on the rally length distribution deeply analyzing it and improving the existing literature in several directions: i) we provide the distribution of rally length for men and women, and for each surface, not limited to the first 10–15 shots, as often done, but for any observed rally length. This allows to appreciate the frequency of quite long rallies; ii) our analyses are based on a very large sample size, around 500, 000 points for men and around 250, 000 points for women. This is by far the largest number of points considered in literature. As a consequence, results should be very stable also for not too short rally lengths; iii) separately for men and women and for each surface, we look for the best statistical distribution for the rally length, in particular we consider the quasi-Poisson distribution, the Geometric distribution and two of their variants, namely the zero-modified Poisson/Geometric distributions and the zero-one-modified Poisson/Geometric distribution, specifically built to produce more accurate estimates of the zero and one rally frequencies; iv) for the same distributions we consider time-varying parameter versions, where parameters depend on other exogenous variables. This, in turn, allows us to study which variables significantly impact on the rally length. An interesting result is that players’ height is particularly relevant for the rally length distribution. To the best of our knowledge, this kind of study is new and has never been done before. While many studies assessed the importance of players’ heights to explain the serve strength (Vaverka and Cernosek, 2013; Pascual, 2023), to predict match outcomes (Bieniek and Kwater, 2015; Gao and Kowalczyk, 2021) or within the betting context (Candila and Palazzo, 2020), none has connected the height with rally lengths.

In our analysis, we focus on parametric distributions, mainly because it is much more complex to generate data from a nonparametric distribution, while anyone can easily generate data from a parametric distribution as soon as (estimated) parameters are known. A parametric distribution is particularly useful when the rally length is used in a simulation context, as in Kovalchik and Ingram (2018) or in Lisi and Grigoletto (2021), who used it to simulate the duration of professional tennis matches. In addition, the parametric approach allows a comparison in terms of parameters’ values and is less sensitive to the presence of several zero frequencies, as observed in the empirical distribution.

The rest of the paper is organized as follows. In Section 2 the literature on rally length distributions is reviewed. In Section 3 we introduce the dataset and provide descriptive analyses. Section 4 is devoted to describe some probabilistic models for the rally length. Estimation results are discussed in Section 5 while the comparison among competitor models is performed in Section 6. Section 7 concludes.

2Literature review

In the current literature, information on rally length distribution is quite incomplete, fragmented and non-homogeneous.

Fernandez-Fernandez et al. (2008) analyzed eight well-trained female tennis players, 6 of which were ranked between 300 and 800 in the Women’s Tennis Association (WTA) singles ranking (one player was the current European Junior Champion) and, for outdoor clay-court surface, reported a mean rally length of 2.5 ± 1.6 shots per rally.

In a four-set Davis Cup match, used as a case study, Gomes et al. (2011) found that the number of strokes per rally decreases during the match.

Carboch et al. (2019) analyzed the rally pace characteristics and the frequency of rally shots in 7 male (1738 points) and 23 female (2926 points) matches at the Australian Open 2017 and provided a graphical representation of the distribution of rally length for men and for women up to 20+ shots1. They found that the frequency of rally shots was similar for the two genders. In the whole match, the rally finished within the first four shots in 59% (men) and in 62% (women) of cases; within 5–8 shots in 27% (men) and 27% (women) of cases; 9 and more shots were required in 14% (men) and 11% (women) of cases.

In a paper focusing on how the use of new balls affects the match characteristics and the frequency of rally shots Carboch et al. (2020) provided observed frequencies of rally length up to 13 shots. However, their results are based on a limited number of matches: 23 female matches played at the Australian Open (1141 points) and 24 male matches played at the Australian Open (699 points), French Open (838 points) and Wimbledon (537 points) in 2017.

Mlakara and Kovalchik (2020) provided a graphical representation of the rally length based on 66 male matches (8026 points) and 64 female matches (4834 points) played during the 2017 Australian Open tournament. However, since they were interested in analyzing time pressure rallies, they included only points longer than 2 shots.

In a study aimed at establishing the prevalence and importance of individual rally lengths within points of 0, 1, 2, 3 and 4 shots in terms of winning elite grass court tennis matches, Fitzpatrick et al. (2021) considered data from 211 male and 209 male Wimbledon singles matches between 2015 and 2017. Their results revealed an underlying prevalence of short points (compared to medium length and long points) on grass courts for both genders, with 66% (for women) and 72% (for men) of all points played at Wimbledon between 2015 and 2017 ending in fewer than 5 shots. Based on the considered data, they also provided the mean percentage of points played per match of 0, 1, 2, 3 and 4 shot rally lengths, both for men and for women.

In his blog, Ingram (2021) studied how the average rally length by surface changed over time in male tennis and showed that, from 1970 to 2020, the average length tended to become more homogeneous across surfaces.

On the website tennisabstract.com, the Match Charting Project provides the average rally length for a number of players, as well as statistics for rally length classified within the categories 1 - 3 shots, 4 - 6 shots, 7 - 9 shots and 10+ shots. Other websites show classifications based on slightly different categories. Further pieces of information about rally length frequencies can be found in specialized websites, as Stat on the T (on-the-t.com) which gives the frequency of rallies longer than four shots for several professional players or the server win percentage by rally length.

To determine a reasonable distribution of the shots per point Kovalchik and Ingram (2018) examined the relationship between the number of shots per rally and the service bonus and malus2 and the surface of the match using data from 1582 male matches and 966 female matches. They suggested that the expected shot count and variance could be accurately approximated with a quasi-Poisson distribution conditional on the service bonus. This is the only work which attempted to pinpoint a statistical distribution for the rally length, even if the authors didn’t give any detail on how they found it.

3The dataset

The dataset on which the analyses are performed is based on data available on the Match Charting Project (MCP), a crowdsourced effort to track shot-by-shot data in professional tennis, created by Jeff Sackman and available on Github3. However, the rally length of each point is not directly available, but has been extrapolated by the information included in the dataset, using an ad hoc code written in R language. In this way we were able to obtain the rally lengths for 5751 male and 3413 female professional matches since year 2000. This permitted us to analyze the rally lengths of 503, 946 points played in the male circuit and 247, 392 points played in the female circuit. A detailed description of the sample sizes for different surfaces and gender, is given in Table 1. These numbers are sensibly higher than those considered in the works quoted in the introduction. This very large sample size is important in order to have a good estimate of occurrences of low-frequency rally lengths and should ensure a good reliability of our analyses for each single surface.

Table 1

Matches and points sample sizes for men and women and for different surfaces

MenWomen
SurfaceN. of matchesN. of pointsN. of matchesN. of points
Grass71070,33391265,705
Hard3,332290,3401,833130,235
Clay1,709142,25368651,449
Total5,751503,9463,431247,389

Note that, being MCP a crowdsourced project,it does not contain all the matches played in a given period.

3.1Descriptive analyses

The definition of rally length is not uniform across literature and blogs, depending on whether serve counts as a shot or not. In this work we use the definition given in the MCP: the serve counts as a shot, but errors do not. Thus, a double fault is 0 shots, and an ace or unreturned serve is 1. A rally with a serve, three additional shots and an error on an attempted fifth shot counts as 4.

Figure 1 shows the empirical rally distribution for men on each surface up to 25 shots. The absolute frequencies for the whole distribution are listed in Table 14 of the Appendix4, while Table 2 provides some descriptive statistics.

Fig. 1

Men: Rally distribution for the first 20 shots for clay, hard and grass surfaces.

Men: Rally distribution for the first 20 shots for clay, hard and grass surfaces.
Table 2

Men: Descriptive statistics for rally distribution for each surface. SD=standard deviation; S=Skewness coefficient; K=Kurtosis coefficient; qα=α-th quantile

SurfaceMeanSDModeSKq2,5q5q10q25q50q75q90q95q97,5Max
Grass3.23.112.18.60111247101248
Hard3.83.811.97.40111259121559
Clay4.34.111.87.001113610131583

Double faults occur around 3.5% of times on clay and around 3.9% of times on hard surfaces and grass, highlighting that, on the whole, there is no surface strategy involving double faults apart from, maybe, taking a little greater risk on grass and hard surfaces. The largest differences among surfaces come in the case of just one shot, which occurs in 24.0% of cases on clay, in 30.9% of cases on hard surfaces and in 35.6% of times on grass. For number of shots greater than one differences are less pronounced. For rally lengths greater than four the observed frequency on grass is always smaller than for clay and hard surfaces. This confirms that, on grass, players try to close the point faster. On the contrary, starting from four shots, the higher frequencies are those related to clay. For all surfaces the rally length’s mode is 1 while the median is 3 on clay and 2 for grass and hard. It is also interesting to note that rallies lasting more that 15 shots occur in 2.6% of times on clay, 2.3% of times on hard and only 1.1% of times on grass. Although low, these frequencies are not completely negligible, as assumed by several categorizations. In our dataset, for men, the largest rally values are 83 for clay, 59 for hard5 and 48 for grass.

Analogously, Fig. 2 shows the rally distribution for women on each surface up to 25 shots. Table 2 gives some descriptive statistics for the whole distribution, whose absolute frequencies are listed in Table 15 of the Appendix.

Fig. 2

Women: Rally distribution for the first 20 shots for clay, hard and grass surfaces.

Women: Rally distribution for the first 20 shots for clay, hard and grass surfaces.

In female matches, double faults occur around 5.0% of times independently of the surface, a little more often than for men. Even if the summary descriptive statistics are quite similar to those for men, the histogram in Fig. 2 globally shows less pronounced differences among surfaces with respect to men. Also, very long rallies are less frequent than in male matches: for instance, rallies long at least 18 shots occur 0.5% of times on hard, 0.27% of times on grass and 0.65% of times on clay for women, against the corresponding 1.1%, 0.5% and 1.2% for men. In our female dataset, the longest rallies lengths are 48 on hard courts and clay and 34 on grass.

It is however curious that the longest rally in professional tennis was played by two women. During the 1984 Virginia Slims tournament, the tennis players Vicki Nelson and Jean Hepner played a point hitting 643 shots over 29 minutes.

Table 3

Women: Descriptive statistics for rally distribution for each surface. SD=standard deviation; S=Skewness coefficient; K=Kurtosis coefficient; qα=α-th quantile

SurfaceMeanSDModeSKq2,5q5q10q25q50q75q90q95q97,5Max
Grass3.53.111.55.50111358101234
Hard3.93.511.65.80011359111348
Clay4.23.611.66.40111369121448

4Probabilistic models

Using the previously described dataset, this section aims at finding probabilistic models able to suitably represent the rally length distribution on different surfaces, both for male and female professional players. Note that, while some authors consider only strictly positive rally lengths (Kovalchik and Ingram, 2018), in this work we try to model the whole distribution, including the case of 0 length, i.e. double faults.

This is a challenging task, since empirical rally length distributions exhibit over-dispersion as well as less zero observations and more one observations than would be allowed, for example, by the Poisson model. The same issues arise when adapting a Geometric distribution.

This critical point requires, hence, to devote specific attention to zero and one frequencies. The need to modify a discrete distribution in order to better model the count of zeros is often encountered in the literature. Zero-inflated (Lambert, 1992) and hurdle models (Mullahy, 1986; Heilbron, 1994) were proposed to improve the fitting of e.g. Poisson, Geometric or negative binomial count models which, in their regular versions, were unable to yield realistic zero counts. Likewise, the literature contains analyses in which discrete distributions are modified for both zero and one counts (Qi et al., 2019; Mohammadi et al., 2021). Below, when using these kinds of distributions, we will refer to them as zero-modified or zero-one-modified. Most properties of a zero-(one-) modified distribution follow easily from its unmodified counterpart.

In the next two subsections, first we consider unconditional models which try to adapt some known distribution to the data. In particular, we consider the quasi-Poisson distribution and the zero-modified versions of the Poisson and Geometric models, to account for deflated zeros. Moreover, to improve the fitting, for both distributions we propose further variants that we call zero-one-modified. The zero-one-modified Poisson distribution and the zero-one-modified Geometric distribution are built to jointly account for deflated zeros and inflated ones values. In all these cases, the goal is to estimate the models’ parameters, assumed to be constant, and to find the distribution which best fits the data.

Secondly, in order to further improve the fitting, for the quasi-Poisson model, the zero-one-modified Poisson and the zero-one-modified Geometric distributions, parameters are allowed to depend on some exogenous variables. This permits also to analyze which variables significantly affect the rally length.

4.1Unconditional models

Since rally lengths are count data taking discrete, non negative values occurring independently, we may think of modeling them by means of a Poisson distribution or a Geometric distribution.

However, in a Poisson distribution mean and variance coincide. In our case, instead, this assumption is clearly violated: for instance, for the male matches on hard surfaces, the rally’s mean is 3.8 while, due to a very long right tail, the rally’s variance is 14.56, and similar results also hold on grass and clay.

A possible solution to handle the over-dispersion is to refer to the quasi-Poisson model. This is a model, for a variable Y, assumes that E (Y) = λ and Var (Y) = φ · λ, where the dispersion parameter φ is unrestricted and is estimated from the data. The quasi-Poisson model is not a distribution but, rather, a model belonging to the family of generalized linear models (see Nelder and Wedderburn, 1972 and McCullagh and Nelder, 1989) with link function defined as

(1)
log(λ)=β0+β1X1+β2X2+...+βqXq,
where λ is the mean of the response variable, and X1, …, Xq are suitable regressors. The unconditional version of model (1) has no explanatory variables and, thus, λˆ=exp(βˆ0) .

To face zero-deflation and one-inflation, we estimate zero-modified and zero-one-modified Poisson and Geometric distributions.

In detail, the zero-modified Poisson (zmPois) distribution is a discrete mixture between a degenerate distribution at zero and a standard Poisson. If Y ∼ zmPois (p0, λ) its probability mass function is given by:

(2)
P(Y=y)={p(0)=p0p(y)=1-p01-e-λf(y)y=1,2,...
with λ > 0, 0 ≤ p0 ≤ 1 and f (y) being the probability mass function of a Poisson distribution with parameter λ.

The probability mass function of a r.v. Y having zero-modified Geometric (zmGeom) distribution, Y ∼ zmGeom (p0, p), is given by

(3)
P(Y=y)={p(0)=p0p(y)=1-p0(1-p)f(y)y=1,2,...
with 0 ≤ p0 ≤ 1, 0 < p < 1 and f (y) being the probability mass function of a Geometric distribution with parameter p.

Generalizing the zero-modified distributions we obtain the zero-one-modified distributions, which are discrete mixtures between two degenerate distributions at zero and one and a standard distribution f (y). For example, the probability mass function of a r.v. Y having zero-one-modified Poisson (zomPois) distribution, Y ∼ zomPois (p0, p1, λ) is given by

(4)
P(Y=y)={p(0)=p0p(1)=p1p(y)=1-p0-p11-e-λ(1+λ)f(y)y=2,3,...
with 0 ≤ p0 ≤ 1, 0 ≤ p1 ≤ 1, 0 < p < 1 and f (y) being the probability mass function of a Poisson distribution with parameter λ.

Likewise, if Y is a zero-one-modified Geometric (zomGeom) distribution, Y ∼ zomGeom (p0, p1, p) its distribution is

(5)
P(Y=y)={p(0)=p0p(1)=p1p(y)=1-p0-p1(1-p)2f(y)y=2,3,...
with 0 ≤ p0 ≤ 1, 0 ≤ p1 ≤ 1, 0 < p < 1 and f (y) being the probability mass function of the Geometric distribution with parameter p.

Parameters p0, p1, p and λ (depending on which distribution is used) can be estimated by maximum likelihood.

4.2Conditional models

To improve the distribution’s fitting, and in agreement with the approach followed by Kovalchik and Ingram (2018), in this section we allow the distribution’s parameters to be non-constant across the matches. To achieve this goal, we write the distribution’s parameters as a function of some variables describing the matches’ features which, possibly, affect the rally length.

As the quasi-Poisson model belongs to the GLM family, representing the dependence of parameter λ on some exogenous variables is quite straightforward and consists in including the regressors in equation (1). For instance, Kovalchik and Ingram (2018) considered a quasi-Poisson model where λ was written as a function of just one variable X denoting the sum of the probabilities of each two player to win the point at serve. In particular, for their data, they found λˆ=2.89-1·X and φˆ=3.3 , for men, and λˆ=2.33-0.7·X and φˆ=2.7 , for women.

For the zero-one modified Geometric and Poisson distributions we allow p (0), the probability of zero length rallies, i.e. of double faults, to depend on the surface but not on other variables. The reason is our assumption that, on average, players try to minimize the number of double faults in any situation, but they may accept to take some risk on surfaces rewarding good serves.

For the zomPois and zomGeom models, we make parameters match-dependent by writing them as functions of exogenous variables. In the case of the the zomGeom model, and for the i-th point, as parameters p1,i and pi, which represent probabilities, we write their logit transformation as a linear function of q regressors:

(6)
log(p1,i1-p1,i)=j=0qγjXi,jandlog(pi1-pi)=j=0qβjXi,j
with Xi,0 = 1, so that for the i-th point we can write

(7)
p1,i=exp(j=0qγjXi,j)1+exp(j=0qγjXi,j)andpi=exp(j=0qβjXi,j)1+exp(j=0qβjXi,j)
where γj and βj, j = 0, 1, . . . , q are unknown parameters to be estimated and X1, …, Xq are known explanatory variables. This representation assures that p1,i and pi belong to the (0, 1) interval.

When considering a zomPois model, parameter p1,i has the same representation while, since λi must be positive, we write

(8)
λi=exp(j=0qδjXi,j)
where δj, j = 0, 1, . . . , q, are parameters to be estimated.

In this paper, all parameters are estimated by maximum likelihood.

5Estimation results

In this section the previously described models are applied to our dataset. For each surface and for both genders, they consist of the sequence yi, i = 1, . . . , N, of rally lengths, where i is the point considered.

The set of explanatory variables Xj,i considered in this work is:

- X1 = Pa + Pb, where Px is the probability that player x wins the point at serve;

- X2 = |Pa - Pb|;

- X3 = log(Ha + Hb), where Hx is the height of player x in cm;

- X4 = |Ha - Hb|.

In addition, as we consider different models for each surface, estimated parameters also implicitly depend on this variable. The first two variables X1 and X2 were used also by Kovalchik and Ingram (2018), while the (log) sum and the absolute difference of heights have never been considered before. The absolute difference and the sum of the probabilities of winning the point at serve are often used to describe the difference in quality and the overall quality of two players. Intuitively, one can expect that the higher the overall quality the longer the rallies because neither of the two players dominates the other one. On the contrary, the higher the difference in quality the shorter the rallies because the stronger player should manage to quickly win the point. Of course, all these considerations hold on average. The rationale behind the consideration of the players’ height as a driver for rally lengths is that a strong serve can make a player have an upper hand after the serve, thus having the opportunity to close up the point. In turn, the service strength is favoured by the player’s height as witnessed by the fact that great servers are usually quite tall players (Bieniek and Kwater, 2015).

Thus, the (absolute) difference in the players’ heights may impact on rally lengths, especially on a fast surface. However, the heights’ difference says nothing about the actual players’ heights and this motivate the consideration also of the sum of the players’ heights. As concerns the use of the logarithm of the heights sum, it is just due to a better fit of models to data with respect to the simple sum6. Note that even if X3 and X4 are related to the players’ height, there are not collinearity issues because the two transformations make them not very correlated: for men their correlation is 0.295 while for women it is only 0.102.

Actually, we also considered the sum and the absolute difference of ATP/WTA ranking but they never resulted significant. In this work Pa and Pb are the fractions of points won at serve by each player in the match within which the i-th point was played. This explains the large variability of X1 and X2 as shown in Table 4, which displays some descriptive statistics for these four variables in our dataset.

Table 4

Statistical features of regressor variables. They were computed over the whole dataset, without distinction on surface

VariablesMin.Q(0.05)Q(0.25)MedianQ(0.75)Q(0.95)Max
X10.7270.9711.1211.2061.2881.4071.670
X20.0010.0280.1120.1940.2800.3990.693
exp (X3)348365372374378390411
X4000271541

The estimated parameters and related p-values for quasi-Poisson conditional models, for men, are listed in Table 5, for two different specifications: the one suggested by Kovalchik and Ingram (2018), including only X1 = Pa + Pb and denoted by QPKI, and the specification including all the X variables, denoted by QP. The estimates of the same parameters for women are listed in Table 6.

Table 5

Men: Estimated parameters and, between parentheses, the corresponding p-value for conditional quasi-Poisson models. QPKI denotes the Kovalchik and Ingram (2018) specification

GrassHardClay
VariablesQPQPKIQPQPKIQPQPKI
β016.8061.25713.4021.53118.3361.627
(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)
β10.0032-0.084-0.053-0.157-0.070-0.136
(0.364)(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)
β2-0.126-0.156-0.123
(0.002)(<0.001)(<0.001)
β3-2.645-2.017-2.829
(<0.001)(<0.001)(<0.001)
β40.003-0.001-0.002
(<0.001)(0.006)(<0.001)
φ3.03.03.83.83.83.8
Table 6

Women: Estimated parameters and, between parentheses, the corresponding p-value for conditional quasi-Poisson models. QPKI denotes the Kovalchik and Ingram (2018). specification

GrassHardClay
VariablesQPQPKIQPQPKIQPQPKI
β029.611.40024.131.60127.191.507
(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)
β1-0.084-0.123-0.130-0.2130.156-0.063
(0.001)(<0.001)(<0.001)(<0.001)(0.011)(0.011)
β20.034-0.070-0.463
(0.263)(0.001)(<0.001)
β3-4.821-3.86-4.415
(<0.001)(<0.001)(<0.001)
β4-0.00320.002-0.003
(0.001)(<0.001)(<0.001)
φ2.72.73.13.23.23.2

In the QP specification, all four variables are significant, with the exception of X1 = Pa + Pb for men on grass and of X2 = |Pa - Pb| for women on grass. This gives a first suggestion of the relevance of the height’s role on the rally length. For QPKI models, parameters have been re-estimated on our dataset. Both for men and women, while dispersion parameters φ are quite similar to those found by Kovalchik and Ingram (2018), estimated β1 parameters, defining the linear dependence of λ on Pa + Pb, are sensibly different, even if in agreement with the sign.

Estimation results for zero-one-modified Geometric (zomGeom) and Poisson (zomPois) are listed in Table 7, for men, and in Table 8, for women. As shown in equations (4) and (5), zomPois and zomGeom models have three parameters, i.e. p0, p1 and λ the former and p0, p1 and p, the latter. Parameter p0 is assumed constant, while the other ones depend on regressors.

Table 7

Men: Estimated parameters and, between parentheses, corresponding p-value for conditional zero-one-modified Geometric and Poisson models

GeometricPoisson
piParamGrassHardClayGrassHardClay
γ0-32.47-33.50-39.72-33.01-33.51-39.72
(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)
γ10.2980.2460.1400.3020.2450.140
(0.005)(<0.001)(0.055)(0.004)(<0.001)(0.057)
p1γ20.1150.2100.3720.1100.2100.372
(0.339)(<0.001)(<0.001)(0.359)(<0.001)(<0.001)
γ35.315.4586.4705.4055.4596.469
(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)
γ4-0.004-0.00010.005-0.004-0.00010.005
(0.021)(0.986)(<0.001)(<0.017)(0.982)(<0.001)
β0-13.51-8.156-15.5811.4257.4312.80
(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)
β1-0.1230.0030.1140.0910.001-0.080
(0.108)(0.923)(0.006)(0.008)(0.912)(<0.001)
p/λβ20.1380.1940.093-0.099-0.139-0.066
(0.113)(<0.001)(0.055)(0.010)(<0.001)(<0.001)
β32.1361.1612.385-1.694-0.969-1.853
(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)
β4-0.0020.0010.0030.0020.001-0.002
(0.057)(0.006)(<0.001)(<0.001)(<0.001)(<0.001)
Table 8

Women: Estimated parameters and, between parentheses, corresponding p-value for conditional zero-one-modified Geometric and Poisson models

GeometricPoisson
piparamGrassHardClayGrassHardClay
γ0-47.86-35.55-50.39-44.06-37.57-50.27
(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)
γ10.2120.171-0.1620.2140.180-0.164
(0.027)(0.007)(0.172)(0.002)(0.005)(0.169)
p1γ20.0390.2840.9210.0450.2700.926
(0.723)(<0.001)(<0.001)(0.679)(<0.001)(<0.00)
γ37.9505.8408.3827.3026.1838.363
(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)
γ40.004-0.0010.0010.005-0.001-0.001
(0.066)(0.439)(0.785)(0.046)(0.434)(0.778)
β0-32.00-25.54-28.2525.4320.2323.14
(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)
β10.0910.158-0.214-0.052-0.1120.151
(0.138)(<0.001)(0.001)(0.052)(<0.001)(<0.001)
p/λβ2-0.0720.0370.4700.044-0.024-0.334
(0.300)(0.425)(<0.001)(0.144)(0.201)(<0.001)
β35.274.1294.6424.072-3.154-3.687
(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)(<0.001)
β40.004-0.0010.005-0.003(0.001)0.004
(0.014)(0.125)(0.001)(<0.001)(0.010)(<0.001)

Apart for the constant, the only variable which is always significant for all parameters, all models, all surfaces and gender is X3 = log (Ha + Hb). This is an evidence that the (log) sum of the players’ heights is the most important variable to explain the rally length.

Tables 7 and 8 show the estimation results for the full models but successive analyses have been performed re-estimating the models including only the significant variables. To better appreciate the impact of each (significant) variable on the models’ parameters and, hence, on the probability distribution of the rally length, we can use equations (7) and (8) and observe how p1, p and λ change as a function of the estimated parameters and of the X variables.

To isolate the effect of a single variable Xj, we fix the values of all other Xi (i ≠ j) to their average within our sample, while letting Xj vary between the sample minimum and maximum. For example, Fig. 3 shows the effect of the regressors on the parameters p1 and p of a zomGeom for men on grass. For p1, we can see that the most impacting variable is log (Ha + Hb) (panel in position (1,2)) which causes p1 to vary from 0.29 to 0.47, with a range of 0.18. Also, Pa + Pb has a significant, although lower, impact leading to a variation of p1 in a range of 0.08 (panel in position (1,1)). Much less important is the role of the difference between the players’ heights (panel in position (2,1)). Moreover, while increasing log (Ha + Hb) leads to a higher probability of rallies of length 1, the opposite occurs for |Ha - Hb|. Indeed, a large heights’ difference implies that one of the players is quite short and this, often, reduces the serve power and, thus, the probability that the point ends in just one shot. For p the only significant variable is log (Ha + Hb), stressing the importance of the players physical characteristics for the rally length. Figure 3 makes also clear that the heights’ sum has a lower impact on p, which varies in a range of 0.06.

Fig. 3

Impact of the significant Xi variables on p1 and p on grass for men. Panel position (1,1): impact of Pa + Pb on p1; position (1,2): impact of log (Ha + Hb) on p1; position (2,1): impact of |Ha - Hb| on p1; position (2,2): impact of log (Ha + Hb) on p. For a better understanding, the tick labels refer to (Ha + Hb) instead of to log (Ha + Hb).

Impact of the significant Xi variables on p1 and p on grass for men. Panel position (1,1): impact of Pa + Pb on p1; position (1,2): impact of log (Ha + Hb) on p1; position (2,1): impact of |Ha - Hb| on p1; position (2,2): impact of log (Ha + Hb) on p. For a better understanding, the tick labels refer to (Ha + Hb) instead of to log (Ha + Hb).

Figure 4 focuses on the impact of log (Ha + Hb) on p1 and p across different surfaces. We can see that, for a given sum of heights and average values of the other variables, the estimated distributions lead to the the highest value of p1 for grass, followed by hard surfaces and by clay. This is not surprising as we know that on fast surfaces rallies of length 1 have higher probability than on slower surfaces. At the same time, we notice that the range of variation of p1 due to log (Ha + Hb) is constant across surfaces: 0.18 on grass and hard and 0.17 on clay. This witnesses that height is an important factor to define the probability of one-shot rally on all surfaces, even if this probability differs according to the surface.

Fig. 4

Impact of log (Ha + Hb) on p1 and p across different surfaces. First column: grass; second column: hard; third column: clay. For a better understanding, the tick labels refer to (Ha + Hb) instead of to log (Ha + Hb).

Impact of log (Ha + Hb) on p1 and p across different surfaces. First column: grass; second column: hard; third column: clay. For a better understanding, the tick labels refer to (Ha + Hb) instead of to log (Ha + Hb).

The impact of log (Ha + Hb) on p is less important but, on grass, this is the only significant variable. Again, the range of variation of p due to log (Ha + Hb) is not very different among surfaces: 0.06, 0.045 and 0.055 on grass, hard and clay, respectively.

Similar considerations also hold for the conditional zom-Poisson but we have focused on the zom-Geometric because in the following section it will result to be best performing one.

6Comparisons

In this section we evaluate the ability of the estimated distributions to reproduce the observed ones. We compare the performances of the proposed distributions:

i) by summing the absolute differences between observed ( Piobs ) and estimated ( Piest ) probability masses of a rally of length i:

(9)
ΔM=i=0M|Piobs-Piest|
where M is the maximum rally length considered.

ii) by applying the Kolmogorov-Smirnov test to asses the equality between the best distributions produced by our models and the empirical distributions.

For a better insight, when computing ΔM, we consider M = 10, 20 and the maximum observed length on each surface.

Tables 9 (for men) and 10 (for women) list the ΔM indicators for unconditional models. Both for men and women, it is clear that Geometric models produce sizable better results than Poisson models and that zero-one-modified models produce sensibly better results than zero-modified models with the only exception of clay for women, for which indicators of zmGeom and zomGeom models are quite similar. Within the class of unconditional models, thus, we can doubtless conclude that the zero-one-modified Geometric distribution is the one leading to the best fitting.

Table 9

Unconditional models for men: Values of indicators ΔM for each surface and for different fittings. QP=Quasi-Poisson, zm=zero modified, zom=zero-one-modified

MenGrassHardClay
DistributionΔ10Δ20ΔallΔ10Δ20ΔallΔ10Δ20Δall
QP0.5430.5560.5580.5000.5180.5210.4870.5240.527
zmPois0.5540.6010.6060.5900.6630.6720.5820.6660.676
zmGeom0.1520.1620.1650.1570.1680.1720.0610.0670.069
zomPois0.2600.2970.3010.3350.3840.3930.3660.4230.433
zomGeom0.0320.0370.0390.0270.0320.0340.0220.0250.026
Table 10

Unconditional models for women: Values of indicators ΔM on each surface and for different fittings. QP=Quasi-Poisson, zm=zero modified, zom=zero-one-modified

WomenGrassHardClay
DistributionΔ10Δ20ΔallΔ10Δ20ΔallΔ10Δ20Δall
QP0.4880.5420.5440.5470.6160.6210.5250.5990.604
zmPois0.4690.5220.5240.5240.5920.5970.4940.5650.570
zmGeom0.0490.0540.0550.0580.0620.0630.0510.0560.059
zomPois0.2620.3060.3080.3000.3510.3550.3060.3580.363
zomGeom0.0220.0260.0270.0230.0260.0270.0490.0540.056

Tables 11 (for men) and 12 (for women) list the values of ΔM for conditional models. In this case we consider two versions of the quasi Poisson models: one using all regressors and one adopting the Kovalchik and Ingram’s specification, which only considers X1. For the modified Poisson and Geometric models we list results only for the zero-one-modified versions, as results in Tables 9 and 10 suggest that zero-modified versions produce worse fittings.

Table 11

Conditional models for men: Values of indicators ΔM on each surface and for different fittings. The letter V in the models’ name denotes varying parameters

MenGrassHardClay
DistributionΔ10Δ20ΔallΔ10Δ20ΔallΔ10Δ20Δall
QP-V0.3600.3690.3720.3190.3290.3330.2460.2520.255
QPKI-V0.3600.3690.3620.3210.3300.3340.2460.2520.255
zomPois-V0.2480.2870.2910.2490.2890.2960.3610.4190.427
zomGeom-V0.0250.0310.0330.0230.0280.0310.0200.0240.025
Table 12

Conditional models for women: Values of indicators ΔM on each surface and for different fittings. The letter V in the models’ name denotes varying parameters

WomenGrassHardClay
DistributionΔ10Δ20ΔallΔ10Δ20ΔallΔ10Δ20Δall
QP-V0.2430.2520.2530.2320.2380.2400.1680.1730.174
QPKI-V0.2450.2530.2550.2300.2370.2390.1730.1770.179
zomPois-V0.2500.2940.2960.2890.3420.3460.3680.4180.423
zomGeom-V0.0330.0380.0390.0260.0310.0320.0510.0570.059

In general, and in terms of ΔM, conditional models provide better results than unconditional ones except in the case of zomGeom for women, for which the unconditional model provides slightly better results. The two versions of quasi-Poisson models show a strong reduction of ΔM and provide extremely similar results. For men, however, their performance is worse than for both zomPois and zomGeom models. Differently, for women, they show values of ΔM lower than those of the zomPois model. As for unconditional models, the zomGeom models is clearly the best one. For men, the conditional zomGeom models lead to an improvement ranging from 10% to 28% in terms of Δ10, with respect to the unconditional zomGeom models.

To assess the statistical equality between the model-implied distributions and the empirical ones, we now apply the well-known Kolmogorov-Smirnov test for goodness-of-fit. The test is applied to the distributions produced by the conditional zero-one-modified models, which are those leading to the best fit in terms of ΔM.

To be independent of the specific sample drawn, we apply the test as follows:

i) we generate 1000 iid samples of size n = 10000 from both the observed and the estimated distributions;

ii) for each couple of samples the two-sided Kolmogorov-Smirnov test is applied and the p-value is recorded;

iii) as final measure of goodness-of-fit we consider the mean p-value over the 1000 simulations.

The results of this procedure are listed, for men and women and for different surfaces, in Table 13. Apart from the case of women/clay, for which the mean p-value is borderline with respect the usual 5% level, in all other situations the mean p-value is largely above 5%, suggesting that the distributions are statistically equivalent.

Table 13

Kolmogorov-Smirnov test: mean p-value over 1000 simulated samples from the observed and estimated distributions

Mean p-valueGrassHardClay
Men0.2970.3020.416
Women0.1250.1760.046

Finally, Fig. 5 shows observed and estimated distributions of the first 25 rally lengths for men and women, and for each surface, when zomGeom models are used. We can see that they are able to describe quite well the very different level of probability of the first rally lengths, including the zero frequency.

Fig. 5

Conditional zomGeom model: observed and estimated distributions of the first 25 rally lengths on grass, hard and clay and for men (left column) and women (right column).

Conditional zomGeom model: observed and estimated distributions of the first 25 rally lengths on grass, hard and clay and for men (left column) and women (right column).

7Conclusions

In this work we have analyzed the rally length distributions for male and female professional tennis matches. Their characteristics have been studied separately on grass, hard and clay surfaces.

Our study differs from the other (few) available in the literature for the extension of the sample size, giving quite reliable results. In addition, the rally length has not been categorized, but each single value, up to the maximum observed, has been specifically considered. In the Appendix the observed frequencies for each rally length are provided for possible future research.

We have focused on finding the statistical distribution most suitable to describe the observed frequencies. To this end, we have considered both unconditional and conditional models. For the latter, parameters were written as a function of other variables.

Our results point out that the statistical distribution which best fits the data is a conditional zero-one-modified Geometric distribution, whose parameters depend on the probabilities that players win a point at serve and on the players’ heights. The estimated distributions can be considered not significantly different from the observed ones. Results have also shown that the (log) sum of the the players’ heights is the most impacting variable on the rally length distribution.

As a future research it will be interesting to analyze and compare the rally length distributions of individual players. This, in turn, could allow to cluster players according to the features of their the rally length distribution and to define the characteristics of two opponents, possibly for each surface.

In addition, analysing player-specific rally length distributions using the proposed methodology may be useful to define betting strategies. Following the approaches of Candila and Palazzo (2020) or Gao and Kowalczyk (2021), the information contained in the rally length distribution could be included in a wide variety of features that could enter some statistical or machine-learning models. For example, Candila and Palazzo (2020) consider some variables related to the fatigue accumulated by the players in the last matches in order to define a betting strategy. As the tendency to play long rallies is correlated to the match duration and to physical stress, the features of the distribution can provide other variables to include in the model. Likewise, Gao and Kowalczyk (2021) consider composite variables obtained combining simple variables, i.e. the ratio between aces and double faults. Also in this case, one can extract information from the rally length distribution by building suitable indicators. The skewness coefficient or the ratio between the probability that a rally length is shorter than or equal to two and the probability that is is longer than two, are just a couple of possible indicators.

References

1 

Bieniek, P., & Kwater, K., (2015) , Body height and career win percentage in relation to serve and return games effectiveness in elite tennis players, Scientific Review of Physical Culture, 4: (3), 75–80.

2 

Candila, V. , & Palazzo, L., (2020) , Neural networks and betting strategies for tennis, Risks, 8: (6).

3 

Carboch, J., Blau, M., Sklenarik, M., Siman, J., & Placha, K., (2020) , Ball change in tennis: How does it affect match characteristics and rally pace in grand slam tournaments? Journal of Human Sport and Exercise, 15: , 153–162.

4 

Carboch, J., Siman, J., Sklenarik, M., & Blau, M., (2019) , Match characteristics and rally pace of male tennis matches in three grand slam tournaments, Physical Activity Review, 7: , 49–56.

5 

Fernandez-Fernandez, J., Sanz-Rivas, D., Fernandez-Garcia, B., & Mendez-Villanueva, A., (2008) , Match activity and physiological load during a claycourt tennis tournament in elite female players, Journal of Sports Sciences, 26: , 1589–1595.

6 

Fitzpatrick, A., Stone, J. A., Choppin, S., & Kelley, J., (2021) , Investigating the most important aspect of elite grass court tennis: Short points, Sports Science & Coaching, 16: , 1178–1186.

7 

Gao, Z., & Kowalczyk, A., (2021) , Random forest model identifies serve strength as a key predictor of tennis match outcome, Journal of Sports Analytics, 7: , 255–262.

8 

Gomes, R., Coutts, A., Viveiros, L., & Aoki, M., (2011) , Physiological demands of match-play in elite tennis: A case study, European Journal of Sport Science, 11: , 105–109.

9 

Heilbron, D. C., (1994) , Zero-altered and other regression models for count data with added zeros, Biometrica Journal, 36: , 531–547.

10 

Ingram, M., (2021) , Rally lengths on the log scale, Martin Ingram’s Blog, https://martiningram.github.io/gp-random-effects-log-scale/.

11 

Kovalchik, S. A., & Ingram, M., (2018) , Estimating the duration of professional tennis matches for varying formats, Journal of Quantitative Analysis in Sports, 14: (1), 13–23.

12 

Lambert, D., (1992) , Zero-inflated poisson regression with an application to defects in manufacturing, Technometrics, 34: , 1–14.

13 

Lisi, F., & Grigoletto, M., (2021) , Modeling and simulating durations of professional tennis matches by resampling match features, Journal of Sports Analytics, 7: (2), 57–75.

14 

Makino, M., Odaka, T., Kuroiwa, J., Suwa, I., & H, S., (2020) , Feature selection to win the point of atp tennis players using rally information, International Journal of Computer Science in Sport, 19: (1), 37–50.

15 

McCullagh, P., & Nelder, J. A., (1989) , Generalized Linear Models. 2nd Edition. Chapman and Hall.

16 

Mlakara, M., & Kovalchik, S., (2020) , Analysing time pressure in professional tennis, Journal of Sports Analytics, 6: , 147–154.

17 

Mohammadi, Z., Sajjadnia, Z., Bakouch, H. S., & Sharafi, M., (2021) , Zero-and-one inflated poisson-lindley inar(1) process for modelling count time series with extra zeros and ones, Journal of Statistical Computation and Simulation, 92: , 2018–2040.

18 

Mullahy, J., (1986) , Specification and testing of some modified count data models, Journal of Econometrics, 33: (3), 341–365.

19 

Nelder, J. A., & Wedderburn, R. W., (1972) , Generalized linear models, Journal of the Royal Statistical Society. Series A., 135: (2), 370–384.

20 

Pascual, J. V., (2023) , Types of serve stance and height of players.a study of the best servers in history, Coaching & Sport Science Review, 89: , 16–20.

21 

Qi, X., Li, Q., & Zhu, F., (2019) , Modeling time series of count with excess zeros and ones based on inar(1) model with zero-one inflated poisson innovations, Journal of Computational and Applied Mathematics, 346: , 572–590.

22 

Vaverka, F., & Cernosek, M., (2013) , Association between body height and serve speed in elite tennis players, Sports Biomechanics, 12: (1), 30–37.

Appendices

Appendix

Table 14

Men: Absolute observed frequencies for the rally length

RallyGrassHardClay
02, 81411, 3995, 021
125, 08089, 87434, 250
211, 61744, 45622, 623
38, 70832, 88716, 604
46, 54326, 97514, 371
54, 15618, 62710, 352
62, 99714, 5628, 221
72, 20911, 0786, 323
81, 4978, 5545, 032
91, 1406, 7884, 093
108775, 2443, 202
116294, 1612, 604
125193, 2822, 124
133502, 6641, 654
143092, 1061, 252
152781, 767978
161761, 376790
171281, 108666
1897876457
1980730376
2042557318
2167438243
2248331191
2329280159
2420197121
251716069
261613067
27129241
2848641
2985925
3054123
3122428
3243219
3322515
3431410
350167
36186
370153
38091
39163
40042
41144
42020
43020
44141
45012
46010
47002
48121
49002
50010
51010
52010
53000
54020
55000
56000
57010
58001
59010
60001
61000
62000
63000
64000
65000
66000
67000
68000
69000
70000
71001
72000
73000
74000
75000
76000
77000
78000
79000
80000
81000
82000
83001
Table 15

Women: Absolute observed frequencies for the rally length

RallyGrassHardClay
03, 2796, 6002, 524
117, 76133, 06910, 806
211, 50721, 2197, 933
38, 18614, 9416, 126
46, 70412, 9355, 516
54, 6979, 9144, 092
63, 6997, 6503, 416
72, 7225, 8302, 544
81, 9984, 4121, 971
91, 4263, 3661, 580
101, 0642, 5981, 156
118391, 949829
125411, 490642
134561, 166508
14307850354
15193656290
16149494200
17118367134
1863265108
195018694
203614780
213210744
22157933
23174824
24135214
2562418
2642510
273205
280115
29374
30276
31170
32154
33030
34101
35022
36010
37001
38000
39040
40000
41000
42000
43000
44010
45000
46000
47000
48011

Notes

1 Note that in the notation of Carboch et al. (2019) double faults are represented by 1 shot.

2 In their terminology, the service bonus is the sum of the probabilities that two players have to win the point at serve, while the malus is the absolute difference of the same probabilities.

4 Digitalized versions of Tables 14 and 15 are available upon request to the authors.

5 But at at the Australian Open 2013 Gilles Simon and Gael Monfils played a point of 71 strokes. Clearly, this match was not included in our crowdsourced dataset.

6 The improvement in fitting has been verified ex-post.