You are viewing a javascript disabled version of the site. Please enable Javascript for this site to function properly.
Go to headerGo to navigationGo to searchGo to contentsGo to footer
In content section. Select this link to jump to navigation

Identifying group contributions in NBA lineups with spectral analysis

Abstract

We address the question of how to quantify the contributions of groups of players to team success. Our approach is based on spectral analysis, a technique from algebraic signal processing, which has several appealing features. First, our analysis decomposes the team success signal into components that are naturally understood as the contributions of player groups of a given size: individuals, pairs, triples, fours, and full five-player lineups. Secondly, the decomposition is orthogonal so that contributions of a player group can be thought of as pure: Contributions attributed to a group of three, for example, have been separated from the lower-order contributions of constituent pairs and individuals. We present detailed a spectral analysis using NBA play-by-play data and show how this can be a practical tool in understanding lineup composition and utilization.

1Introduction

A fundamental challenge in basketball performance evaluation is the team nature of the game. Contributions to team success occur in the context of a five-player lineup, and isolating the specific contribution of an individual is a difficult problem with a considerable history. Among the many approaches to the player evaluation problem are well-known metrics like player efficiency rating (PER), wins produced (WP), adjusted plus-minus (APM), box plus-minus (BPM), win shares (WS), value over replacement player (VORP), and offensive and defensive ratings (OR and DR) to name only a few (Basketball-Reference). While these individual player metrics help create a more complete understanding of player value, some contributions remain elusive. Setting good screens, ability to draw defenders, individual defense, and off-ball movement are all examples of important contributions that are difficult to measure and quantify. In part, these contributions are elusive because they often facilitate the success of a teammate who ultimately reaps the statistical benefit.

Even beyond contributions that are difficult to quantify, the broader question of chemistry between players is a critical aspect of team success or failure. It is widely accepted that some groups of players work better together than others, creating synergistic lineups that transcend the sum of their individual parts. Indeed, finding (or fostering) these synergistic groups of players is fundamental to the role of a general manager or coach. There are, however, far fewer analytic approaches to identifying and quantifying these synergies between players. Such positive or negative effects among teammates represent an important, but much less well understood, aspect of team basketball.

In this paper we propose spectral analysis (Diaconis, 1988) as a novel approach to identifying and quantifying group effects in NBA play-by-play data. Spectral analysis is based on algebraic signal processing, a methodology that has garnered increasing attention from the machine learning community (Kakarala, 2011; Kondor et al., 2007; Kondor and Dempsey, 2012), and is particularly well suited to take advantage of the underlying structure of basketball data. The methodology can be understood as a generalization of traditional Fourier analysis, an approach whose centrality in a host of scientific and applied data analysis problems is well-known, and speaks to the promise of its application in new contexts from social choice to genetic epistasis and more (Paudel et al., 2013; Jurman et al., 2008; Lawson et al., 2006; Uminsky et al., 2018; Uminsky et al., 2019). The premise of spectral analysis in a basketball context is simple: team success (appropriately measured) can be understood as a function on lineups. Such functions have rich structure which can be analyzed and exploited for data analytic insights.

Previous work in basketball analytics has addressed similar questions from a different perspective. Both Kuehn (2016) and Maymin et al. (2013) studied lineup synergies on the level of player skills. In Maymin et al. (2013) the authors used a probabilistic framework for game events, along with simulated games to evaluate full-lineup synergies and find trades that could benefit both teams by creating a better fit on both sides. In Kuehn (2016), on the other hand, the author used a probabilistic model to determine complementary skill categories that suggest the effect of a player in the context of a specific lineup. Work in Grassetti et al. (2019a) and Grassetti et al. (2019b) modeled lineup and player effects in the Italian Basketball League (Serie A1) based on an adjusted plus-minus framework.

Our approach is different in several respects. First, we study synergies on the level of specific player groups independent of particular skill sets. We also ignore individual production statistics and infer synergies directly from observed team success, as defined below. As a consequence of this approach, our analysis is roster constrained–we don’t suggest trades based on prospective synergies across teams. We can, however, suggest groupings of players that allow for more optimal lineups within the context of available players, a central problem in the course of an NBA game or season. Further, our approach uses orthogonality to distinguish between the contributions of a group and nested subgroups. So, for example, a group of three players that appears to exhibit positive synergies may, in fact, be benefiting from strong individual and pair contributions while the triple of players adds no particular value as a pure triple. We tease apart these higher-order correlations.

Furthermore, spectral analysis is not a model-based approach. As such, our methodology is notably free of modeling assumptions–rather than fitting the data, spectral analysis reports the observed data, albeit projected into a new basis with new information. Thus, it is a direct translation of what actually happened on the court (as we make precise below). As such, our methodology is at least complementary to existing work, and is also promising in presenting a new approach to understanding and appreciating the nuances of team basketball.

Finally, we note that while the methodology that underlies the spectral analysis approach is challenging, the resulting intuitions and insights are readily approachable. In what follows, we have stripped the mathematical details to a minimum and relegated them to references for the interested reader. The analysis, on the other hand, shows promise as a new and practical approach to a difficult problem in basketball analytics.

2Data

We start with lineup level play-by-play data from the 2015-2016 NBA season. Such play-by-play data is publically available on ESPN.com or NBA.com, or can be purchased from websites like bigdataball.com, already processed into csv format. For a given team, we restrict attention to the 15 players on the roster having the most possessions played on the season, and filter the play-by-play data to periods of games involving only those players. Next, we compute the aggregated raw plus-minus (PM) for each lineup. Suppose lineup L plays against opposing lineup M during a period of gameplay with no substitutions. We compute the points scored by each lineup, as well as the number of possessions for both lineups during that stretch of play. For example, if lineup L scored 6 points in 3 possessions and lineup M scored 3 points in 2 possessions, then their plus-minus is computed as the difference in points-per-possession times possessions. Thus, for L the plus-minus is (63-32)3=1.5 while for M the plus-minus is (32-63)2=-1 . Summing over all of lineup L’s possessions gives the total aggregate plus-minus for lineup L which we denote by pmL.

Since a lineup consists of 5 players on the floor, there are 3003 = 15choose5 possible lineups, though most see little or no playing time. We thus naturally arrive at a function on lineups by associating with L the value of that lineup’s aggregate plus-minus, and write f (L) = pmL. We call f the team success function. This particular success metric has the advantage of being simple and intuitive. Moreover, by summing over all lineups we recover the value of the team’s cumulative plus-minus, which is highly correlated with winning percentage. The function f will serve as the foundation for our analysis, but we note that for what follows, any quantitative measure of a lineup’s success could be substituted in its place.

3Methodology

Our goal is now to decompose the function f in a way that sheds light on the various group contributions to team success. The groups of interest are generalized lineups, meaning groups of all sizes, from individual players to pairs, triples, groups of four, and full five-player lineups. Our primary tool is spectral analysis, which uses the language of representation theory (Serre, 2012) to understand functions on lineups.

Observe that a full lineup is an unordered set of five players. Any reshuffling of the five players on the floor, or the ten on the bench, does not change the lineup under consideration. Moreover, given a particular lineup, a permutation (or reshuffling) of the fifteen players on the team will result in a new lineup. The set of such permutations has a rich structure as a mathematical group. In this case, all possible permutations of fifteen players are described by S15: the symmetric group on 15 items (Dummit and Foote, 2004). Furthermore, the set X of five-player lineups naturally reflects this group structure (as a homogeneous space). Most importantly for our purposes, the set of functions on lineups has robust structure with respect to the natural action of permutations on functions. This structure is well understood and can be exploited for data analytic insights as we show below. By way of analogy, just as traditional Fourier analysis looks to decompose a time series into periodicities that can reveal a hidden structure (weekly or seasonal trends, say), our decomposition of f will reveal group effects in lineup-level data.

Let L (X) denote the collection of all real valued functions on five-player lineups. This set is a vector space with the usual notions of sum of functions, multiplication by scalars, and an inner product given by

(1)
g,h=1|X|xXg(x)h(x).
The dimension of L (X) is equal to the number of lineups, 3003 = 15choose5. In light of the permutation group’s action on L (X) as mentioned above, L (X) admits a natural (invariant and irreducible) decomposition as follows:
(2)
L(X)=V0V1V2V3V4V5.
Each Vi, with 0 ≤ i ≤ 5 is a vector subspace with data analytic significance. Rather than give a self contained treatment of this decomposition, we refer to Diaconis (1988) and Dummit and Foote (2004), and here, simply note that each space is spanned by the matrix coefficients of the irreducible representations of the group S15 associated with Young tableaux of shape (10, 5). We can gain some intuition for the decomposition by considering the lower-order spaces as follows. An explicit computation of the decomposition is given in section 4 below for a toy example.

Take δL to be the indicator function of a fixed lineup L, so that δL (L) =1, while δL (L′) =0 for any other lineup L′. As above, X is the set of all possible lineups, and

(3)
δ=LXδL.
If we act on the function δ by reshuffling lineups (this is the action of the permutation group S15), we see that while the terms in the summation in (3) get reordered, the function itself remains unchanged. (See section 4 below for details.) Thus, the one-dimensional space spanned by δ is invariant under lineup reshuffling and represents the mean value of the function f since we can write f =  + (f - ). Here, c is just the average value of f and is the best possible constant approximation to f. The function f -  represents the original data, but now centered with mean zero, and orthogonal to the space of constant functions with respect to the inner product in (1). The space spanned by δ is V0 in (2).

To understand V1, we start with indicator functions for individual players. Given a player i, define δi=LLiδL-mδ where the sum is over all lineups that include player i and four other players, and m is a constant chosen so that δi is orthogonal to δ. One can show that the space spanned by {δ1, δ2, … δ15} is again stable under lineup reshuffling. (Though the set of individual indicator functions is linearly dependent, and only spans a 14-dimensional space as we’ll see below.)

The decomposition continues in an analogous way, though the computations become more involved. Several computational approaches are described in Diaconis (1988) and Maslen et al. (2003). In our case of the symmetric group S15 acting on lineups, we employ the method in Maslen et al. (2003), which involves first computing the adjacency matrix of an associated Johnson graph J (15, 5). It turns out that J (15, 5) has 6 eigenvalues, each of which is associated with one of the effect spaces: zero (mean), and first through fifth-order spaces. Specifically, the largest eigenvalue is simple and is associated with the one-dimensional mean space; the second largest eigenvalue is associated with the first-order space, etc. It is now a matter of computing an eigenbasis for each space, and using it to project the data vector onto each eigenspace to give the orthogonal decomposition used in (2). It is also worth noting that spectral analysis includes the traditional analysis of variance as a special case, a connection suggested by the discussion above and further explained in Diaconis (1988).

The decomposition in (2) is particularly useful for two reasons. First, each Vi can be interpreted as the space of functions encoding i-th order effects. For instance, one can see that V1 is naturally understood as encoding first-order individual effects beyond the mean. Thus, the projection of f onto V1 can be understood as that part of team success f attributable to the contributions of individual players. Similarly V2 includes effects attributable to pure player pairs (individual contributions have been removed), and the corresponding projection of f in V2 gives the contributions of those pairs to team success. V3 encodes contributions of groups of three, and so on. These interpretations follow from the fact that each subspace in the decomposition of L (X) is invariant under the natural reshuffling action of S15 on lineups. It is also worth noticing that the lineup success function is completely recovered via its projections onto the order subspaces in (2). If we write fi for the projection of f onto Vi, then f = f0 + f1 + f2 + f3 + f4 + f5. As such, the spectral decomposition gives a complete description of the original data set with respect to a new basis grounded in group contributions.

Secondly, the decomposition in (2) is orthogonal (signified by the ⊕ notation). From a data analytic perspective, this means that there is no overlap among the spaces, and group effects are independent. Thus, for instance, a contribution attributed to a group of three players can be understood as a pure third-order contribution. All constituent pair and individual contributions have been removed and quantified separately in the appropriate lower-order spaces. We thus avoid erroneous attribution of success due to multicollinearity among groups. For example, is a big three really adding value as a triple, or is its success better understood as a strong pair plus an individual? The spectral decomposition in (2) provides a quantitative basis for answering such questions.

The advantage of the orthogonality of the spaces in (2), however, presents a challenge with respect to direct interpretation of contributions for particular groups. This is evident when considering the dimension of each of the respective effect spaces in Table 1, which is strictly smaller than the number of groups of that size we might wish to analyze.

Table 1

Dimension of each effect space, along with the number of natural groups of each size

SpaceDimensionNumber of Groups
V01
V11415
V290105
V3350455
V49101365
V516383003

Since we have rosters of fifteen players, there are fifteen individual contributions to consider. The space V1, however, is 14-dimensional. Similarly, while V2 includes all of the contributions to f attributable to pairs of players, it does so in a 90-dimensional space despite the fact that there are 105 = 15choose2 natural pairs of players to consider. The third-order space V3 has dimension 350 while there are 455 player triples, and so on.

We deal with this issue using Mallows’ method of following easily interpretable vectors as in Diaconis (1988). Let g be a group of players. For example, if players are labeled 1 through 15, then a particular triple might be g = {1, 2, 7}. Let φg be the indicator function associated with g, i.e., the function that takes the value 1 when all three players 1, 2, and 7 are in a lineup, and outputs 0 otherwise. The function φg is intuitively associated with the success of the group g (though it is not invariant under reshuffling and is not orthogonal to nested lower-order groups).

To quantify the contribution of g (as a pure triple) to the success of the team as measured by f, project both φg and f onto V3 and take the inner product of the projections: 〈prV3 (φg) , prV3 (f) 〉 = 〈prV3 (φg) , f3〉. After projecting onto V3 we are left with only the third-order components of φg and f. The resulting inner product is a weighted cosine similarity that indicates the extent to which the pure triple g is correlated with the team’s success f. Larger values of this inner product reflect a stronger synergy between the triple of players {1, 2, 7}, while a negative value indicates that, after removing the contributions of the constituent individuals and pairs, spectral analysis finds this particular group of three ineffective. In the results below we show how this information might be useful in evaluating lineups.

4Two-On-Two Basketball

To ground the ideas of the previous section we present a small-scale example in detail. Consider a version of basketball where a team consists of 5 players, two of which play at any given moment. The set of possible lineups consists of the ten unordered pairs {i, j} with i, j ∈ {1, 2, 3, 4, 5} and i ≠ j. The symmetric group S5 acts on lineups by relabeling, and we extend this action to functions on lineups as follows. Given a permutation π, a function h, and a lineup L, define

(4)
(π·h)(L)=h(π-1L).
Therefore, if π is the permutation (123), taking player 1 to player 2, player 2 to player 3, player 3 to player 1, and leaving everyone else fixed, and if L is the lineup {1, 3}, then
(5)
(π·h)(L)=h(π-1{1,3})=h({3,2}).
The use of the inverse is necessary to ensure that the action on functions respects the operation in the group, that is, so that (τπ) · h = τ · (π · h) (Dummit and Foote, 2004).

Following a season of play, we obtain a success function that gives the plus-minus (or other success metric) of each lineup. We might observe a function like that in Table 2.

Table 2

Success function for two-player lineups

Lf (L)Lf (L)
{1, 2}22{2, 4}35
{1, 3}18{2, 5}26
{1, 4}3{3, 4}84
{1, 5}58{3, 5}25
{2, 3}93{4, 5}2

Summing f (L) over all lineups that include a particular player gives individual raw plus-minus as in Table 3.

Table 3

Preliminary analysis of sample team using individual plus-minus (PM), which is the sum of the lineup PM over lineups that include a given individual

PlayerPMRank
11015
21762
32201
41243
51114

Player 3 is the top rated individual, followed by 2, 4, 5, and 1. Lineup rankings are given by f (L) itself, which shows {2, 3} , {3, 4}, and {1, 5} as the top three.

Now compare the analysis above with spectral analysis. In this context the vector space of functions on lineups is 10-dimensional and has a basis consisting of vectors δ{i,j} that assign the value 1 to lineup {i, j} and 0 to all other lineups. The decomposition in (2) becomes

(6)
V=V0V1V2.
Define δ = ∑{i,j}δ{i,j}. The span of δ is the one-dimensional subspace V0 of constant functions. Moreover, V0 is S5 invariant since for any relabeling of players given by π, we have π · δ = δ. Given a function f in V, its projection f0 on V0 will assigns to each lineup the average value of f, in this case 36.6.

First order (or individual) effects beyond the mean are in encoded in V1. Explicitly, define δ1=iδ{1,i}-25δ , with δ2, δ3, and δ4 defined analogously. One can check that the 4-dimensional vector space spanned by {δ1, δ2, δ3, δ4}, is S5 invariant, and is orthogonal to V0. Since the mean has been subtracted out and accounted for in V0, a vector in V1 represents a pure first order effect. Note that δ5(x)=iδ{5,i}-25δ can be written δ5 = - δ1 - δ2 - δ3 - δ4. Consequently, V1 is 4-dimensional even though there are five natural first order effects to consider: one for each player.

Finally, the orthogonal complement of V0 oplus V1 is the 5-dimensional S5 invariant subspace V2. V2 gives the contribution to f from pure pairs, or pure second order effects after the mean and individual contributions are removed. The three subspaces V0, V1, and V2 are all irreducible since none contains a nontrivial S5 invariant subspace.

We can now project f onto V0, V1, and V2. All together we have f = f0 + f1 + f2:

(7)
f({1,2}{1,3}{1,4}{1,5}{2,3}{2,4}{2,5}{3,4}{3,5}{4,5})=[221835893352684252]=[36.636.636.636.636.636.636.636.636.636.6]+[-5.279.40-22.60-26.9334.402.40-1.9317.0712.73-19.27]+[-9.33-28.00-11.0048.3322.00-4.00-8.6730.33-24.33-15.33]

Turning to the question of interpretability, section 3 proposes Mallows’ method of using readily interpretable vectors projected into the appropriate effect space. To that end, the individual indicator function φ{2} = δ{1,2} + δ{2,3} + δ{2,4} + δ{2,5} is naturally associated with player 2: φ{2} (L) =1 when player 2 is in L and is 0 otherwise. We quantify the effect of player 2 by projecting φ{2} and f into V1, and then taking the dot product of the projections. For a lineup like {2, 3}, we take the dot product of the projections of the lineup indicator function δ{2,3}, and f, in V2. Note that player 2’s raw plus-minus is the inner product of 10 · f with the interpretable function φ{2}. Similarly f ({i, j}) is 10 · 〈f, φ{i,j}〉. The key difference is that spectral analysis uses Mallow’s Method after projecting onto the orthogonal subspaces in (6).

Contributions from spectral analysis as measured by Mallows’ method are given in Table 4 for both individuals and (two-player) lineups.

Table 4

Spectral value (Spec) for each individual player and two-player lineup, and rank of each lineup, along with the preliminary rank given by f

IndividualSpecPairSpecRankf RankPairSpecRankf Rank
{1}-45.4{1,2}-9.367{2,4}-444
{2}29.6{1,3}-28108{2,5}-8.755
{3}73.6{1,4}-1179{3,4}30.322
{4}-22.4{1,5}48.313{3,5}-24.396
{5}-35.4{2,3}2231{4,5}-24810

The table also includes both the spectral and preliminary (based on f) rankings of each lineup. Note that lineup {2, 3} drops from the best pair to the third best pure pair. Once we account for the contributions of players two and three as individuals, the lineup is not nearly as strong as it appears in the preliminary analysis. We find stronger pair effects from lineups {1, 5} and {3, 4}. All remaining lineups are essentially ineffective in that their success can be attributed to the success of the constituent individuals rather than the pairing. Interesting questions immediately arise. What aspects of player four’s game result in a more effective pairing with player three, the team’s star individual player, than the pairing of three with two, the team’s second best individual? What is behind the success of the {1, 5} lineup? These considerations are relevant to team construction, personnel considerations, and substitution patterns. We pursue this type of analysis further in the context of an actual NBA team below.

5Results and discussion

A challenge inherent in working with real lineup-level data is the wide disparity in the number of possessions that lineups play. Most teams have a dominant starting lineup that plays far more possessions than any other. For example, the starting lineup of the ’16 Golden State Warriors played approximately 1140 possessions while the next most used lineup played 535 possessions. Only 12 lineups played more than 100 possessions for the Warriors on the season. For the Boston Celtics, the starters played 1413 possessions compared to 257 for the next most utilized, with 13 lineups playing more than 100 possessions. By contrast, the Celtics had 255 lineups that played fewer than 10 possessions (but at least one), and the Warriors had 236. Numbers are similar across the league. This is another reason for using raw plus-minus in defining the team success function f on lineups. A metric like per-possession lineup plus-minus breaks down in the face of large numbers of very low possession lineups and a few high possession lineups. Still, we want to identify potentially undervalued and underutilized groups of players–especially for smaller groups like pairs and triples where there are many more groups that do play significant numbers of possessions. Another consideration is that over time, lineups with large numbers of possessions will settle closer to their true mean value while lineups with few possessions will be inherently noisier. As a result, we perform the spectral analysis on f as described in section 3 above, and then normalize the spectral contribution by the log of possessions played by each group. We call the result spectral contribution per log possession (SCLP). This balances the considerations above and allows strong lower possession groups to emerge while not over-penalizing groups that do play many possessions.

Despite these challenges, however, we’ll see below that there are significant insights to be gained in working with lineup level data. Moreover, since spectral analysis is a non-model-based description of complete lineup-level game data, it has the advantage of maintaining close proximity to the actual gameplay observed by coaches, players, and fans. There are always five players on the floor, so all data begins at the level of full lineups.

Consider the first order effects for the 15-16 Golden State Warriors in Table 5. Draymond Green, Stephen Curry, and Klay Thompson are the top three players. The ordering, specifically Green ranked above Curry, is perhaps interesting, though it’s worth noting that this ordering agrees with ESPN’s real plus-minus (RPM). (Green led the entire league in RPM in 15-16.) Other metrics like box plus-minus (BPM) and wins-above-replacement (WAR) rank Curry higher. Because SCLP is based on ability of lineups to outscore opponents when the player is on the floor (like RPM), however, as opposed to metrics like BPM and WAR which are more focused on points produced, the ordering is defensible.

Table 5

Top and bottom five first-order effects for GSW. SCLP is the spectral contribution per log possession, PM is the player’s raw plus-minus, and Poss is the number of possessions for that player

PlayerSCLPPMPoss
Draymond Green17.21038.45800
Stephen Curry15.9978.75610
Klay Thompson12.0808.65453
Andre Iguodala03.5436.13516
Andrew Bogut02.8403.62951
Marreese Speights-7.420.01630
Ian Clark-9.8-51.91108
Anderson Varejao-11.1-34.4368
Jason Thompson-11.2-33.8339
James Michael McAdoo-12.1-85.0526

In fact, a closer look at the interpretable vector φi associated with individual player i (as described in sections 3 and 4) reveals that φi = δi + c · δ, so is just a non-mean-centered version of the first order invariant functions that span V1. Consequently, the spectral contribution (non-possession normalized) is a linear function of individual plus-minus, so reflects precisely that ordering. This is not the case for higher-order groups, however, which is where we focus the bulk of our analysis.

The second-order effects are given in in Table 6, and quantify the contributions of player pairs, having removed the mean, individual, and higher-order group effects. The top and bottom five pairs (in terms of SCLP) are presented here, with more complete data in Table 16 in the appendix.

Table 6

Top and bottom five SCLP pairs with at least 200 possessions, along with raw plus-minus and possessions

P 1P2SCLPPMPoss
Draymond GreenStephen Curry13.3979.95102
Stephen CurryKlay Thompson11.2827.84311
Draymond GreenKlay Thompson11.1847.84678
Leandro BarbosaMarreese Speights05.376.2983
Draymond GreenAndre Iguodala04.3490.02165
Draymond GreenIan Clark-7.233.3424
Klay ThompsonLeandro Barbosa-7.24.8349
Stephen CurryIan Clark-8.114.0220
Draymond GreenAnderson Varejao-9.57.2217
Stephen CurryAnderson Varejao-10.1-26.9237

Even after accounting for and removing their strong individual contributions, however, it is notable that Green–Curry, Curry–Thompson, and Green–Thompson are the dominant pair contributors by a considerable margin, with SCLP values that are all more than twice as large as for the next largest pair (Barbosa–Speights). These large positive SCLP values represent true synergies: These pairs contribute to team success as pure pairs. The fact that the individual contributions of the constituent players are also positive results in a stacking of value within a lineup that provides a quantifiable way of assessing whether the whole does indeed add to more than the sum of its parts.

Reserves Leandro Barbosa, Mareese Speights, and Ian Clark, on the other hand, were poor individual contributors, but manage to combine effectively in several pairs. In particular, the Barbosa–Speights pairing is notable as the fourth best pure pair on the team (in 983 possessions). After accounting for individual contributions, lineups that include the Barbosa–Speights pairing benefited from a real synergy that positively contributed to team success. This suggests favoring, when feasible, lineup combinations with those two players together to leverage this synergy and mitigate their individual weaknesses.

Tables 7 and 8 show pair values for players Andrew Bogut and Shaun Livingston (again in pairs with at least 150 possessions, and with more detailed tables in the appendix). Both players are interesting with respect to second order effects. While Bogut was a positive individual contributor, and was a member of the Warriors’ dominant starting lineup that season, he largely fails to find strong pairings. His best pairings are with Klay Thompson and Harrison Barnes, while he pairs particularly poorly with Andre Iguodala (in a considerable 785 possessions). This raises interesting questions as to why Bogut’s style of play is better suited to players like Thompson or Barnes rather than players like Curry or Iguodala. Also noteworthy is the fact that the Bogut–Iguodala pairing has a positive plus-minus value of 107. The spectral interpretation is that this pairing’s success should be attributed to the individual contributions of the players, and once those contributions are removed, the group lacks value as a pure pair.

Table 7

Select pairs involving Andrew Bogut (with at least 150 possessions)

P1P2SCLPPMPoss
Andrew BogutKlay Thompson3.7394.32637
Andrew BogutHarrison Barnes2.1206.21527
Andrew BogutStephen Curry1.6378.52530
Andrew BogutAndre Iguodala-2.1107.0785
Table 8

Select pairs involving Shaun Livingston (with at least 150 possessions)

P1P2SCLPPMPoss
Shaun LivingstonAnderson Varejao2.0-1.5174
Shaun LivingstonMarreese Speights1.617.81014
Shaun LivingstonDraymond Green1.2323.61486
Shaun LivingstonAndre Iguodala-1.365.21605
Shaun LivingstonKlay Thompson-3.6111.81412

Shaun Livingston, on the other hand, played an important role as a reserve point guard for the Warriors. Interestingly, Livingston’s worst pairing by far was with Klay Thompson. Again, considering the particular styles of these players compels interesting questions from the perspective of analyzing team and lineup compositions and playing style. It’s also noteworthy that this particular pairing saw 1412 possessions, and it seems entirely plausible that its underlying weakness was overlooked due to the healthy 111.8 plus-minus with that pair on the floor. The success of those lineups should be attributed to other, better synergies. For example, one rotation added Livingston as a sub for Barnes (112 possessions). Another put Livingston and Speights with Thompson, Barnes, and Iguodala (70 possessions). Finally, it’s also interesting to note that Livingston appears to pair better with other reserves than with starters (save Draymond Green, further highlighting Green’s overall value), an observation that raises important questions about how players understand and occupy particular roles on the team.

Table 9 shows the best and worst triples with at least 200 possessions.

Table 9

Best and worst third-order effects for GSW with at least 200 possessions

P 1P2P3SCLPPMPoss
Draymond GreenStephen CurryKlay Thompson12.6812.74085
Draymond GreenKlay ThompsonHarrison Barnes5.9427.32473
Draymond GreenStephen CurryAndre Iguodala5.8464.81830
Stephen CurryKlay ThompsonHarrison Barnes5.7416.52431
Stephen CurryKlay ThompsonAndrew Bogut4.9382.22296
Stephen CurryAndre IguodalaBrandon Rush-3.8-13.5207
Draymond GreenStephen CurryMarreese Speights-4.197.9299
Draymond GreenKlay ThompsonMarreese Speights-4.552.2250
Draymond GreenKlay ThompsonIan Clark-5.89.8316
Draymond GreenStephen CurryIan Clark-7.414.5205

The grouping of Green–Curry–Thompson is far and away the most dominant triple, and safely (and unsurprisingly) earns designation as the Warriors’ big three. Other notable triples include starters like Green and Curry or Green and Thompson together with Andre Iguodala who came off the bench, and more lightly used triples like Curry–Barbosa–Speights who had an SCLP of 4.6 in 245 possessions. Analyzing subpairs of these groups shows a better stacking of synergies in the triples that include Iguodala–he pairs well with Green, Curry, and Thompson in the second order space as well, while either of Barbosa or Speights paired poorly with Curry. Still, Barbosa with Speights was quite strong as a pair, and we see that the addition of Curry does provide added value as a pure triple. Interesting ineffective triples include Iguodala and Bogut with either of Curry or Green, especially in light of the fact that Bogut–Iguodala was also a weak pairing (see detailed tables in the appendix).

Figure 1 shows that the most effective player-triples as identified by spectral analysis are positively correlated with higher values of plus-minus.

Fig. 1

Third-order effects for triples with more than 100 possessions the 2015-2016 Golden State Warriors. The x-axis gives the group’s plus-minus per log possession (PMperLP) while the y-axis shows the spectral contribution per log possession (SCLP). Observations are shaded by number of possessions.

Third-order effects for triples with more than 100 possessions the 2015-2016 Golden State Warriors. The x-axis gives the group’s plus-minus per log possession (PMperLP) while the y-axis shows the spectral contribution per log possession (SCLP). Observations are shaded by number of possessions.

As raw group plus-minus decreases, however, we see considerable variation in the spectral contributions of the groups (and in number of possessions played). This suggests the following narrative: while it may be relatively easy to identify the team’s top groups, it is considerably more difficult to identify positive and negative synergies among the remaining groups, especially when controlling for lower-order contributions. Spectral analysis suggests several opportunities for constructing more optimal lineups with potential for untapped competitive advantage, especially when more obvious dominant groupings are unavailable.

Table 10 shows top and bottom three third-order effects for the 15-16 Boston Celtics. (The appendix includes more complete tables for Boston including effects of all orders.) Figure 2 gives contrasting bar plots of the third-order effects for both Boston and Golden State.

Table 10

Top and bottom three third-order effects for BOS with at least 150 possessions

P 1P2P3SCLPPMPoss
Evan TurnerKelly OlynykJonas Jerebko2.9110.1879
Isaiah ThomasAvery BradleyJared Sullinger2.7177.72642
Avery BradleyJae CrowderJared Sullinger2.3139.32216
Isaiah ThomasEvan TurnerKelly Olynyk-1.8-30.9870
Avery BradleyJared SullingerJonas Jerebko-2.3-11.7194
Isaiah ThomasAvery BradleyJonas Jerebko-2.4-1.6290
Fig. 2

Bar graph of third order spectral contributions per log possession (SCLP) for BOS and GSW for groups with more than 150 possessions.

Bar graph of third order spectral contributions per log possession (SCLP) for BOS and GSW for groups with more than 150 possessions.

The Celtics have fewer highly dominant groups. In particular, we note that the spectral signature of the Celtics is distinctly different from that of the Warriors in that Boston lacks anything resembling the big-three of Golden State. While SCLP values are not directly comparable across teams (they depend, for instance, on the norm of the overall team success function when projected into each effect space), the relative values within an effect-space are comparable. Similarly, the SCLP values also depend on the norm of the interpretable vector used in Mallow’s method. As a result, the values are not directly comparable across effect spaces–a problem we return to below.

In fourth and fifth-order spaces the numbers of high-possession groups begins to decline, as alluded to above. (See appendix for complete tables.) Still, it is interesting to note that spectral analysis flags the Warriors small lineup of Green–Curry–Thompson–Barnes–Iguodala as the team’s best, even over the starting lineup with Bogut replacing Barnes. It also prefers two lesser-used lineups to the Warriors’ second most-used lineup of Green–Curry–Thompson–Bogut–Rush. Also of note is the fact that Golden State’s best group of three and best group of four are both subsets of the starting lineup–another instance of stacking of positive effects–while neither of Boston’s best groups of three or four are part of their starting lineup.

6Connection with linear models

Before moving on, we consider the connection between spectral analysis and a related approach via linear regression which will likely be more familiar to the sports analytics community.

Recalling our assumption of a 15 man roster, consider the problem of modeling a lineup’s plus-minus, given by f (L) for lineup L, using indicator variables that correspond to all possible groups of players. Label the predictor variables X1, X2,…Xp, where each variable corresponds to a group of players (with some fixed group order). Thus, the variable Xi is 1 when the players from group i are on the floor, and zero otherwise. If the first fifteen variables are the indicator functions of the individual players X1, X2, … X15, then the group variables, the Xi for i > 15, are interaction terms. For instance, the variable corresponding to the group {1, 2, 3} is X1X2X3. This approach is therefore similar to an adjusted plus minus with interactions approach. Including all possible group effects, however, means that the number of predictors is quite large and depending on the number of observations, we may be in a situation where p >> N. Moreover, the nature of player usage in lineups means that there is a significant multicollinearity issue. Consequently, an attempt to quantify group effects in a regression model of this sort will rely on a shrinkage technique like ridge regression.

Let N be the number of lineups, and y = f (L), an N × 1 column vector. Let X be the N × (p + 1) matrix whose first column is the vector of all ones and where the i-th row consists of the binary value of each predictor variable for the i-th player group. The vector of ridge coefficients βˆridge minimizes the penalized residual sum of squares: argminβ{y-Xβ2+λi=1pβi2} . The non-negative parameter λ serves as a penalty on the L2-norm of the solution vector. (The intercept is not included in the ridge penalty.) The ridge approach reduces the variability exhibited by the least squares coefficients in the presence of multicollinearity by shrinking the coefficient estimates in the model towards zero (and toward each other). One can show that ridge regression uses the singular values of the covariance matrix associated with the centered version of X to disproportionately shrink coefficients associated with inputs where the data exhibits lower degrees of variance. See Friedman et al. (2001) for details.

The fitted coefficients β0ˆ,βˆ1,βˆp in the ridge regression model attempt to measure the contribution of group i while controlling for the contributions of all other groups and individuals. We note that this modeling approach resembles work in Sill (2010), Grassetti et al. (2019a), and Grassetti et al. (2019b), though there are key differences which we explore below. In particular, note that we model group contributions aggregated over all opponents, and without controlling for the quality of the opponents faced. This simplified approach allows for a more direct comparison with the results of spectral analysis above.

Tables 11 and 12 give the ridge regression coefficients associated with the top 5 individuals, pairs, and triples for the Warriors.

Table 11

Best individuals and pairs using the linear model

IndividualEstimateP1P2Pair Estimate
Draymond Green0.28Draymond GreenStephen Curry0.65
Stephen Curry0.25Stephen CurryAndrew Bogut0.53
Klay Thompson0.15Stephen CurryKlay Thompson0.47
Andrew Bogut0.14Draymond GreenKlay Thompson0.47
Festus Ezeli0.02Draymond GreenAndrew Bogut0.46
Table 12

Top triples according to the linear model

P 1P2P3Estimate
Draymond GreenStephen CurryAndrew Bogut1.61
Stephen CurryKlay ThompsonAndrew Bogut1.49
Draymond GreenStephen CurryKlay Thompson1.39
Draymond GreenKlay ThompsonAndrew Bogut1.24
Draymond GreenKlay ThompsonHarrison Barnes1.03

Comparing with Tables 5, 6, and 9 shows both some overlap in the top rated groups, but also significant differences with respect to both ordering and magnitude of contribution. In particular, the linear model appears to value the contributions of Andrew Bogut considerably more than spectral analysis. It is also notable that spectral analysis identifies a clearly dominant big three of Green–Curry–Thompson, in contrast to the considerably different result arising from the modeling approach which ranks that group third.

We can interpret the linear model determined by βˆridge as giving a similar decomposition to the spectral decomposition in (refdecomposition). For each lineup L we have predicted success given by

(8)
yˆ=XLβˆridge
where XL is now the 15choose5 × (p + 1) matrix whose first column is all 1s, and whose i, j + 1 entry is 1 if the j-th player group is part if the i-th lineup. (We have fixed a particular ordering of lineups.) The columns of XL (the Xi) that correspond to individual players can be understood as spanning a subspace W1 analogous to V1 in (2). Similarly, W2 is spanned by the columns of XL corresponding to pair interactions, and so on for all groups through full five player lineups. The particular linear combinations in each Wi determined by the respective coordinates of βˆridge are analogous to the prVif. In fact, the space of all lineup functions can be written
(9)
V=W0+W1+W2+W3+W4+W5,
where Wi is the space of interaction effects for groups of size i.

Still, there are important differences between (2) and (9). While V0 and W0 are both one-dimensional, for i ≥ 1 the dimensions of the Wi are strictly larger than those of their Vi counterparts. For instance, W5 includes a vector for each possible set of five players from the original fifteen. Similarly W4 and groups of four, and so on. Thus, the dimension of W5 is 3003 (the number of lineups), which is the same as the dimension of V itself. By contrast the dimension of V5 in (2) is only 1638. Similarly the dimension of W4 is 1365 while that of V4 is 350. Clearly, the decomposition in (2) is highly non-orthogonal (explaining the + rather than ⊕ notation). It is easy to find vectors in Wi that overlap with Wj in the sense that their inner product is non-zero. In the context of basketball, the contribution of a group of, for example, 5 players is not necessarily separate from a constituent group of four (or any other number of) players despite the use of shrinkage methods.

The decomposition in (refdecomposition) is special in that it gives minimal subspaces that are invariant under relabeling and mutually orthogonal as described in section 3. As we’ve seen, spectral analysis achieves this at the expense of easy interpretation of group contributions. This is a drawback to spectral analysis that (2) does not have, and is an appealing feature of regression models. The interaction term associated with a group of i players in a regression model is easy to understand. Still, as we see above one must balance either ease of interpretation, or orthogonality of effects.

7Stability

In this section we take a first step to addressing questions of the stability of spectral analysis. We seek evidence that spectral analysis is indicative of a true signal, and that should the data have turned out slightly differently, the analysis would not change dramatically. Since spectral analysis works on the lineup function f (L), which is aggregated over all of a team’s plays involving L, we need to introduce variability into the values of f (L). A fully aggregated NBA season is, in a sense, a complete record of all events and lineup outcomes in that season. Still, it seems reasonable to leverage the variability inherent in the many observed results of a lineup’s plays, as well as the substitution patterns of coaches, and suggest a bootstrapping approach.

To that end, we start with the actual 15-16 season for the Boston Celtics. We can then build a bootstrapped season by sampling plays, with replacement, from the set of all plays in the actual season. (We sample the same number of plays as in the actual season.) A play is defined as a connected sequence of events surrounding a possession in the team’s play-by-play data. For example, a play might involve a sequence like a missed shot, offensive rebound, and a made jump shot; or, a defensive rebound followed by a bad pass turnover. When sampling from a team’s plays, a particular lineup will be selected with a probability proportional to the number of plays in which that lineup participated. We generate 500 bootstrapped seasons, process each using the methodology of sections 2 and 3 to produce success functions fboot, and then apply spectral analysis to each. We thus have a bootstrapped distribution of lineup plus-minus and possession values over each lineup L, which in turn gives plus-minus and possession distributions of all player-groups. While the the number of possessions played is highly stable for both full-lineups and smaller player-groups, there is considerable variability in plus-minus values over the bootstrapped seasons. Lineups with a significant number of possessions exhibit both positive and negative performance, and the balance between the positive and negative plays is delicate.

The variability in group PM presents a challenge in gauging the stability of the spectral analysis associated with a player group. Take, for example, the Thomas–Bradley–Crowder triple for the Celtics. The actual season’s plus-minus for this group was 154.8 in 2572 possessions. Over the bootstrapped seasons the group has means of 145.9 and 2574.1 for plus-minus and possessions, respectively. On the other hand, the standard deviation of the plus-minus values is 82.8 versus only 47.7 for possessions. Thus, some of the variability in the spectral contribution of the group over the bootstrapped seasons should be expected since, in fact, the group was less effective in some of those seasons. Figure 3 shows SCLP plotted against PMperLP for the Thomas–Bradley–Crowder triple in 500 bootstrapped seasons. Of course, spectral analysis purports to do more than raw plus-minus by removing otherwise confounding colinearities and overlapping effects. Not surprisingly, therefore, we still see variability in SCLP within a band of plus-minus values, but the overall positive correlation, whereby SCLP increases in seasons where the group tended to outscore its opponents, is reasonable.

Fig. 3

Spectral contribution per log possession (SCLP) versus plus-minus per log possession (PMperLP) for Thomas–Bradley–Crowder triple in 500 bootstrapped seasons. Each bootstrapped season consists of sampling plays (connected sequences of game events) with replacement from the set of all season plays. Resampled season data is then processed as in section 2 and group contributions are computed via spectral analysis as in section 3.

Spectral contribution per log possession (SCLP) versus plus-minus per log possession (PMperLP) for Thomas–Bradley–Crowder triple in 500 bootstrapped seasons. Each bootstrapped season consists of sampling plays (connected sequences of game events) with replacement from the set of all season plays. Resampled season data is then processed as in section 2 and group contributions are computed via spectral analysis as in section 3.

Also intuitively, the strength of the correlation between group plus-minus and spectral contribution depends on the number of possessions played. Fewer possessions means that group’s contribution is more dependent on other groups and hence exhibits more variability. The mean possessions for the Thomas–Bradley–Crowder triple in Fig.3 is 2574, and has a Pearson correlation of r = 0.953. The group Thomas–Turner–Zeller, on the other hand, has r = 0.688 with a mean of 305 possessions. A group like Jared Sullinger–Marcus Smart is particularly interesting. This pair has a season plus-minus of 25.0 in 1116 possessions. In 500 bootstrap seasons, they have a mean plus-minus of 23.6 and mean possessions of 1118.3. The value of the group’s plus-minus is negative in only 32.4% of those seasons. Should this group, therefore, be considered effective overall? Spectral analysis answers with a fairly emphatic no. After removing other group contributions their SCLP as a pure pair is negative in 90.6% of bootstrapped seasons, while still exhibiting strong correlation with overall plus-minus (r = 0.73). Similarly, the Bradley–Smart pair has a season plus-minus of 45.3 in 1679 possessions In 500 bootstrap seasons, they have a mean plus-minus of 40.4 and mean possessions of 1679. Their plus-minus is negative in 27% of those seasons while their spectral contribution is negative in 81% of bootstrapped seasons.

8Importance of effect spaces

Another natural question is how to value the relative importance of the group-effect spaces. One way to gauge importance uses the squared L2 norm of the success function in each space. Since the spaces are mutually orthogonal, we have ∥f ∥ 2 = ∥ f1 ∥ 2 + ∥ f2 ∥ 2 + ∥ f3 ∥ 2 + ∥ f4 ∥ 2 + ∥ f5 ∥ 2. (Recall that fi is the projection of f onto the i-th order effect space Vi.) One can then measure the total mass of f that is concentrated in each effect space. For example, if we found that the mass of the success function was concentrated in the mean space, and thus, a constant function gave a good approximation to f, we could conclude that the particular lineup used by this team was largely irrelevant–the success of the team never strayed far from the mean and was not strongly affected by any groups. This would be an easy team to coach. Of course, this is not the case in basketball, as evidenced by the L2 norm squared distribution of the sample of teams in Table 13.

Table 13

Distribution of the squared L2-norm of the team success function over the effect spaces

TeamV0V1V2V3V4V5
BOS0.0010.0120.0480.1380.2970.504
CLE0.0030.0210.0580.1500.3010.467
GSW0.0030.0310.0920.2030.3120.360
HOU0.0000.0070.0370.1230.2850.548
OKC0.0010.0110.0380.1370.3040.510
POR0.0000.0040.0270.1120.2890.568
SAS0.0070.0270.0720.1730.2940.427
Null0.0000.0050.030.1170.3030.545

By this measure, the higher-order spaces are dominant as they hold most of the mass of the success function. An issue with this metric, however, is the disparity in the dimensions of the spaces. Because V5 is 1638-dimensional, we might expect the mass of f to be disproportionately concentrated in that space. In fact, a random unit vector projected into each of the effect spaces would be, on average, distributed according to the null distribution in Table 13, with mass proportional to the dimension of each of the spaces in question.

Moreover, we can take the true success function of a team and break the dependence on the actual player groups as follows. Recall that the raw data f records the plus-minus for each of the possible 3003 lineups. We then take f and randomly permute the values so that there is no connection between the lineup and the value associated with that lineup. Still, however, the overall plus-minus and mean of f are preserved. We can then run spectral analysis on the permuted f and record the distribution of the squared L2 norm in each space. Repeating this experiment 500 times for both GSW and BOS give means in Table 14 that exactly conform to the null distribution in Table 13.

Table 14

Average fraction of squared L2 mass by order effect space using randomly permuted success function

SpaceBOSGSW
First0.0050.005
Second0.0300.030
Third0.1170.116
Fourth0.3020.302
Fifth0.5430.544

An alternative measure of the importance of each effect space is given by measuring the extent to which projections onto Vi deviate from the null distribution. By this measure of importance, there is some preliminary evidence that strong teams shift the mass of f from V5 into lower-order spaces, particularly V1, V2, and V3. This is interesting as it agrees with the idea that building an elite team requires a group of three stars. Using all 30 NBA teams, we compute correlations of r = 0.51, r = 0.58 and r = 0.55, respectively, between win-percentage and the projected mass f in the first, second, and third-order spaces. Win-percentage and fifth-order projection have correlation coefficient r = -0.54. As pointed out in Diaconis (1989), however, care must be taken when looking at deviation from the null distribution if the projections are highly structured and lie close to a few of the interpretable vectors. This is a direction for further inquiry.

9Conclusion

Spectral analysis proposes a new approach to understanding and quantifying group effects in basketball. By thinking of the success of a team as function on lineups, we can exploit the structure of functions on permutations to decompose the team success function. The resulting Fourier expansion is naturally interpreted as quantifying the group effects to overall team success. The resulting analysis brings insight into important and difficult questions like which groups of players work effectively together, and which do not. Furthermore, the spectral analysis approach is unique in addressing questions of lineup synergies by presenting an EDA summary of the actual team data without making the kind of modeling or skill-based assumptions of other methods.

There are several directions for future work. First, the analysis presented used raw lineup level plus-minus to measure success. This approach has the advantage of keeping the analysis tethered to data that is intuitive, and helps avoid pitfalls arising from low-possession lineups. Still, adjusting the lineup level plus-minus to account for quality of opponent, for example, seems like a valuable next step. Another straight forward adjustment to raw plus-minus data would involve devaluing so-called garbage time possessions when the outcome of the game is not in question.

As presented here, spectral analysis provides an in-depth exploratory analysis of a team’s lineups. Still, the results of spectral analysis could also add valuable inputs to more traditional predictive models or machine learning approaches to projecting group effects. Similarly, it would be interesting to use spectral analysis as a practical tool for lineup suggestions. While the orthogonality of the spectral decomposition facilitates valuation of pure player-groups, the question of lineup construction realistically begins at the level of individuals and works up, hopefully stacking the contributions of individuals with strong pairs, triples, and so-on. A strong group of three, for instance, without any strong individual players may be interesting from an internal development perspective, or at the edges of personnel utility, but may also be of limited practical value from the perspective of constructing a strong lineup. Development of a practical tool would likely require further analysis of the ideas in sections 7 and 8 based on ideas in Diaconis et al. (1998). For example, given data (a function on lineups), we might fix the projection of that data onto certain spaces (like the first or second order), and then generate new sample data conditional on that fixed projection. The resulting projections in the higher-order spaces would give some evidence for how the fixed lower-order projections affect the mass of f in the higher-order effects spaces. This would help give a more detailed sense of variability of projections, and a more definitive answer to the question of which spaces are most important, and how the spectral signature of a team correlates with team success. With that information in place, however, one can build tools to suggest lineup replacements that maximize the stacking of a team’s most important groups.

Appendices

10

10Appendix

We include more detailed tables reporting group effects of all orders for both Golden State and Boston.

See tables 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, and 27.

Table 15

All first order effects for GSW

PlayerSCLPPMPoss
Draymond Green17.21038.45800
Stephen Curry15.9978.75610
Klay Thompson12.0808.65453
Andre Iguodala3.5436.13516
Andrew Bogut2.8403.62951
Harrison Barnes2.2384.74138
Festus Ezeli-1.9225.01550
Shaun Livingston-2.0211.12980
Leandro Barbosa-5.870.62144
Brandon Rush-7.123.32087
Marreese Speights-7.420.01630
Ian Clark-9.8-51.91108
Anderson Varejao-11.1-34.4368
Jason Thompson-11.2-33.8339
James Michael McAdoo-12.1-85.0526
Table 16

Second order effects for GSW with at least 200 possessions

P 1P2SCLPPMPoss
Draymond GreenStephen Curry13.3979.95102
Stephen CurryKlay Thompson11.2827.84311
Draymond GreenKlay Thompson11.1847.84678
Leandro BarbosaMarreese Speights5.376.2983
Draymond GreenAndre Iguodala4.3490.02165
Klay ThompsonAndre Iguodala4.2411.41764
Stephen CurryAndre Iguodala3.9460.02185
Klay ThompsonHarrison Barnes3.9396.03058
Klay ThompsonAndrew Bogut3.7394.32637
Leandro BarbosaIan Clark3.5-9.6325
Draymond GreenHarrison Barnes3.3445.12634
Stephen CurryHarrison Barnes3.2423.32809
Marreese SpeightsIan Clark2.5-44.2493
Harrison BarnesAndrew Bogut2.1206.21527
Leandro BarbosaBrandon Rush1.8-22.4638
Brandon RushIan Clark1.7-64.6463.0
Andre IguodalaFestus Ezeli1.7152.6999.0
Stephen CurryAndrew Bogut1.6378.52530.0
Shaun LivingstonMarreese Speights1.617.81014.0
Leandro BarbosaFestus Ezeli1.326.2468.0
Stephen CurryBrandon Rush-2.5140.71260.0
Harrison BarnesMarreese Speights-2.9-48.8794.0
Draymond GreenBrandon Rush-3.0144.41266.0
Klay ThompsonFestus Ezeli-3.1138.9824.0
Harrison BarnesBrandon Rush-3.1-50.1546.0
Harrison BarnesFestus Ezeli-3.29.5598
Draymond GreenLeandro Barbosa-3.3154.7860
Klay ThompsonShaun Livingston-3.6111.81412
Stephen CurryLeandro Barbosa-3.7126.3883
Draymond GreenMarreese Speights-4.0129.7492
Andre IguodalaBrandon Rush-4.4-59.9399
Klay ThompsonIan Clark-5.110.3498
Stephen CurryMarreese Speights-5.773.5423
Klay ThompsonMarreese Speights-6.4-5.1581
Klay ThompsonJames Michael McAdoo-7.1-28.9241
Draymond GreenIan Clark-7.233.3424
Klay ThompsonLeandro Barbosa-7.24.8349
Stephen CurryIan Clark-8.114.0220
Draymond GreenAnderson Varejao-9.57.2217
Stephen CurryAnderson Varejao-10.1-26.9237
Table 17

Third order effects for GSW with at least 200 possessions

P 1P2P3SCLPPMPoss
Draymond GreenStephen CurryKlay Thompson12.6812.74085
Draymond GreenKlay ThompsonHarrison Barnes5.9427.32473
Draymond GreenStephen CurryAndre Iguodala5.8464.81830
Stephen CurryKlay ThompsonHarrison Barnes5.7416.52431
Stephen CurryKlay ThompsonAndrew Bogut4.9382.22296
Stephen CurryLeandro BarbosaMarreese Speights4.684.6245
Draymond GreenStephen CurryAndrew Bogut4.1377.42346
Draymond GreenStephen CurryHarrison Barnes4.1411.32421
Stephen CurryAndre IguodalaFestus Ezeli4.1197.2633
Draymond GreenKlay ThompsonAndre Iguodala4.0388.41418
Draymond GreenKlay ThompsonAndrew Bogut4.0359.82409
Stephen CurryKlay ThompsonAndre Iguodala4.0377.01270
Draymond GreenLeandro BarbosaMarreese Speights3.888.9248
Draymond GreenAndre IguodalaFestus Ezeli3.2180.8569
Klay ThompsonHarrison BarnesAndre Iguodala2.3199.5671
Leandro BarbosaMarreese SpeightsIan Clark-2.1-0.2203
Draymond GreenAndre IguodalaAndrew Bogut-2.479.4535
Draymond GreenShaun LivingstonAndrew Bogut-2.547.4370
Klay ThompsonHarrison BarnesMarreese Speights-2.6-31.4323
Stephen CurryAndre IguodalaAndrew Bogut-2.870.2541
Klay ThompsonHarrison BarnesFestus Ezeli-2.9-1.2353
Draymond GreenStephen CurryLeandro Barbosa-3.0126.9687
Stephen CurryKlay ThompsonShaun Livingston-3.0121.3530
Klay ThompsonHarrison BarnesBrandon Rush-3.1-1.0265
Draymond GreenHarrison BarnesFestus Ezeli-3.316.0326
Stephen CurryAndre IguodalaBrandon Rush-3.8-13.5207
Draymond GreenStephen CurryMarreese Speights-4.197.9299
Draymond GreenKlay ThompsonMarreese Speights-4.552.2250
Draymond GreenKlay ThompsonIan Clark-5.89.8316
Draymond GreenStephen CurryIan Clark-7.414.5205
Table 18

Fourth order effects for GSW with at least 150 possessions

P 1P2P3P4SCPLPPMPoss
Draymond GreenStephen CurryKlay ThompsonHarrison Barnes8.7401.62271
Draymond GreenStephen CurryKlay ThompsonAndrew Bogut7.8365.72159
Draymond GreenStephen CurryKlay ThompsonAndre Iguodala7.7364.91157
Draymond GreenAndre IguodalaShaun LivingstonFestus Ezeli3.976.8201
Draymond GreenStephen CurryLeandro BarbosaMarreese Speights3.967.8173
Draymond GreenStephen CurryAndre IguodalaFestus Ezeli3.8170.3526
Stephen CurryKlay ThompsonHarrison BarnesAndre Iguodala3.3171.3451
Draymond GreenStephen CurryHarrison BarnesAndrew Bogut2.8162.31165
Stephen CurryAndre IguodalaShaun LivingstonFestus Ezeli2.764.9201
Draymond GreenStephen CurryKlay ThompsonBrandon Rush2.4177.9870
Draymond GreenStephen CurryHarrison BarnesAndre Iguodala2.3157.7419
Draymond GreenKlay ThompsonHarrison BarnesAndre Iguodala2.0158.0417
Draymond GreenKlay ThompsonHarrison BarnesAndrew Bogut1.8158.41221
Draymond GreenStephen CurryShaun LivingstonFestus Ezeli1.775.7198
Stephen CurryKlay ThompsonHarrison BarnesAndrew Bogut1.5154.01235
Draymond GreenStephen CurryAndre IguodalaAndrew Bogut-1.379.3433
Draymond GreenStephen CurryHarrison BarnesShaun Livingston-1.757.4214
Harrison BarnesAndre IguodalaShaun LivingstonFestus Ezeli-2.0-20.7160
Draymond GreenStephen CurryKlay ThompsonShaun Livingston-2.1116.5485
Draymond GreenStephen CurryHarrison BarnesBrandon Rush-2.123.0160
Draymond GreenKlay ThompsonHarrison BarnesFestus Ezeli-2.317.1299
Stephen CurryKlay ThompsonHarrison BarnesBrandon Rush-2.430.3152
Stephen CurryKlay ThompsonHarrison BarnesFestus Ezeli-2.519.0309
Draymond GreenStephen CurryHarrison BarnesFestus Ezeli-3.017.7309
Draymond GreenKlay ThompsonAndre IguodalaShaun Livingston-3.018.2261
Table 19

Fifth order effects for GSW with at least 80 possessions

P 1P2P3P4P5SCLPPMPoss
Draymond GreenStephen CurryKlay ThompsonHarrison BarnesAndre Iguodala10.3152.9372
Draymond GreenStephen CurryKlay ThompsonHarrison BarnesAndrew Bogut7.0142.01140
Draymond GreenStephen CurryAndre IguodalaShaun LivingstonFestus Ezeli6.967.3160
Draymond GreenStephen CurryKlay ThompsonAndrew BogutBrandon Rush5.288.2535
Draymond GreenStephen CurryKlay ThompsonAndre IguodalaAndrew Bogut5.198.2310
Draymond GreenStephen CurryKlay ThompsonAndre IguodalaFestus Ezeli4.985.7266
Draymond GreenStephen CurryKlay ThompsonShaun LivingstonAndrew Bogut1.942.7112
Harrison BarnesShaun LivingstonLeandro BarbosaBrandon RushMarreese Speights1.86.487
Draymond GreenStephen CurryKlay ThompsonHarrison BarnesShaun Livingston742.7175
Harrison BarnesAndre IguodalaShaun LivingstonLeandro BarbosaMarreese Speights-0.8-3.1172
Harrison BarnesAndre IguodalaShaun LivingstonLeandro BarbosaFestus Ezeli-1.3-9.9102
Draymond GreenStephen CurryKlay ThompsonHarrison BarnesFestus Ezeli-1.920.1283
Draymond GreenStephen CurryKlay ThompsonHarrison BarnesBrandon Rush-2.328.0123
Draymond GreenStephen CurryKlay ThompsonAndre IguodalaShaun Livingston-5.12.398
Draymond GreenStephen CurryKlay ThompsonHarrison BarnesJames Michael McAdoo-5.9-14.291
Table 20

Pairs involving Andrew Bogut (with at least 150 possessions)

P 1P2SCLPPMPoss
Andrew BogutKlay Thompson3.7394.32637
Andrew BogutHarrison Barnes2.1206.21527
Andrew BogutStephen Curry1.6378.52530
Andrew BogutDraymond Green0.8371.52596
Andrew BogutBrandon Rush0.754.5733
Andrew BogutIan Clark-0.36.1198
Andrew BogutShaun Livingston-0.677.4573
Andrew BogutLeandro Barbosa-1.616.3166
Andrew BogutAndre Iguodala-2.1107.0785
Table 21

Pairs involving Shaun Livingston (with at least 150 possessions)

P 1P2SCLPPMPoss
Shaun LivingstonAnderson Varejao2.0-1.5174
Shaun LivingstonMarreese Speights1.617.81014
Shaun LivingstonDraymond Green1.2323.61486
Shaun LivingstonIan Clark9-25.7378
Shaun LivingstonLeandro Barbosa915.21210
Shaun LivingstonJames Michael McAdoo8-41.6180
Shaun LivingstonFestus Ezeli449.0654
Shaun LivingstonStephen Curry-0.1265.51120
Shaun LivingstonAndrew Bogut-0.677.4573
Shaun LivingstonHarrison Barnes-1.155.21475
Shaun LivingstonAndre Iguodala-1.3v65.21605
Shaun LivingstonBrandon Rush-1.5-63.2536
Shaun LivingstonKlay Thompson-3.6111.81412
Table 22

Worst triples for GSW with at least 500 possessions

Player 1Player 2Player 3SCLPPMPoss
Klay ThompsonHarrison BarnesShaun Livingston-1.135.3733
Draymond GreenAndre IguodalaShaun Livingston-1.392.5630
Draymond GreenKlay ThompsonFestus Ezeli-1.4151.4721
Stephen CurryKlay ThompsonFestus Ezeli-1.5152.4694
Draymond GreenKlay ThompsonShaun Livingston-1.6160.5929
Draymond GreenStephen CurryBrandon Rush-1.7153.91116
Draymond GreenAndre IguodalaAndrew Bogut-2.479.4535
Stephen CurryAndre IguodalaAndrew Bogut-2.870.2541
Draymond GreenStephen CurryLeandro Barbosa-3.0126.9687
Stephen CurryKlay ThompsonShaun Livingston-3.0121.3530
Table 23

First order effects for BOS

PlayerSCLPPMPoss
Isaiah Thomas3.4236.55388
Avery Bradley3.3228.55099
Jae Crowder3.1219.54685
Jared Sullinger3210.83828
Amir Johnson2172.83580
Kelly Olynyk1.7154.82835
Marcus Smart0.9125.33407
Evan Turner-0.281.14577
Jonas Jerebko-0.471.52346
RJ Hunter-3.3-18.3624
Jordan Mickey-3.66106
David Lee-3.6-35945
Tyler Zeller-3.7-44.31442
James Young-4-31.3392
Terry Rozier-4.1-43616
Table 24

Top ten and bottom five second order effects for BOS with at least 150 possessions

P 1P2SCLPPMPoss
Isaiah ThomasAvery Bradley3.5229.83564
Evan TurnerJonas Jerebko3.0109.31945
Marcus SmartKelly Olynyk3.0141.81298
Avery BradleyJared Sullinger2.7191.22969
Tyler ZellerRJ Hunter2.68.5261
Isaiah ThomasJared Sullinger2.5188.63315
Isaiah ThomasJae Crowder2.3187.03668
Jae CrowderAmir Johnson2.3162.32594
Isaiah ThomasAmir Johnson2.1165.83175
Kelly OlynykJonas Jerebko2.095.61030
Isaiah ThomasEvan Turner-2.21.02462
Avery BradleyTyler Zeller-2.5-38.0674
Jared SullingerJonas Jerebko-2.5-2.4386
Isaiah ThomasTyler Zeller-2.9-41.5455
Avery BradleyTerry Rozier-3.4-41.9160
Table 25

Top ten and bottom five third order effects for BOS with at least 150 possessions

P 1P2P3SCLPPMPoss
Evan TurnerKelly OlynykJonas Jerebko2.9110.1879
Isaiah ThomasAvery BradleyJared Sullinger2.7177.72642
Avery BradleyJae CrowderJared Sullinger2.3139.32216
Isaiah ThomasAvery BradleyJae Crowder2.2154.82572
Isaiah ThomasAvery BradleyAmir Johnson2.0137.52351
Evan TurnerMarcus SmartJonas Jerebko2.093.71159
Jae CrowderEvan TurnerJonas Jerebko2.061.2460
Isaiah ThomasJae CrowderJared Sullinger1.9140.72533
Isaiah ThomasMarcus SmartKelly Olynyk1.885.5464
Avery BradleyJae CrowderAmir Johnson1.7107.31894
Avery BradleyJae CrowderEvan Turner-1.8-7.9708
Isaiah ThomasEvan TurnerTyler Zeller-1.8-68.4305
Isaiah ThomasEvan TurnerKelly Olynyk-1.8-30.9870
Avery BradleyJared SullingerJonas Jerebko-2.3-11.7194
Isaiah ThomasAvery BradleyJonas Jerebko-2.4-1.6290
Table 26

Top ten and bottom five fourth order effects for BOS with at least 150 possessions

P 1P2P3P4SCLPPMPoss
Avery BradleyEvan TurnerKelly OlynykJonas Jerebko3.171.8375
Evan TurnerMarcus SmartKelly OlynykJonas Jerebko2.788.0526
Isaiah ThomasAvery BradleyJae CrowderJared Sullinger2.6120.02014
Isaiah ThomasAvery BradleyEvan TurnerJared Sullinger2.476.8584
Avery BradleyEvan TurnerMarcus SmartJonas Jerebko2.162.8526
Avery BradleyJae CrowderJared SullingerKelly Olynyk1.959.7247
Avery BradleyMarcus SmartKelly OlynykJonas Jerebko1.755.6304
Isaiah ThomasAvery BradleyJae CrowderKelly Olynyk1.762.8432
Avery BradleyEvan TurnerJared SullingerAmir Johnson1.642.2343
Avery BradleyEvan TurnerMarcus SmartKelly Olynyk1.644.3423
Jae CrowderEvan TurnerJared SullingerMarcus Smart-1.0-21.2180
Isaiah ThomasAvery BradleyJae CrowderMarcus Smart-1.12.1281
Evan TurnerMarcus SmartJonas JerebkoTyler Zeller-1.11.5408
Isaiah ThomasAvery BradleyAmir JohnsonMarcus Smart-1.34.6322
Isaiah ThomasAvery BradleyEvan TurnerKelly Olynyk-2.6-24.2225
Table 27

Top five and bottom three fifth order effects for BOS with at least 100 possessions

P 1P2P3P4P5SCLPPMPoss
Avery BradleyEvan TurnerMarcus SmartKelly OlynykJonas Jerebko6.163.0257
Isaiah ThomasAvery BradleyJae CrowderJared SullingerKelly Olynyk3.441.9202
Isaiah ThomasAvery BradleyJae CrowderJared SullingerAmir Johnson2.948.81413
Isaiah ThomasAvery BradleyEvan TurnerJared SullingerAmir Johnson2.433.8256
Isaiah ThomasAvery BradleyJae CrowderEvan TurnerJared Sullinger1.123.7148
Isaiah ThomasAvery BradleyJae CrowderAmir JohnsonKelly Olynyk-1.27.3107
Isaiah ThomasAvery BradleyJared SullingerAmir JohnsonMarcus Smart-1.6-3.8128
Isaiah ThomasJae CrowderEvan TurnerJared SullingerAmir Johnson-1.8-7.0105

References

1 

Basketball-Reference. Glossary. https://www.basketball-reference.com/about/glossary.html. [Online; accessed 17-May-2019].

2 

Diaconis, P. , 1988, Group representations in probability and statistics, Lecture Notes-Monograph Series, 11, pp. i-vi+1-192. ISSN 07492170. URL http://www.jstor.org/stable/4355560.

3 

Diaconis, P. , 1989, A generalization of spectral analysis with application to ranked data, The Annals of Statistics 17(3), 949–979. ISSN 00905364. URL http://www.jstor.org/stable/2241705.

4 

Diaconis, P. and Sturmfels, B. , 1998, et al., Algebraic algorithms for sampling from conditional distributions, The Annals of statistics 26(1), 363–397.

5 

Dummit, D. S. and Foote, R. M. , 2004, Abstract algebra. John Wiley & sons, Hoboken, NJ. ISBN 0-471-43334-9. URL http://opac.inria.fr/record=b1133479.

6 

Friedman, J. , Hastie, T. and Tibshirani, R. , 2001, The elements of statistical learning, volume 1. Springer series in statistics Springer, Berlin.

7 

Grassetti, L. , Bellio, R. , Fonseca, G. and Vidoni, P. , 2019a, Estimation of lineup efficiency effects in basketball using play-by-play data. In G. Arbia, S. Peluso, A. Pini, and G. Rivellini, editors, Book of Short Papers SIS2019. Pearson.

8 

Grassetti, L. , , Bellio, R. , Fonseca, G. and Vidoni, P. , 2019b, Play-by-play data analysis for team managing in basketball. In Dimitris Karlis, Ioannis Ntzoufras, and Sotiris Drikos, editors, Proceedings of Math Sport International 2019 Conference (e-book). Propobos Publications.

9 

Jurman, G. , Merler, S. , Barla, A. , Paoli, S. , Galea, A. , Furlanello, C. , 2008, Algebraic stability indicators for ranked lists in molecular profiling, Bioinformatics, 24(2), 258–264. doi: 10.1093/bioinformatics/btm550. URL http://bioinformatics.oxfordjournals.org/content/24/2/258.abstract.

10 

Kakarala, R. , 2011, Asignal processing approach to fourier analysis of ranking data: The impor-tance of phase, Signal Processing, IEEE Transactions on, 59(4), 1518–1527. ISSN 1053-587X. doi: 10.1109/TSP.2010.2104145

11 

Kondor, R. and Dempsey, W. , 2012, Multiresolution analysis on the symmet-ric group. In F. Pereira, C.J.C. Burges, L. Bottou, and K.Q. Wein-berger, editors, Advances in Neural Information Processing Systems 25, pp. 1637-1645. Curran Associates, Inc., 2012. URL http://papers.nips.cc/paper/4720-multiresolution-analysis-on-the-symmetric-group.pdf.

12 

Kondor, R. , Howard, A. and Jebar, T. , 2007, Multi-object tracking with representations of the symmetric group. In AISTATS 2007 Proceedings, 2, pp. 211-218.

13 

Kuehn, J. , 2016, Accounting for complementary skill sets when evaluating nba players? Values to a specific team. In 2016 MIT Sloan Sports Analytics Conference, 6, 2016.

14 

Lawson, B. L. , Orrison, M. E. and Uminsky, D. T. , 2006, Spectral analysis of the supreme court, Mathematics Magazine, 79(5), 340–346. ISSN 0025570X. URL http://www.jstor.org/stable/27642969.

15 

Maslen, D. K. , Orrison, M. E. and Rockmore, D. N. , 2003, Computing isotypic projec-tions with the lanczos iteration, SIAM Journal on Matrix Analysis and Applications 25(3), 784–803.

16 

Maymin, A. Z. , Maymin, P. Z. and Shen, E. , 2013, Nba chemistry: Positive and negative synergies in basketball, International Journal of Computer Science in Sport 12(2), 4–23.

17 

Paudel, K. P. , Pandit, M. and Dunn, M. A. , 2013, Using spectral analysis and multinomial logit regression to explain households’ choice patterns, Empirical Economics, 44(2), 739-760. ISSN 0377-7332. doi: 10.1007/s00181-012-0558-4. URL http://dx.doi.org/10.1007/s00181-012-0558-4.

18 

Serre, J. -P. , 2012, Linear representations of finite groups, volume 42. Springer Science & Business Media.

19 

Sill, J. , 2010, Improved nba adjusted +/- using regularization and out-of-sample testing. In MIT Sloan Sports Analytics Conference.

20 

Uminsky, D. , Banuelos, M. , Gonzlez-Albino, L. , Garza, R. and Nwakanma, S. A. , 2019, Detecting higher order genomic variant interactions with spectral analysis. In 2019 27th European Signal Processing Conference (EUSIPCO), pp. 1-5. doi: 10.23919/EUSIPCO.2019.8902725

21 

Uminsky, D. , Garza, R. , González, L. , Nwakanma, S. , Devlin, S. and Banuelos, M. , 2018, Detecting higher order variable interactions: A spectral analysis approach. In LatinX in AI Workshop at NeurIPS.