Conditional computation in neural networks: Principles and research trends

Scardapane, Simone; Baiocchi, Alessandro; Devoto, Alessio; Marsocci, Valerio; Minervini, Pasquale; Pomponi, Jary

doi:10.3233/IA-240035

Conditional computation in neural networks: Principles and research trends

Issue title: Selected papers from the AIxIA 2023 Workshops

Guest editors: Andrea Brunello and Danilo Croce

Article type: Research Article

Authors: Scardapane, Simone^{a; *} | Baiocchi, Alessandro^b | Devoto, Alessio^b | Marsocci, Valerio^c | Minervini, Pasquale^d | Pomponi, Jary^a

Affiliations: [a] Dipartimento di Ingegneria dell’Informazione, Elettronica e Telecomunicazioni (DIET), Sapienza University of Rome, Rome, Italy | [b] Department of Computer, Control, and Management Engineering Antonio Ruberti (DIAG), Rome, Italy | [c] Geomatics Research Group, KU Leuven, Gent, Belgium | [d] School of Informatics, University of Edinburgh, Edinburgh, UK

Correspondence: [*] Corresponding author: Simone Scardapane, DIET Department, Sapienza University of Rome, Via Eudossiana 18, 00184, Rome, Italy. E-mails: [email protected], [email protected].

Abstract: This article summarizes principles and ideas from the emerging area of applying conditional computation methods to the design of neural networks. In particular, we focus on neural networks that can dynamically activate or de-activate parts of their computational graph conditionally on their input. Examples include the dynamic selection of, e.g., input tokens, layers (or sets of layers), and sub-modules inside each layer (e.g., channels in a convolutional filter). We first provide a general formalism to describe these techniques in an uniform way. Then, we introduce three notable implementations of these principles: mixture-of-experts (MoEs) networks, token selection mechanisms, and early-exit neural networks. The paper aims to provide a tutorial-like introduction to this growing field. To this end, we analyze the benefits of these modular designs in terms of efficiency, explainability, and transfer learning, with a focus on emerging applicative areas ranging from automated scientific discovery to semantic communication.

Keywords: Conditional computation, neural networks, modularity, explainability, efficiency

DOI: 10.3233/IA-240035

Journal: Intelligenza Artificiale, vol. 18, no. 1, pp. 175-190, 2024

Published: 31 July 2024

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]

For editorial issues, like the status of your submitted paper or proposals, write to [email protected]

如果您在出版方面需要帮助或有任何建, 件至: [email protected]

Share this:

North America

Europe

Asia