Training Neural Networks in Single vs. Double Precision

Item Type Journal paper
Abstract The commitment to single-precision floating-point arithmetic is widespread in the deep learning community. To evaluate whether this commitment is justified, the influence of computing precision (single and double precision) on the optimization performance of the Conjugate Gradient (CG) method (a second-order optimization algorithm) and Root Mean Square Propagation (RMSprop) (a first-order algorithm) has been investigated. Tests of neural networks with one to five fully connected hidden layers and moderate or strong nonlinearity with up to 4 million network parameters have been optimized for Mean Square Error (MSE). The training tasks have been set up so that their MSE minimum was known to be zero. Computing experiments have dis-closed that single-precision can keep up (with superlinear convergence) with double-precision as long as line search finds an improvement. First-order methods such as RMSprop do not benefit from double precision. However, for moderately nonlinear tasks, CG is clearly superior. For strongly nonlinear tasks, both algorithm classes find only solutions fairly poor in terms of mean square error as related to the output variance. CG with double floating-point precision is superior whenever the solutions have the potential to be useful for the application goal.
Authors Hrycej, Tomas; Bermeitinger, Bernhard & Handschuh, Siegfried
Research Team Data Science and Natural Language Processing
Journal or Publication Title Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR
Language English
Subjects computer science
HSG Classification contribution to scientific community
HSG Profile Area None
Refereed Yes
Date October 2022
Publisher SciTePress
Page Range 307-314
ISSN 2184-3228
ISSN-Digital 2184-3228
Publisher DOI https://doi.org/10.5220/0011577900003335
Official URL https://www.scitepress.org/PublicationsDetail.aspx...
Contact Email Address bernhard.bermeitinger@unisg.ch
Depositing User Bernhard Bermeitinger
Date Deposited 28 Oct 2022 19:15
Last Modified 28 Oct 2022 19:15
URI: https://www.alexandria.unisg.ch/publications/267727

Download

[img] Text
115779.pdf - Published Version
Restricted to Repository staff only

Download (464kB) | Request a copy
[img] Slideshow
TrainingNeuralNetworksinSinglevs.DoublePrecisionKDIR2022Valetta.pdf - Presentation
Available under License Creative Commons Attribution.

Download (966kB)
[img] Text
arxiv.pdf - Accepted Version
Available under License Creative Commons Attribution.

Download (293kB)

Citation

Hrycej, Tomas; Bermeitinger, Bernhard & Handschuh, Siegfried (2022) Training Neural Networks in Single vs. Double Precision. Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR, 307-314. ISSN 2184-3228

Statistics

https://www.alexandria.unisg.ch/id/eprint/267727
Edit item Edit item
Feedback?