June 26th, 2023, 12:03 pm
Learnability can be undecidable
Shai Ben-David1 , Pavel Hrubeš2 , Shay Moran3 , Amir Shpilka4 and Amir Yehudayoff 5 *
The mathematical foundations of machine learning play a key role in the development of the field. They improve our understanding and provide tools for designing new learning paradigms. The advantages of mathematics, however, sometimes come with a cost. Gödel and Cohen showed, in a nutshell, that not everything is provable. Here we show that machine learning shares this fate. We describe simple scenarios where learnability cannot be proved nor refuted using the standard axioms of mathematics. Our proof is based on the fact the continuum hypothesis cannot be proved nor refuted. We show that, in some cases, a solution to the ‘estimating the maximum’ problem is equivalent to the continuum hypothesis. The main idea is to prove an equivalence between learnability and compression.
The main result of this work is that the learnability of the family of sets F * over the class of probability distributions P* is undecidable. While learning F * over P* may not be directly related to practical machine learning applications, the result demonstrates that the notion of learnability is vulnerable. In some general yet simple learning frameworks there is no effective characterization of learnability. In other words, when trying to understand learnability, it is important to pay close attention to the mathematical formalism we choose to use. How come learnability can neither be proved nor refuted? A closer look reveals that the source of the problem is in defining learnability as the existence of a learning function rather than the existence of a learning algorithm. In contrast with the existence of algorithms, the existence of functions over infinite domains is a (logically) subtle issue. The advantages of the current standard definitions (that use the language of functions) is that they separate the statistical or information-theoretic issues from any computational considerations. This choice plays a role in the fundamental characterization of PAC learnability by the VC dimension. Our work shows that this settheoretic view of learnability has a high cost when it comes to more general types of learning. Data availability The data that support the findings of this study are av
-
Attachments
-
- 1687370290312.pdf
- (1.59 MiB) Downloaded 56 times