Artificial Neural Networks using Multimodal Neurons
C Language Integrated Production System (CLIP) neurons respond to the same concept whether it is presented literally, symbolically, or conceptually, according to our findings. This could explain CLIP's accuracy in recognizing startling visual renderings of concepts, and it's also a big step toward understanding how CLIP and other models learn associations and biases. The discovered multimodal neurons exist in the human brain. Rather than any specific visual aspect, these neurons respond to clusters of abstract thoughts based around a similar high-level topic. The "Halle Berry" neuron, which has been published in both Scientific American and The New York Times, responds to images, doodles, and the text "Halle Berry" (but not other names). CLIP is a general-purpose vision system developed by OpenAI that matches the performance of a ResNet-502 but exceeds existing vision systems on some of the most difficult datasets. Each of these challenge datasets, Object...