News

Important: ConvMixer-768/32 here uses ReLU instead of GELU, so you would have to change convmixer.py accordingly (we will fix this later). You can evaluate ConvMixer-1536/20 as follows: python ...