This isn't an image model. It's a text model, but text models can output SVG so ...

cedws · 2025-06-27T12:41:45 1751028105

>Multimodal by design: Gemma 3n natively supports image, audio, video, and text inputs and text outputs.

But I understood your point, Simon asked it to output SVG (text) instead of a raster image so it's more difficult.

simonw · 2025-06-27T13:12:58 1751029978

It can handle image and audio inputs, but it cannot produce those as outputs - it's purely a text output model.

cedws · 2025-06-27T14:17:44 1751033864

Yeah you're right. Also, you're Simon :)