Apple has released a new AI tool that will blow your mind! Try it out for yourself – Letem svetem Applem

Apple has released a new AI tool that will blow your mind! Try it out for yourself – Letem svetem Applem


Apple showed off another technology from its AI lab. This time it’s FastVLM – a visual-linguistic model designed for lightning-fast image processing and subtitle generation. You can now try the latest version of the model yourself, right in your browser. All you need is a Mac with Apple SiliconYou can try the model hereParadoxically, we managed to get the model up and running in Chrome and not Safari.

You could be interested in

New Siri

Apple brings AI without waiting and without cloudu

FastVLM is part of the new open-source MLX framework, which Apple Designed specifically for the M1, M2, and M3 chips, the model achieved up to 85x faster video processing than comparable alternatives while taking up only a third of the size.

From now on, you can run the FastVLM-0.5B model directly on the web via the platform hugging face. Just wait a while for it to load (e.g. on 16GB MacBookfor the M2 it took about two minutes), and it immediately starts describing what is happening in front of the camera: your face, expressions, background or objects you point out.

Interactive real-time captioning

The user can choose exactly what the model should recognize. There are preset prompts available, such as:

  • Describe what you see in one sentence.
  • What color is my shirt?
  • What object am I holding in my hand?
  • What emotions do you see?

More advanced users can also connect a virtual camera and test the model in different scenes. The output is surprisingly accurate and detailed, to the point where it’s hard to keep track of what the model is capable of.

Privacy and use in practice

The key advantage of this approach is that all data remains on the device. The model runs directly in the browser, even without an internet connection. This makes it a great candidate for use in wearables or assistive technologies, where low latency and privacy are essential.

The currently available model has “only” 0,5 billion parameters, but Apple It is also preparing versions with 1,5 and 7 billion, which could offer even better results – although not directly in the browser.



Originally Appeared Here