UI-TARS Desktop is a GUI Agent application based on UI-TARS (Vision-Language Model) that allows you to control your computer using natural language.

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2025-01-23 18:30:04

UI-TARS Desktop is a GUI Agent application based on UI-TARS (Vision-Language Model) that allows you to control your computer using natural language.

   📑 Paper    | 🤗 Hugging Face Models   | 🖥️ Desktop Application    |    👓 Midscene (use in browser)

The GGUF model has undergone quantization, but unfortunately, its performance cannot be guaranteed. As a result, we have decided to downgrade it.

💡 Alternative Solution: You can use Cloud Deployment or Local Deployment [vLLM](If you have enough GPU resources) instead.

We provide three model sizes on Hugging Face: 2B, 7B, and 72B. To achieve the best performance, we recommend using the 7B-DPO or 72B-DPO model (based on your hardware configuration):

Leave a Comment