Microsoft Fara1.5: 4B/9B/27B browser agents, 27B Mind2Web पर 72% vs Operator 58%

Microsoft Research के AI Frontiers lab ने Fara1.5 release किया: browser computer-use agents की एक family 4B, 9B, और 27B parameter sizes में, Qwen3.5 base checkpoints पर built। Models screenshots पढ़ते हैं और mouse/keyboard actions emit करते हैं observe-think-act loop के through — हर step prior conversation history plus तीन latest screenshots लेता है, thoughts और एक single action output करता है। Action space में standard inputs plus web-specific operations (searches) और context management और user clarification के लिए meta-actions शामिल हैं। Online-Mind2Web (300 tasks, 136 sites): Fara1.5-27B 72% पर, Fara1.5-9B 63.4% पर। Comparison set: OpenAI Operator 58.3%, Gemini 2.5 Computer Use 57.3%, Yutori Navigator n1 64.7%। WebVoyager: 27B 88.6%, 9B 86.6%, 4B 80.8%। Training: ~2 million supervised samples — 60% web trajectories, 12.8% synthetic environments, 12.5% form filling/interactions, 8.8% grounding, 4.9% VQA, plus safety data। Missing personal info, ambiguous task descriptions, approval के बिना irreversible actions पर safety pauses। Open-source availability, weights, license, और HuggingFace/Azure deployment details announcement में अभी specified नहीं।

Two things to note। Microsoft Research Qwen3.5 base पर build कर रहा है — वो Microsoft Western agentic product build करने के लिए Chinese open-weight foundations use कर रहा है। वही cross-vendor weight-initialization pattern जो हमने पिछले week NVIDIA के Nemotron-Labs-Diffusion (Ministral3 पर built) के साथ cover किया था। Microsoft की अपनी Phi family है लेकिन browser-agent starting point के रूप में Qwen3.5 चुना। OpenAI Operator comparison strategic move है। Microsoft OpenAI का सबसे बड़ा investor और partner है, फिर भी Microsoft Research एक research-grade browser agent ship कर रहा है जो Operator को Online-Mind2Web पर 13.7 points से outperform करता है। Microsoft Research में in-house build करके Microsoft अपनी OpenAI dependence को hedge कर रहा है। तीन sizes (4B/9B/27B) का मतलब है deployment flexibility: edge tasks 4B local पर, server-grade tasks 27B datacenter में। Meta-action space जो context management और user clarification support करता है — personal info के लिए pause, ambiguous tasks के लिए pause, irreversible से पहले pause — वो differentiator है जो browser agents को shippable बनाता है। Agents जो destructive actions से पहले नहीं पूछते वो ऐसे agents हैं जिन्हें आप production में put नहीं कर सकते।

Ecosystem context। Browser-agent space closed-API incumbents के परे heat up हो रहा है। OpenAI Operator (closed, GPT-class)। Google Gemini 2.5 Computer Use (closed, Gemini-based)। Anthropic Computer Use (closed, Claude-based)। अब Microsoft Fara1.5 (Qwen3.5-based, तीन sizes, availability TBD)। Benchmark numbers कहते हैं कि Microsoft की research-grade family पहले से ही closed-API frontier को Online-Mind2Web पर beat करती है। अगर Microsoft Fara1.5 weights publicly release करता है, open-weights browser-agent category को रातोंरात real frontier-class entry मिलती है। अगर वे closed रखते हैं और Azure/Bing/Edge integration के through route करते हैं, तो यह Microsoft की defense बन जाती है OpenAI के agent layer capture करने के against। किसी भी तरह, benchmark pressure अब Operator और Gemini Computer Use पर है comparable numbers के साथ next iteration ship करने के लिए। आज browser-automation products ship करने वाले builders के लिए: 4B model 80.8% WebVoyager पर interesting size class है — local deployment के लिए accessible enough, अधिकांश browser tasks handle करने के लिए capable enough।

सोमवार: अगर आप browser-automation या computer-use products ship करते हैं (RPA replacements, web scraping, QA testing, customer-support workflow automation), जैसे ही availability land करे Fara1.5 evaluate करें। अपनी task distribution पर specific tests: (1) MFA के साथ login flows, (2) conditional logic के साथ form filling, (3) state preserve करते हुए multi-page navigation, (4) unexpected page states से error-recovery। 4B variant start करने का size है — अगर 80.8% WebVoyager आपकी tasks पर 70-80% translate होता है, तो आपके पास datacenter inference के बिना deployable agent है। Closed-source competitors (Operator, Gemini Computer Use, Anthropic Computer Use) के लिए: pricing competitive position को अभी real pressure मिली। Operator $200/महीना per user versus locally Fara1.5-4B deploy करना fundamentally different cost curve है अगर Microsoft weights release करता है। अगले 48 hours में HuggingFace और Microsoft Research blog weight और license announcement के लिए watch करें। Benchmark gap (72% vs 58%) real है, और downstream competitive consequence इस पर depend करती है कि Microsoft weights ship करता है या Fara1.5 को Azure-internal capability के रूप में keep करता है।

Microsoft Fara1.5: 4B/9B/27B browser agents, 27B Mind2Web पर 72% vs Operator 58%

और समाचार