AWS WorkSpaces अब MCP agents को screenshots से legacy desktop apps चलाने देता है, Zubnet AI समाचार

AWS ने इस सप्ताह Amazon WorkSpaces को AI agents के लिए preview में खोल दिया — किसी भी MCP-compatible agent framework को, चाहे वह LangChain हो, CrewAI हो, या AWS का अपना Strands Agents, एक managed virtual desktop दिया कि legacy applications को computer vision और input simulation से चलाए। agent IAM से authenticate होता है, एक pre-signed URL पर WorkSpaces instance से जुड़ता है, और एक मानव कर्मचारी की तरह ही interact करता है: screenshots लेना, click करना, type करना, scroll करना। target application को पता नहीं चलता कि कोई agent उसे चला रहा है; software में कुछ भी बदलना नहीं पड़ता। AWS ने pattern का demo Bedrock पर एक Strands agent से दिखाया — एक sample pharmacy system में prescription refill workflow: patient lookup, medication search, order placement, refill confirmation — सब कुछ बिना API के।

architecture demo से ज़्यादा रोचक है। WorkSpaces एक managed MCP endpoint को agent के control plane के रूप में expose करता है, जिससे framework का चुनाव builder के पास रहता है, बजाय AWS-native runtimes में बंधे रहने के। security मानव WorkSpaces मॉडल से inherit होती है: isolated instances, हर agent के लिए अलग IAM identity (ताकि CloudTrail agentic actions को मानवीय actions से अलग कर सके), CloudWatch observability, और per-stack configurable capabilities — resolution, image format, screenshot storage, input enable। ईमानदार लागत वाली सच्चाई वह हिस्सा है जिसे अधिकांश reads भूल जाएंगे: Reflex के हाल के benchmark ने दिखाया कि एक vision agent ने एक task पूरा करने के लिए लगभग 5 लाख input tokens खाए, जबकि API agent ने वही 12,000 tokens में निपटा दिया — 45× का अंतर, vision agent ने 17 minute लिए जबकि API path में 20 second लगे। Reflex के Palash Awasthi ने इसे साफ़ कहा: "बेहतर vision models per-screenshot error rate कम करते हैं, लेकिन वे relevant data तक पहुँचने के लिए ज़रूरी screenshots की संख्या कम नहीं करते।"

यहाँ का ecosystem signal दो track पर है। AWS दांव लगा रहा है कि Gartner के बताए 75% organizations — जो अभी भी legacy apps को बिना modern API के चला रहे हैं — और 71% Fortune 500 कंपनियाँ जिनके critical processes mainframe पर बिना programmatic access के हैं — वे 45× ज़्यादा महंगे agent को multi-year modernization project के बजाय चुनेंगे, क्योंकि enterprise pricing पर गणित सच में काम करती है। MCP plumbing का महत्व WorkSpaces brand से ज़्यादा है: यह पहला managed cloud-desktop-as-MCP-endpoint है, जो इसे Anthropic के Claude computer-use और OpenAI के Operator का cloud-side counterpart बनाता है। Microsoft भी Windows 365 for AI agents के साथ यही category बना रहा है। bottleneck अब यह नहीं है कि agents GUI चला सकते हैं या नहीं (Claude 3.5 Sonnet computer-use ने 2024 के अंत में यह दिखा दिया); अब सवाल है कि agent जिस desktop पर चलता है उसे host कौन करता है। AWS ने अभी एक MCP front door के साथ इस layer पर बोली लगाई।

regulated industries में agents तैनात कर रहे builders के लिए: per-agent IAM pattern, CloudTrail audit, और isolated-instance model — यदि तुम कहीं और बना रहे हो तो यही copy करने योग्य हिस्से हैं — regulators को बिल्कुल यही trace चाहिए, "agent पर भरोसा करो" वाली कहानी नहीं। computer-use vs API integration के बीच मूल्यांकन कर रहे builders के लिए: अपनी scale और अपने workflow length पर token math करो। API पथ 20 second में, vision agents 17 minute में — API मौजूद हो तो वह सस्ता है; लेकिन जिन legacy stacks पर modernization एक साल का काम और सात-अंकों का खर्च है, वहाँ अगले हफ़्ते डिलीवर होने वाला 45× महंगा agent ही तर्कसंगत विकल्प है। preview US East (N. Virginia, Ohio), US West (Oregon), Canada Central, चार European regions और पाँच Asia-Pacific regions में उपलब्ध है, sample code GitHub पर है।

AWS WorkSpaces अब MCP agents को screenshots से legacy desktop apps चलाने देता है

और समाचार