Musk says that Grok will be available as an exclusive feature on X, formerly Twitter, and will also be available as a standalone app. xAI claims Grok outperformed Llama 2 and OpenAI’s GPT 3.5 at tests such as the GSM8k comprising middle-school mathematics problems, multidisciplinary multiple choice questions (MMLU), and Python coding-oriented tasks in HumanEval.

However, Grok-1 is still a healthy margin behind seasoned players like OpenAI’s GPT-4 powering the latest iteration of ChatGPT, Google’s PaLM 2, and Anthropic’s Claude 2 models. The biggest difference happens to be the size of the training dataset and the amount of time poured into refining the project, both machine and human-assisted kind.

xAI is working to solve that problem by hiring human experts from different domains to improve it. Grok has one key aspect in its favor, though: the chatbot is connected to the internet and pulls data in real time from X. In contrast, OpenAI waited months before it connected ChatGPT to the internet through web plug-ins. But once again, it’s a risky endeavor, as false information that often goes viral could very well dictate Grok’s response until it’s flagged and corrected. “We will work towards developing reliable safeguards against catastrophic forms of malicious use,” assures xAI. Right now, Grok only supports text-based interactions, but the company says it has plans to add multi-modal capabilities down the road so that the AI chatbot can also process image and audio inputs, as well.


Source link