‘Man in Finance’: ChatGPT vs Gemini vs Llama vs Meta AI vs Claude

Mustafa Najoom
8 min readJun 28, 2024

--

To test each chatbot’s language capabilities, they were asked to translate the ‘man in finance’ meme using the prompt:

“can you write “i’m looking for a man in finance, trust fund, 6'5, blue eyes” in korean, arabic and spanish?”

To test each chatbot’s image-generation capabilities, they were provided with the following prompt:

“Generate an image of an ice cream with strawberries.”

The information in this article is correct as of June 2024.

Integrate artificial intelligence applications with your existing information systems, build custom LLMs for your business and stay five steps ahead of your competitors: Hire AI engineers from Gaper.io’s marketplace.

How does ChatGPT work?

ChatGPT uses a transformer-based neural network architecture, which is highly effective for natural language processing tasks. ChatGPT’s training involved a process called unsupervised learning which helps predict the next word in a sentence, based on the context provided by previous words. This process is repeated billions of times to fine-tune the bot’s ability to understand and generate coherent text.

ChatGPT also has an option to manage its ‘memory’ across chats now.

Here’s how it translated the ‘man in finance’ meme.

  • GPT-4: GPT-4 offers better performance in understanding and generating text compared to its predecessors. This version includes improvements in areas like factual accuracy, contextual understanding, and response coherence. GPT-4 can be fine-tuned for specific applications, making it adaptable to various needs and industries.
  • GPT-4o: is faster and better equipped at understanding visual and audio input as compared to other existing models.

Google’s Gemini

Google’s Gemini, particularly the latest 1.5 version, represents a significant advancement in AI technology. It employs a Mixture-of-Experts (MoE) architecture. A traditional Transformer operates using one large neural network, but MoE models get segregated into smaller “expert” neural networks. This in turn enhances efficiency by activating only relevant neural network pathways. This design builds upon Google’s research in transformer models and MoE, such as GShard and Switch-Transformer.

A key feature of Gemini 1.5 is its extended context window, which can handle up to 1 million tokens. This allows the model to process and reason across large volumes of data, such as entire books or long videos, making it highly versatile in applications.

While Gemini can do various languages, it took it upon itself to not translate the ‘man in finance’ meme that it considered ‘materialistic’ and provided me with more ‘balanced’ translations. It also provided transliterations for each language without being asked, something that other chatbots did not.

Meta Llama / Meta AI

The term Meta AI is often confusing…

Meta AI, owned by Meta (formerly Facebook), is a company (or a research lab) that works on building AI, and augmented and artificial reality technologies.

Meta AI is also the name of the AI assistant developed by Meta, that understands and responds to human input in a conversational manner.

Meta AI was trained on a massive dataset of diverse text from various sources, including but not limited to web pages, books, articles, and conversations (the conversations part was mentioned by the Meta AI Assistant itself…🤔)

Meta AI is also proficient in multiple languages. However, when asked to translate the ‘looking for a man in finance’ meme, it seemed to translate it too literally. Perhaps native Arabic/Korean/Spanish speakers can weigh in?

Meta Llama 2 and 3

Meta Llama is a family of LLMs developed by Meta AI. Also using a transformer-based architecture, Meta Llama models are trained on massive datasets and designed to perform various tasks like text generation, question answering, and code analysis.

Meta Code Llama

Meta Code Llama is an LLM for coding. It aims to create better and more productive workflows for developers and make learning code easier for aspiring developers. Built on Llama 2 using code-specific datasets, Meta Code Llama can understand and generate both natural language and code-based prompts and responses (including code completion and debugging). It has three models:

Microsoft’s Copilot

It is sometimes frustrating to talk to Microsoft’s Copilot. In addition to being slow, it often refuses to answer questions and tells you that, “ It might be time to move onto a new topic. Let’s start over “, and forces you to start a new chat while discarding the old one. In this case, I was merely asking it what its exact name was (to understand how Bing, Bing AI, Microsoft Copilot, Microsoft 365 Copilot were different/interconnected or if some of these terms are used synonymously).

The Copilot provided a simple and literal translation of the man in finance meme.

Anthropic’s Claude

Meet Claude, the AI assistant from Anthropic, that’s making waves in the AI world with its impressive long conversation memory and informative responses. Claude comes in three flavors: Haiku, Sonnet, and Opus, each offering different capabilities.

Claude’s standout feature is its massive 200,000 token context window, allowing it to handle lengthy documents with ease. It’s not just about quantity; Claude excels in quality too, boasting low hallucination rates and high accuracy. It sets new industry benchmarks in several areas:

- Graduate-level reasoning (GPQA)
- Undergraduate-level knowledge (MMLU)
- Coding proficiency (HumanEval)

Claude isn’t just book-smart; it’s got a sense of humor and can grasp complex instructions. It even transcribes text from unclear or incomplete images, making it a versatile tool for tasks in retail, logistics, and financial services.

Let’s see how well Claude did in translating the man in finance meme:

Interestingly, while Gemini also changed the height measurement unit in Korean (and completely omitted to state it in Arabic and Spanish), it did not explain why. Claude provided the cultural context when changing the unit to centimenters and meters in Korean and Spanish translations respectively.

Anthropic prioritizes ethical considerations, incorporating measures to reduce biases and ensure fair and responsible AI usage. This includes extensive testing and refinement to mitigate any unintended biases in the model’s outputs.

With a “best-in-class jailbreak resistance”, Claude also adheres to various security standards thanks to its ability to access AWS and GCP, SOC 2 Type II certification, and HIPAA compliance options.

Cost Comparison of the Chatbots

The following table shows a general cost comparison of the chatbots and their various versions. Each chatbot is available for free to use. More advanced versions have a price.

Image Generation: Comparison of the Chatbots

The following command was input into a new chat of every chatbot:

“Generate an image of an ice cream with strawberries.”

ChatGPT

It was possible to generate this only on the phone app — it kept having ~issues~ on the website.

Gemini

Not bad. Strawberries are missing in the third picture but the ice cream color is correct. Gemini also allowed a high quality download of the pictures.

Meta AI

MetaAI’s ice creams look better than Gemini’s with interesting backgrounds and crockery.

Copilot

Copilot’s Designer is powered by DALL-E 3. Their images adopt that typical AI-generated appearance, but still look good and were presented in a high quality. The detailed backgrounds are quite impressive but all of its ice-creams were in cones for some reason.

Claude

Claude could not generate images.

All chatbots refused to generate images of humans.

Which Chatbot Generated the Best Images?

While every chatbot generated impressive images (ChatGPT’s ice cream was interesting in its own right), I was particularly impressed by Meta AI’s pictures.

Can the ChatGPT Chatbot Generate Images or Not?

ChatGPT can generate basic line drawings of prompts it is given but like other chatbots, it does not generate images of humans. When asked to generate a picture of Harry Styles, it responded with, “ As a text-based AI, I don’t have the capability to directly generate realistic images or photographs of specific individuals, including Harry Styles. However, I can suggest a few methods you can use to obtain or generate images “.

After this, it kept reiterating its inability to generate any kind of image even though it had just produced the ice cream before.

When asked to generate an image via the Android app, ChatGPT happily generated the previously shown illustration of an ice cream with strawberries.

For the images that it did generate, I can only describe them as being comparable to what little kids used to make on Microsoft Paint.

Conclusion

Choosing the ideal chatbot depends heavily on your specific needs. Each competitor in this arena brings unique strengths to the table.

For creative text generation and pushing boundaries in storytelling, ChatGPT and Claude are a great choice. If informative responses with a focus on factual accuracy are your priority, then Gemini and its access to vast search data might be the perfect fit.

For complex problem-solving tasks requiring analysis of various data formats, the advanced capabilities of Llama stand out. Meta AI, with its consumer-facing assistant functionality, offers a user-friendly way to interact with powerful AI technology.

Finally, Claude excels in long conversation memory and nuanced understanding, fostering a more natural and engaging dialogue experience.

The future of chatbots is undeniably collaborative. As these models continue to evolve, we can expect them to learn from each other’s strengths. Imagine a world where ChatGPT’s narrative prowess merges with Gemini’s factual grounding, or where Llama’s problem-solving abilities are integrated with Meta AI’s accessibility.

Don’t be a laggard in your industry: become an early adopter by integrating artificial intelligence applications with your business processes, use custom LLMs , and have a stellar team of AI engineers work for you.

Originally published at https://gaper.io on June 28, 2024.

--

--