Table of Contents
Gemini 1.5 Flash, Gemini 1.5 Pro and Gemini 1.0 Pro are all powerful tools, but they cater to different needs and offer varying capabilities. Here’s a breakdown of their key differences:
Modality
Gemini 1.5 Flash | Gemini 1.5 Pro | Gemini 1.0 Pro |
The fastest and most cost-effective Gemini multimodal model, ideal for high-volume, latency-sensitive tasks. | Truly multimodal, handling text, images, videos, and audio seamlessly. | Primarily text-based, with limited image and video support (1.0-pro-vision). |
Input/Output
All models support a wide range of inputs and outputs, but the Gemini 1.0 Pro is limited to 32,760 tokens and primarily text-based outputs.
Gemini 1.5 Flash | Gemini 1.5 Pro | Gemini 1.0 Pro |
2,000,000 token limit | 2,000,000 token limit | Text input limited to 32,760 tokens |
Video, Audio, images, and text | Video, Audio, images, and text | Vision version allows up to 16 images or 1 video clip (2 minutes max) alongside text. |
JSON mode supported | JSON mode supported | Output formats are primarily text-based |
Functionalities
Gemini 1.5 Flash and Gemini 1.5 Pro offer advanced functionalities including function calling, system instructions, and enhanced safety controls. Gemini 1.0 Pro focuses on text generation, translation, and basic image/video understanding.
Gemini 1.5 Flash | Gemini 1.5 Pro | Gemini 1.0 Pro |
Function calling: Integrate external systems for actions beyond the model’s knowledge. | Same as Flash | Focuses on text generation, translation, and basic image/video understanding. |
System instructions: Provide guidance for better performance and desired response styles. | Same as Flash | Offers temperature and topK parameters for controlling response creativity. |
Grounding with Google Search: Access real-time information for more accurate and relevant results (text only). | Same as Flash | |
Enhanced safety controls: Fine-tune response filtering based on specific categories and probability thresholds. | Same as Flash |
Cost
While Gemini 1.5 Pro is 20 times more expensive than Gemini 1.0 Pro for both input and output, this is due to its enhanced capabilities in handling larger and more complex data sets and its ability to process multimodal inputs seamlessly. On the other hand, Gemini 1.5 Flash has the same cost per input and output as Gemini 1.0 Pro, but it offers the fastest processing speeds, making it ideal for high-volume, latency-sensitive tasks.
23 Aug 2024 Update: Google reduced prices on Gemini 1.5 Flash and Gemini 1.5 Pro. Below is the new pricing table.
New Pricing after 12 Aug 2024:
Gemini 1.5 Flash | Gemini 1.5 Pro | Gemini 1.0 Pro |
Text Input: $0.00001875 / 1k characters Text Output: $0.000075 / 1k characters | Text Input: $0.00125 / 1k characters Text Output: $0.00375 / 1k characters | Text Input: $0.000125 / 1k characters Text Output: $0.000375 / 1k characters |
Image Input: $0.00002 / image | Image Input: $0.001315 / image | Image Input: $0.0025 / image |
Video Input: $0.00002 / second | Video Input: $0.001315 / second | Video Input: $0.002 / second |
Audio Input: $0.000002 / second | Audio Input: $0.000125 / second | N/A |
Old Pricing before 12 Aug 2024:
Gemini 1.5 Flash | Gemini 1.5 Pro | Gemini 1.0 Pro |
Text Input/Output: $0.000125 / 1k characters (input) & $0.000375 / 1k characters (output) | Text Input/Output: $0.0025 / 1k characters (input) & $0.0075 / 1k characters (output) | Text Input/Output: $0.000125 / 1k characters (input) & $0.000375 / 1k characters (output) |
Image Input: $0.0001315 / image | Image Input: $0.00265 / image | Image Input: $0.0025 / image |
Video Input: $0.0001315 / second | Video Input: $0.00265 / second | Video Input: $0.002 / second |
Audio Input: $0.0000125 / second | Audio Input: $0.00025 / second | N/A |
When to choose which
Choosing the right model depends on your specific needs: Gemini 1.5 Flash, with its high-speed processing, is ideal for time-sensitive applications requiring quick turnarounds. Gemini 1.5 Pro, offering extensive multimodal interactions, is suited for complex, data-intensive projects that require nuanced understanding and output capabilities.
Gemini 1.5 Flash | Gemini 1.5 Pro | Gemini 1.0 Pro |
The latest all-rounder. | Ideal for complex, multimodal projects requiring advanced functionality, large-scale input, and flexible output formats. | Suitable for text-centric tasks, basic image analysis, and cost-sensitive applications. |
Conclusion
- Gemini 1.5 Flash is tailored for users needing rapid response times without the complexities of deep multimodal functionalities.
- Gemini 1.5 Pro is optimal for users whose requirements extend to advanced data processing across various media types, willing to invest more for broader capabilities.
- Gemini 1.0 Pro is no longer a viable option after the price deduction of Gemini 1.5 Flash. Flash is now better and cheaper than 1.0 Pro.