Today we are rolling the initial version Gemini 2.5 flash In Precursor Google AI Studio and Vertex by Gemini API by AI. In view of the popular foundation of 2.0 Flash, this new version delivers a large upgrade to logic capabilities, while still prioritizing motion and cost. Gemini 2.5 Flash is our first complete hybrid logic model, giving developers the ability to turn or close the idea. This model also allows developers to set up a thinking budget to find the right trade between quality, cost and delay. With To think, Developers can maintain a fast pace of 2.0 flash, and improve performance.
Our Gemini 2.5 models are thinking of models, which are capable of arguing through their thoughts before answering. Instead of producing output immediately, the model can process the “idea” to better understand the prompt, break complex tasks and plan a response. On complex tasks that require multiple steps of logic (such as solving math problems or analyzing research questions), the thinking process allows the model to reach more accurate and comprehensive answers. In fact, Gemini 2.5 flash performs strong at the solid prompts in Almena, which is only 2.5 Pro.
2.5 Flash has a comparative matrix with other leading models for price and size fraction.
Our highest cost-efficient idea model
2.5 Flash is leading as a model with the best price-to-display ratio.
Gemini 2.5 Flash adds another model of quality cost to Google’s Pareto Frontier.*
Fine-Grind Controls to Manage Thinking
We know that in different use cases there are different trade in quality, cost and delay. To relieve developers, we have enabled a setting Constitution It provides fine-grained control over the maximum number of token that can generate a model when thinking. Allows the high budget model to give more reasons to improve quality. Importantly, the budget sets a cap on how much a flash can think, but the model does not use the full budget if the prompt does not need it.
Improve the quality of logic as the idea of the idea increases.
The model is trained to know how much time to think for a given prompt, and therefore automatically determines how much to think based on the complexity of the task.
If you want to keep the lowest cost and delay when you improve the performance of more than 2.0 flash Set the idea to budget 0. You can also choose Set a specific token budget For the phase of the idea using an API or slider in Google AI Studios and in the vertex AI. Budget can range from 0 to 24576 tokens for 2.5 flash.
The following indications show how many logic can be used in a default lt mode of 2.5 flash.
Asks the requirement of a low reasoning:
Example 1: “Thank you” in Spanish
Example 2: How many provinces do Canada have?
Medium reasoning asks the requirement:
Example 1: You roll two aspects. What are they likely up to 7?
Example 2: There are pickup hours for the basketb for L at my gym 9-3pm on MWF and Tuesday and Saturday afternoon 2-8. If I work 9-6 days in the evening 5 days a week and want to play Basketb Play L 5 hours a week, make a timetable for me to operate them all.
Asks that high logic is needed:
Example 1: Length L = 3 M is rectangular cross-section (width b = 0.1 m, height h = 0.2 m) and is made of steel (E = 200 GPA). It is subject to its full length uniformly distributed load W = 5 kn/m and a point load P = 10 KN on its free end. Calculate the maximum banding stress (σ_max).
Example 2: Write a task evaluate_cells(cells: Dict(str, str)) -> Dict(str, float)
It calculates the values of the spreadsheet cells.
Each cell includes:
- Or like a formula
"=A1 + B1 * 2"
Vibrant+
,-
,*
,/
And other cells.
Requirements:
- Settle the dependence between the cells.
- Handle Operator Perator priority (
*/
Before+-
).
- Cycle
ValueError("Cycle detected at
.") |
- Somebody
eval()
. Just use built-in libraries.
Start building today with Gemini 2.5 flash
The capabilities of thinking with Gemini 2.5 Flash are now available in preview by Gemini API in Google AI Studios and in the vertex AI and Gemini API and Dedicated to Gemini Application. We encourage you for this experiment thinking_budget
Dimensions and explore how controlled logic can help you solve more complex problems.
from google import genai
client = genai.Client(api_key="GEMINI_API_KEY")
response = client.models.generate_content(
model="gemini-2.5-flash-preview-04-17",
contents="You roll two dice. What’s the probability they add up to 7?",
config=genai.types.GenerateContentConfig(
thinking_config=genai.types.ThinkingConfig(
thinking_budget=1024
)
)
)
print(response.text)
Find detailed API references and thoughtful guides in our developer D. C in the C in the DS C.
We will continue to improve Gemini 2.5 flash, before we usually make available for full product use.
*Model prices are obtained from artificial analysis and company documentation