Gemini flash 2.0 lite does support audio files. In google AI Studio you can simply upload them and ask a question like "Transcribe this file"
I know i can upload the file somewhere and provide a link to the file but is it also possible to directly upload the file in an openrouter api request to gemini?
I tried something like this, but didn't work:
self.update_status("Transcribing file with GEMINI API (OpenRouter)...")
try:
with open(file_path, "rb") as audio_file:
audio_data = audio_file.read()
encoded_audio = base64.b64encode(audio_data).decode("utf-8")
# Message now instructs Gemini to transcribe the file and include timestamps.
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "Transcribe the file and include timestamps."},
{
"type": "audio_file",
"audio_file": {
"data": encoded_audio,
"filename": os.path.basename(file_path)
}
}
]
}
]
completion = self.gemini_client.chat.completions.create(
extra_headers={
"Authorization": f"Bearer {OPENROUTER_API_KEY}",
"HTTP-Referer": HTTP_REFERER,
"X-Title": X_TITLE,
},
extra_body={},
model="google/gemini-2.0-flash-lite-001",
messages=messages,
timeout=120
)