CoderFunda
  • Home
  • About us
    • Contact Us
    • Disclaimer
    • Privacy Policy
    • About us
  • Home
  • Php
  • HTML
  • CSS
  • JavaScript
    • JavaScript
    • Jquery
    • JqueryUI
    • Stock
  • SQL
  • Vue.Js
  • Python
  • Wordpress
  • C++
    • C++
    • C
  • Laravel
    • Laravel
      • Overview
      • Namespaces
      • Middleware
      • Routing
      • Configuration
      • Application Structure
      • Installation
    • Overview
  • DBMS
    • DBMS
      • PL/SQL
      • SQLite
      • MongoDB
      • Cassandra
      • MySQL
      • Oracle
      • CouchDB
      • Neo4j
      • DB2
      • Quiz
    • Overview
  • Entertainment
    • TV Series Update
    • Movie Review
    • Movie Review
  • More
    • Vue. Js
    • Php Question
    • Php Interview Question
    • Laravel Interview Question
    • SQL Interview Question
    • IAS Interview Question
    • PCS Interview Question
    • Technology
    • Other

03 September, 2024

How do I pass data directly from Google Cloud TTS SynthesizeSpeechResponse object to sounddevice without creating a file on disk

 Programing Coderfunda     September 03, 2024     No comments   

Question




How do I pass data from google.cloud.texttospeech_v1.types.cloud_tts.SynthesizeSpeechResponse object to sounddevice play function without creating an audio file on disk


Problem




Data supplied by SynthesizeSpeechResponse has RIFF WAV header information in byte stream, but I do not know how to manipulate it as such. Can easily save file to disk, then read that file to play audio, but want to keep data in memory without write to disk.


err=LibsndfileError(2, 'Error opening b\'RIFF\\xcbt\\x00\


response.doc reminds us that "Note: as with all bytes fields, protobuffers use a pure binary representation, whereas JSON representations use base64.", but I do not find a way to access this JSON readily.


Environment





* torch base platform (CUDA 11.8)

* google-3.0.0

* google-cloud-texttospeech-2.17.2

* transformers-4.44.1

* soundfile-0.11.1

* sounddevice-0.5.0

* numpy-1.26.4






Code


import soundfile as sf
import sounddevice as sd

from transformers import pipeline
from google.cloud import texttospeech

text_to_speak = "The quick brown fox"

google_client = texttospeech.TextToSpeechClient()
input_text = texttospeech.SynthesisInput(text=text_to_speak)
selected_voice_name = "en-US-Standard-B"
voice = texttospeech.VoiceSelectionParams(
language_code=selected_voice_name[:4],
name=selected_voice_name,
ssml_gender=texttospeech.SsmlVoiceGender.FEMALE,
)
audio_config = texttospeech.AudioConfig(
audio_encoding=texttospeech.AudioEncoding.LINEAR16 # MULAW # ALAW
)
response = google_client.synthesize_speech(
input=input_text, voice=voice, audio_config=audio_config
)

# The below works by creating file on disk, then reading and playing that file

output_file_name = "google_cloud_tts_output.wav"

with open(output_file_name, "wb") as out:
out.write(response.audio_content)
r_data, r_samplerate = sf.read(output_file_name)
sd.play(r_data, r_samplerate)

# The below does not work
# AttributeError: Unknown field for SynthesizeSpeechResponse: decode

sd.play(response.decode(response.audio_content), 16000)

# The below does not work because the data type is unsupported
# In this case the '674416' is subject to change, it could be a five or six digit number
# TypeError: Unsupported data type: 'bytes674416'

sd.play(response.audio_content, 16000)

# The below does not work because the data is malformatted
# File "c:\Projects\ventriloquist\.venv\lib\site-packages\soundfile.py", line 1216, in _open
# raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
# soundfile.LibsndfileError: Error opening b'RIFFFI\x01\x00WAVEfmt \x10\x00\x00\x00\x01\x00\x01\x00\xc0]\x00\x00\x80\xbb\x0
# ...
# xff\xf0\xff\xf2\xff\xee\xff': System error.

sd.play(sf.read(response.audio_content), 16000)

# The below does not work because the data is malformatted
# ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

import io

io_ir = io.BytesIO(response.audio_content)

sd.play(sf.read(io_ir), 16000)



When working with speech synthesis data returned by Hugging Face transformers, it is easy to convert this data object to what sounddevice is expecting:
speech = tts_model.generate_speech(inputs["input_ids"], speaker_embeddings, vocoder=vocoder)
sd.play(speech.numpy(), 16000)



When working with the Google Cloud TTS SynthesizeSpeechResponse object, the expectation is to write the contents to a wave or MP3 file; details on the format of the data provided by the SynthesizeSpeechResponse object are not provided, but empirically we know that it is formatted to be readily saved to disk.


The below does not work
AttributeError: Unknown field for SynthesizeSpeechResponse: decode
sd.play(response.decode(response.audio_content), 16000)



The below does not work because the data type is unsupported
In this case the '674416' is subject to change, it could be a five or six digit number
TypeError: Unsupported data type: 'bytes674416'
sd.play(response.audio_content, 16000)



The below does not work because the data is malformatted
File "c:\Projects\ventriloquist.venv\lib\site-packages\soundfile.py", line 1216, in _open
raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
soundfile.LibsndfileError: Error opening b'RIFFFI\x01\x00WAVEfmt \x10\x00\x00\x00\x01\x00\x01\x00\xc0]\x00\x00\x80\xbb\x0
...
xff\xf0\xff\xf2\xff\xee\xff': System error.
sd.play(sf.read(response.audio_content), 16000)



Some methods already tried




The below does not work because the data is malformatted
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.
import io
io_ir = io.BytesIO(response.audio_content)
sd.play(sf.read(io_ir), 16000)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Google+
  •  Stumble
  •  Digg
Email ThisBlogThis!Share to XShare to Facebook

Related Posts:

  • Laravel Log Keeper Laravel Log Keeper helps rotating your logs while storing them anywhere you want with custom local/remote retention policies.Mathias Grimm … Read More
  • Mail Preview Laravel Driver Mohamed Said has created a new mail driver for Laravel that converts emails to files.Installation is simple. Just require the package:compo… Read More
  • LaraDock Homestead for Docker LaraDock is a Laravel Homestead Docker project that aims to bring Homestead to Docker.LaraDock strives to make the development experience e… Read More
  • Laravel Package for failed Queue Notifications The company Spatie released a new open source package to handle queue job failure notifications. When one fails it will send you an email with t… Read More
  • Laravel Backup v3 is now released Freek Van der Herten today announced the v3 release of the Spatie Backup manager. This release adds a whole host of new features including … Read More
Newer Post Older Post Home

0 comments:

Post a Comment

Thanks

Meta

Popular Posts

  • Spring boot app (error: method getFirst()) failed to run at local machine, but can run on server
    The Spring boot app can run on the online server. Now, we want to replicate the same app at the local machine but the Spring boot jar file f...
  • Log activity in a Laravel app with Spatie/Laravel-Activitylog
      Requirements This package needs PHP 8.1+ and Laravel 9.0 or higher. The latest version of this package needs PHP 8.2+ and Laravel 8 or hig...
  • Laravel auth login with phone or email
          <?php     Laravel auth login with phone or email     <? php     namespace App \ Http \ Controllers \ Auth ;         use ...
  • Failed to install 'cordova-plugin-firebase': CordovaError: Uh oh
    I had follow these steps to install an configure firebase to my cordova project for cloud messaging. https://medium.com/@felipepucinelli/how...
  • Cashier package and Blade files
    I'm a little confused about this Cashier package. I installed it using the Laravel website (with composer), but noticed there's no...

Categories

  • Ajax (26)
  • Bootstrap (30)
  • DBMS (42)
  • HTML (12)
  • HTML5 (45)
  • JavaScript (10)
  • Jquery (34)
  • Jquery UI (2)
  • JqueryUI (32)
  • Laravel (1017)
  • Laravel Tutorials (23)
  • Laravel-Question (6)
  • Magento (9)
  • Magento 2 (95)
  • MariaDB (1)
  • MySql Tutorial (2)
  • PHP-Interview-Questions (3)
  • Php Question (13)
  • Python (36)
  • RDBMS (13)
  • SQL Tutorial (79)
  • Vue.js Tutorial (68)
  • Wordpress (150)
  • Wordpress Theme (3)
  • codeigniter (108)
  • oops (4)
  • php (853)

Social Media Links

  • Follow on Twitter
  • Like on Facebook
  • Subscribe on Youtube
  • Follow on Instagram

Pages

  • Home
  • Contact Us
  • Privacy Policy
  • About us

Blog Archive

  • September (100)
  • August (50)
  • July (56)
  • June (46)
  • May (59)
  • April (50)
  • March (60)
  • February (42)
  • January (53)
  • December (58)
  • November (61)
  • October (39)
  • September (36)
  • August (36)
  • July (34)
  • June (34)
  • May (36)
  • April (29)
  • March (82)
  • February (1)
  • January (8)
  • December (14)
  • November (41)
  • October (13)
  • September (5)
  • August (48)
  • July (9)
  • June (6)
  • May (119)
  • April (259)
  • March (122)
  • February (368)
  • January (33)
  • October (2)
  • July (11)
  • June (29)
  • May (25)
  • April (168)
  • March (93)
  • February (60)
  • January (28)
  • December (195)
  • November (24)
  • October (40)
  • September (55)
  • August (6)
  • July (48)
  • May (2)
  • January (2)
  • July (6)
  • June (6)
  • February (17)
  • January (69)
  • December (122)
  • November (56)
  • October (92)
  • September (76)
  • August (6)

  • Failed to install 'cordova-plugin-firebase': CordovaError: Uh oh - 9/21/2024
  • pyspark XPath Query Returns Lists Omitting Missing Values Instead of Including None - 9/20/2024
  • SQL REPL from within Python/Sqlalchemy/Psychopg2 - 9/20/2024
  • MySql Explain with Tobias Petry - 9/20/2024
  • How to combine information from different devices into one common abstract virtual disk? [closed] - 9/20/2024

Laravel News

  • Auto-translate Application Strings with Laratext - 5/16/2025
  • Simplify Factory Associations with Laravel's UseFactory Attribute - 5/13/2025
  • Improved Installation and Frontend Hooks in Laravel Echo 2.1 - 5/15/2025
  • Filter Model Attributes with Laravel's New except() Method - 5/13/2025
  • Arr::from() Method in Laravel 12.14 - 5/14/2025

Copyright © 2025 CoderFunda | Powered by Blogger
Design by Coderfunda | Blogger Theme by Coderfunda | Distributed By Coderfunda