AssemblyAI Launches Ruby SDK for Enhanced Audio Processing

On Aug 13, 2024

Ted Hisokawa
Aug 13, 2024 04:37

AssemblyAI has introduced a Ruby SDK, enabling users to transcribe audio, use audio intelligence models, and apply LLMs to audio data.

AssemblyAI has unveiled its latest offering, the Ruby SDK, aimed at simplifying the process of utilizing advanced speech AI models. According to AssemblyAI, this new SDK allows developers to transcribe audio, leverage audio intelligence models, and apply Large Language Models (LLMs) to their audio data using LeMUR.

Transcribing Audio Files

The Ruby SDK provides an efficient way to transcribe audio files. Users can transcribe both remote and local audio files by following simple code snippets. For instance, to transcribe a remote audio file, users can use the following code:

require 'assemblyai'

client = AssemblyAI::Client.new(api_key: 'YOUR_API_KEY')

transcript = client.transcripts.transcribe(
  audio_url: 'https://storage.googleapis.com/aai-docs-samples/nbc.mp3'
)

abort transcript.error if transcript.status == AssemblyAI::Transcripts::TranscriptStatus::ERROR

puts transcript.text

Similarly, local files can be transcribed by first uploading the file and then processing it:

uploaded_file = client.files.upload(file: '/path/to/your/file')
transcript = client.transcripts.transcribe(
  audio_url: uploaded_file.upload_url
)

Detailed instructions for transcribing audio files are available in the AssemblyAI documentation.

Applying LLMs to Audio Data with LeMUR

The Ruby SDK also supports the application of LLMs to audio data using LeMUR. Users can build applications that summarize transcripts or perform other tasks:

response = client.lemur.task(
  transcript_ids: [transcript.id],
  prompt: 'Summarize this transcript.'
)

puts response.response

More information on using LLMs with audio data can be found in the AssemblyAI documentation.

Utilizing Audio Intelligence Models

Another key feature of the Ruby SDK is its capability to use audio intelligence models. These models can analyze audio files for various attributes, such as sentiment analysis:

transcript = client.transcripts.transcribe(
  audio_url: 'https://storage.googleapis.com/aai-docs-samples/nbc.mp3',
  sentiment_analysis: true
)

abort transcript.error if transcript.status == AssemblyAI::Transcripts::TranscriptStatus::ERROR

transcript.sentiment_analysis_results.each do |result|
  puts result.text
  puts result.sentiment
  puts result.confidence
  printf("%d - %d\n", start: result.start, end: result.end_)
end

Additional details on audio intelligence models are available in the AssemblyAI documentation.

Getting Started with the Ruby SDK

To begin using the Ruby SDK, developers can refer to the installation instructions and the README of the Ruby SDK GitHub repository. For any issues or feedback, users are encouraged to file an issue on the GitHub repository.

Image source: Shutterstock