Getting Claude Code to work with AWS Bedrock requires some configuration that isn't obvious from the documentation. Here's what I learned after working through a few issues.
The Solution
The key was combining multiple environment variables and properly exporting AWS credentials. Here's the shell script I created to make it all work:
#!/bin/zsh
eval "$(aws configure export-credentials --profile ravenna-staging --format env)"
export AWS_REGION=us-west-2
export CLAUDE_CODE_USE_BEDROCK=1
export ANTHROPIC_MODEL='us.anthropic.claude-3-7-sonnet-20250219-v1:0'
export DISABLE_PROMPT_CACHING=1
claude "$@"
Critical Components
There were two critical parts to making this work:
- Exporting AWS credentials from an existing profile into the current terminal environment
- Setting the AWS_REGION explicitly (even if your default AWS region is set in your config)
Sources:
I had a couple conversations with the new Multimodal Live API through the Google AI Studio.
I wanted to have a voice-to-voice conversation about an idea I was working through, and wanted Flash to walk me through what the potential pitfalls are.
Flash didn't give me the sophisticated kind of response that I expect from Claude or ChatGPT. I specifically asked Flash to compare and contrast different AWS instances and it wouldn't list them out (despite it being pretty obvious that it would have that knowledge).
Based on the style of the voice, it sounds like the voice is generated from text-to-speech as opposed to "natively multimodal" right now. Some of the responses were a little awkward.
I'm sure it's just early, and they'll make improvements. It's also unfair to compare the "Flash" line of models with Sonnet and 4o.
I really just want Advanced Voice Mode with GPTs! I want to be able to orchestrate some basic workflows with my voice. Coming soon, I'm sure.
I've been working with the Deepseek API while working on my tool carrier.nvim.
It has a new beta feature that Anthropic also has that is really excellent, called chat prefix completion. You can just set what the assistant message response should start with.
This is really useful for contexts like code completion, where you want the model to always respond with just the code you want by starting the response with three backticks and a newline (essentially saying the assistant response should start a code block).
https://platform.deepseek.com/api-docs/chat_prefix_completion
This makes getting Just the Code back consistently really easy for my tool! I highly recommend you try this out in your applications, especially if you're asking for code completions.