Amazon Bedrockの基本情報とRuntime APIの実行例まとめ－参考資料、モデルの特徴、価格、使用方法、トークンと推論パラメータの説明

小西秀和です。
今回は2023-09-28にGeneral Availability(GA)になったAmazon Bedrockの基本情報、Runtime APIの実行例についてまとめました。また、トークンやパラメータのイメージをつかむための必要最小限の用語説明も所々入れています。
最終更新日：2024/06/21

※AWS re:Invent 2024後の2024年末時点におけるAmazon Bedrockのモデル一覧は以下の記事で紹介しています。
　Amazon Bedrock Models as of 2024 - An Analysis of the Comprehensive Model Catalog
※本記事および当執筆者のその他の記事で掲載されているソースコードは自主研究活動の一貫として作成したものであり、動作を保証するものではありません。使用する場合は自己責任でお願い致します。また、予告なく修正することもありますのでご了承ください。

今回の記事の内容は次のような構成になっています。

Amazon Bedrockの基本情報
Amazon Bedrockの基本的な使い方
まとめ

Amazon Bedrockの基本情報

Amazon Bedrockの参考資料・学習リソース

Amazon Bedrockの理解に役立つ主な参考資料・学習リソースには次のものが挙げられます。
この記事の内容はこれらの参考資料・学習リソースにある情報を基としています。

What's New： Amazon Bedrock is now generally available
AWS Blog: Amazon Bedrock Is Now Generally Available – Build and Scale Generative AI Applications with Foundation Models
モデルごとの価格： Amazon Bedrock Pricing
Workshop： GitHub - aws-samples/amazon-bedrock-workshop: This is a workshop designed for Amazon Bedrock a foundational model service.
AWS Documentation(User Guide)： What is Amazon Bedrock? - Amazon Bedrock
AWS Documentation(API Reference)： Bedrock API Reference - Amazon Bedrock
AWS SDK for Python(Boto3) Documentation(Bedrock)： Bedrock - Boto3 documentation
AWS SDK for Python(Boto3) Documentation(BedrockRuntime)： BedrockRuntime - Boto3 documentation
AWS CLI Command Reference(bedrock)： bedrock — AWS CLI Command Reference
AWS CLI Command Reference(bedrock-runtime)： bedrock-runtime — AWS CLI Command Reference
AWS Management Console(Amazon Bedrock Model Providers): Amazon Bedrock Model Providers - AWS Management Console
AI and Machine Learning Glossary： AI and Machine Learning Glossary for AWS - Knowledge Gained While Studying for AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer - Associate

Amazon Bedrockとは

Amazon BedrockはAI21 LabsのJurassic-2, AmazonのTitan, AnthropicのClaude, CohereのCommand, MetaのLlama 2, Stability AIのStable Diffusionといった基盤モデル(Foundation Models:FMs)を使用するためのAPI経由のアクセスや独自データを使用してFMsをプライベートにカスタマイズする機能を提供するサービスです。
テキスト生成、チャットボット、検索、テキスト要約、画像生成、パーソナライズしたレコメンデーションなどのユースケースに応じて基盤モデルを選択してGenerative AIアプリケーションの構築や拡張ができます。

テキストを扱うGenerative AIにおけるトークン(tokens)とは

Amazon Bedrockのモデル一覧や価格表を見ていく前に制限や課金の単位となっているトークン(tokens)について簡単に説明します。
ただし、ここではイメージのしやすさを重視したため厳密な定義とは異なる可能性があることをご承知おきください。

テキストを扱うGenerative AIにおけるトークン(tokens)とは、テキストを意味のある部分に分割した単位のことです。
トークンは単語に該当する場合もありますが、必ずしも単語と同義ではなく文字、サブワードなどに分割される場合もあります。

例えばAmazon Bedrock is amazing!という文字列を単語ベースにトークン化すると次のようになります。
["Amazon", "Bedrock", "is", "amazing", "!"]

しかし、単語ベースではない別のトークン化手法(Tokenization)を使用すると次のようにスペースも含めて分割される場合があります。
["Amazon", " ", "Bedrock", " ", "is", " ", "amazing", "!"]

トークン化手法には単語ベース以外にUnigram Tokenization、WordPiece、SentencePiece、Byte Pair Encoding(BPE)など高度なものがあり、モデルごとに採用している手法は様々なので、その点は意識しておく必要があります。

特に、トークンベースの料金計算を行う際には、対象となるモデルのトークン化手法に従って、実際に使用する条件に近いシナリオでトークン数を算出するのが最良だと思います。
しかし個人的には、自分が利用しているGenerative AIサービスの月額予算を考えるときなどトークン数の詳しい予測に時間と作業量をかけたくない場合には、Generative AIそのものを使用して計算したり、計算しやすいように1文字＝1トークンとして高めに料金を見積もったりしています。

用意されているモデルの一覧

製品ページのAmazon Bedrock – AWSまたはAWS Management ConsoleのAmazon Bedrock Model Providersを参考に本記事執筆時点のデータをまとめました。

※Embeddings (Embed)をサポートしているモデルは、テキスト入力(単語、フレーズ、大きなテキスト単位など)をテキストの意味内容を含む数値表現(Embedding: 埋め込み)に変換することができます。

Model Provider	Model	Model ID	Max tokens	Modality (Data Type)	Languages	Supported use cases
AI21 Labs	Jurassic-2 Ultra (v1)	ai21.j2-ultra-v1	8191	Text	English Spanish French German Portuguese Italian Dutch	Open book question answering summarization draft generation information extraction ideation
AI21 Labs	Jurassic-2 Mid (v1)	ai21.j2-mid-v1	8191	Text	English Spanish French German Portuguese Italian Dutch	Open book question answering summarization draft generation information extraction ideation
Amazon	Titan Embeddings G1 - Text (v1.2)	amazon.titan-embed-text-v1	8k	Embedding	English, Arabic, Chinese (Sim.), French, German, Hindi, Japanese, Spanish, Czech, Filipino, Hebrew, Italian, Korean, Portuguese, Russian, Swedish, Turkish, Chinese (trad), Dutch, Kannada, Malayalam, Marathi, Polish, Tamil, Telugu and others.	Translate text inputs (words, phrases or possibly large units of text) into numerical representations (known as embeddings) that contain the semantic meaning of the text.
Amazon	Titan Text G1 - Lite	amazon.titan-text-lite-v1	4k	Text	English	Summarization and copywriting.
Amazon	Titan Text G1 - Express	amazon.titan-text-express-v1	8k	Text	English (GA), Multilingual in 100+ languages (Preview)	Open ended text generation brainstorming summarization code generation table creation data formatting paraphrasing chain of though rewrite extraction Q&A chat
Amazon	Titan Image Generator G1	amazon.titan-image-generator-v1	77	Image	English	Text to image generation image editing image variations
Amazon	Titan Multimodal Embeddings G1	amazon.titan-embed-image-v1	128	Embedding	English	Search recommendation personalization
Anthropic	Claude 3.5 Sonnet	anthropic.claude-3-5-sonnet-20240620-v1:0	200k	Text	English and multiple other languages	Complex tasks like customer support Coding Data Analysis and Visual Processing. Streamlining of Workflows Generation of Insights and Production of High-Quality Natural-Sounding Content.
Anthropic	Claude 3 Opus	anthropic.claude-3-opus-20240229-v1:0	200k	Text	English and multiple other languages	Task automation: plan and execute complex actions across APIs and databases, interactive coding R&D: research review, brainstorming and hypothesis generation, drug discovery Strategy: advanced analysis of charts & graphs, financials and market trends, forecasting
Anthropic	Claude 3 Sonnet	anthropic.claude-3-sonnet-20240229-v1:0	200k	Text	English and multiple other languages	Data processing: RAG or search & retrieval over vast amounts of knowledge Sales: product recommendations, forecasting, targeted marketing Time-saving tasks: code generation, quality control, parse text from images
Anthropic	Claude 3 Haiku	anthropic.claude-3-haiku-20240307-v1:0	200k	Text	English and multiple other languages	Customer interactions: quick and accurate support in live interactions, translations Content moderation: catch risky behavior or customer requests Cost-saving tasks: optimized logistics, inventory management, extract knowledge from unstructured data
Anthropic	Claude v2.1	anthropic.claude-v2:1	200k	Text	English and multiple other languages	Question answering information extraction removing PII content generation multiple choice classification Roleplay comparing text summarization document Q&A with citation
Anthropic	Claude v2	anthropic.claude-v2	100k	Text	English and multiple other languages	Question answering information extraction removing PII content generation multiple choice classification Roleplay comparing text summarization document Q&A with citation
Anthropic	[Legacy version] Claude v1.3	anthropic.claude-v1	100k	Text	English and multiple other languages	Question answering information extraction removing PII content generation multiple choice classification Roleplay comparing text summarization document Q&A with citation
Anthropic	Claude Instant v1.2	anthropic.claude-instant-v1	100k	Text	English and multiple other languages	Question answering information extraction removing PII content generation multiple choice classification Roleplay comparing text summarization document Q&A with citation
Cohere	Command R+ (v1)	cohere.command-r-plus-v1:0	128k	Text	English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, and Chinese	Complex RAG on large amounts of data Q&A Multi-step tool use chat text generation text summarization
Cohere	Command R (v1)	cohere.command-r-v1:0	128k	Text	English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, and Chinese	Chat text generation text summarization RAG on large amounts of data Q&A function calling
Cohere	Command (v14.7)	cohere.command-text-v14	4000	Text	English	Summarization copywriting dialogue extraction question answering
Cohere	Command Light (v14.7)	cohere.command-light-text-v14	4000	Text	English	Summarization copywriting dialogue extraction question answering
Cohere	Embed English (v3)	cohere.embed-english-v3	512	Embedding	English	Semantic search retrieval-augmented generation (RAG) classification clustering
Cohere	Embed Multilingual (v3)	cohere.embed-multilingual-v3	512	Embedding	108 Languages	Semantic search retrieval-augmented generation (RAG) classification clustering
Meta	Llama 3 70B Instruct	meta.llama3-70b-instruct-v1:0	8k	Text	English	Language modeling Dialog systems Code generation Following instructions Sentiment analysis with nuances in reasoning Text classification with improved accuracy and nuance Text summarization with accuracy and nuance
Meta	Llama 3 8B Instruct	meta.llama3-8b-instruct-v1:0	8k	Text	English	Text summarization Text classification Sentiment analysis
Meta	Llama 2 Chat 13B	meta.llama2-13b-chat-v1	4096	Text	English	Text generation Conversation Chat based applications
Meta	Llama 2 Chat 70B	meta.llama2-70b-chat-v1	4096	Text	English	Text generation Conversation Chat based applications
Mistral AI	Mistral 7B Instruct	mistral.mistral-7b-instruct-v0:2	32K	Text	English	Classification Text generation Code generation
Mistral AI	Mixtral 8x7B Instruct	mistral.mixtral-8x7b-instruct-v0:1	32K	Text	English, French, Italian, German and Spanish	Complex reasoning & analysis Text generation Code generation
Mistral AI	Mistral Large	mistral.mistral-large-2402-v1:0	32K	Text	English, French, Italian, German and Spanish	Complex reasoning & analysis Text generation Code generation RAG Agents
Mistral AI	Mistral Small	mistral.mistral-small-2402-v1:0	32K	Text	English, French, Italian, German and Spanish	Text generation Code generation Classification RAG Conversation
Stability AI	[Legacy version] Stable Diffusion XL (v0.8)	stability.stable-diffusion-xl-v0	77	Image	English	image generation image editing
Stability AI	Stable Diffusion XL (v1.0)	stability.stable-diffusion-xl-v1	77	Image	English	image generation image editing

用意されているモデルの価格

Amazon Bedrock Pricingを参考に本記事執筆時点のデータをまとめました。

価格が記載されていないモデルの項目はその価格オプションが提供されていない、またはモデルのカスタマイズの機能そのものがサポートされていないことを示しています。

テキストを扱うモデルの価格

テキストを扱うモデルの価格は次の項目で価格が設定されています。

On-Demand
On-Demandは1,000ごとの入力トークンと1,000ごとの出力トークンで価格計算されます(時間ベースの支払いではありません)。
Provisioned Throughput
Provisioned Throughputは指定した期間における時間ベースの支払いを確約(commitment)することで、大規模利用など要件を満たすための十分なスループットをプロビジョニングします。
commitmentの期間には、無し、1ヶ月、6ヶ月があり長期間であるほど割引がされます。
Model customization(Fine-tuning)
Fine-tuningを使用したカスタムモデルを作成する場合は、1,000トークンごとのトレーニング料金、カスタムモデルごとの月額保存料金が発生します。

Model Provider	Model	On-Demand (per 1000 input tokens)	On-Demand (per 1000 output tokens)	Provisioned Throughput (per hour per model)	Model customization through Fine-tuning
AI21 Labs	Jurassic-2 Ultra	0.0188 USD	0.0188 USD	-	-
AI21 Labs	Jurassic-2 Mid	0.0125 USD	0.0125 USD	-	-
Amazon	Titan Text Lite(Titan Text G1 - Lite)	0.0003 USD	0.0004 USD	no commitment: 7.10 USD 1-month commitment: 6.40 USD 6-month commitment: 5.10 USD	Train(per 1000 tokens): 0.0004 USD Store each custom model(per month): 1.95 USD
Amazon	Titan Text Express(Titan Text G1 - Express)	0.0008 USD	0.0016 USD	no commitment: 20.50 USD 1-month commitment: 18.40 USD 6-month commitment: 14.80 USD	Train(per 1000 tokens): 0.008 USD Store each custom model(per month): 1.95 USD
Amazon	Titan Embeddings(Titan Embeddings G1 - Text)	0.0001 USD	N/A	no commitment: N/A 1-month commitment: 6.40 USD 6-month commitment: 5.10 USD	-
Anthropic	Claude 3.5 Sonnet	0.00300 USD	0.01500 USD	no commitment: N/A 1-month commitment: N/A 6-month commitment: N/A	-
Anthropic	Claude 3 Opus	0.01500 USD	0.07500 USD	no commitment: N/A 1-month commitment: N/A 6-month commitment: N/A	-
Anthropic	Claude 3 Sonnet	0.00300 USD	0.01500 USD	no commitment: N/A 1-month commitment: N/A 6-month commitment: N/A	-
Anthropic	Claude 3 Haiku	0.00025 USD	0.00125 USD	no commitment: N/A 1-month commitment: N/A 6-month commitment: N/A	-
Anthropic	Claude(v2.0, v2.1)	0.00800 USD	0.02400 USD	no commitment: N/A 1-month commitment: 63.00 USD 6-month commitment: 35.00 USD	-
Anthropic	Claude Instant(v1.2)	0.00080 USD	0.00240 USD	no commitment: N/A 1-month commitment: 39.60 USD 6-month commitment: 22.00 USD	-
Cohere	Command R+	0.0030 USD	0.0150 USD	-	-
Cohere	Command R	0.0005 USD	0.0015 USD	-	-
Cohere	Command	0.0015 USD	0.0020 USD	no commitment: 49.50 USD 1-month commitment: 39.60 USD 6-month commitment: 23.77 USD	Train(per 1000 tokens): 0.004 USD Store each custom model(per month): 1.95 USD
Cohere	Command-Light	0.0003 USD	0.0006 USD	no commitment: 8.56 USD 1-month commitment: 6.85 USD 6-month commitment: 4.11 USD	Train(per 1000 tokens): 0.001 USD Store each custom model(per month): 1.95 USD
Cohere	Embed – English	0.0001 USD	N/A	no commitment: 7.12 USD 1-month commitment: 6.76 USD 6-month commitment: 6.41 USD	-
Cohere	Embed – Multilingual	0.0001 USD	N/A	no commitment: 7.12 USD 1-month commitment: 6.76 USD 6-month commitment: 6.41 USD	-
Meta	Llama 3 Instruct 8B	0.0003 USD	0.0006 USD	-	-
Meta	Llama 3 Instruct 70B	0.00265 USD	0.0035 USD	-	-
Meta	Llama 2 Chat 13B	0.00075 USD	0.00100 USD	no commitment: N/A 1-month commitment: 21.18 USD 6-month commitment: 13.08 USD	Train(per 1000 tokens): 0.00149 USD Store each custom model(per month): 1.95 USD
Meta	Llama 2 Chat 70B	0.00195 USD	0.00256 USD	no commitment: N/A 1-month commitment: 21.18 USD 6-month commitment: 13.08 USD	Train(per 1000 tokens): 0.00799 USD Store each custom model(per month): 1.95 USD
Mistral AI	Mistral 7B Instruct	0.00015 USD	0.0002 USD	-	-
Mistral AI	Mixtral 8x7B Instruct	0.00045 USD	0.0007 USD	-	-
Mistral AI	Mistral Small	0.001 USD	0.003 USD	-	-
Mistral AI	Mistral Large	0.004 USD	0.012 USD	-	-

マルチモーダルモデルの価格

画像やその他のメディアを処理するマルチモーダルモデルの価格設定は、画像の数、解像度など様々な基準に基づいており、それぞれのモデルごとにまとめました。

Model Provider	Model	Standard quality(<51 steps) (per image)	Premium quality(>51 steps) (per image)	Provisioned Throughput (per hour per model)	Model customization through Fine-tuning
Stability AI	Stable Diffusion XL (v0.8)	512x512 or smaller: 0.018 USD Larger than 512x512: 0.036 USD	512x512 or smaller: 0.036 USD Larger than 512x512: 0.072 USD	-	-
Stability AI	Stable Diffusion XL (v1.0)	Up to 1024 x 1024: 0.04 USD	Up to 1024 x 1024: 0.08 USD	no commitment: N/A 1-month commitment: 49.86 USD 6-month commitment: 46.18 USD	-

Model Provider	Model	Standard quality (per image)	Premium quality (per image)	Provisioned Throughput (per hour per model)	Model customization through Fine-tuning
Amazon	Titan Image Generator	512x512: 0.008 USD 1024X1024: 0.01 USD	512x512: 0.01 USD 1024X1024: 0.012 USD	no commitment: N/A 1-month commitment: 16.20 USD 6-month commitment: 13.00 USD	Train(per image seen): 0.005 USD Store each custom model(per month): 1.95 USD
Amazon	Titan Image Generator(custom models)	512x512: 0.018 USD 1024X1024: 0.02 USD	512x512: 0.02 USD 1024X1024: 0.022 USD	no commitment: 23.40 USD 1-month commitment: 21.00 USD 6-month commitment: 16.85 USD	-

Model Provider	Model	On-Demand (per 1000 input tokens)	On-Demand (per 1000 input image)	Provisioned Throughput (per hour per model)	Model customization through Fine-tuning
Amazon	Titan Multimodal Embeddings	0.0008 USD	0.00006 USD	no commitment: 9.38 USD 1-month commitment: 8.45 USD 6-month commitment: 6.75 USD	Train(per image seen): 0.0002 USD Store each custom model(per month): 1.95 USD

Amazon Bedrockの基本的な使い方

Amazon Bedrockの始め方・準備

Amazon Bedrockを始めるにはAWS Management ConsoleでAmazon BedrockのModel access画面に遷移し、Editをクリックしして使用するモデルを選択し、Save changesでモデルへのアクセスをリクエストします。
Amazon Bedrock > Model access - AWS Management Console
※Anthropicのモデルは会社情報や目的などを入力してリクエストする必要があります。

リクエストが承認されるとモデルのアクセスが有効になり使用できるようになります。

Amazon Bedrock Runtime APIのInvokeModel、InvokeModelWithResponseStreamとパラメータ

実際にAmazon Bedrockを使用するためのAPIについて説明します。
Amazon Bedrockに関するAPIには大きく分けてBedrock APIとBedrock Runtime APIがあります。

Bedrock APIはFine-tuningによるカスタムモデルの作成やモデルのProvisioned Throughputの購入などAWSリソースの操作に使用します。

一方のBedrock Runtime APIはベースモデルやカスタムモデルを指定して入力データ(Prompt)をリクエストし、レスポンスから出力データ(Completions)を取得する実際の実行に使用します。

Amazon Bedrock Runtime APIには実際にモデルを呼び出して使用するためのInvokeModelとInvokeModelWithResponseStreamがあります。

Amazon Bedrock Runtime APIのInvokeModelはリクエストに対するレスポンスの内容を一度にすべて取得するAPIです。

一方でAmazon Bedrock Runtime APIのInvokeModelWithResponseStreamはリクエストに対するレスポンスの内容を少量の文字ずつ徐々にストリームとして取得するAPIです。
すでにチャット形式のGenerative AIサービスを使ったことがある方はPromptに対する結果が数文字ずつ表示される画面を見たことがあると思いますが、その表示方法に使用できるのがInvokeModelWithResponseStreamです。

Amazon Bedrock Runtime APIのInvokeModelとInvokeModelWithResponseStreamのリクエストで指定するパラメーターは共通で次のものを使用します。

accept: レスポンスの推論BodyのMIMEタイプ。(Default: application/json)
contentType: リクエストの入力データのMIMEタイプ。(Default: application/json)
modelId: [Required]モデルの識別子。(例： ai21.j2-ultra-v1)
body: [Required]contentTypeで指定した形式の入力データ。各モデルでサポートされている推論パラメーターにあわせてbodyのフィールドのフォーマットを指定する。

一般的な推論パラメータの意味

以降ではAmazon Bedrock Runtime APIの実行例を紹介しますが、その前にモデルへのリクエストのBody内でよく使用する一般的な推論パラメーターについて簡単に説明します。
ただし、ここではイメージのしやすさを重視したため厳密な定義とは異なる可能性があることをご承知おきください。

temperature
モデルの出力確率分布のランダム性と多様性を調整するパラメータで、値が大きいとランダム性と多様性の高い意外な回答を返す傾向になり、値が小さいとより高い確率で推定される回答を返す傾向になります。temperatureの通常の範囲は0 - 1の間ですが、1を超える値を設定できるモデルもあります。例えば、temperature=1.0とtemperature=0.1ではtemperature=1.0の方がランダム性と多様性が高い回答を、temperature=0.1の方がより高い確率で推定される回答を返す傾向になります。
topK
モデルが考慮するトークンの上位K個を制限することでランダム性と多様性を調整するパラメータです。topKの最適な範囲は使用するモデルによって異なります。この値をセットすると出力トークンはこの上位Kの中から選択されます。例えば、topK=10の場合はモデルが回答生成時に確率の高い上位10のトークンのみを考慮するようになります。簡単に言えば、topKは選択可能なトークンの範囲を出力トークンの数で制限し、その結果として多様性も調整します。
topP
トークンの累積確率が指定したPを超える前のトークンの集合からサンプリングすることでランダム性と多様性を調整するパラメータです。通常のtopPの範囲は0 - 1の間です。例えばtopP=0.9の場合はモデルが回答生成時に確率が高いトークンから順に累計確率が0.9を超える前のトークンまでを考慮するようになります。簡単に言えば、topPは選択可能なトークンの範囲を出力トークンの確率の累計に基づいて制限し、その結果としてランダム性と多様性も調整します。
maxTokens
生成されるトークンの最大数を制限し、生成されるテキストの長さをコントロールするためのパラメータです。例えばmaxTokens=800の場合はモデルが800トークンを超えるテキストを生成しないようになります。

APIへのリクエストではtemperature、topK、topPのパラメータを組み合わせて確信度と多様性のバランスを調整し、maxTokensで出力されるトークン数を制限します。

Amazon Bedrockにある各モデルの詳細な推論パラメータについては「Inference parameters for foundation models - Amazon Bedrock」を参照してください。

AWS SDK for Python(Boto3)によるAmazon Bedrock Runtimeのinvoke_model実行例

ここではAWS SDK for Python(Boto3)によるAmazon Bedrock Runtimeのinvoke_modelをAWS Lambda関数で実行した例を紹介します。
本記事執筆時点ではAWS Lambda関数のデフォルトのAWS SDK for Python(Boto3)ではbedrock、bedrock-runtimeのClientがまだ呼び出せませんでした。
そのため、以下は最新のAWS SDK for Python(Boto3)をLambda Layerに追加してbedrock-runtimeのClientを使用した例です。

・実行例(AWS Lambda関数)

import boto3
import json
import os

region = os.environ.get('AWS_REGION')
bedrock_runtime_client = boto3.client('bedrock-runtime', region_name=region)

def lambda_handler(event, context):
    modelId = 'ai21.j2-ultra-v1'
    contentType = 'application/json'
    accept = 'application/json'
    body = json.dumps({
        "prompt": "Please tell us all the states in the U.S.",
        "maxTokens": 800,
        "temperature": 0.7,
        "topP": 0.95
    })

    response = bedrock_runtime_client.invoke_model(
        modelId=modelId,
        contentType=contentType,
        accept=accept, 
        body=body
    )
    response_body = json.loads(response.get('body').read())
    return response_body

・実行結果例(上記AWS Lambda関数の返却値)

{
    "id": 1234,
    "prompt": {
        "text": "Please tell us all the states in the U.S.",
        "tokens": [
            〜省略〜
        ]
    },
    "completions": [
        {
            "data": {
                "text": "\nUnited States of America is a federal republic consisting of 50 states, a federal district (Washington, D.C., the capital city of the United States), five major territories, and various minor islands. The 50 states are Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, Florida, Georgia, Hawaii, Idaho, Illinois, Indiana, Iowa, Kansas, Kentucky, Louisiana, Maine, Maryland, Massachusetts, Michigan, Minnesota, Mississippi, Missouri, Montana, Nebraska, Nevada, New Hampshire, New Jersey, New Mexico, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Carolina, South Dakota, Tennessee, Texas, Utah, Vermont, Virginia, Washington, West Virginia, Wisconsin, and Wyoming.",
                "tokens": [
                    〜省略〜
                ]
            },
            "finishReason": {
                "reason": "endoftext"
            }
        }
    ]
}

※本記事執筆時点で最新のAWS SDK for Python(Boto3)にはAmazon Bedrock Runtimeのinvoke_model_with_response_streamコマンドが用意されています。
ただ、別記事で詳細について説明する予定のため、今回の記事では割愛します。

AWS CLIによるAmazon Bedrock Runtimeのinvoke-model実行例

ここでは、AWS CLIによるAmazon Bedrock Runtimeのinvoke-model実行例を紹介します。
本記事執筆時点ではAmazon Bedrock Runtime APIはAWS CLI Version 2系にはまだ対応していませんでした。
そのため、以下はAmazon Bedrock Runtime APIが対応していたAWS CLI Version 1系を別途インストールして実行した例です。

・フォーマット

aws bedrock-runtime invoke-model \
    --region [Region] \
    --model-id "[modelId]" \
    --content-type "[contentType]" \
    --accept "[accept]" \
    --body "[body]" [Output FileName]

・実行例

aws bedrock-runtime invoke-model \
    --region us-east-1 \
    --model-id "ai21.j2-ultra-v1" \
    --content-type "application/json" \
    --accept "application/json" \
    --body "{\"prompt\": \"Please tell us all the states in the U.S.\", \"maxTokens\": 800,\"temperature\": 0.7,\"topP\": 0.95}" invoke-model-output.txt

・レスポンス例

* 画面表示  
{"contentType": "application/json"}

* ファイル内容(invoke-model-output.txt)  
{"id": 1234,"prompt": {"text": "Please tell us all the states in the U.S.","tokens": [〜省略〜]},"completions": [{"data": {"text": "\nUnited States of America is a federal republic consisting of 50 states, a federal district (Washington, D.C., the capital city of the United States), five major territories, and various minor islands. The 50 states are Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, Florida, Georgia, Hawaii, Idaho, Illinois, Indiana, Iowa, Kansas, Kentucky, Louisiana, Maine, Maryland, Massachusetts, Michigan, Minnesota, Mississippi, Missouri, Montana, Nebraska, Nevada, New Hampshire, New Jersey, New Mexico, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Carolina, South Dakota, Tennessee, Texas, Utah, Vermont, Virginia, Washington, West Virginia, Wisconsin, and Wyoming.","tokens": [〜省略〜]},"finishReason": {"reason": "endoftext"}}]}

※本記事執筆時点でAWS CLIにAmazon Bedrock Runtimeのinvoke-model-with-response-streamコマンドは用意されていません。

参考：
Amazon Bedrock is now generally available
Amazon Bedrock Is Now Generally Available – Build and Scale Generative AI Applications with Foundation Models
Amazon Bedrock Pricing
GitHub - aws-samples/amazon-bedrock-workshop: This is a workshop designed for Amazon Bedrock a foundational model service.
What is Amazon Bedrock? - Amazon Bedrock
Bedrock API Reference - Amazon Bedrock
Bedrock - Boto3 documentation
BedrockRuntime - Boto3 documentation
bedrock — AWS CLI Command Reference
bedrock-runtime — AWS CLI Command Reference
Amazon Bedrock Model Providers - AWS Management Console
AI and Machine Learning Glossary for AWS - Knowledge Gained While Studying for AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer - Associate
Amazon Bedrock Models as of 2024 - An Analysis of the Comprehensive Model Catalog
Tech Blog with related articles referenced

まとめ

今回はAmazon Bedrockの参考資料、モデル一覧、価格、使い方、トークンやパラメータの用語説明、Runtime APIの実行例について紹介しました。
情報をまとめていく中でAmazon Bedrockは様々な種類のモデルからユースケースに応じたものを選択でき、他のAWSサービスと親和性の高いAWS SDKやAWS CLIのインタフェースで呼び出すせることがわかりました。
これからもAmazon Bedrockをアップデート、実装方法、他のサービスとの組み合わせなどの観点でウォッチしていきたいと思います。

[English Edition] Basic Information about Amazon Bedrock with API Examples - Model Features, Pricing, How to Use, Explanation of Tokens and Inference Parameters

Written by Hidekazu Konishi