2024年4月18日

AIに仕事をさせましょう、CrewAIと共に

このチュートリアルでは、CrewAIを活用して、さまざまなタスクを実行できる複数のAIエージェントを作成し、展開する方法を実証します。GPT-4モデルの能力を示す数え切れないほどのチュートリアルがありますが、今回は無料でオープンソースのモデルを利用します。これにより、コストをかけることなくAIエージェントについて学ぶことができ、驚くほどうまく機能することに気づくでしょう。

作成するアプリケーションは、AIエージェントに人間のように考えさせ、どんなトピックについても、良質なブログ記事を生成します。このアプローチにより、AIの力を活用してコンテンツ制作プロセスを効率化し、貴重な時間とリソースを他の優先事項に充てることができます。

舞台の設定：ツールとテクノロジー

コードに入る前に、AIによるブログを強化する主要な要素を理解するための時間を取りましょう。

Ollama：OllamaはLLMモデルをローカルで実行するためのツールです。それは一連の事前トレーニング済みモデルを提供し、それぞれ独自の能力を持っています。これらはプロジェクトで活用します。

LangChain：LangChainは、大規模な言語モデルを活用したアプリケーションの開発を簡素化するフレームワークです。複雑なAIシステムを構築するのがより簡単になる一連の抽象化とツールを提供します。

Chroma：Chromaは、LangChainとシームレスに統合される高性能ベクトルデータベースです。これにより、テキストの埋め込みを効率的に保存および取得できます。これは私たちの質問応答機能に不可欠です。

crewai：crewaiライブラリは、協調型AIエージェントの作成を可能にする強力なツールです。プロジェクトでは、AI「エージェント」のチームを構築するために使用します。これらのエージェントは、魅力的なブログコンテンツを生成するために連携します。

さあ、対話型AIを活用したブログを構築する手順について、ステップバイステップで見ていきましょう。

環境のセットアップ

システムにPython 3.10がインストールされていることを確認してください。
必要なPythonパッケージをインストールします：langchain、crewai、bs4、およびchromadb。

pip install langchain
pip install crewai
pip install bs4
pip install chromadb

クラウドから必要なOllamaモデルを取得します。

gemma: Gemmaは、Google DeepMindによって構築された軽量で最先端のオープンモデルのファミリーです。
mistral: 低遅延のワークロードに対するコスト効率の高い推論。
nomic-embed-text: 大規模なトークンコンテキストウィンドウを持つ高性能なオープン埋め込みモデルです。このモデルは、ブログの後半の第2部で使用されます。

ollama pull gemma:7b
ollama pull mistral:7b
ollama pull nomic-embed-text

AIエージェント

さて、コーディングの部分です。

まず、blog.pyというファイルを作成し、次の内容を追加します。

from langchain_community.llms import Ollama
from crewai import Agent, Task, Crew
import sys

ollama_gemma = Ollama(model='gemma:7b')
ollama_mistral = Ollama(model='mistral:7b')

必要なすべてのライブラリをインポートしました。次に、Ollamaからダウンロードした2つのAIモデルを初期化しました。

def write_blog(topic):
 # Create Researcher agent
 researcher = Agent(
     role='Senior Research Analyst',
     goal=f"Uncover research in {topic}",
     backstory="""You work at a leading research company.
     Your expertise lies in identifying emerging trends, research and discover.
     You have a knack for dissecting complex data and presenting actionable insights.""",
     verbose=True,
     allow_delegation=False,
     llm=ollama_gemma
 )

 # Create Writer agent
 writer = Agent(
     role='Content Strategist',
     goal=f"Craft compelling content on {topic}",
     backstory="""You are a renowned Content Strategist, known for your insightful and engaging articles.
     You transform complex concepts into compelling narratives.""",
     verbose=True,
     allow_delegation=False,
     llm=ollama_mistral
 )

 # Create tasks for your agents
 task1 = Task(
   description=f"Conduct a comprehensive analysis of {topic}.",
   expected_output="Full analysis report in bullet points",
   agent=researcher
 )

 task2 = Task(
   description=f"""Using the insights provided, develop an engaging blog
   post that highlights the topic of {topic}.
   Your post should be informative yet accessible, catering to any audience, easy to understand.
   Make it sound cool, avoid complex words so it doesn't sound like AI.""",
   expected_output="Full blog post of at least 4 paragraphs",
   agent=writer
 )

 # Instantiate your crew with a sequential process
 crew = Crew(
     agents=[researcher, writer],
     tasks=[task1, task2],
     verbose=2, # You can set it to 1 or 2 to different logging levels
 )

 # Get your crew to work!
 result = crew.kickoff()
 return result

それでは、AIを使ってブログ記事作成プロセスをどのように効率化できるか、ステップバイステップで見ていきましょう。

必要なのは、以下の4つです：1) オペレーションを監督するエージェント、2) エージェントの専門知識に合わせたタスク、3) Crew(=チーム)を組み立て、それに応じてタスクを割り当てること、4) クルーが割り当てられたタスクを実行することです。以上です。

このアプローチにより、異なるAIモデル（ollama_gemmaおよびollama_mistralモデル）の強みを活用して、それぞれリサーチとライティングのタスクを処理できます。作業を分割し、エージェントがそれぞれの特性に集中できるようにすることで、コンテンツ作成プロセスを効率化し、質の高いブログ記事を迅速に作成することができます。

確かに、「CrewAI」という概念は理にかなっています。さまざまな専門知識を持つ多様なチームと、特定の専門能力を提供するAIモデルライブラリとの類似性を描きます。CrewAIを使用することで、ユーザーは特定のタスクに合わせて異なるAIモデルの強みを活用できます。これは、多様なスキルを持つクルーがプロジェクトのさまざまな側面に取り組むのと同様です。このアナロジーは、AIモデルの柔軟性と適応性を強調し、幅広い課題に効果的に対処することを示しています。

最後に、コマンドラインからトピックを受け取り、プログラムを実行するためのmain関数を追加しましょう。

def main():
 if len(sys.argv) < 2:
   print("Please provide a topic as an argument.")
   print('Usage: python blog.py "AI and data science trends in 2024"')
   sys.exit(1)

 # Get the first command-line argument after the script name
 topic = sys.argv[1]

 # Now you can use 'argument' in your script
 print("\n\n#### Topic ####")
 print(topic)

 if topic == '':
   print("Topic is empty.")
   sys.exit(1)

 result = write_blog(topic)

 print("\n\n#### Result ####")
 print(result)

if __name__ == '__main__':
 main()

たとえば、「AI and data science trends in 2024」についてのブログを書きたい場合、ターミナルを開いて次のように書きます。

python blog.py "AI and data science trends in 2024"

結果は次のようになるかもしれません。

#### Topic ####
AI and data science trends in 2024

[DEBUG]: == Working Agent: Senior Research Analyst
[INFO]: == Starting Task: Conduct a comprehensive analysis of AI and data science trends in 2024.


> Entering new CrewAgentExecutor chain...
**Thought:**

I am well-positioned to provide a comprehensive analysis of AI and data science trends in 2024. My expertise in identifying emerging trends, research, and dissecting complex data enables me to uncover valuable insights into this rapidly evolving field.

**Final Answer:**

**AI and Data Science Trends in 2024:**

* **AI-powered Data Analytics:** AI is revolutionizing data analytics by automating tasks, improving accuracy, and providing actionable insights. Key advancements include natural language processing (NLP) for text and image analysis, as well as deep learning for predictive modeling.

* **Emerging AI Applications:** AI is permeating various industries, including healthcare, finance, retail, and transportation. Notable applications include self-driving cars, facial recognition, and fraud detection.

.
.
.

[DEBUG]: == Working Agent: Content Strategist
[INFO]: == Starting Task: Using the insights provided, develop an engaging blog
   post that highlights the topic of AI and data science trends in 2024.
   Your post should be informative yet accessible, catering to any audience, easy to understand.
   Make it sound cool, avoid complex words so it doesn't sound like AI.


> Entering new CrewAgentExecutor chain...
Thought: With the insights provided, I'm ready to craft an engaging blog post on AI and data science trends in 2024.

Final Answer:

**Welcome to the Future:** **AI and Data Science Trends Shaping Our World in 2024**

Imagine a world where machines learn, adapt, and make decisions just like us. That's not science fiction anymore; it's our present-day reality. Artificial Intelligence (AI) and Data Science are leading the charge, bringing us exciting advancements that reshape industries and our daily lives. Let's explore some trends that will dominate these fields in 2024.

.
.
.


#### Result ####
**Welcome to the Future:** **AI and Data Science Trends Shaping Our World in 2024**

Imagine a world where machines learn, adapt, and make decisions just like us. That's not science fiction anymore; it's our present-day reality. Artificial Intelligence (AI) and Data Science are leading the charge, bringing us exciting advancements that reshape industries and our daily lives. Let's explore some trends that will dominate these fields in 2024.

**Transforming Data with AI:** **Revolutionizing Analytics**

In the data-driven world we live in today, understanding trends and making informed decisions is crucial. And here comes AI to the rescue! Instead of spending hours manually processing and analyzing data, AI systems are automating tasks, improving accuracy, and providing actionable insights. Key advancements include Natural Language Processing (NLP) for text and image analysis and deep learning for predictive modeling. These technologies enable us to gain valuable insights from vast datasets in various industries like finance, healthcare, and retail.

**AI Everywhere:** **Emerging Applications in Multiple Sectors**

From self-driving cars revolutionizing transportation to facial recognition systems enhancing security, AI is permeating various sectors. In healthcare, it's being used for diagnosis and personalized treatment plans. In finance, AI is powering fraud detection and risk assessments. Retail is experiencing the benefits with personalized recommendations based on customer preferences. The applications of AI are endless!

**Breaking Down Data Silos:** **Data Democratization**

Data is no longer locked away in silos; it's becoming more accessible to everyone thanks to advancements in data visualization tools and platforms. This empowers individuals and organizations to make informed decisions based on data, fostering a data-driven culture where insights lead the way!

**Protecting Our Data:** **Data Privacy and Security**

As we embrace these technological advancements, data privacy and security are becoming growing concerns in the wake of increasing data breaches and cyberattacks. Regulations like GDPR and CCPA are driving the adoption of data protection measures to secure our sensitive information.

.
.
.

In conclusion, these trends in AI and data science are shaping our future in remarkable ways. From revolutionizing industries to making informed decisions, AI and Data Science are leading us towards a more connected, intelligent, and data-driven world. Stay tuned for more updates on these exciting trends!

各エージェントの出力と、各エージェントがタスクを実行するためにどのように協力しているかがわかります。

ボーナス

AIモデルの制限の1つは、それがトレーニングされた内容しか知らないということです。つまり、AIモデルにトレーニングされていないタスクを実行するように求めたり、最近のニュースについて尋ねた場合、AIが答えを知ることは不可能であり、不正確な情報を提供したり、幻想を生成する可能性があります。

このセクションでは、さまざまなソースからの追加のコンテキスト情報を提供することで、この問題に対処します。これらのソースには、PDF、ウェブサイト、ビデオ、オーディオなどが含まれます。これにより、AIが私たちの質問により効果的に答えることができるようになります。単純に言えば、補助情報なしでAIにクエリを送信するだけでなく、補助情報を提供する別のプログラムを開発します。

ask.pyというファイルを作成し、以下の内容を追加しましょう。

from langchain_community.llms import Ollama
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import Chroma
from langchain.chains import RetrievalQA
import sys

ollama = Ollama(model="mistral:7b")

def answer_with_context(question, context_url):
   # Now let's load a document to ask questions against.
   loader = WebBaseLoader(context_url)
   data = loader.load()

   # This file is pretty big. Which means the full document won't fit into the context for the model. So we need to split it up into smaller pieces.
   text_splitter=RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
   all_splits = text_splitter.split_documents(data)

   # It's split up, but we have to find the relevant splits and then submit those to the model.
   # We can do this by creating embeddings and storing them in a vector database.
   # We can use Ollama directly to instantiate an embedding model.
   oembed = OllamaEmbeddings(model="nomic-embed-text")
   vectorstore = Chroma.from_documents(documents=all_splits, embedding=oembed)

   # And here is the relevant part of the document to the question.
   docs = vectorstore.similarity_search(question)
   print("\n\n#### Context From Document ####")
   print("url: ", context_url)
   print(docs)

   # The next thing is to send the question and the relevant parts of the docs to the model to see if we can get a good answer.
   qachain = RetrievalQA.from_chain_type(ollama, retriever=vectorstore.as_retriever())
   output = qachain.invoke({"query": question})
   return output['result']

def answer(question):
   output = ollama.invoke(question)
   return output

def main():
   if len(sys.argv) < 2:
       print("Please provide a topic as an argument.")
       print('Usage 1: python ask.py "Who is current President in USA?"')
       print('Usage 2: python ask.py "Who is current President in USA?" "https://en.wikipedia.org/wiki/President_of_the_United_States"')

   # Get the first command-line argument after the script name
   question = sys.argv[1]

   print("\n\n#### Question ####")
   print(question)

   # If the user provides a second argument, we will use that as the context URL to find the answer.
   context_url = ''
   if len(sys.argv) == 3:
       context_url = sys.argv[2]

   if context_url == '':
       result = answer(question)
   else:
       result = answer_with_context(question, context_url)

   print("\n\n#### Result ####")
   print(result)

if __name__ == '__main__':
   main()

answerとanswer_with_contextの2つの関数を作成しました。

answer関数では、AIモデルは追加の補助情報なしで回答を試みます。

一方、answer_with_context関数は次のように動作します。

提供されたURLからデータ（テキスト）を取得します。
取得したデータは、nomic-embed-textモデルを使用してベクトルに変換され、Chromaベクトルデータベースに格納されます。
ユーザーの質問に関連するドキュメントのみが抽出されます。
最後に、ベクトルデータベースからの情報がAIモデルに渡され、質問に対する回答が生成されます。

アプリケーションを実行するには、以下のコマンドを使用します：

> python ask.py "Who is current President in USA?"


#### Question ####
Who is current President in USA?


#### Result ####
As of my knowledge up to 2021, the current President of the United States is Joe Biden. He assumed office on January 20, 2021.

コンテキストを持っている場合

> python3.10 ask.py "Who is current President in USA?" "https://en.wikipedia.org/wiki/President_of_the_United_States"


#### Question ####
Who is current President in USA?


#### Context From Document ####
url:  https://en.wikipedia.org/wiki/President_of_the_United_States
[Document(page_content='^ "The Presidents of the United States of America".
Enchanted Learning. Retrieved August 2, 2018.\n\n^
"Political Parties of the Presidents". Presidents USA.
Retrieved August 2, 2018.\n\n\nFurther reading', .....
...

#### Result ####
The current President of the United States is Joe Biden.

ちょっとした手助けで、私たちのAIは質問に正しく答えることができます。以下は、GitHubリポジトリへのリンクです - https://github.com/yong-asial/local-llm。

結論

この記事では、CrewAIを使用してタスクを実行する複数のAIエージェントを作成する方法を学びました。そして、ボーナスパートでは、AIにコンテキストを提供してより良い仕事を行う方法を示しました。これらが役立つことを願っていますし、これらの技術を組み合わせることで、より優れた面白いアプリケーションを作ることができるかもしれません。

ハッピーコーディング。